Emulating Goliath Storage Systems with David

Nitin Agrawal, NEC Laboratories America
Leo Arulraj, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau,
Department of Computer Sciences,University of Wisconsin-Madison


Benchmarking file and storage systems on large file-system images is important, but difficult and often infeasible. Typically, running benchmarks on such large disk setups is a frequent source of frustration for file-system evaluators; the scale alone acts as a strong deterrent against using larger albeit realistic benchmarks. To address this problem, we develop David: a system that makes it practical to run large benchmarks using modest amount of storage or memory capacities readily available on most computers. David creates a "compressed" version of the original file-system image by omitting all file data and laying out metadata more efficiently; an online storage model determines the runtime of the benchmark workload on the original uncompressed image. David works under any file system as demonstrated in this paper with ext3 and btrfs. We find that David reduces storage requirements by orders of magnitude; David is able to emulate a 1 TB target workload using only an 80 GB available disk, while still modeling the actual runtime accurately. David can also emulate newer or faster devices, e.g., we show how David can effectively emulate a multi-disk RAID using a limited amount of memory.

Full Paper: Abstract, Postscript, PDF, BibTex
Talk Slides: PowerPoint
Talk Video: Hosted by USENIX | Hosted by Youtube
Winner: Best Paper Award