Improving Content Addressable Storage for Databases

Vandhana Selvaprakash, Brandon M. Smith

Department of Computer Sciences , University of Wisconsin-Madison



Abstract

Recent trends have seen database clients use content addressable storage systems (CASs) for near-line storage. While CASs have many attractive properties such as storage space savings, data integrity, and low network bandwidth requirements, CASs are not well suited for databases, mainly because of the rigid structure of databases, and the way databases intersperse metadata with data.

In this paper, we evaluate where current CAS techniques fail for databases and identify properties of database systems that can be leveraged for potential improvements to CAS techniques specific to databases. We propose three ways in which CAS systems can be made database aware, and evaluate the potential strengths and weakness of each approach. We find that our techniques improve memory savings, but at the expense of coupling the solution too closely to particular database vendors.


Available as: PDF

Click here to download our software.

Click here to download our powerpoint slides.