Failure Analysis of SGI XFS File system

Krishna Pradeep Tamma, Shreepadma Venugopalan

Abstract: Commodity file systems expect a fail stop disk. But todays disks fail in unexpected ways. Disks exhibit latent sector errors, silent corruption and transient failures to name a few. In this paper we study the behavior of SGI XFS to such errors. File systems play a key role in handling most data and  hence the failure handling policy of the file system plays a major role in ensuring the integrity of the data. XFS is a highly scalable journaling file system developed by SGI. We analyze the failure handling policy of XFS file system. We fail reads and writes of various blocks that originate from the file system and analyze the file system behavior. Some such blocks include super block, journal header block, commit block, data block, index blocks. We classify our errors according to the extended IRON Taxonomy. We see that XFS is vulnerable to these failures and does not possess a uniform failure handling policy. Further the file system at times silently corrupts data. We also see that XFS has very little internal redundancy even for critical structures such as the super block and B+Tree root. Finally we see that XFS, like most commodity file systems, fails to recover from these failures putting the data at risk.

Available as: Postscript or PDF

Click here to download our software.