Finding Error-Handling Bugs in Systems Code Using Static Analysis

This research was conducted by Cindy Rubio González and Ben Liblit. The paper appeared in the PhD Forum of the 2011 Grace Hopper Celebration of Women in Computing (GHC 2011).

Abstract

Run-time errors are unavoidable whenever software interacts with the physical world. Unchecked errors are especially pernicious in operating system file management code. Transient or permanent hardware failures are inevitable, and error-management bugs at the file system layer can cause silent, unrecoverable data corruption. Furthermore, even when developers have the best of intentions, inaccurate documentation can mislead programmers and cause software to fail in unexpected ways.

We use static program analysis to understand and make error handling in large systems more reliable. We apply our analyses to numerous Linux file systems and drivers, finding hundreds of confirmed error-handling bugs that could lead to serious problems such as system crashes, silent data loss and corruption.

Full Paper

The full paper is available as a single PDF document. A suggested BibTeX citation record is also available.