Enforcing Murphy’s Law for Advance Identification of Run-time Failures

This research was conducted by Zach Miller, Todd Tannenbaum, and Ben Liblit. The paper appeared in the 2012 USENIX Annual Technical Conference (USENIX ATC 2012).

Abstract

Applications do not typically view the kernel as a source of bad input. However, the kernel can behave in unusual (yet permissible) ways for which applications are badly unprepared. We present Murphy, a language-agnostic tool that helps developers discover and isolate run-time failures in their programs by simulating difficult-to-reproduce but completely-legitimate interactions between the application and the kernel. Murphy makes it easy to enable or disable sets of kernel interactions, called gremlins, so developers can focus on the failure scenarios that are important to them. Gremlins are implemented using the ptrace interface, intercepting and potentially modifying an application’s system call invocation while requiring no invasive changes to the host machine.

We show how to use Murphy in a variety of modes to find different classes of errors, present examples of the kernel interactions that are tested, and explain how to apply delta debugging techniques to isolate the code causing the failure. While our primary goal was the development of a tool to assist in new software development, we successfully demonstrate that Murphy also has the capability to find bugs in hardened, widely-deployed software.

Full Paper

The full paper is available as a single PDF document. A suggested BibTeX citation record is also available.

See also the related technical report.

Implementation

Visit the Murphy repository on GitHub to work with Murphy source code. Start by reading the instructions for building and running Murphy. You may also download zip or tar archives of source code for the latest version or any official Murphy release. Note: currently, Murphy is only implemented for 64-bit Linux.