Overview
Device drivers are a major source of complexity, unreliability, and cost for modern operating systems. As evidence, drivers account for the majority of system crashes: Microsoft reports that 89% of Windows XP crashes are caused by device drivers, and Linux driver code had up to seven times the bug density of other kernel code. The objective of this research is to improve device drivers by (1) reducing the complexity and cost of implementing device drivers, (2) improving the fault tolerance of device drivers, and (3) improving the performance of device drivers on modern hardware and software architectures.Driver Static Analysis
Static analysis provides a useful technique for understanding and manipulating large bodies of driver code. We built the DriverSlicer tool, using CIL, to analyze and modify driver code. The tool was originally developed as part of the microdriver project described below, but we have since applied it to two more projects.Carburizer
Hardware devices can fail, but many drivers assume they do not. When confronted with real devices that misbehave, these assumptions can lead to driver or system failures. Such bugs cannot easily be detected by regular stress testing because the failures are induced by the device and not the software load. We built Carburizer, a code-manipulation tool and associated runtime that improves system reliability in the presence of faulty devices. Carburizer analyzes driver source code to find locations where the driver incorrectly trusts the hardware to behave. Carburizer identified almost 1000 such bugs in Linux drivers with a false positive rate of less than 8 percent. With the aid of shadow drivers for recovery, Carburizer can automatically repair 840 of these bugs with no programmer involvement.
Driver Study
We study the source code of Linux drivers to understand what drivers actually do, how current research applies to them and what opportunities exist for future research. We develop a set of static-analysis tools to analyze driver code across various axes. We found that many assumptions made by driver research do not apply to all drivers. At least 44% of drivers have code that is not captured by a class definition, 28% of drivers support more than one device per driver, and 15% of drivers do significant computation over data. From the driver interactions study, we find that the USB bus offers an efficient bus interface with significant standardized code and coarse-grained access, ideal for executing drivers in isolation. We also find that drivers for different buses and classes have widely varying levels of device interaction, which indicates that the cost of isolation will vary by class. Finally, from our driver similarity study, we find 8% of all driver code is substantially similar to code elsewhere and may be removed with new abstractions or libraries.
Microdrivers and DriverSlicer
A major difficulty in writing kernel driver code is the many unenforced rules required by the kernel. Some examples from Windows include:- Functions that block may not be called at high priority levels or deadlock may occur.
- Locks provide mutual exclusion above a certain priority level but not below. For example, KeAcquireSpinLockForDpc synchronizes all callers except interrupt handlers.
- Code executing at high priority may not access pageable memory, because page faults cannot be satisfied.
- Addresses passed from applications are only accessible when executing on a thread from the application's process but not on kernel worker threads, such as during a timer callback.
We developed a novel hybrid approach to building drivers that provides both high performance and compatibility. Rather than execute all driver code at user level, we propose to extract a kernel-level microdriver from existing driver code. The microdriver contains only the code required for high-performance and to satisfy OS requirements. We convert the remaining code to a userdriver that executes in a user-level process. Shared data is marshaled and copied between the two portions on function calls. To maintain compatibility with existing code, we created DriverSlicer, a tool to semi-automatically partition drivers.
Decaf Drivers
Decaf Drivers takes a best-effort approach to simplifying driver development by allowing most driver code to be written at user level in languages other than C. Decaf Drivers sidesteps many of the above problems by leaving code that is critical to performance or compatibility in the kernel in C. All other code can move to user level and to another language; we use Java for our implementation, as it has rich tool support for code generation, but the architecture does not depend on any Java features. The Decaf architecture provides common-case performance comparable to kernel-only drivers, but reliability and programmability improve as large amounts of driver code can be written in Java at user level.Nooks
Nooks is a reliability subsystem that seeks to greatly enhance OS reliability by isolating the OS from driver failures. The Nooks approach is practical: rather than guaranteeing complete fault tolerance through a new (and incompatible) OS or driver architecture, our goal is to prevent the vast majority of driver-caused crashes with little or no change to existing driver and system code. To achieve this, Nooks isolates drivers within lightweight protection domains inside the kernel address space, where hardware and software prevent them from corrupting the kernel. Nooks also tracks a driver's use of kernel resources to hasten automatic clean-up during recovery.Shadow Drivers
We extended Nooks with shadow drivers to recover from driver failures. A shadow driver is a kernel agent that (1) conceals a driver failure from its clients, including the operating system and applications, and (2) transparently restores the driver back to a functioning state. In this way, applications and the operating system are unaware that the driver failed, and hence continue executing correctly themselves.Support
This work is supported in part by National Science Foundation (NSF) grants CNS-0915363 and CNS-0745517 and a grant from Google.Publications
- Asim Kadav, Matthew J. Renzelmann, Michael M. Swift. Fine-Grained Fault Tolerance using Device Checkpoints. In ASPLOS'13: Proceeedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems, Houston, Texas, March 16-20 2013.
- Matthew J. Renzelmann, Asim Kadav, and Michael M. Swift. SymDrive: Testing Drivers without Devices. In OSDI '12: Proceedings of the 12th Symposium on Operating System Design and Implementation, October 2012
- Asim Kadav and Michael M. Swift. Understanding modern device drivers, in ASPLOS '12: Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, March 2012.
- Asim Kadav, Matthew J. Renzelmann and Michael M. Swift. Tolerating Hardware Device Failures in Software. In Proceedings of the Symposium on Operating Systems Principles, Oct. 2009.
- Matthew J. Renzelmann and Michael M. Swift. Decaf: Moving Device Drivers to a Modern Language. in Proceedings of the USENIX Annual Technical Conference, June 2009.
- Vinod Ganapathy, Matthew Renzelmann, Arini Balakrishnan, Michael Swift and Somesh Jha. The Design and Implementation of Microdrivers, in Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, Seattle, WA, March 2008.
- Vinod Ganapathy, Arini Balakrishnan, Michael M. Swift, and Somesh Jha. Microdrivers: A New Architecture for Device Drivers, in Proceedings of the 11th Workshop on Hot Topics in Operating Systems San Diego, California, May 2007.
- Asim Kadav, Michael M. Swift. Live Migration of Direct-Access Devices. In Operating Systems Review, 43(3), Jul. 2009.
- Asim Kadav, Michael M. Swift. Live Migration of Direct-Access Devices. In Proceedings of the Workshop on I/O Virtualization (WIOV), Dec. 2008.
- Michael M. Swift, Damien Martin-Guillerez, Muthukaruppan Annamalai, Brian N. Bershad and Henry M. Levy. Live Update for Device Drivers, Univ. of Wisconsin Computer Sciences Technical Report CS-TR-2008-1634, Mar. 2008.
- Michael Swift, Muthukaruppan Annamalai, Brian N. Bershad, Henry M.
Levy. Recovering Device Drivers, in ACM
Transactions on Computer Systems, 24(4), Nov. 2006.
- Michael Swift, Muthukaruppan Annamalai, Brian N. Bershad, Henry M. Levy. Recovering Device Drivers, in Proceedings of the 6th ACM/USENIX Symposium on Operating Systems Design and Implementation, San Francisco, CA, Dec. 2004. Best paper award.
- Michael M. Swift. Device Driver Reliability, chapter in The Handbook of Research on Advanced Operating Systems and Kernel Applications: Techniques and Technologies, edited by Yair Waisman and Song Jiang, 2009.
- Michael Swift. Improving
the Reliability of Commodity Operating Systems, Ph.D. Dissertation, Oct. 2005.
- Michael Swift, Brian N. Bershad, and Henry M. Levy. Improving the Reliability of Commodity
Operating Systems, in ACM Transactions on Computer
Systems, 23(1), Feb. 2005.
- Michael Swift, Brian N. Bershad, and Henry M. Levy. Improving the Reliability of Commodity
Operating Systems, in Proceedings of the 19th ACM Symposium
on Operating Systems Principles, Bolton Landing, NY,
Oct. 2003. Best paper award.
- Michael Swift, Steven Martin, Henry M. Levy, and Susan J. Eggers. Nooks: an architecture for reliable device drivers, in Proceedings of the Tenth ACM SIGOPS European Workshop, Saint-Emilion, France, Sept. 2002.
New!
Microdrivers
Shadow Drivers
Nooks
Presentations
Michael M. Swift
Professor
Computer Sciences Department
College of Letters and Sciences
University of Wisconsin, Madison
Contact Information
608-890-0131
swift at cs dot wisc dot edu
7369 Computer Sciences
Computer Sciences Department
University of Wisconsin-Madison
1210 West Dayton Street
Madison, WI 53706-1685 USA