Overview
Device drivers are a major source of complexity, unreliability, and
cost for modern operating systems. As evidence, drivers account for
the majority of system crashes: Microsoft reports that 89% of Windows
XP crashes are caused by device drivers, and Linux driver code had up
to seven times the bug density of other kernel code. The objective
of this research is to improve device drivers by (1) reducing the
complexity and cost of implementing device drivers, (2) improving the
fault tolerance of device drivers, and (3) improving the performance
of device drivers on modern hardware and software architectures.
Microdrivers and DriverSlicer
A major difficulty in writing kernel driver code is the many
unenforced rules required by the kernel. Some examples from Windows
include:
- Functions that block may not be called at high priority levels or
deadlock may occur.
- Locks provide mutual exclusion above a certain priority level but
not below. For example, KeAcquireSpinLockForDpc synchronizes all
callers except interrupt handlers.
- Code executing at high priority may not access pageable memory,
because page faults cannot be satisfied.
- Addresses passed from applications are only accessible when
executing on a thread from the application's process but not on kernel
worker threads, such as during a timer callback.
Coding at user level simplifies driver development because these rules
do not apply. In addition, there are many more software engineering
tools and programming languages available that further
aid programmers.
There have been attempts to execute driver code in user mode, as in a
microkernel. However, current user-mode techniques suffer one of
from two flaws. User-level driver frameworks that run unmodified
kernel drivers suffer from poor performance because the existing
kernel interface was written assuming fine-grained sharing, trusted
code, and zero-cost invocations. For example, the kernel calls
network drivers once for each packet, rather than batching a set of
packets into a single call. In contrast, user-mode driver frameworks
with good performance require rewriting drivers to a new interface to
avoid these inefficiencies. Furthermore, some user-level driver
systems limit support to devices that do not require DMA or interrupt
handling. Considering the large base of existing drivers,
these problems limit the usefulness of current user-mode driver
frameworks.
We propose a novel hybrid approach to building drivers that provides
both high performance and compatibility. Rather than execute all
driver code at user level, we propose to extract a kernel-level
microdriver from existing driver code. The microdriver contains only
the code required for high-performance and to satisfy OS
requirements. We convert the remaining code to a userdriver that
executes in a user-level process. Shared data is marshaled and copied
between the two portions on function calls. To maintain compatibility
with existing code, we will create DriverSlicer, a tool to
semi-automatically partition drivers.
This architecture resembles network routers, in which
dedicated processors perform high-speed switching while complicated
routing and error handling are left to separate control processors. We
leave the code for transmitting data to and from a device in the
microdriver, which ensures high performance. We move the code for
initializing and configuring the device, error handling, and reporting
statistics out of the kernel and into user mode.
Decaf Drivers
Decaf Drivers takes a best-effort approach to simplifying driver
development by allowing most driver code to be written at user level
in languages other than C. Decaf Drivers sidesteps many of the above
problems by leaving code that is critical to performance or
compatibility in the kernel in C. All other code can move to user
level and to another language; we use Java for our implementation, as
it has rich tool support for code generation, but the architecture
does not depend on any Java features. The Decaf architecture provides
common-case performance comparable to kernel-only drivers, but
reliability and programmability improve as large amounts of driver
code can be written in Java at user level.
The goal of Decaf Drivers is to provide a clear migration path for
existing drivers to a modern programming language. User-level code can
be written in C initially and converted entirely to Java over
time. Developers can also implement new user-level functionality in
Java.
We implemented Decaf Drivers in the Linux 2.6.18.1 kernel by extending
the Microdrivers infrastructure. Microdrivers provided the mechanisms
necessary to convert existing drivers into a user-mode and kernel-mode
component. The resulting driver components were both written in C,
consisted entirely of preprocessed code, and offered no path to evolve
the driver over time.
The contributions of our work are threefold. First, Decaf Drivers
provide a mechanism for converting the user-mode component to
microdrivers to a Java through cross-language marshaling of data
structures. Second, Decaf supports incremental conversion of driver
code from C to Java on a function-by-function basis, which allows a
gradual migration away from C. Finally, the resulting driver code can
be easily modified as the operating system and supported devices
change, through both editing of driver code and modification of the
interface between user and kernel driver portions.
Nooks
Nooks is a reliability subsystem that seeks to greatly enhance OS
reliability by isolating the OS from driver failures. The Nooks
approach is practical: rather than guaranteeing complete fault
tolerance through a new (and incompatible) OS or driver architecture,
our goal is to prevent the vast majority of driver-caused crashes with
little or no change to existing driver and system code. To achieve
this, Nooks isolates drivers within lightweight protection domains
inside the kernel address space, where hardware and software prevent
them from corrupting the kernel. Nooks also tracks a driver's use of
kernel resources to hasten automatic clean-up during recovery.
Shadow Drivers
We extended Nooks with shadow drivers to recover from driver
failures. A shadow driver is a kernel agent that (1) conceals a driver
failure from its clients, including the operating system and
applications, and (2) transparently restores the driver back to a
functioning state. In this way, applications and the operating system
are unaware that the driver failed, and hence continue executing
correctly themselves.
Publications
Microdrivers
- Matthew J. Renzelmann and Michael M. Swift. Decaf: Moving Device Drivers to a Modern Language. in Proceedings of the USENIX Annual Technical Conference, June 2009.
- Vinod Ganapathy, Matthew Renzelmann, Arini Balakrishnan, Michael
Swift and Somesh Jha.
The Design and Implementation of Microdrivers, in
Proceedings of the 13th International Conference on Architectural
Support for Programming Languages and Operating Systems, Seattle,
WA, March 2008.
- Vinod Ganapathy, Arini Balakrishnan, Michael M. Swift, and Somesh
Jha.
Microdrivers: A New Architecture for Device Drivers, in
Proceedings of the 11th Workshop on Hot Topics in Operating
Systems San Diego, California, May 2007.
Shadow Drivers
- Asim Kadav, Michael M. Swift. Live Migration of Direct-Access Devices. In Proceedings of the Workshop on I/O Virtualization (WIOV), Dec. 2008.
- Michael M. Swift, Damien Martin-Guillerez, Muthukaruppan
Annamalai, Brian N. Bershad and Henry M. Levy. Live Update for Device Drivers,
Univ. of Wisconsin Computer Sciences Technical Report CS-TR-2008-1634,
Mar. 2008.
- Michael Swift, Muthukaruppan Annamalai, Brian N. Bershad, Henry M.
Levy. Recovering Device Drivers, in ACM
Transactions on Computer Systems, 24(4), Nov. 2006.
- Michael Swift, Muthukaruppan Annamalai, Brian N. Bershad, Henry M.
Levy. Recovering Device Drivers,
in Proceedings of the 6th ACM/USENIX
Symposium on Operating Systems Design and Implementation, San
Francisco, CA, Dec. 2004.
Nooks
- Michael Swift. Improving
the Reliability of Commodity Operating Systems, Ph.D. Dissertation, Oct. 2005.
- Michael Swift, Brian N. Bershad, and Henry M. Levy. Improving the Reliability of Commodity
Operating Systems, in ACM Transactions on Computer
Systems, 23(1), Feb. 2005.
- Michael Swift, Brian N. Bershad, and Henry M. Levy. Improving the Reliability of Commodity
Operating Systems, in Proceedings of the 19th ACM Symposium
on Operating Systems Principles, Bolton Landing, NY,
Oct. 2003. Best paper award.
- Michael Swift, Steven Martin, Henry M. Levy, and Susan J.
Eggers. Nooks:
an architecture for reliable
device driversin Proceedings
of the Tenth ACM SIGOPS European Workshop, Saint-Emilion, France,
Sept. 2002.
Presentations
Decaf: Moving Device Drivers to a Modern Language talk at USENIX, June 2009. (pdf )
The Design and Implementation of Microdrivers talk at ASPLOS, March 2008. (pdf )
Improving the Reliabibility of Commodity Operating
Systems talk given at UIUC ACM Reflections/Projections
Conference, October 2006. (pdf)
Improving the Reliabibility of Commodity Operating
Systems job talk given at various places in 2005 (pdf)
Recovery Device Drivers talk at OSDI 2004, December
2004. (pdf)
Recovering Device
Drivers, or
Cleaning Up Nooks talk in UW class CSE551: Graduate Operating
Systems (pdf)
Shadow Drivers:
Transparent
Recovery for Kernel Extensions poster at UW
industrial Affiliates, February 2004 (pdf)
Improving the Reliability
of
Commodity Operating Systems talk at SOSP 2003, October 2003 (pdf)
Nooks poster at UW
industrial Affiliates, February 2003 (pdf)
Nooks: an architecture
for
reliable device drivers talk at ACM SIGOPS worksop, September
2002(ppt)
Nooks: an architecture for reliable device
drivers talk at UW Networking and Systems Retreat, June
2002 (ppt)