Improving the Reliability of Commodity Operating Systems

nook (nk)
n.
  1. A small corner, alcove, or recess, especially one in a large room.
  2. A hidden or secluded spot.
  3. A lightweight kernel protection domain for preventing device drivers from crashing an operating system (new).

Overview

Despite decades of research in extensible operating system technology, extensions such as device drivers remain a significant cause of system failures. In Windows XP, for example, drivers account for 85% of recently reported failures. 

Nooks is a reliability subsystem that seeks to greatly enhance OS reliability by isolating the OS from driver failures. The Nooks approach is practical: rather than guaranteeing complete fault tolerance through a new (and incompatible) OS or driver architecture, our goal is to prevent the vast majority of driver-caused crashes with little or no change to existing driver and system code. To achieve this, Nooks isolates drivers within lightweight protection domains inside the kernel address space, where hardware and software prevent them from corrupting the kernel. Nooks also tracks a driver's use of kernel resources to hasten automatic clean-up during recovery.

More recently, we have extended Nooks with shadow drivers. A shadow driver is a kernel agent that (1) conceals a driver failure from its clients, including the operating system and applications, and (2) transparently restores the driver back to a functioning state. In this way, applications and the operating system are unaware that the driver failed, and hence continue executing correctly themselves.

People

Faculty
Hank Levy
Brian Bershad

Graduate Students Mike Swift
Muthu Annamalai
Brian Milnes
Leo Shum
Undergraduate Students
Micah Brodsky
Eric Kochhar
Jordan Hom
Doug Buxton
Steve Martin
Exchange Students
Christophe Augier
Damien Martin-Guillerez


Publications

 

Presentations

Lessons Learned

Software Downloads

Grant Support

Driver Reliabily Links


Last modified 9/30/2004