Project List
- Device drivers
- People don't really understand device
driver code. Analyze a set of device drivers (a few network, disk, etc.
drivers) to figure out what the code does:
- How much code runs at interrupt level?
- How much code runs in response to I/O requests
- How much code runs in response to configuration requests?
- How much code runs in response to initialization / shutdown / environment change (e.g. power management) events?
- Make recommendations on how drivers could be changed as a result of this analysis
- Build tools for automating this analysis across a large number of drivers
reading: Linux device drivers
Solaris device drivers
Windows device drivers
- Drivers run in kernel mode, where any bug can crash the
system. Nooks solves this problem by running the entire driver
in a protection domain. Another approach is to split the driver
apart and only run critical pieces in the kernel. Come up with a
factorization of code that should be in the kernel
(e.g. interrupt handlers, performance sensitive i/o code) and
code that need not be in the kernel
reading: privtrans (where code was factored for security purposes)
Mach I/O model - A large number of driver failures are caused by device
failures. CSL has a number of pieces of faulty hardware. Take a
few devices and try to understand how the device fails and make
a driver that works with the failing
device.
reading: iron file systems
Solaris hardened drivers
Intel on hardened drivers
- Operating systems currently assume that
you have a small number of devices attached. In a word of ubiquitous
computing, there may be thousands of devices you could potentially use
(every mouse, keyboard, and monitor in the building!). How would the OS
structures for managing device drivers change? How could applications
change to take advantage of these devices, for example having a separate
mouse and keyboard per window for collaboration?
reading: Remote I/O
Device ensembles
- File Systems
- Currently, the only way to scale a file
system is to buy a bigger computer and add disks. A nicer solution would
be to add another computer and then distribute the files over the two
computers. You might want to investigate Samba and NFS as possible
protocols for building such a system.
reading: AFS
- Reliability
- Lots of people have multiple computers at home. However, they do not supply increased reliability. If one computer fails, typically lots of work or files are lost. Find a way to use multiple computers in a home / small office to build a reliable system. For example, you could mirror the file systems to each other and provide a way to boot up the other OS should the computer fail
- Most people have a single disk in their computer,
making them vulnerable to disk failures. Create a file system to provide
high reliability on a single disk, for example by storing data multiple
places.
RAID
Solaris ZFS integrity
Dell Poweredge integrity
- Configuration errors are a major source of system downtime and
management cost. However, fairly little is known about the nature
of configuration data. Survey the configuration data on an operating system to determine its characteristics, such as:
- How much configuration data is there
- Is it per-user or per machine?
- How is it specified? As a script, as key/value pairs, as XML, or as a database?
reading: Chronus
Strider
Configuration Validation
- It is very hard to tell if one system is more reliable than
another. Fuzz testing (throwing random data at the input
functions) is one approach. Fault injection below the system is
another. Find some interesting systems and compare them using a
few different metrics.
reading: fuzz testing
Ballista
fault injection
dependability benchmarking
- Management
- Feedback control loops are a promising mechanism for
automated performance tuning. Try to automatically control an
application, such as Samba or Apache, using this technique.
reading: feedback control
Controllable systems
- There are many different approaches to isolating applications so
that they can't interfere with each other, including standard
usermode proceses, VServers, BSD Jails, Solaris Containers, the
Xen hypervisor, and VMware. Experiment with these to see how
they differ in the level of isolation, sharing, and
performance.
reading: Xen
VServers 1 , VServers 2
, Jails
- Security
- A common problem in security is policy. While a system
may have many ways to enforce protection or security boundaries, finding
what should be inside or outside the boundaries is different. Find a way
to generate interesting and useful security policies automatically, such
as monitoring what files are accessed during installation, or by
automatically granting access to resources provided by a user (via common
dialogs or on the command line).
Reading: Janus
MAPbox
Polaris (Alan Karp)
Polgen for SELinux
MSR Strider
- Virtual machines provide an opportunity to do things
like virus scanning, and spyware detection (and removal) off line,
avoding interference from the virus or spyware itself. Create a tool
using VMware or Xen to provide this service.
reading: ReVirt