CS 736: Advanced Operating Systems

Project List

Measurement

People don't really understand device driver code. Analyze a set of device drivers (a few network, disk, etc. drivers) to figure out what the code does:

~~How much code runs at interrupt level?~~

~~How much code runs in response to I/O requests~~

~~How much code runs in response to configuration requests?~~

~~How much code runs in response to initialization / shutdown / environment change (e.g. power management) events?~~

~~Make recommendations on how drivers could be changed as a result of this analysis~~

~~Build tools for automating this analysis across a large number of drivers~~

reading: Linux device drivers
Solaris device drivers
Windows device drivers

We don't really understand dataflow through the kernel. For example, when data is read from disk, where does it flow and at what rate? Similarly, when data arrives off the network, what happens? For a couple of major data sources and sinks, measure the bandwidth demand of the flow and the major components of the flow, such as device drivers, file systems, protocl stacks.
On top of this, measure the performance impact for common applications of restricting the bandwidth or increasing the latency of these flows. Does it matter? Does this give us room to add additional services without impacting performance?
Linux and Windows used shared libraries to improve performance. Many people think they are not worth the cost. Build a Linux installation without shared libraries -- with totally statically linked binaries, and look at the costs, in terms of disk space, physical memory, and performance. You can also look at other alternatives, such as non-shared dynamically loaded libraries.
reading: Slinky
We use the AFS file system in the department, which has very different performance characteristics than other network file systems, such as NFS or CIFS. Examine AFS performance from your machine to understand how it performs. If possible, trace all department AFS access to determine overall access pattens and workload characteristics.

Reading: NT file sytem Study, Sprite FS study

Transactional Memory

Many people think transactional memory is the wave of the future, including several people at UW. One problem is that nobody really knows what transactional programs will look like. Our best guess is that they will look like today's multithreaded programs. Pick a set of multithreaded programs on your favorite operating system, and characterize their behavior in terms of:
1. Critical sections: duration, frequency
2. System calls: what calls are made during critical sections
3. Paging: how much paging happens
4. Context switching: how often do context switches occor, and how often within critical sections?
5. Threads: how many threads are used; and how many are active at one time?
Reading: Transactional Behavior
Almost all transactional memory systems to date have focused exclusively on user-mode code. One issue with kernel mode code is interrupts: what happens to a transaction when an interrupt occurs? Come up with a model of kernel transactions that handles interrupts, and apply it to one kernel component, such as a device driver, network protocol, or file system.
Reading: TX in kernel
Many people think that multi-core chips in the future will have extra, idle cores availble for use. One possibility is to run user-mode code on one core and kernel code on another, preserving cache locality. Implement a version of this in an existing operating system, to see if there is any performance benefit. To make things simpler, you can create another thread for each running process, and schedule them together. One thread is for user level code, one for kernel, and they communicate when the user thread takes a trap.
Reading: Computation Spreading

File Systems

Currently, the only way to scale a file system is to buy a bigger computer and add disks. A nicer solution would be to add another computer and then distribute the files over the two computers. You might want to investigate Samba and NFS as possible protocols for building such a system.
reading: AFS

Reliability

Lots of people have multiple computers at home. However, they do not supply increased reliability. If one computer fails, typically lots of work or files are lost. Find a way to use multiple computers in a home / small office to build a reliable system. For example, you could mirror the file systems to each other and provide a way to boot up the other OS should the computer fail
reading: Symantec LiveState
Virtual Machine Migration
One of the most difficult parts of building a reliable system is dealing with failures. However, not much is known about failures internal to an operating system. For this project, instrument an OS kernel to measure how often things fail, and where. Standardized interfaces are a natural place to do this check:
- device drivers: character, block, network
- system calls
- internal file system interfaces
- signals
Based on this study, look for opportunities to enhance failure handling in the kernel, for example by making it more useful or more uniform.
It is very hard to tell if one system is more reliable than another. Fuzz testing (throwing random data at the input functions) is one approach. Fault injection below the system is another. Find some interesting systems and compare them using a few different metrics.
reading: fuzz testing
Ballista
fault injection
dependability benchmarking

Management

Feedback control loops are a promising mechanism for automated performance tuning. Try to automatically control an application, such as Samba or Apache, using this technique.
reading: feedback control
Controllable systems
There are many different approaches to isolating applications so that they can't interfere with each other, including standard usermode proceses, VServers, BSD Jails, Solaris Containers, the Xen hypervisor, and VMware. Experiment with these to see how they differ in the level of isolation, sharing, and performance.
reading: Xen
VServers 1 , VServers 2 Jails

Security

A common problem in security is policy. While a system may have many ways to enforce protection or security boundaries, finding what should be inside or outside the boundaries is different. Find a way to generate interesting and useful security policies automatically, such as monitoring what files are accessed during installation, or by automatically granting access to resources provided by a user (via common dialogs or on the command line).
Reading: Janus
MAPbox
Polaris (Alan Karp)
Polgen for SELinux
MSR Strider
Root kits inject code into the OS to hide their tracks while retaining control over the system. They are difficult to detect, because the modify the OS to return false information, for example removing themselves from the list of processes returned by the ps command. However, it may be possible to detect them when either (a) when the OS serializes its state to disk for hibernation, or (b) via an external device that can read and write memory without OS intervention. Try one of these techinques for detecting root kits.
Resources:
Suspend for Linux
Rootkit articles

Previous project lists:

UW Global Navigation

University of Wisconsin-Madison

CS 736: Advanced Operating Systems

Basic Information

Notes

Project List

Menu

Page footer

Copyright