CS 838: Special Topics in Operating Systems -- Gray-Box System Design

Fall 2002

Instructor: Andrea Arpaci-Dusseau,
office: 7375 Computer Sciences

Lecture:
time: 2:30-3:45
days: Tuesday, Thursday
place: Tuesday: 2310
place: Thursday: 7331
Note the room change!

Office Hours:
4:00 - 5:00 Tuesday
11:00 - 12:00 Thursday

Short-cuts


News


Overview

Defining interfaces is the most important part of system design. Usually it is also the most difficult... -- Butler Lampson

As systems become composed of more and more systems themselves, it becomes increasingly important to define system interfaces well. Unfortunately, complex systems often provide the wrong interface, thus hiding useful internal information and limiting the functionality provided to systems built on top.

In this seminar course, we will investigate how to adapt when the original developers chose the wrong underlying interfaces. Over the semester, we will consider two different scenarios. In the first part of the course, we will assume the common case in which interfaces cannot be changed; thus, we will learn about strategies for inferring information when the desired interfaces do not exist. In the second part of the course, we will "build" a new operating system with new interfaces; this part of the course is more open-ended, but our general goal will be to expose information that was previously hidden (both to applications and to subsystems within the OS).

This semester is likely to be the only time this course is ever offered. This is a once in a lifetime opportunity!

Motivation

Living in an Imperfect World

Encapsulation has long been a guiding principle when designing large and complex software systems. The advantages of encapsulating functionality into components are well known: simplicity of organization, code re-use, and the freedom to change the underyling implementation. However, the dangers of encapsulation are not as well publicized: namely, interfaces are often chosen incorrectly, thereby hiding some of the potential power of the underyling implementation. For many important components, once these interfaces have been chosen, they are standardized and are subsequently almost impossible to change (e.g., TCP/IP, SCSI, Posix). Thus, developers using these components must employ creative solutions to uncover the hidden information they need or to control the component in unanticipated ways.

To make this discussion more concrete, consider a developer writing a memory-intensive application. With this type of application, we all know it is important to avoid thrashing the virtual memory system; thus, developers often structure these applications to adapt to the amount of currently available memory. However, there are two hurdles that must be overcome in current systems. First, many operating systems do not provide an interface for accurately reporting the amount of available memory (especially due to interactions with the file cache). Second, even when the OS does provide this information, the OS does not provide an control interface to ensure that this memory can be allocated atomically (i.e., before another application grabs it). Therefore, a developer interested in this type of information and control over available memory must work around the existing interfaces using covert means.

In the first part of this course, we will study examples in which developers have cleverly worked around limited interfaces. We will begin by understanding some of the techniques that are potentially useful, such as microbenchmarking, fingerprinting, reverse engineering, and self-simulation. We will then explore how these techniques have been applied in a wide selection of case studies including TCP and RED, implicit coscheduling, MS Manners, semantically smart disks, control of server utilization, and breaking cryptosystems.

Creating a New World

The observation that operating systems often implement the wrong interface or the wrong policy for some application or workload is not new; this is the major tenet of extensible systems and microkernels. Specifically, given that no single policy is ideal for all applications and workloads, such an OS is designed to be extended by applications. This extensibility is traditionally provided in one of two ways. First, the OS may be implemented with a microkernel that exposes only the base mechanisms (or units of protection) and application libraries are expected to provide the policies. Second, the OS may allow safe code with new policies to be downloaded into the kernel.

However, the existing set of extensible systems still do not expose all useful information and enable all types of control. For example, consider one of the more extreme examples: Exokernel. Exokernel has the goal of removing all abstractions and of securely exposing the available hardware resources to applications. However, exokernel still limits information and control. First, exokernel does not explicitly expose the cost of each operation (e.g., fetching a particular page from disk at this time). Second, exokernel multiplexes resource across competing applications in a simple and static manner (e.g., processes may choose to run in fixed time-slices on the CPU). The combined result is that applications cannot easily adapt to changing costs (e.g., by prefetching more pages from disk when disk and memory utilization is low or by running for more consecutive time-slices when they have a large working set loaded).

In the second part of this course we will determine how a new operating system could be built to expose as much information and control as possible. This will be much more open-ended than the first part of the course. We will begin this step by understanding some of the developmemts in extensible systems and microkernels: Synthesis, SPIN, Exokernel, VINO, and Scout. We will then explore how a new OS could expose all information (i.e., policies, internal state, and the cost of operations) to both applications and other subsystems within the OS. We will then study how applications or the OS itself could use this information to make bettter decisions (e.g., applications and/or the OS compare the cost and the benefit of performing an operation and only perform those operations with the greatest win). Thus, we will study the research in the area of exposing cost models, performing cost/benefit analysis, and more general economic computations within systems.

Format of Course

This course is based on a large set of readings as well as a ubstantial project. Your grade in the course will be based upon: the notes you write up to summarize lecture, your class participation, nd project performance. I do not plan on having any exams.

Readings

Since this course is a seminar, the class meetings will be discussion based. Every student is expected not only to participate regularly but also to occassionally take lecture notes (that are distributed back to the class) and to lead the discussion. Although you will be leading discussion multiple times throughout the course, you will never be responsible for an entire lecture: either you will have a relatively short discussion topic (e.g., roughly 20 minutes on one of the microbenchmark papers) or you will be in a group (e.g., multiple people will help lead each of the extensible systems). The presentations should be informal; once again, you are leading a discussion that everyone should be participating in, not giving a lecture. Whether you use slides or the blackboard is up to you.

A tentative reading list is available.

Projects

This course will involve a substantial set of projects; however, the format/topic of these projects is still an open question.

One option is that the course could involve two projects. The first project would then be associated with the first half of the course and will involve uncovering new information or control for some existing component (e.g., writing a new fingerprinting routine to extract the layout policy for a RAID system). Students would be free to choose any new information they think would be useful. The second project would involve modifying an OS to directly expose this new information or control and comparing the complexity and performance of both. Students would be encouraged to work in small groups of two.

The second option is to have a single goal that the entire class is working toward. For example, we could all work on having a version of Linux that exposes as much information/control as possible, both to applications/libraries as well as to other subsystems within Linux. The steps here could be: remove as many policies as possible from Linux so that we have a base framework to work with; design and implement straight-forward policies such that applications/subsystems can choose if or when operations occur (these policies should be simple to express); determine how descriptions of the policy should be exposed to other layers; implement layers that adapt to the exposed information. Different groups would be expected to be responsible for different subsystems: for example, CPU scheduling, networking, memory managament, and the file system.

Which project style is chosen will depend upon both student and instructor interest.

Tentative Schedule

The numbers in parenthesis correspond to the number of paper from this list that should be read for class. Come prepared to discuss!
Tuesday
Thursday
09/03 Introduction 09/05 System Design (1, 2)
09/10 Gray-Box Systems (3) 09/12 Case Study: TCP and RED (22, 23)
09/17 Microbenchmarks (4, 5) 09/19 Buffer Cache Fingerprinting (13)
09/24 Disk Microbenchmarks (7, 8) 09/26 TCP Fingerprinting (9,10,11,12)
10/01 Scheduler Fingerprinting (14) 10/03 Reverse Engineering Instructions(18,19)
10/08 SSD (28) 10/10 Implicit Coscheduling + MS Manners (24,25)
10/15 Status 10/17 Cryptosystems (29,30)
10/22 Visual Proxies (26) (and some Status) 10/24 Summary
10/29 Summary 10/31 Project Discussion
11/05 Project Discussion 11/07 No class
11/12 Open Implementation (38) 11/14 No class
11/19 Exokernel (33) 11/21 Exokernel (34)
11/26 VINO (35) 11/28 Thanksgiving
12/03 SPIN (32) 12/05 u-Kernel (37)
12/10 Gray-Box layout (39) 12/12 Wrap-Up
We will schedule a day or two for final project presentations at the end of the semester as well.

Prerequisites

To enroll for this course, students must either have taken CS736 (Advanced Operating Systems) or have my permission. Highly motivated first-year graduate students who are taking CS736 concurrently should talk to me for permission to enroll. Other students in Computer Science who have not taken CS736 should also speak with me about enrolling.