|
UNIVERSITY OF WISCONSIN-MADISON
Computer Sciences Department |
|
CS 736
Fall 2016 |
|
Barton Miller |
Project List
(Project Assigned: Monday, October 10)
(Project Proposal Due: Friday, October 21, in class)
(Midway Interviews: Thursday and Friday, November 10 and 11)
(Draft Report to Referees: Friday, December 9, noon)
(Referee Reports to Authors: Wednesday, December 14 noon)
(Final Reports Due: Monday, December 19, noon)
(Project Presentation Session: Wednesday, December 21, 10am-noon)
General Comments
The projects are intended to give you an opportunity to study a particular
area related to operating systems. Your project will require a test implementation,
measurement study or analysis, literature search, and evaluation.
The project suggestions below are briefly stated. They are intended
to guide you into particular areas and you are expected to expand these
suggestions into a full project descriptions. This gives you more freedom
in selecting an area and more burden in defining your own project. There
may be more issues listed for a project than you can cover. If you have
a topic of your own that is not listed below, you should come and talk
with me so we can work out a reasonable project description.
You will write a paper that reports on your project. This paper will
structured as if you were going to submit it to a conference. I will provide
more details on the project report later in the semester.
You should work in teams of two people (and, in certain cases three people)
on the project and report.
Projects
Here are a few project suggestions. I have mentioned other ideas
in class and I encourage you to develop an idea of your own.
(1) Benchmarking Revisited
For your first assignment, you did a series of measurements.
In many of these measurements, your results had a great deal of variation that
you could speculate on but not precisely identify.
The goal of this assignment is to get to the source of the variation by measuring
activities inside the Linux kernel.
You have a variety of approaches, from putting in your own print statements
(printk), the new perf_events software and hardware counter interface,
function tracer (ftrace), and the Red Hat SystemTap tool.
Precise identification of what happens in a given run of a benchmark program
is the key to this project.
Note that you will likely have to run this project on your own computer (though
I will see if we can get access to a system where you can have root access).
(2) More Random Testing of Programs:
Join a distinguished line of CS736 projects!
In the past (1990, 1995, 2000, and 2006), we have used
random input
to test the reliability of application programs on UNIX and NT.
These techniques have been quite effective in finding bugs, and have been
widely cited.
Fuzz testing has become a fundamental technique in both the testing and
security communities, and it all started with CS736 projects.
Here are a couple of different fuzz projects that you might undertake (you are
free to think of others):
A. Comparing the Free Unix Variants
The goal of this project is to take the previous study of applications on various
platforms and repeat the study on command-line and X-window applications
across
the various free/open UNIX systems such as Linux, BSDi, freeBSD, and
even the simple V6-based UNIX version used for operating system
class projects,
Xv6.
As a product of this project,
we would like to provide:
(1) an update of the tool set that can be used by other developers,
(2) bug reports for the software vendors, and
(3) bug fixes for these bugs.
B. Extend the Testing of Unchecked Kernel Call Return Values
As you heard me mention, unchecked returned values are a significant
contributor to problems with software reliability and security.
In the Fuzz Revisited paper, we included tests for this problem in programs that
called malloc and its cousins.
For this project, the goal would be to focus exlusively on the problem of
unchecked return values, and extend the coverage to a wide variety of
kernel calls and library routines.
(3) Dynamic Instrumentation API:
The Paradyn project
has developed a library for the analysis and patching of
binary (executable) programs and libraries,
for both the executable files and running programs.
This library, called the
Dyninst API,
allows programmers to write tools that can patch code into unmodified
programs in a machine-independent way.
So, a tool that uses Dyninst can work on x86 and Power (32 and 63 bit).
Many simple tools have been built
using Dyninst for tracing and debugging, and more complex tools for doing
such things as checkpointing and process migration.
With Dyninst, you can extract structural information such as control flow and
call graphs, and identify functions, basic blocks, and instructions.
You can instrument code and patch it to change its behavior, either statically
(before execution) or dynamically (during execution).
The goal is to choose some analysis, instrumentation, or control task, and
build a Dyninst-based tool.
The task could be in the area of tracing, program analysis, modeling,
cyber-security and forensics, performance profiling, or debugging.
An interesting example of such a project is to build a tool that can
perform "hot patching" of a running server.
You can check out the example of such a tool in the paper:
(4) Extreme Scale: The Multicast-Reduce Network (MRNet):
Infrastructures for scalable computing are intended to ease the task of distributed
programming,
reducing errors and making it easier to bulk process large amounts of data easily.
We have a local developed scalability infrastructure, called MRNet (Multicast Reduction Network),
which is good for large scale control and monitoring, and stream data processing.
This infrastructure has been used to build tools that control more than a million processes or
cluster 10 billion data points.
The goal of this assignment is to build a scalable program or tool based on
MRNet.
One possible project is to
start with the collection of up to a billion (!) Tweets that we can provide to you, and then you
will design
and implement a program that analyzes these Tweets to extract some common characteristics of the
data.
You can also develop your own idea (in consultation with Bart) for a scalable program or tool.
Whichever idea you choose, you will get the program working and benchmark its performance to
understand its limits on scalability.
(5) Windows vs. Unix File System Performance
NTFS and the Unix FFS are both designed to provide good performance.
However each file system has been optimized to be good at different things.
The goal of this project is to do a side-by-side comparison of the basic on-disk
structures, performance strategies
(e.g., memory caching of data and meta-data, pre-fetching, replication),
and generate workloads to tests these features.
Project Proposal
The project descriptions listed above a intentionally brief. You will need
to expand and refine these to develop your own project. Different groups
may choose the same topic, but have significantly different emphases.
Your project proposal will describe your goals, methods, implementation
outline, evaluation criteria, and resources needed. First, you should describe
the basic problem that you will be addressing. Next, you should provide
a more detailed description of how you will approach the problem. This
description should be contain much more detail than the brief paragraphs
given above. You will provide an outline of features that you project will
include, an implementation plan, and an evaluation plan.
Project proposals will typically be 3 to 4 double-spaced pages.
Referee Report
Each of you will referee the paper from another group. I am providing you
with a
sample referee report form, that is similar
to one that I used for a recent conference. I am also providing you with
two
sample referee reports, to give you an idea of what might appear in
a real report.
In general, a referee report starts with a summary of the main idea(s)
in the paper, and then has an overall statement of the quality. You should
then review the main technical ideas. In addition, either a marked-up copy
of the paper, with typos and related errors, or a summary of typos and
related errors should be given to the authors.
Last modified:
Wed Nov 30 17:07:29 CST 2016
by
bart