CS 537 - Spring 2004
Assignment 4: File Systems, Part I - Disk Scheduling and Buffer Cache

Due: Friday, April 16, 2004 at 1:00 am.


Introduction

Assignments 4 and 5 will use a common software structure called MiniKernel, which simulates a simple operating system. In this assignment, you will manage multiple programs accessing a disk and improve their performance by adding a disk buffer cache. In the next assignment, you will build a file system.

Frequently Asked Questions

There is a FAQ page for this project. Check it frequently for updates.

Getting Started

First, read and understand the MiniKernel documentation. Download the code from ~cs537-1/project4 and run the examples. To make sure you understand how all the pieces fit together, implement getTime() as described in the documentation.

Adding System Calls

Disks come in lots of different shapes and sizes, so we'll also need some facility for the user program to inquire the geometry of the disk. Add two system calls that report the number of blocks on the disk and the number of bytes in a disk block. The library interface to the system calls should look like this:

Add two more system calls that allow reading and writing of individual blocks.

The data array must be allocated by the caller and it must be at least BLOCK_SIZE bytes long, where BLOCK_SIZE is the value returned by getDiskBlockSize.

These systems calls are blocking. They return 0 on success and a non-zero value to indicate an error. The calling process is blocked until the operation completes. However the Disk has only non-blocking methods beginRead and beginWrite, so you will need to write a monitor to schedule requests to the disk. For reasons that will become clear later, call this monitor Elevator. At first, make this class very simple: It has three methods read, write and endIO. The read method waits until the disk is idle, records information about the current disk operation, calls Disk.beginRead, and waits for the disk operation to complete. The write method is similar. The endIO method is called by Kernel.interrupt() when it gets an INTERRUPT_DISK interrupt indicating that the current disk operation has completed and uses notifyAll to tell any waiting threads that they should re-check to see if it is time for them to do something.

You will need to modify the Kernel to create an Elevator instance in Kernel.doPowerOn and to call the read, write, and endIO methods at the appropriate points.

Disk Scheduling

Modify your Elevator class to implement the (two-way) elevator scheduling algorithm. If the read or write method finds the disk busy, it should enter its request in a queue of pending operations and wait until it is chosen. The endIO method should notify the thread that started the I/O operation, and then, if there are other requests in the queue, choose one and allow it to make another beginRead or beginWrite call. The request "queue" can be any data structure you think is appropriate. Each time an I/O operation finishes, endIO needs to choose the request that is closest to the operation that just completed, where distance is simply the difference between block numbers. Don't get too fancy. The queue should never be very long, so the cost of searching is unlikely to be important.

Add a method flush() that delays the caller until all pending disk operations have completed. Add a call to this method to Kernel.doShutdown() just before the call to disk.flush().

Buffer Cache

Programs tend to request the same blocks of a disk over and over again. The kernel can help speed things up by making a buffer cache of Kernel.bufferSize disk blocks. Create a class BufferPool that maintains an array of buffers each of which has space to store the contents of a disk block as well as information about the block, such as its location on disk, whether it is "dirty". Define methods BufferPool.read and BufferPool.write and have the Kernel call these methods instead of Elevator.read and Elevator.write.

Use the LRU algorithm to allocate buffers in the cache. For each block in the cache, you will need a byte[] array to hold its data, and indications of the corresponding block number on disk and whether the block is "clean", "dirty", or "empty". Whenever read or write is called and the requested block is already in the cache, simply copy the data to or from the cached array of bytes (you may find System.arraycopy handy for this). If the desired block is not in the cache, grab the least recently used buffer from the cache and use it instead. If that buffer is dirty, you first have to write its contents out to disk. For a read operation, you also have to read the requested block from disk into the allocated buffer. In any case, once the desired buffer has been found or allocated, it should be marked as "recently used".

There are two alternative strategies for implementing LRU. One technique is to keep a LinkedList of buffers, sorted in decreasing order of age. Whenever a buffer is "touched", remove it from the list and add it to the tail of the list. The other is to associate a "last reference time" with each buffer, where "time" is simply the total number of read and write calls thus far. Use whichever strategy you find easiest.

Your BufferPool class also needs a method flush() that writes all dirty blocks back to disk. Add a call to this method to Kernel.doShutdown(). Note that this method doesn't have to be very fancy. Because it is called only at system shutdown, it doesn't have to be very efficient, and it doesn't have to worry about new requests arriving. When the MiniKernel shuts down cleanly it leaves behind a Unix file called DISK that contains the contents of the whole simulated disk. The next time you run MiniKernel, it will read this file to restore the contents of the simulated disk. If there is no DISK file, the simulated disk will start out with random data.

Errors

"User" programs have a habit of misbehaving. The Library and the Kernel should be vigilant in detecting invalid operations and returning an appropriate error value. You may define new error values if you like. For debugging purposes, you can use System.err.println to print error messages, but you should be aware that the only way a "real" operating system kernel indicates errors is by returning error result codes from system calls (unless you count the blue screen of death as "indicating errors":-).

Be very careful about race conditions and deadlocks. Use all the skills you learned from project 2. The synchronization for this project is surprisingly hard to get right. See the "Hints" section below and the FAQ for more advice.

Testing

Write a user-level test program called DiskTester.java that exercises the disk. Your program should do the following things:
Test for correctness:
Write test patterns to various disk blocks and then read back the data in a different order and check that you get what you expect. Different blocks should contain different test data.
Test a uniform distribution:
Read and write many blocks randomly across the disk.
Test a localized distribution:
Read and write a small selection of blocks many times, thus making good use of the buffer cache. For example, you might designate a small number of disk blocks as "hot" and choose to read or write hot blocks with much higher probability than cold blocks. This simulates a program with a small "working set".
You may either write three separate disk tester programs or one program that alters its behavior depending on command-line arguments.

Run your test program with various arguments and note the behavior. In contrast to project 3, the emphasis on this project is on a correct implementation rather than performance analysis, but if your performance enhancements don't have a noticeable effect, you should try to find out why. Once you can correctly run one copy of DiskTester at a time, try running several copies at once, using the & feature of the Shell we supplied to you. For example, you might several simultaneous tests of localized distributions with overlapping or non-overlapping working sets. The performance should be good if the total size of the buffer cache is big enough to hold all the working sets.

Hints

Grading

Your grade will be 70% for correctness, 20% for completeness of testing, and 10% for style. Don't forget the following:

What to Hand In

Place in the handin directory of the "senior" member of your team (the one whose login name comes first in alphabetical order): You may add new classes as you see fit. Do not modify Disk.java. As written, the Kernel automatically starts the program Shell. Do not change this feature. We will run your Kernel with our own test program and then examine your test program to verify the results.
Last modified: Wed Mar 31 11:49:31 CST 2004