May 10th at 11:59pm.
You are to design and implement a simple file system on top of a simulated
disk.
Simulated Disk
The simulated disk uses a Unix file named DISK to simulate a disk with NUM_BLOCKS blocks of BLOCK_SIZE bytes per block. It supports three methods:
You will need to implement files on the simulated disk and some way of
allocating disk blocks.
You will use an adaptation of the method used by Unix. (In fact, the scheme
described below is very similar to the one used by the original
version of Unix-the so-called ``Sixth Edition'' version for the
PDP-11).
Super Block
Block 0 of the disk is the so-called ``super block'' , which contains information about the rest of the disk. You will want to keep a copy of this block in memory at all times. It should be read in when the file system starts up, and written back out before shutting down. The super block should hold the following variables:
class SuperBlock { public int size; public int isize; public int freeList; }
The size of the file system is recorded in the super block to allow the file
system to use less than the whole disk and to support various sizes of disk.
In all the data structures on disk, a ``pointer'' to a disk block is an
integer in the range 1..NUM_BLOCKS-1.
Since block 0 is treated specially, you can use a block number of zero
to represent a null pointer.
Free Space
You will use the free list space management technique discussed
in Section 11.3.3 of the text on page 384.
More specifically, each block of the free
list contains Disk.POINTERS_PER_BLOCK
block numbers, where
POINTERS_PER_BLOCK = BLOCKSIZE/4
(4 is the size of an integer in bytes).
The first of these is the block number of the next block on the
free list. The remaining entries are block numbers of additional free blocks
(whose contents are assumed to be meaningless). While the system is running,
you will want to keep a copy of the first block of the free list in memory.
File Structure
The technique is third method described in Section 11.2.3 of the text on page 379. Each file in the system is described by an index node (inode for short).1
class Inode { public final static int SIZE = 64; // size in bytes public int flags; public int owner; public int size; public int ptr[] = new int[13]; }If the flags field is zero, the index block is unused. In a real file system, the bits of this int distinguish different types of file (directories, data files, etc.) and indicate permissions. You do not have to implement these features. Similarly, you may ignore the owner field. The size field indicates the current size of the file, in bytes.
Block 0 of the disk is the super block. Blocks 1 through isize are packed with inodes.1
class InodeBlock { public Inode node[] = new Inode[Disk.BLOCK_SIZE/Inode.SIZE]; }The remaining blocks may be allocated as direct or indirect blocks, or placed on the free list. They are collectively known as data blocks.
The data blocks that contain the contents of the files are called direct blocks. The ptr fields in an inode point (directly or indirectly) to these blocks. The first 10 pointers point to the first 10 direct blocks. The 11th pointer (ptr[10]) points to an indirect block. This block contains pointers to the next Disk.BLOCK_SIZE/4 direct blocks of the file.1
class IndirectBlock { public int ptr[] = new int[Disk.BLOCK_SIZE/4]; }(Note that the blocks on the free list have the same format). Pointer ptr[11] points to a ``doubly indirect'' block. It is filled with pointers to indirect blocks, each of which contains pointers to data blocks. Similarly, the final pointer points to a ``triply indirect'' block. The size of the file is determined by the size field of the inode, not by the pointers.
A null pointer (either in the inode or in one of the indirect blocks) may indicate a hole in the file. For example, if the size field indicates that there should be five blocks, but ptr[2]==0, then the third block constitutes a hole. Similarly, if the file is large enough and ptr[10]==0, then blocks 11 through POINTERS_PER_BLOCK + 10 are all holes. Attempts to read from a hole act as if the hole were filled with zeros; an attempt to write into a hole causes the hole to be ``filled in'': Blocks are allocated as necessary and added to the file. Holes are created by seeking beyond the end of the file and then writing.
Inodes are numbered consecutively starting at 1 (not zero!),
so block 1 of the disk contains inodes 1..Disk.BLOCK_SIZE/Inode.SIZE,
and so on.
Files are
referenced by these numbers (called ``index numbers'' , or inumbers
for short). In a real file system, directory files are used to
translate mnemonic names to inumbers, but for this project, we will use
inumbers directly.
Other Disk Operations
The data structures SuperBlock, InodeBlock, and IndirectBlock are all the same size, so any one of them can be written to or read from any disk block. For your convenience, we have added three ``overloaded'' versions of read and write to the Disk interface.
class Disk { public Disk() { public void read(int blocknum, byte[] buffer) {} public void read(int blocknum, SuperBlock block) {} public void read(int blocknum, InodeBlock block) {} public void read(int blocknum, IndirectBlock block) {} public void write(int blocknum, byte[] buffer) {} public void write(int blocknum, SuperBlock block) {} public void write(int blocknum, InodeBlock block) {} public void write(int blocknum, IndirectBlock block) {} public void stop() {} }
You must implement the class FileSystem that contains the following ten methods.
class FileSystem { public int formatDisk(int size, int isize){} public int shutdown(){} public int create(){} public int inumber(int fd){} public int open(int inumber){} public int read(int fd, byte[] buffer){} public int write(int fd, byte[] buffer){} public int seek(int fd, int offset, int whence){} public int close(int fd){} public int delete(int inumber){} }In the tradition of C programming, each function returns an integer value, with -1 meaning ``error'' and a non-negative value (0 unless specified otherwise) meaning ``success.''2
The argument to open is the inumber of an existing file.4 The method inumber returns the inumber of the file corresponding to an open file descriptor.
The method write transfers buffer.length bytes from buffer to the file starting at the current seek pointer and advances the seek pointer by that amount. It is not an error if the seek pointer is greater than the size of the file. In this case, holes may be created.
The method seek modifies the seek pointer as follows.
public static final int SEEK_SET = 0; public static final int SEEK_CUR = 1; public static final int SEEK_END = 2; ... switch (whence) { case SEEK_SET: seekPointer = offset; break; case SEEK_CUR: seekPointer += offset; break; case SEEK_END: seekPointer = file_size + offset; break; }
In case 0 (SEEK_SET), the offset is from the beginning of the file. In case 1 (SEEK_CUR), offset is relative to the current seek pointer. In case 2 (SEEK_END), offset is relative end of the file. For cases 1 and 2, the value of the parameter offset can be positive or negative; however the resulting seekPointer must always be positive or zero. If a call to seek would result in a negative value for the seek pointer, the seek pointer is unchanged and the call returns -1. Otherwise, value returned is the new seek pointer (distance in bytes from the start of the file).
Although this is a large project, it should be manageable if you break it down into small pieces. Here is one way (but not the only possible way!) to decompose the problem. The tasks are listed roughly in the order they are needed, although in some cases they are inter-dependent.
When you get done, you should thoroughly test all the ten required
functions, including creating, reading, writing, closing, reopening, and
clearing all sorts of files (small, large, filled with holes, etc.) You
should also test the error checking in your code.
The main program we supply should be very handy in helping you to do this.
Extra Credits
If you get everything working and throughly tested early, you might consider adding the following two extra-credit features.
Dirty InodeBlocks must be written back to disk at shutdown.
Dirty blocks must be written back to disk at shutdown.
Each of the two parts count for 10% extra credit for the project.
I cannot stress too strongly, however, that you should not even think
of adding these features until the required part of the project is
completely written and debugged.
Program Structure
We have provided several files, all of which may be found in the directory
~cs537-2/public/project5/
The main method in class Proj5 implements a simple command interpreter. You can either use it interactively by invoking the program as
java Proj5or you can have it run a test script by typing, for example,
java Proj5 test1Input lines starting with ``/*'' or ``//'' are ignored (the latter are echoed to the output). Other lines have the format
[ var = ] command [ args ]The optional prefix var = causes the result of the command to be assigned to a variable. In any case, the result of the command is printed. The there is one command for each of the ten methods of class FileSystem as well as three additional commands: help, vars, and quit. The help command prints a list of commands, the vars command lists the current values of all interpreter variables that have been assigned values, and the quit command terminates the program. With the exception of the second argument to write, each argument can be either an integer or the name of a variable. The command
write fd pattern sizewrites size bytes to the indicated file at the current offset. The data is generated by repeating pattern over and over the required number of times.
You are to prepare a report describing the design and structure of your
directory and file system. The report should be not more than two
typewritten pages, not including diagrams. You should carefully describe
all design decisions you made and explain how these decisions affect the
performance of your file/directory system. You may assume that this handout
is part of the documentation of your program. Thus you need not repeat
information that is in this handout.
Handing In
You must work in groups of 2 for this project.
~cs537-2/handin/{your-login-name}/p5
As always, points will be deducted for code that fails to satisfy the minimal criteria for comments and structure specified in the hand-in directions for project number 2.
1There is also an artifact of Java here that would not be present in a real system. In Java, the Inode structure is stored in memory as three integers followed by a pointer to an array of thirteen more integers. There would also be additional information to indicate the type of the Inode structure and the size of the array. On disk, however, the Inode structure is simply 16 integers in a row, like the C structure
struct inode { int flags; int owner; int size; int ptr[13]; };Unfortunately, there's no easy way to create exactly this structure in memory in Java, but fortunately, you will probably never notice the difference. Similar remarks apply to InodeBlock and IndirectBlock.
2A real system would need some way to indicate what sort of error occurred. In Unix, the nature of the error is indicated by an integer error code placed in a global variable called errno. For this project, you can just print an error message. A more ``Java-like'' design would use exceptions to indicate errors.
3In real Unix, this array is split into three parts. Each process has its own table of open files. There is a single system-wide table of so-called ``in-core inodes'' shared among all processes. Each entry in this table has a reference count so that it can be removed when the last process closes the file. Seek pointers are kept in yet another system-wide table so that there can be multiple seek pointers into the same file, and multiple processes can share a seek pointer. For this project, you can combine all this information into one table.
4In real Unix, the argument is a pathname. The file system uses the directories to translate this name into an inumber.
5In real Unix, deletion is delayed until all processes that have the file open close it.