Project #2: A Pipe Dream

Project Due Date: Thursday, March 7 by 11:59 PM

0.0 Updates and Notes

3/1: Just as in the last project, people have been sending questions to cs537-question@cs and they are getting logged here. Sorry if people didn't realize this earlier.

3/1: Tip: To copy strings, use strcpy(). To copy data, copy memcpy() or bcopy(). Do not use strcpy() to copy data! (in other words, realize that the data written to a pipe may not be a string! All strings terminate in '\0', whereas arbitrary data may not.

3/1: Tip: Within the OS, make sure to keep track of which processes open a pipe, so that you can later check during reads and writes that the same process that opened it for reading is indeed reading it. The way to do this is to save the sockaddr structures of the processes somewhere during open and compare them on subsequent reads and writes. If you don't do this, someone could just call Pipe_Read(), and if they guess a valid descriptor, they will be able to read the data from a pipe they haven't opened! A good way to compare to arbitrary structures may be to use memcmp().

3/1: Tip: Within the OS, be careful when passing data from the main OS thread to the worker threads. For example, if you call Domain_Read() to read a message into a structure, and then pass a pointer to that structure to a thread, you could get into trouble if the main thread calls Domain_Read() again before the worker thread is done with the structure. Thus, if passing data to a thread, make sure to allocate it on the heap, and then have the thread free it when it is finished with it.

2/25: A new error code known as E_INVALID_OP has been introduced. This error should be signalled when a pipe open for reading is written to, or a pipe open for writing is read from. LibOS.h has been updated to reflect this change. The project description has also been updated to reflect this change (see below), and highlighted in red for your viewing pleasure.

2/18: A number of small discrepancies between the LibOS example code and this document have been resolved. However, please assume that this document provides the exact specification of functionality required.

2/18: Document corrected to say osErrno instead of osError.

1.0 Overview

Project 2 is also about pipes (yes!), but this time you are going to build your own pipe abstraction within your user-level microkernel-based operating system. That is, processes that are communicating with your OS process will be able to tell your OS to create a pipe, open it, and read and write from it just like Unix pipes. Well, not quite like Unix pipes (actually a restricted, simpler version), but you get the picture.

A couple of things will make this project tricky. First, processes will have to communicate with your OS via domain sockets. These sockets provide a simple and quite general interprocess communication method (more general than the pipes you used in the previous assignment), but require you to learn how to use a new form of IPC. Second, your OS will have to be multi-threaded. Specifically, whenever your OS receives a request (such as a "pipe create" or "pipe open", etc.), instead of doing whatever work is needed to fulfill the request immediately, it will instead hand-off the request to one of a pool of threads inside the OS. One of those threads will then service the request, and send the result back to the requesting process.

Fortunately, you will get a little help with some of the code. In particular, we will give you example code that communicates via domain sockets, and some skeleton code to help programs access your OS. We will also describe what basic structure your OS and processes should take on. Finally, we will give some example code that starts a thread to do some work, and point you to some useful synchronization primitives to use. More details on all of these things are found below.

This project is hard, so start early! A lot of coding is involved, as well as careful design work to make sure your structure will be satisfactory.

2.0 Details

We'll now describe some details about the project. First, we'll tell you the basic structure your programs should have, and what functionality is required by the pipes you are implementing. Then, we'll tell you about a number of things that will be useful in doing the assignment: IPC with domain sockets, how to create a dynamically-linked library, the message format, multi-threading and synchronization, and error checking. After that, you'll find some hints on how to split up the work, and finally some details on handing it all in when you are finished.

2.1 Basic Structure

In this project, you will implement two pieces of code. The first is the OS process, similar to the OS in the last project (although obviously much more complex). The second is a library called LibOS: programs that wish to communicate with your OS will link with this library. The library internally will communicate via domain sockets to the OS to provide whatever service the application is requesting.

Your OS will provide a very limited set of pipe-like services. A process, by linking with the LibOS library, will be able to call routines to create, open, read, write, and close pipes. These pipes are much more restricted than the typical Unix pipe, in that you cannot write arbitrary-sized messages to a pipe. Instead, a write writes a fixed-sized message to the pipe (the size is a pre-determined constant BUFFER_SIZE), and reads retrieve a single fixed-sized message that has been written to the pipe. Otherwise, many of the semantics are similar to the traditional pipe.

Let's go through a simple example of how this will all work. Let's say a program called main.c wants to use the pipe functionality provided by the OS. What you would do in main.c is this: first, you would include the LibOS header file, called LibOS.h. As you can see, we are providing this header file, and you should not be adding to it. Then, you would compile main and link it with the LibOS library (this process is described further below). Assuming that the OS is already running (after all, it is a separate process), you would run the main program. When this program calls into the LibOS library (let's say to open a pipe with Pipe_Open()), the library will contact the OS process (via domain sockets) and tell the OS what the request was. The OS will then figure out how to service the request, and reply to the library, which will somehow convey the results of the call (success or failure) to the main program. More details on all of these steps are found below.

In the general case, you will probably run at least three processes: the OS itself, a program that reads from a pipe, and a program that writes to a pipe. Of course, while always running a single OS, you should be able to run many programs that are concurrently reading and writing to different pipes.

2.2 How To Run The Programs

You should make the OS runnable as follows:



prompt> ./os [-d] filename poolSize

The filename that is passed to the OS is the name that it will be bound to; other processes will use this name to direct their messages to the OS process. The poolSize is used to specify the number of service threads that the OS should start up. More details on each of these is found below. The -d flag is optional (that is, it can be anywhere on the command line, or not on the command line at all); it is used to specify that the OS should take preventative action to make sure a deadlock does not occur (more details on this below). Of course, if the arguments are not specified correctly, your OS should print an error message and exit.

As for LibOS, you are just creating a library, which should be a file called libOS.so. This library has a pre-defined set of interfaces, as defined below. Libraries don't have a main routine, so you will have to create your own test programs in order to test if your library (and your OS) are working correctly. Thus, for testing, you might write two programs: pipe_reader.c (which opens a pipe and reads from it) and pipe_writer.c (which opens a pipe and writes to it). Both of these will have to be linked with the LibOS, as described below.

Important Note: for testing, we will be linking our own programs with your library. Thus, it is very important that you don't change any of the interfaces specified in for the LibOS!

2.3 LibOS Functionality

Your OS will provide service for six different routines, as shown here. These routines will be available to applications as a part of your LibOS. Internally (that is, inside of each of these routines), the LibOS may contact your OS process in order to actually do the real work. Upon an error, the appropriate error code should be returned (as described below), and osErrno should be set to the appropriate value (also as described below). Applications can then access osErrno to find out what went wrong.

int Pipe_Create(char *name, int numBuffers);

int Pipe_Open(char *name, Pipe_t mode);

int Pipe_Read(int pipe, char *buffer);

int Pipe_Write(int pipe, char *buffer);

int Pipe_Close(int pipe);

int OS_Init(char *filename);

int Pipe_Create(char *name, int numBuffers);
Pipe_Create takes two arguments, the first which is a string that names the pipe, and the second is the number of buffers that the OS should create for buffering data for this particular pipe. Each buffer is of a fixed size which is defined by BUFFER_SIZE in LibOS.h The OS will use the numBuffers argument in order to create some buffer space inside of the OS for the pipe, specifically of size (numBuffers * BUFFER_SIZE). If the pipe does not already exist, a pipe of that name should be created. If the pipe is successfully created, Pipe_Create() should return a 0 to the user. If the pipe already exists or cannot otherwise be created, -1 should be returned, and the library global variable osErrno should be set appropriately. Specifically, if the name is a null pointer or too long (as compared to MAX_STR ) or numBuffers is less than or equal to 0, -1 should be returned and osErrno should be set to E_INVALID_ARGS. If the pipe already exists, osErrno should be set to E_PIPE_EXISTS. Finally, if some other failure occurs in the process, osErrno should be set to E_CREATE_PIPE. All errors are defined as a part of an enumerated type in LibOS.h.

int Pipe_Open(char *name, Pipe_t mode);
Pipe_Open takes two arguments, the name of the pipe to be opened, and the mode, which is an enumerated type found in LibOS.h. There are two modes that an application can specify: PIPE_READER, and PIPE_WRITER. Pipe_Open should block until a reader and writer have both opened it; in other words, when the reader has opened it (but no writer yet), the call to Pipe_Open should not return until the writer has also opened the pipe. Upon success, Pipe_Open() should return a file descriptor that the application can use in subsequent read and write calls. Upon any failure, -1 should be returned, and osErrno set appropriately. For any pipe, there can be at most one reader and one writer. Thus, if a second reader or writer tries to open the given pipe, -1 should be returned and osErrno set to E_PIPE_FULL. If the name is NULL or that pipe doesn't exist or the mode is not appropriate, -1 should be returned and osErrno set to E_INVALID_ARGS.

int Pipe_Read(int pipe, char *buffer);
Pipe_Read takes two arguments: the first is a descriptor (as returned by Pipe_Open), the second is a pointer to a buffer where data read from the pipe will be placed. When the user calls Pipe_Read, the routine will block until data is available, at which point a single buffer of size BUFFER_SIZE will be copied into buffer and the size of the read (in our case, always BUFFER_SIZE) will be returned to the user. Once the pipe is closed by the writer, the reader should still be able to read any data that has been placed in the pipe, but once that data is gone, Pipe_Read should return 0 to indicate that the pipe has been shutdown. If a bad descriptor is passed to Pipe_Read, -1 should be returned and osErrno set to E_INVALID_ARGS. If the pipe has been opened for writing, calling Pipe_Read should return -1 and osErrno should be set to E_INVALID_OP.

int Pipe_Write(int pipe, char *buffer);
Pipe_Write is quite similar to Pipe_Read: it takes two arguments, a descriptor and a buffer that points to the data to be written to the pipe. If the data is successfully written to the pipe, Pipe_Write will return the number of bytes written (again, this should be BUFFER_SIZE, because we are only dealing with fixed-sized buffers). Pipe_Write also can block in the OS, though, if the buffers are all full and the reader has not yet read the data from the pipe. In this case, Pipe_Write should not return until the data has been written to the pipe. If the reader closes the pipe, any subsequent writes should return 0. If a bad descriptor is passed to Pipe_Write, -1 should be returned and osErrno set to E_INVALID_ARGS. If the pipe has been opened for reading, calling Pipe_Write should return -1 and osErrno should be set to E_INVALID_OP.

int Pipe_Close(int pipe);
Pipe_Close takes a descriptor and closes the pipe associated with that descriptor. Pipe_Close should block until the OS has informed the library that the pipe is indeed closed. Closing the pipe has certain side-effects on subsequent reads or writes by the other pipe member, as described in the read and write sections above. If the close is successful, 0 should be returned. If a bad descriptor is passed to Pipe_Close, -1 should be returned and osErrno set to E_INVALID_ARGS.

int OS_Init(char *filename);
This routine is called once by any program that wishes to use the LibOS services, and it takes a single argument. This argument is the name of the file that the OS is bound to, and should be used by the LibOS to send messages to the OS in the Pipe_XXX() routines. Internally, OS_Init() should probably call Domain_Open() and Domain_FillSockAddr(filename) to prepare for upcoming communication that will occur within the Pipe_XXX() routines. Upon success, this should return 0, but if for some reason this initialization fails, the routine should return -1 (no setting of osErrno is necessary here).

Note 1: If the Pipe_XXX() routines are called before OS_Init() has been called, that is an error; the pipe routines should check that OS_Init() has been called, and if not, return -1 and set osErrno to E_INIT.

Note 2: For all calls, if the library can't communicate successfully with the OS, -1 should be returned and osErrno should be set to E_COMM_FAILURE.

2.4 Interprocess Communication with Domain Sockets

For this project, you will be using domain sockets to communicate between the library LibOS and the OS process. Unfortunately, you may not know how to use these sockets! Therefore, we are providing some sample code to get you going. In particular, check out Domain.c and Domain.h, because they contain exactly what you need to set up a communication channel between your library and the OS process.
There are five routines that are of interest:

int Domain_FillSockAddr(struct sockaddr_un *addr, char *file);

int Domain_Open(char *file);

int Domain_Close(int fd);

int Domain_Write(int fd, struct sockaddr_un *addr, char *vptr, int n);

int Domain_Read(int fd, struct sockaddr_un *addr, char *vptr, int n);

When you want to set up a socket to send and receive messages with, you should call Domain_Open(). Notice that this routine takes a file name: this name should be of a file located in /tmp, such as /tmp/os.fifo. Note that the OS program will have to call Domain_Open() with the file name that is passed in on the command line. Also note that each library has to call Domain_Open() with a file name too, although it does not need to be a well-known file name. For this, you should use the tmpnam() library call (do a "man tmpnam" for more info) to return a temporary filename to use inside of Domain_Open(). Domain_Open() returns an integer which is a socket descriptor, and you can think of this in many ways as quite similar to a file descriptor: you will be using it to send and receive messages from other processes. Here's an example of how you might use this in your process:



int socket = Domain_Open(tmpnam(NULL)); // library might use this code 

if (socket < 0) { // signal error }

Once opened, to send a message to a process, you need to know the name of the file that the other process has specified when it called Domain_Open(). In this project, this means that the library LibOS will somehow have to be informed of the file name that the OS process has used. Let's assume that the OS is running, and has called Domain_Open("/tmp/os.fifo"); in this case, we will say that the OS has bound itself to the file name /tmp/os.fifo. For the library to send a message to the OS process that is bound to /tmp/os.fifo, it has to construct a proper address which can then be passed to the Domain_Write() routine. Fortunately, this functionality has been provided in Domain_FillSockAddr(). Before trying to send a message, use Domain_FillSockAddr() to set up the address of the message recipient. Here's an example:



struct sockaddr_un addr; 

Domain_FillSockAddr(&addr, "/tmp/os.fifo");

Once this address has been filled in, you can use Domain_Write() to send a buffer to the remote process (in this case, the OS):



char buffer[512]; 

int rc = Domain_Write(socket, &addr, buffer, 512); 

if (rc != 512) { // there was some trouble! }

Of course, on the receive side, you have to be able to receive a message, as well as to identify whom it was from. This is all achieved quite readily with Domain_Read(), as in the following example:



char buffer[512]; 

struct sockaddr_un receiveAddr; 

int rc = Domain_Read(socket, &receiveAddr, buffer, 512); 

if (rc != 512) { // didn't read as much as expected (but only an error if < 0) }

Once you have received a message (and Domain_Read() will block until one has been received), you can then use receiveAddr to send a response back to the sender, which may be quite handy!

To learn more about the code, read the code - it is pretty simple. To learn more about what Domain_Read and Domain_Write may return, read the sendto() and recvfrom() man pages, which are the OS routines used to implement Domain_Read and Domain_Write.

2.5 Message Format

One thing that is quite important is settling on a message format between the library and the OS. This message has to have enough structure in order for the OS to be able to receive and interpret the message, and of course pull out the necessary information. Thus, spend some time and figure out what information needs to flow between the library and the OS on all of the Pipe_XXX() calls (note that OS_Init() should not need to contact the OS at all). For simplicity, you might decide on a single struct which is used to transfer messages between library and OS, or you could decide on a different struct per message type.

For example, the following struct could be used to pack the necessary information to perform a Pipe_Create():



typedef struct __MsgCreate_t { 



int type; // used to set the type of this message 


char name[MAX_STR]; 


int numBuffers; 



} MsgCreate_t;

Important: Remember, the OS just receives a set of bytes from the library. It has to be able to figure out what type of message it is in a simple and reliable way! Think about this before implementing something less than perfect.

2.6 Multi-threading the OS

One of the major challenges of this assignment is to use multiple threads inside of the OS process to handle requests. The structure of the OS process is thus as follows. When the OS starts up, it should create a pool of threads, the number of which is specified on the command line (with the poolSize argument). Threads can be created with the pthread_create routine. Read the manual page for details. Also, read the manual page of pthread_attr_init to find out more about what flags to pass to pthread_create. We may hand out some sample code on this, but don't wait for it!

The typical operation of the OS is as follows. The main thread (the one that starts up all the other threads) will start running inside of the main() routine. It should parse arguments, initialize data structures, and then create the thread pool. At that point, the main thread should enter a loop where it just waits for messages from LibOS's.

When a message is received, the OS should pass the message (or some description of the work that needs to be done to complete the request) to one of the threads. This should be done through some sort of shared data structure. You might recognize the producer-consumer relationship here, in that the main thread that reads messages from the network is a producer of work for the thread pool, which consists of a bunch of consumers.

One of the threads in the thread pool (and exactly one) will get the request, and try to execute the needed functionality. For example, if the message was a Pipe_Create() message, the thread would update whatever global data structures are in the OS that keeps track of whatever pipes are currently active. When done, the thread should go ahead and send a response to the LibOS, of course packaging up the response in a way that the library will be able to interpret.

Remember that multiple threads may be accessing those shared data structures at the same time. Thus, you will need to use synchronization primitives to correctly implement access to those structures. The synchronization primitives that are available to you are as follows:

pthread_mutex_lock

pthread_mutex_unlock

pthread_cond_signal

pthread_cond_wait

pthread_mutex_lock and pthread_mutex_unlock
These are just like the lock and unlock primitives we talked about in class. Locking a particular mutex will guarantee that any other thread calling mutex_lock will wait until the mutex is unlocked before returning. Using these thus allows you to build a critical section.

pthread_cond_signal and pthread_cond_wait
Unfortunately, locks are not quite good enough: you need to be able to wait for a condition to come true. For this, you will need to use pthread_cond_wait and pthread_cond_signal. These should be used in combination with a lock in order to implement the necessary waiting that a Pipe_Read() or Pipe_Write() may incur because the pipe is empty or full, respectively. Read the man pages for more details.

Important: The goal of implementing a multi-threaded OS is to provide for as much concurrency as possible. Thus, in your design and implementation, you need to think about how to provide as fine-grained locking as possible. Just putting one big honkin' lock around everything is not good and will be marked down.

A Note About Deadlock: Note that when a request to perform a Pipe_Read or Pipe_Write or even a Pipe_Open comes into the OS and gets handed off to a thread, it may block waiting for the pipe to become full or empty (more specifically, if a thread executing a read request finds the pipe empty, it should block, and if a write finds a pipe full, it also should block, and if an open finds that that the other process has not yet opened the pipe, it should block too). Thus, threads get used up by pipes that are getting blocked like this.

Now consider a system that has 2 threads in the OS thread pool, and 2 readers and 2 writers (this of course means there are 2 pipes open). If both pipes are full and 2 writers come to perform a write in the OS, both writers will have to block waiting for a reader to remove a message. However, this will never happen because the two threads in the system are blocked - one was assigned to each writer, and there are no threads left to service any read requests. This system would be in a state of deadlock.

As Extra Credit: Your program is to accept an optional command line argument that indicates whether or not the above situation should be dealt with. In other words, if the -d option is enabled, your OS should never deadlock. However, if the -d option is NOT enabled, your program should deadlock in the above and similar situations. This part of the assignment is considered extra credit: that is, you can certainly still get a very strong A in the course if you do not do it. In general, we recommend that you implement the -d option (the "no deadlock" option) when everything else is done and working correctly.

If you decide to try for the extra credit, make sure to make a copy of your working program before you start modifying the code. More generally, make copies of every version where you get some new substantial functionality working. If you have lots of free time, use CVS to do this; otherwise, just make a subdirectory under your working directory for each copy (i.e., 01, 02, 03), and keep good notes on what version is in each directory (in a file called version.doc, for example).

2.7 Compilation Tips

Your LibOS needs to be built into a dynamically-linked shared library. In this subsection, we tell you how to do that. There are two basic steps: first, how to compile each .c (or .cc) file that comprises your library, and then how to link them all together to create a shared library. Once you have your shared library, you need to know how to compile a program (say main.c) so that it can link with your library, so we will describe that too.

You also need to learn how to compile a multi-threaded application. We'll tell you how to do that too.

All of these things should be done within your makefile, to make your life easy!

2.7.1 Compiling A Dynamically-Linked Library

Let's say you have implemented your library in a single file, called LibOS.c. This is how you should compile that within your makefile:



gcc -c LibOS.c -g -Wall -fpic

The -c flag tells the compiler to create an object file (in this case, LibOS.o), the -g flag is good to have on when debugging (so you can use the debugger gdb ), the -Wall flag should always be used, and finally the -fpic flag tells the compiler to use something called "position-independent" code, which is good to use when building shared libraries. We'll learn more about what this means later in the course; of course, if you are curious, read the gcc info page for more details.

Note that any other files that are going to be linked into the LibOS (e.g., Domain.c) should be compiled in the same way.

Now that you have LibOS.o, you'll want to make a shared library out of it. The way you do that is with the following line:



gcc -o libOS.so LibOS.o -shared

It's that easy! Of course, you should be including Domain.o with your library, too, so you'll probably want a line more like this:



gcc -o libOS.so LibOS.o Domain.o -shared -lnsl -lsocket

When linking with Domain.o, it's a good idea to also add the -lnsl and -lsocket flags, which tell the compiler to link with the socket libraries.

2.7.2 Linking A Program With Your Library

Let's say you have a program, main.c, that calls OS_Init(), Pipe_Create(), and all of those other great functions that your OS provides. In order to compile main.c, you need to link it with your library. This is how you would do that, assuming all of your code is in the same directory:



gcc -o main main.c -L. -R. -lOS

The -lOS flag tells the compiler to look for a library called libOS.so (or libOS.a, but don't worry about that), and the -L. flags tells the compiler to look for the library in the "." directory ("." is a way to refer to the current directory in Unix). Finally, the -R. flag tells the compiler to include information in the executable that tells the program, when running, to also look in the "." directory to find the library. It should be as easy as that!

2.7.3 Compiling A Multi-Threaded Application

Finally, let's talk about compiling your multithreaded OS. This is fairly straight-forward, too. First, when compiling a .c or .cc file, do the following:



g++ -c os.cc -D_REENTRANT -g -Wall

The only new flag here is -D_REENTRANT, which defines the constant _REENTRANT for the file being compiled. This is equivalent to including the line "#define _REENTRANT" inside of your code. The reason you need to do this is that a number of libraries and such that you link with require this to be defined to work properly in multithreaded mode. Failing to do this will cause strange and hard to track down bugs to occur.

Finally, to link your OS all together, do something like this:



g++ -o os os.o Domain.o -lnsl -lsocket -lpthread

The -lnsl and -lsocket flags are included because Domain.o requires them (again), so the only new flag is the -lpthread flag, which links with the pthread library.

2.9 Summary of Useful Stuff

Domain sockets code is found in

Domain.c

Domain.h

The skeleton code for the library is found in

LibOS.cc

LibOS.h

LibOS.c

Compilation tips are found right above

3.0 Hints For A Successful Endeavor

One might consider breaking down the project into smaller parts, each of which is not too difficult. There are a number of obvious divisions:

Get interprocess communication to work with domain sockets. Make sure you can send a simple message to the OS, and then respond to whomever sent the message.
Decide on a message format. How are you going to pass arguments to the OS, and results back to the process? Get that settled upon, and then implement it. Make sure that all of the arguments are passed correctly to the OS, and that any results are passed back successfully too, all without actually implementing any of the pipe functionality.
Add multi-threading to the OS: learn how to start a thread, how to get the main OS thread to hand-off work to the workers, and so forth.
Implement the pipe functionality: what data structures are needed to track these things, what synchronization is needed to make sure that access to shared data structures is synchronized, and so forth.

There are some obvious ways to split this among two people. One person could work on the library and communication, while the other could work on multi-threading the OS and implementing the pipe functionality. However, I recommend that both have a good idea of what is going on in the others' code, because you both are responsible for a correct, working implementation.

4.0 Grading

No late assignments will be accepted. This project is due on Thursday 11:59 PM on March 7. This assignment will be graded based on correctness of implementation as well as robustness. To this end, we will not just be looking at the output of your program, but also the code. Points will also be deducted if all the proper files are not submitted. In this case, the mininum files will be LibOS.c, os.c, and whatever other files you have modified, a Makefile, and a README file describing your program, how to run it, and any other notes you consider relevant.

5.0 Handing in Your Project

The directory for handing in your program can be found at:

~cs537-1/handin/(username)/p2

where (username) is your login. Because you are working in groups of two, please go ahead and put copies of your code in both partners directories.

Obviously, you should submit all source files that are needed to compile your program (just the ones you wrote, not standard headers that will be found on any machine). You should also include a README file and a Makefile for this project. Your README file should contain information about how to run your program, any known bugs it may have, and any other information that is important to your project. Your makefile should successfully compile the program and create the os executable and the LibOS library.

Also, be sure to comment your code well so that it can be easily browsed to see what is going on. Code that is excessively difficult to read may be penalized.