CS 537: Fall 2005

Programming Assignment 1: The Unix Shell


Due: Thursday, September 22 at 9 pm
You are to do this project BY YOURSELF
This project must be implemented in C (and not C++ or anything else)

Updates

9/16: Handling Control-D and EOF. Your shell should recognize the end of the input stream and terminate when it gets an "end of file" (EOF) notification. However, if the EOF comes at the end of a line of commands, you should first execute those commands before exiting.

9/16: Exiting when processes are running. You should never exit the shell while jobs are running. Hence, if some jobs are running and you receive an exit or EOF, you should wait until they complete and then exit.

9/16: You can absolutely receive multiple built-in commands on a single line, such as "style" or even "exit".

9/13: Due to popular demand, project due time changed to 9pm.

9/8: Send most project questions to: 537proj@cs. Answers to questions will be archived here. Note: you should read through the archives before sending a question -- perhaps it is already answered! If questions are particularly private in nature (should be the rare case), just send it to remzi or the TAs or (better yet) all of us.

9/8: Any updates to the project specification will go here.

Objectives

There are four objectives to this assignment:

Overview

In this assignment, you will implement a command line interpreter or shell. The shell should operate in this basic way: when you type in a command (in response to its prompt), the shell creates a child process that executes the command you entered and then prompts for more user input when it has finished.

The shells you implement will be similar to, but much simpler than, the one you run every day in Unix. You can find out which shell you are running by typing echo $SHELL at a prompt. You may then wish to look at the man pages for sh or the shell you are running (more likely tcsh or bash) to learn more about all of the functionality that can be present. For this project, you do not need to implement much functionality; specifically, you will not implement pipes, or re-direction of standard input and standard output. You will need to be able to handle running multiple commands simultaneously.

Your shell can be run in two ways: interactive and batch. In interactive mode, you will display a prompt (any string of your choosing) and the user of the shell will type in a command at the prompt. In batch mode, your shell is started by specifying a batch file on its command line; the batch file contains the list of commands (each on its own line) that should be executed. In batch mode, you should not display a prompt. In batch mode you should echo each line you read from the batch file back to the user before executing it; tis will help you when you debug your shells (and us when we test your programs). In both interactive and batch mode, your shell terminates when it sees the exit command on a line or reaches the end of the input stream (i.e., the end of the batch file or the user types 'Ctrl-D').

More than one job can be run on a single command line; a semi-colon ";" separates each separate job to run. For example, if the user types /bin/ls; /bin/ps the shell should run both programs. However, how those jobs are run depends on the style the shell is in. If the shell is in sequential style, the jobs should be run one at a time, in left-to-right order. Hence, in our previous example ( /bin/ls; /bin/ps ), first ls should run to completion, then ps. In contrast, in parallel style, the jobs should all be started at the same time. In both styles, the prompt should not be shown again until all jobs are complete (the wait() and waitpid() system calls may be useful here). Note that more than two jobs may be combined on a single command line, with semi-colons in-between.

To switch into sequential style, the user types style sequential. Similarly, to switch into parallel style, the user types style parallel. Shortcuts should be available (e.g., "style s" and "style p" should work, too). The shell should begin in sequential style.

To quit the shell, the user can type exit. This should just quit the shell and be done with it (the exit() system call will be useful here). Note that exit and style are all built-in shell commands. They are not to be executed like other programs the user types in.

You will also do one other thing in this project: turn in a digital picture of yourself. Put this in your handin directory (a jpg or a gif is fine), in the form Firstname.Lastname.gif (or whatever). We want to get to know you better, and what better way than to be able to associate a name with a face?

This project is not as hard as it may seem at first reading (or perhaps it doesn't seem that hard at all, which is good!); in fact, the code you write will be much smaller than this specification. Writing your shell in a simple manner is a matter of finding the relevant library routines and calling them properly. Your finished programs will probably be under 200 lines, including comments. If you find that you are writing a lot of code, it probably means that you are doing something wrong and should take a break from hacking and instead think about what you are trying to do.

Program Specifications

Your C program must be invoked exactly as follows:


shell [batchFile]


The command line arguments to your shell are to be interpreted as follows.
batchFile: an optional argument (often indicated by square brackets as above). If present, your shell will read each line of the batchFile for commands to be executed. If not present, your shell will run in interactive mode by printing a prompt to the user at stdout and reading the command from stdin.
For example, if you run your program as


shell /u/j/v/batchfile


then your program will read commands from /u/j/v/batchfile until it sees the exit command.

Defensive programming is an important concept in operating systems: an OS can't simply fail when it encounters an error; it must check all parameters before it trusts them. In general, there should be no circumstances in which your C program will core dump, hang indefinately, or prematurely terminate. Therefore, your program must respond to all input in a reasonable manner; by "reasonable", we mean print an understandable error message and either continue processing or exit, depending upon the situation.

You should consider the following situations as errors; in each case, your shell should print a message (to stderr) and exit gracefully:

For the following situation, you should print a message to the user (stderr) and continue processing:

Optionally, to make coding your shell easier, you may print an error message and continue processing in the following situation:

Your shell should also be able to handle the following scenarios, which are not errors:

All of these requirements will be tested extensively!

Hints

Your shell is basically a loop: it repeatedly prints a prompt (if in interactive mode), parses the input, executes the command specified on that line of input, and waits for the command to finish, if it is in the foreground. This is repeated until the user types "exit" or ends their input.

You should structure your shell such that it creates a new process for each new command. There are two advantages of creating a new process. First, it protects the main shell process from any errors that occur in the new command. Second, it allows easy concurrency; that is, multiple commands can be started and allowed to execute simultaneously (i.e., in parallel style).

To simplify things for you in this first assignment, we will suggest a few library routines you may want to use to make your coding easier. (Do not expect this detailed of advice for future assignments!) You are free to use these routines if you want or to disregard our suggestions.

To find information on these library routines, look at the manual pages (using the Unix command man ). You will also find man pages useful for seeing which header files you should include.

Parsing

For reading lines of input, you may want to look at fgets(). To open a file and get a handle with type FILE * , look into fopen(). Be sure to check the return code of these routines for errors! (If you see an error, the routine perror() is useful for displaying the problem.) You may find the strtok() routine useful for parsing the command line (i.e., for extracting the arguments within a command separated by whitespace or a tab or ...).

Executing Commands

Look into fork(), execvp(), and wait/waitpid().

The fork() system call creates a new process. After this point, two processes will be executing within your code. You will be able to differentiate the child from the parent by looking at the return value of fork; the child sees a 0, the parent sees the pid of the child.

You will note that there are a variety of commands in the exec family; for this project, you must use execvp(). Remember that if execvp() is successful, it will not return; if it does return, there was an error (e.g., the command does not exist). The most challenging part is getting the arguments correctly specified. The first argument specifies the program that should be executed, with the full path specified; this is straight-forward. The second argument, char *argv[] matches those that the program sees in its function prototype:

int main(int argc, char *argv[]);

Note that this argument is an array of strings, or an array of pointers to characters. For example, if you invoke a program with:

foo 205 535

then argv[0] = "foo", argv[1] = "205" and argv[2] = "535". Important: the list of arguments must be terminated with a NULL pointer; that is, argv[3] = NULL. We strongly recommend that you carefully check that you are constructing this array correctly!

The wait()/waitpid() system calls allow the parent process to wait for its children. Read the man pages for more details.

Miscellaneous Hints

Remember to get the basic functionality of your shell working before worrying about all of the error conditions and end cases. For example, first focus on interactive mode, and get a single command running in sequential style (probably first a command with no arguments, such as "ls"). Then, add in the functionality to work in batch mode (most of our test cases will use batch mode, so make sure this works!). Next, try working on sequential style for multiple jobs, then on parallel style. Finally, make sure that you are correctly handling all of the cases where there is miscellaneous white space around commands or missing commands.

We strongly recommend that you check the return codes of all system calls from the very beginning of your work. This will often catch errors in how you are invoking these new system calls.

Beat up your own code! You are the best (and in this case, the only) tester of this code. Throw lots of junk at it and make sure the shell behaves well. Good code comes through testing -- you must run all sorts of different tests to make sure things work as desired. Don't be gentle -- other users certainly won't be. Break it now so we don't have to break it later.

Keep versions of your code. More advanced programmers will use a source control system such as CVS. Minimally, when you get a piece of functionality working, make a copy of your .c file (perhaps a subdirectory with a version number, such as v1, v2, etc.). By keeping older, working versions around, you can comfortably work on adding new functionality, safe in the knowledge you can always go back to an older, working version if need be.

Grading

Hand in your source code. We have created a directory ~cs537-SECTION/handin/NAME, where SECTION is either 1 or 2 and NAME is your login name. For example, if you are in section 2 and your login is jimmyv, your handin directory is ~cs537-2/handin/jimmyv. Your handin directory has subdirectories: p1, p2, p3, etc. For this assignment, use directory p1.

Copy your picture into the directory too. If you don't turn in a picture, your project will not be graded!

Copy all of your .c source files into the appropriate subdirectory. Do not submit any .o files. After the deadline for this project, you will be prevented from making any changes in these directory. Remember: No late projects will be accepted!

To ensure that we compile your C correctly for the demo, you will need to create a simple makefile; this way our scripts can just run make to compile your code with the right libraries and flags. If you don't know how to write a makefile, you might want to look at the man pages for make, or better yet, read this little tutorial. Otherwise, check out this very simple sample makefile.

The majority of your grade for this assignment will depend upon how well your implementation works. We will run your program on a suite of about 20 test cases, some of which will exercise your programs ability to correctly execute commands and some of which will test your programs ability to catch error conditions. Be sure that you thoroughly exercise your program's capabilities on a wide range of test suites, so that you will not be unpleasantly surprised when we run our tests.

For testing your code, you will probably want to run commands that take awhile to complete. Try compiling and running this very simple C program; when multiple copies are run in parallel you should see the output from each process interleaved. See the code for more details.

Even though you will not be heavily graded on style for this assignment, you should still follow all the principles of software engineering you learned in CS 302, CS 367, and elsewhere, such as top-down design, good indentation, meaningful variable names, modularity, and helpful comments. Don't be sloppy! You should be following these principles for your own benefit.

Finally, while you can develop your code on any system that you want, make sure that your code runs correctly on a machine that runs the Linux operating system. Specifically, since libraries and environments sometimes vary in small and large ways across systems, you should verify your code on the linux machines in the 13XX labs (e.g., the royal or emperor clusters). These machines are where your projects will be tested.

Other Shell Fun

The shell you are building is very simplistic; it doesn't have a PATH variable, it doesn't support changing directories, there is no shell history of previous commands you have run, the user can't customize the prompt, etc. Feel free to play around with adding such functionality; it will give you some more insight into how things really work. However, please don't let your fun ruin your code -- keep a working copy of the shell before playing around.