Programming Assignment 1: The Unix Shell
Due: Thursday, September 20 at 9 pm
You are to do this project BY YOURSELF
This project must be implemented in C (and not C++ or anything else)
Updates
8/25: Any updates to the project specification will go here. You should read through the archives before sending a question -- perhaps it is already answered!
Objectives
There are five objectives to this assignment:- To familiarize yourself with the Linux programming environment.
- To develop your defensive programming skills in C.
- To gain exposure to the necessary functionality in shells.
- To learn how processes are handled (i.e., starting and waiting for their termination).
- To turn in a picture of yourself.
Overview
In this assignment, you will implement a command line interpreter or shell. The shell should operate in this basic way: when you type in a command (in response to its prompt), the shell creates a child process that executes the command you entered and then prompts for more user input when it has finished.
The shells you implement will be similar to, but much simpler than,
the one you run every day in Unix. You can find out which shell you
are running by typing
echo $SHELL at a prompt. You may then wish to look at the man
pages for
sh or the shell you are running (more likely tcsh or bash) to
learn more about all of the functionality that can be present. For
this project, you do not need to implement much functionality; but you
will need to be able to handle running multiple commands
simultaneously.
Your shell can be run in two ways: interactive and
batch. In interactive mode, you will display a prompt (any
string of your choosing) and the user of the shell will type in a
command at the prompt. In batch mode, your shell is started by
specifying a batch file on its command line; the batch file contains
the list of commands that should be
executed. In batch mode, you should
not display a prompt. In batch mode you should echo each line
you read from the batch file back to the user before executing it;
this will help you when you debug your shells (and us when we test
your programs). In both interactive and batch mode, your shell
stops accepting new commands when it sees the quit command on a line or
reaches the end of the input stream (i.e., the end of the batch file
or the user types 'Ctrl-D'). The shell should then exit after
all running processes have terminated.
Each line (of the batch file or typed at the prompt) may contain
multiple commands separated with the ; character. Each of the
commands separated by a
; should be run simultaneously, or concurrently.
prompt>
prompt> ls
prompt> /bin/ls
prompt> ls -l
prompt> ls -l ; cat file
prompt> ls -l ; cat file ; grep foo file2
For example, on the last line, the commands ls -l , cat file and grep foo file2 should all be running at the same time; as a result, you may see that their output is intermixed.
To exit the shell, the user can type quit. This should just
exit the shell and be done with it (the exit() system call will be
useful here). Note that quit is a built-in shell
command; it is not to be executed like other programs the user types
in. If the "quit" command is on the same line with other commands,
you should ensure that the other commands execute (and finish) before
you exit your shell.
These are all valid examples for quitting the shell.
prompt> quit
prompt> quit ; cat file
prompt> cat file ; quit
This project is not as hard as it may seem at first reading (or
perhaps it doesn't seem that hard at all, which is good!); in fact,
the code you write will be much smaller than this
specification. Writing your shell in a simple manner is a matter of
finding the relevant library routines and calling them properly. Your
finished programs will probably be under 200 lines, including
comments. If you find that you are writing a lot of code, it probably
means that you are doing something wrong and should take a break from
hacking and instead think about what you are trying to do.
Program Specifications
Your C program must be invoked exactly as follows:shell [batchFile]
The command line arguments to your shell are to be interpreted as follows.
batchFile: an optional argument (often indicated by square brackets as above). If present, your shell will read each line of the batchFile for commands to be executed. If not present, your shell will run in interactive mode by printing a prompt to the user at stdout and reading the command from stdin.For example, if you run your program as
shell /u/j/v/batchfile
then your program will read commands from /u/j/v/batchfile until it sees the quit command.
Defensive programming is an important concept in operating
systems: an OS can't simply fail when it encounters an error; it must
check all parameters before it trusts them. In general, there should
be no circumstances in which your C program will core dump, hang
indefinately, or prematurely terminate. Therefore, your program must
respond to all input in a reasonable manner; by "reasonable", we mean
print an understandable error message and either continue processing
or exit, depending upon the situation.
You should consider the following situations as errors; in each case,
your shell should print a message (to stderr) and exit gracefully:
- An incorrect number of command line arguments to your shell program.
- The batch file does not exist or cannot be opened.
For the following situation, you should print a message to the user (stderr) and
continue processing:
- A command does not exist or cannot be executed.
Optionally, to make coding your shell easier, you may print an error message and continue processing in the following situation:
- A very long command line (for this project, over 512 characters including the '\n').
Your shell should also be able to handle the following scenarios, which are
not errors (i.e., your shell should not print an error message):
- An empty command line.
- Extra white spaces within a command line.
- Batch file ends without quit command or user types 'Ctrl-D' as command in interactive mode.
prompt> ; cat file ; grep foo file2
prompt> cat file ; ; grep foo file2
prompt> cat file ; ls -l ;
prompt> cat file ;;;; ls -l
prompt> ;; ls -l
prompt> ;
Hints
Your shell is basically a loop: it repeatedly prints a prompt (if in interactive mode), parses the input, executes the command specified on that line of input, and waits for the command to finish, if it is in the foreground. This is repeated until the user types "quit" or ends their input.
You should structure your shell such that it creates a new process for
each new command. There are two advantages of creating a new
process. First, it protects the main shell process from any errors
that occur in the new command. Second, it allows easy concurrency;
that is, multiple commands can be started and allowed to execute
simultaneously (i.e., in parallel style).
To simplify things for you in this first assignment, we will suggest a
few library routines you may want to use to make your coding
easier. (Do not expect this detailed of advice for future
assignments!) You are free to use these routines if you want or to
disregard our suggestions.
To find information on these library routines, look at the manual pages (using the Unix command
man
). You will also find man pages useful for seeing which header files you should include.
Parsing
For reading lines of input, you may want to look at fgets(). To open a file and get a handle with type FILE * , look into fopen(). Be sure to check the return code of these routines for errors! (If you see an error, the routine perror() is useful for displaying the problem.) You may find the strtok() routine useful for parsing the command line (i.e., for extracting the arguments within a command separated by whitespace or a tab or ...).Executing Commands
Look into fork(), execvp(), and wait/waitpid().
The
fork()
system call creates a new process. After this point, two processes will be executing within your code. You will be able to differentiate the child from the parent by looking at the return value of
fork;
the child sees a 0, the parent sees the
pid
of the child.
You will note that there are a variety of commands in the
exec
family;
for this project, you must use
execvp().
Remember that if execvp() is successful, it will not return;
if it does return, there was an error (e.g., the command does not
exist). The most challenging part is getting the arguments correctly
specified. The first argument specifies the program that should be
executed, with the full path specified; this is straight-forward. The
second argument, char *argv[] matches those that the program
sees in its function prototype:
int main(int argc, char *argv[]);
Note that this argument is an array of strings, or an array of
pointers to characters. For example, if you invoke a program with:
foo 205 535
then argv[0] = "foo", argv[1] = "205" and argv[2] = "535".
Important:
the list of arguments must be terminated with a NULL pointer; that is,
argv[3] = NULL. We strongly recommend that you carefully check that
you are constructing this array correctly!
The
wait()/waitpid()
system calls allow the parent process to wait for its children. Read
the man pages for more details.
Miscellaneous Hints
Remember to get the
basic functionality
of your shell working before worrying about all of the error
conditions and end cases. For example, first focus on interactive
mode, and get a single command running (probably
first a command with no arguments, such as "ls"). Then, add in the
functionality to work in batch mode
(most of our test cases will use batch mode, so make sure this
works!).
Next, try working on multiple jobs separated with the ;
character. Finally, make sure that you are correctly handling all of
the cases where there is miscellaneous white space around commands or
missing commands.
We strongly recommend that you check the return codes of all system
calls from the very beginning of your work. This will often catch
errors in how you are invoking these new system calls.
Beat up your own code!
You are the best (and in this case, the only) tester of this
code. Throw lots of junk at it and make sure the shell behaves
well. Good code comes through testing -- you must run all sorts of
different tests to make sure things work as desired. Don't be gentle
-- other users certainly won't be. Break it now so we don't have to
break it later.
Keep versions of your code. More advanced programmers will use a
source control system such as
CVS.
Minimally, when you get a piece of functionality working, make a copy
of your .c file (perhaps a subdirectory with a version number, such as
v1, v2, etc.). By keeping older, working versions around, you can
comfortably work on adding new functionality, safe in the knowledge
you can always go back to an older, working version if need be.
For testing your code, you will probably want to run commands that
take awhile to complete. Try compiling and running this very simple C
program; when multiple copies are run in
parallel you should see the output from each process interleaved. See
the code for more details.
Grading
For this project, you need to hand in four distinct items:- Your source code (no object files or executables, please!)
- A Makefile for compiling your source code
- A picture of yourself
- A README file with some basic documentation about your code
Hand in your source code. We have created a directory
~cs537-SECTION/handin/NAME, where SECTION is either 1 or 2 and NAME is
your login name. For example, if you are in section 2 and your login
is jimmyv, your handin directory is ~cs537-2/handin/jimmyv. Your
handin directory has subdirectories: p1, p2, p3, etc. For this
assignment, use directory p1.
Copy all of your .c source files into the appropriate subdirectory. Do
not submit any .o files. After the deadline for this project,
you will be prevented from making any changes in these directory.
Remember: No late projects will be accepted!
To ensure that we compile your C correctly for the demo, you will need
to create a simple
makefile;
this way our scripts can just run
make
to compile your code with the right libraries and flags. If you don't know how to write a makefile, you might want to look at the man pages for
make,
or better yet, read this little
tutorial.
Otherwise, check out this very simple
sample makefile.
You will also need to turn in a digital picture of yourself.
Put this in your handin directory (a jpg or a gif is fine), in the
form Firstname.Lastname.gif (or whatever). We want to get to know you
better, and what better way than to be able to associate a name with a
face?
If you don't turn in a picture, your project will not be graded!
Finally, we would like to see a file called README describing your code. This file should have contain the following four sections:
- Your name and login information
- Design overview: A few paragraphs describing the overall structure of your code and any important structures.
- Complete specification: Describe how you handled any ambiguities in the specification. For example, for this project, explain how your shell will handle lines that have no commands between semi-colons.
- Known bugs or problems: A list of any features that you did not implement or that you know are not working correctly
Even though you will not be heavily graded on style for this
assignment, you should still follow all the principles of software
engineering you learned in CS 302, CS 367, and elsewhere, such as
top-down design, good indentation, meaningful variable names,
modularity, and helpful comments.
Don't be sloppy!
You should be following these principles for your own benefit.
Finally, while you can develop your code on any system that you want,
make sure that your code runs correctly on a machine that runs the
Linux operating system. Specifically, since libraries and environments
sometimes vary in small and large ways across systems, you should
verify your code on the linux machines in the 13XX labs (e.g., the
royal or emperor clusters). These machines are where your projects
will be tested.