There are four objectives to this assignment:
In this assignment, you will implement a command line interpreter or shell. The shell should operate in this basic way: when you type in a command (in response to its prompt), the shell creates a child process that executes the command you entered and then prompts for more user input when it has finished.
The shells you implement will be similar to, but much simpler, than the one you run every day in Unix. You can find out which shell you are running by typing echo $SHELL at a prompt. You may then wish to look at the man pages for sh or the shell you are running to learn more about all of the functionality that can be present. For this project, you do not need to implement much functionality; specifically, you will not implement pipes, or re-direction of standard input and standard output. You will need to be able to handle running multiple commands simultaneously.
Your shell can be run in two modes: interactive and batch. In interactive mode, you will display a prompt (any string of your choosing) and the user of the shell will type in a command at the prompt. In batch mode, your shell is started by specifying a batch file on its command line; the batch file contains the list of commands (each on its own line) that should be executed. In batch mode, you should not display a prompt. In batch mode you should echo each line you read from the batch file back to the user before executing it; tis will help you when you debug your shells (and us when we test your programs). In both interactive and batch mode, your shell terminates when it sees the exit command on a line or reaches the end of the input stream (i.e., the end of the batch file or the user types 'Ctrl-D').
Jobs may be executed in either the foreground or the background. When a job is run in the foreground, your shell waits until that job completes before it proceeds and displays the next prompt. When a job is run in the background (as denoted with the '&' character as the last non-whitespace character on the line), your shell starts the job, but then immediately returns and displays the next prompt with the background job still running.
Each job that is executed by your shell should be given its own unique job id (jid). Your shell must assign each new command (whether it completes successfully or not) the next integer, starting with jid 1. Empty lines and built-in shell commands (such as 'j' and 'w' described below) should not advance the jid count.
For example, given this sequence of input:
prompt> /bin/ls
prompt>
prompt> /bin/ls -l
prompt> output &
prompt> nonexistentjob
prompt> output -o 10 &
prompt> output -o 20 &
The jobs are given jids as follows:
/bin/ls | 1 |
/bin/ls -l | 2 |
output & | 3 |
nonexistentjob | 4 |
output -o 10 & | 5 |
output -o 20 & | 6 |
Users are able to perform very limited job control with your shell. First, users are able to find out which of their jobs are still running by using the shell built-in command 'j' . When your shell sees the 'j' command, it is to print to standard output the jid of each of the currently running jobs (i.e., the background jobs that have not yet finished) followed by a colon, :, and the job name and its arguments (without the '&'). For your shell to find out which jobs are still running, you may find the Unix system call waitpid useful; more details are given below. Be careful that your shell returns the current information about which jobs are really running and doesn't simply report the background jobs that the user hasn't explicitly waited for.
Given the previous sequence of input commands and assuming that all of the background jobs are still running, then, when the user types j on, then your shell should print:
3 : output
5 : output -o 10
6 : output -o 20
Note for us to correctly test your code, your output must match this format exactly!
Second, users are able to tell the shell to wait for a particular
job to terminate by using the built-in command 'w'. When
your shell sees the 'w' command along with the jid of a job,
it waits for the specified job to complete before accepting more
input. When the job completes, your shell should print the message,
"Job Continuing our example from above, if the user types the command
'w 3', then your shell waits for jid 3 to terminate and then prints:
Job 3 terminated
If the 'w' command is used with the jid of a job that has
already completed, then your shell should immediately return and print
the same message as above. Note that this case can occur even when
the user was just informed that a given job is executing, if the job
terminates before the user enters the 'w' command. In other
words, there is a race condition between these two events.
Continuing from our example, if the user types the command
'w 1', then your shell returns immediately and prints:
Job 1 terminated
Finally, if the 'w' command is used with an invalid jid,
then your shell should immediately return and print the message,
"Invalid jid In our example, if the user types the command 'w 20', then
your shell returns immediately and prints:
Invalid jid 20
Note that exit, j, and w are all
built-in shell commands. They are not to be executed and cannot be
placed in the background (you should just ignore the '&' if it is
specified). These commands are not given jids.
This project is not as hard as it may seem at first reading; in fact,
the code you write will be much, much smaller than this specification.
Writing your shell in a simple manner is a matter of finding the
relevant library routines and calling them properly. Your finished
programs will probably be under 200 lines, including comments. If you
find that you are writing a lot of code, it probably means that you
are doing something wrong and should take a break from hacking and
instead think about what you are trying to do.
You will probably want to check the log of questions (and answers)
posted to cs537-2help frequently; you should get in the habit of
checking this page daily, as new hints and clarifications may be added
at any time.
Defensive programming is an important concept in operating systems:
an OS can't simply fail when it encounters an error; it must check all
parameters before it trusts them. In general, there should be no
circumstances in which your C program will core dump, hang
indefinately, or prematurely terminate. Therefore, your program must
respond to all input in a reasonable manner; by "reasonable", we mean
print an understandable error message and either continue processing
or exit, depending upon the situation.
You should consider the following situations as errors; in each
case, your shell should print a message (to stderr) and exit gracefully:
For the following situation, you should print a message to the user
(stderr) and continue processing:
For simplicity, you can limit the number of background jobs that are
currently running to 32. However, you must allow an unlimited number
of background jobs to be started over the lifetime of your shell.
You should structure your shell such that it creates a new process for
each new command. There are two advantages of creating a new process.
First, it protects the main shell process from any errors that occur
in the new command. Second, it allows easy concurrency; that is,
multiple commands can be started and allowed to execute
simultaneously.
For each running job, you will want to track some information in a
data structure, similar to a process control block (PCB). It is up to
you to determine what information you need to keep in your PCB and the
list structure you want to use (e.g., an array or a linked list).
This information will allow you to wait for the appropriate job to
complete, as requested by the user.
To simplify things for you in this first assignment, we will suggest a
few library routines you may want to use to make your coding easier.
(Do not expect this detailed of advice for future assignments!) You
are free to use these routines if you want or to disregard our
suggestions.
To find information on these library routines, look at the manual
pages (using the Unix command man). You will also find man
pages useful for seeing which header files you should include.
The fork system call creates a new process. After this point,
two processes will be executing within your code. You will be able to
differentiate the child from the parent by looking at the return value
of fork; the child sees a 0, the parent sees the pid of
the child. Note that you will need to map between your jid and
this pid.
You will note that there are a variety of commands in the exec
family; for this project, you must use execv. Remember that if
execv is successful, it will not return; if it does return,
there was an error (e.g., the command does not exist). The most
challenging part is getting the arguments correctly specified. The
first argument specifies the program that should be executed, with the
full path specified; this is straight-forward. The second argument,
char *argv[] matches those that the program sees in its
function prototype:
int main(int argc, char *argv[]);
Note that this argument is an array of strings, or an array of
pointers to characters. For example, if you invoke a program with:
/bin/foo 205 535
then argv[0] = "/bin/foo", argv[1] = "205" and argv[2] = "535". Note
the list of arguments must be terminated with a NULL pointer; that is,
argv[3] = NULL. We strongly recommend that you carefully check that
you are constructing this array correctly!
The waitpid system call allows the parent process to wait for
one of its children. Note that it returns the pid of the completed child;
again, you will need to map between your jid and this
pid. You will want to investigate the different options that
can be passed to waitpid so that your shell can query the OS
about which jobs are still running without having to wait for a job to
terminate.
We strongly recommend that you check the return codes of all
system calls from the very beginning of your work. This will often
catch errors in how you are invoking these new system calls.
Hand in your source code. We have created a directory
~cs537-SECTION/handin/NAME, where SECTION is either 1 or 2 and NAME is
your login name. For example, if you are in section 2 and your login
is lab, your handin directory is ~cs537-2/handin/lab. Your handin
directory has five subdirectories: p1, p2, p3, p4, and p5. For this
assignment, use directory p1.
Copy all of your .c source files into the appropriate
subdirectory. Do not submit any .o files. After the deadline
for this project, you will be prevented from making any changes in
these directory. Remember: No late projects will be accepted!
To ensure that we compile your C correctly for the
demo, you will need to create a simple makefile; this way our
scripts can just run make to compile your code with the right
libraries and flags. If you don't know how to write a makefile, you
might want to look at the man pages for make. Otherwise, check
out this very simple sample makefile.
The majority of your grade for this assignment will depend upon how
well your implementation works. We will run your program on a suite
of about 20 test cases, some of which will exercise your programs
ability to correctly execute commands and some of which will test your
programs ability to catch error conditions. Be sure that you
thoroughly exercise your program's capabilities on a wide range of
test suites, so that you will not be unpleasantly surprised when we
run our tests.
For testing your code, you will probably want to run commands that
take awhile to complete. Try compiling and running this very simple C
program; when multiple copies are run in the
background you should see the output from each process interleaved.
See the code for more details.
Even though you will not be heavily graded on style for this assignment,
you should still follow all the principles of software engineering you
learned in CS 302 and CS 367, such as top-down design, good indentation,
meaningful variable names, modularity, and helpful comments. After all,
you should be following these principles for your own benefit.
Finally, while you can develop your code on any system that you want,
make sure that your code runs correctly on a machine that runs the
Linux operating system. Specifically, since libraries and
environments sometimes vary in small and large ways across systems,
you should verify your code on the royal cluster (which is where we
will test your code).
The first script is t1.exp . Please download it
and use 'chmod u+x t1.exp' to make it executable. Make sure you run the script
in a directory where it's able to find your 'shell' command. The basic idea of
the script is to run a command with your shell interactively (so you should
have interactive mode supported before you try this script) and compare it's result
with what we get by running the command directly w/o using your shell.
The second script is t2.sh . It requires two other files:
jobs2.txt and jobs2.sh .
So you need to download all the three files in the same directory. Then use 'chmod'
to make t2.sh and jobs2.sh executable. The idea of this script is to run commands
in batch file jobs2.txt using your shell and then check your results against what's
generated by jobs2.sh which runs the commands w/o using your shell. Notice that
in order to pass this test, you have to echo each command in batch file exactly as it is
(nothing more, nothing less) before executing it. If you get a failure message, you can
run 'shell jobs2.txt' and 'jobs2.sh' directly in your terminal to compare where the
results are different.
prompt>
prompt>
prompt>
Program Specifications
Your C program must be invoked exactly as follows:
shell [batchFile]
The command line arguments to your shell are to be interpreted as
follows.
For example, if you run your program as
shell /p/course/cs537-1/file1.txt
then your program will read commands from
/p/course/cs537-1/file1.txt until it sees the exit command.
Optionally, to make coding your shell easier, you may print an error
message and continue processing in the following situation:
Your shell should also be able to handle the following scenarios,
which are not errors:
All of these requirements will be tested extensively!
C Hints
Your shell is basically a loop: it repeatedly prints a prompt (if in
interactive mode), parses the input, executes the command specified on
that line of input, and waits for the command to finish, if it is in
the foreground. This is repeated until the user types "exit" or ends
their input.
Parsing
For reading lines of input, you may want to look at fgets().
To open a file and get a handle with type FILE *, look into
fopen(). Be sure to check the return code of these routines
for errors! (If you see an error, the routine perror() is
useful for displaying the problem.) You may find the
strtok() routine useful for parsing the command line (i.e., for
extracting the arguments within a command separated by ' ').
Executing Commands
Look into fork, execv, and waitpid.
Miscellaneous Hints
Remember to get the basic functionality of your shell working
before worrying about all of the error conditions and end cases. For
example, first focus on interactive mode, and get a single command
running in the foreground working (probably first command with no
arguments, such as "ls"). Then, add in the functionality to work in
batch mode (most of our test cases will use batch mode, so make sure
this works!). Next, handle starting up jobs in the background;
waiting for them and then listing their status should be next.
Finally, make sure that you are correctly handling all of the cases
where there is miscellaneous white space around commands or missing
commands.
Grading
Sample Test Scripts