Today : Mon, 06 Jul 20 .


CS537 Student Wiki


Writing

Homeworks

Sections

Projects

Project Discussion

edit SideBar

Project1

Page: PmWiki.Project1 - Last Modified : Thu, 24 Sep 09

Programming Assignment 1: The Unix Shell

Due Thursday, 9/24 at 9 pm.
REVISED: Due Tuesday, 9/29 at 9pm. If turned in by 9/24 at 9pm, you can receive up to 20% extra credit.
You are to do this project BY YOURSELF
This project must be implemented in C (and not C++ or anything else)

Update

Thursday, 9/24:

  • Clarification: regarding the file name for executables. I said in class that you should be able to find executables in the local directory. People have pointed out that execvp() searches the path, which does not include the local directory. So ... You can rely on execvp() to do the right thing. If we want to find executables in the current directory, we will modify your path before running that test.
  • Clarification: It has been pointed out that "cd ~" does not work. As long as you faithfully pass in the directory name from the command line to "cd", it is chdir()'s problem to do the change. If it cannot handle something, that is not your problem. In addition, it is optional to handle directory names with spaces in them (meaning we won't test it).

Wednesday, 9/23:

  • Hint: you can use gdb to debug child processes as well. Here is a link to instructions
  • Several people have asked whether there has to be a space before ampersands (&). The assignment does not specify that a space is required. HOWEVER, you can safely assume there will be a space and do not need to handle the case where there is not -- you can treat the ampersand as part of the last work.
  • Hint: make sure to call exit() after executing a shell command if you are doing it in the background -- otherwise you will leave two copies of the shell around.
  • A student pointed out that you can get the the web page for Purify help by running "purify -onlinehelp"
  • Apparently the mysterious "detach()" system call I mentioned to prevent zombies does not exist in Linux. You can ignore zombies for this project. However, you do need to wait for all child processes to exit before the shell itself exits.
  • Hint: if you do "cd something &" it is not supposed to have any effect. The reason is that the current working directory is different for each process. When you do "cd something &" - it changes the directory of the background process, not the shell itself.

Monday, 9/21:

  • Here is a link to a simple program for testing background code: simple.c. It takes three parameters: a name to print, the number of seconds to sleep, and the number of iterations to execute.
  • Your shell only needs to execute programs in the current directory or with a full file name. So, a user must enter "/bin/ls" to find the ls program. Running "ls" would only find ls if it was in the current directory.

Sunday, 9/20:

  • You must use fork() and execvp() for launching programs -- do not use system(). The goal of this assignment is to write a shell, and system() invokes the already-existing shell to do things.
  • Every process has its own working directory. So, if you fork a process and the new process executes the chdir() system call, it does not change the directory for the parent process. Hence, for chdir() to change directory for the shell itself, the shell process must make the system call itself. In fact, the shell should execute all built-in commands itself. However, if a built-in command is followed by and '&', then you need to make it run in the background. To accomplish this, you can call fork(), and then execute the command in the new process. Note that chdir() will not change the shell's password this time, but as specified below, that is o.k.

Friday, 9/18:

  • As I described in section, there can be multiple '&'s on a line. If there is no final '&', then your shell needs to wait for just the final command. Otherwise, it doesn't have to wait for any of the commands. This is now described in the "Process Control" section below
  • Hint for executing builtin commands with '&': fork the shell, as you would for a regular command, and then execute the builtin command.
  • A development hint: start by developing the parsing code separately, just to parse the commands and print out the executable name and arguments. Write the fork/exec code separately using hard-coded commands for testing. Once these two pieces work on their own, then combine them.

Objectives

There are four objectives to this assignment:

  • To familiarize yourself with the Linux programming environment.
  • To develop your defensive programming skills in C.
  • To gain exposure to the necessary functionality in shells.
  • To learn how processes are handled (i.e., starting and waiting for their termination).

Overview

In this assignment, you will implement a command line interpreter or shell. The shell should operate in this basic way: when you type in a command (in response to its prompt), the shell creates a child process that executes the command you entered and then prompts for more user input when it has finished.

The shells you implement will be similar to, but much simpler than, the one you run every day in Unix. You can find out which shell you are running by typing echo $SHELL at a prompt. You may then wish to look at the man pages for sh or the shell you are running (more likely tcsh or bash) to learn more about all of the functionality that can be present. For this project, you do not need to implement much functionality; but you will need to be able to handle running multiple commands simultaneously.

Your shell can be run in two ways: interactive and batch. In interactive mode, you will display a prompt (any string of your choosing) and the user of the shell will type in a command at the prompt. In batch mode, your shell is started by specifying a batch file on its command line; the batch file contains the list of commands (each on its own line) that should be executed. In batch mode, you should not display a prompt. In batch mode you should echo each line you read from the batch file back to the user before executing it; this will help you when you debug your shells (and us when we test your programs). In both interactive and batch mode, your shell stops accepting new commands when it sees the exit command on a line or reaches the end of the input stream (i.e., the end of the batch file or the user types 'Ctrl-D'). The shell should then exit after all running processes have terminated.

Process Control

Each line (of the batch file or typed at the prompt) contains a single command. In addition, programs can be launched asyncronously with the "&" separator. This causes the command to execute without waiting for the result. This should be implemented by calling fork() and then executing the command.

For example:

  prompt> ls -l &
  prompt> cat foo

will execute "cat foo" without waiting for "ls -l" to complete. This is simple to code: just don't wait for the first process to complete before starting the second.

You can also have multiple commands on a line with this:

  prompt> cat foo & sleep 10 &

This cases both 'cat' and 'sleep' to execute in the background but the shell immediately continues and prints another prompt.

  prompt> cat foo & sleep 10

This causes 'cat foo' to execute in the background, but the shell must wait for the 'sleep 10' to complete because it does not have a '&' after it.

Built-in commands

Whenever your shell accepts a command, it should check whether the command is a built-in command or not. If it is, it should not be executed like other programs. Instead, your shell will invoke your implementation of the built-in command. For example, to implement the exit built-in command, you simply call exit(0); in your C program.

The UNIX shell has many built-in commands such as cd , echo , pwd , etc. In this project, you should implement:

  • exit - exit the shell
  • echo - print the remaining arguments (up to "&" if there is one)
  • cd - change the current directory
  • pwd - print the current directory

Your shell users will be happy with this feature because they can change their working directory. Without this feature, your user is stuck in a directory.

For the 'exit' built-in command, you should simply call 'exit();'. The corresponding process will exit, and the parent (i.e. your shell) will be notified. If the "exit" command is on the same line with other commands, no commands after the "exit" should execute. If not followed by an '&, built-in commands should execute in your shell program, so that side effects such as changing directories affect the shell and future programs that it launches. If a built-in command is followed by a '&', then it should not affect the shell

No commands after "exit" should be executed. Also, invoking "exit" without waiting (i.e., followed by an '&') does not cause the shell to exit.

The shell should wait for all child processes to terminate when the user hits ctrl-D or the shell reaches the end of the batch file. This is true for the "exit" command as well.

For managing the current working directory, you should use chdir, and getcwd. You do not have to parse the directory for cd; just pass it to the chdir system call. You do not have to manage the $PWD environment variable. getcwd() system call is useful to know the current working directory; i.e. if a user types pwd, you simply call getcwd(). And finally, chdir is useful for moving directories.

Built-in command execute as if the shell were a separate process. Thus, they can be executed asynchronously using "&".

This project is not as hard as it may seem at first reading (or perhaps it doesn't seem that hard at all, which is good!); in fact, the code you write will be much smaller than this specification. Writing your shell in a simple manner is a matter of finding the relevant library routines and calling them properly. Your finished programs will probably be under 200 lines, including comments. If you find that you are writing a lot of code, it probably means that you are doing something wrong and should take a break from hacking and instead think about what you are trying to do.

Program Specifications

Your C program must be invoked exactly as follows:

  shell [batchFile] 

The command line arguments to your shell are to be interpreted as follows.

batchFile
an optional argument (often indicated by square brackets as above). If present, your shell will read each line of the batchFile for commands to be executed. If not present, your shell will run in interactive mode by printing a prompt to the user at stdout and reading the command from stdin.

For example, if you run your program as

  shell /u/j/v/batchfile 

then your program will read commands from /u/j/v/batchfile until it sees the exit command.

Defensive programming is an important concept in operating systems: an OS can't simply fail when it encounters an error; it must check all parameters before it trusts them. In general, there should be no circumstances in which your C program will core dump, hang indefinately, or prematurely terminate. Therefore, your program must respond to all input in a reasonable manner; by "reasonable", we mean print an understandable error message and either continue processing or exit, depending upon the situation.

You should consider the following situations as errors; in each case, your shell should print a message (to stderr) and exit gracefully:

  • An incorrect number of command line arguments to your shell program.
  • The batch file does not exist or cannot be opened.
  • For the following situation, you should print a message to the user (stderr) and continue processing:
  • A command does not exist or cannot be executed.
  • Optionally, to make coding your shell easier, you may print an error message and continue processing in the following situation:
  • A very long command line (for this project, over 512 characters).

If a line is too long, you should print an error without executing any commands.

Do not use printf! When you want to print a command line or error message, your shell code should *never use printf* . Instead, use write(STDOUT_FILENO, ...). For example:

    write(STDOUT_FILENO, cmdline, strlen(cmdline));

    char error_message[30] = "An error has occurred\n";
    write(STDERR_FILENO, error_message, strlen(error_message)); 

You can use sprintf() to get the same formatting features of printf:

   char error_message[200];
   sprintf(error_message,"Error occurred: invalid command %s\n",command);
   write(STDERR_FILENO, error_message, strlen(error_message));

The reason is quite complicated. In short, you will test your shell in an automated-way, and using printf in your shell code will make the output indeterministic. If you want to know more, email us. Your shell should also be able to handle the following scenarios, which are not errors (i.e., your shell should not print an error message):

  • An empty command line.
  • Extra white spaces within a command line.
  • Batch file ends without exit command or user types 'Ctrl-D' as command in interactive mode.

In no case, should any input or any command line format cause your shell program to crash or to exit prematurely. You should think carefully about how you want to handle oddly formatted command lines (e.g., lines with no commands between an &). In these cases, you may choose to print a warning message and/or execute some subset of the commands. However, in all cases, your shell should continue to execute!

You can assume that files end in a newline and a ctrl-D only happens at the beginning of a line.

Hints

Your shell is basically a loop: it repeatedly prints a prompt (if in interactive mode), parses the input, executes the command specified on that line of input, and waits for the command to finish, if it is in the foreground. This is repeated until the user types "exit" or ends their input.

You should structure your shell such that it creates a new process for each new command. There are two advantages of creating a new process. First, it protects the main shell process from any errors that occur in the new command. Second, it allows easy concurrency; that is, multiple commands can be started and allowed to execute simultaneously (i.e., in parallel style).

To simplify things for you in this first assignment, we will suggest a few library routines you may want to use to make your coding easier. (Do not expect this detailed of advice for future assignments!) You are free to use these routines if you want or to disregard our suggestions.

To find information on these library routines, look at the manual pages (using the Unix command man ). You will also find man pages useful for seeing which header files you should include.

Hint: to make sure that all child processes have terminated before you exit the shell, you can rely on this information from the wait man page:

    If wait() returns due to a stopped or terminated child process, the
     process ID of the child is returned to the calling process.  Otherwise, a
     value of -1 is returned and errno is set to indicate the error.

Hint: to make sure the "cd" takes effect on the shell itself, make sure that you do not fork before executing builtin commands (unless they are followed by an '&').

Basically, when you enter control-D, the standard I/O libraries translate this to "end of file". Thus, you should make sure you can handle the end of a file. If you have this working for batch files, it should work the same for interactive usage.

Parsing

For reading lines of input, you may want to look at fgets(). To open a file and get a handle with type FILE * , look into fopen(). Be sure to check the return code of these routines for errors! (If you see an error, the routine perror() is useful for displaying the problem.) You may find the strtok() routine useful for parsing the command line (i.e., for extracting the arguments within a command separated by whitespace or a tab or ...).

Executing Commands

Look into fork(), execvp(), and wait/waitpid().

The fork() system call creates a new process. After this point, two processes will be executing within your code. You will be able to differentiate the child from the parent by looking at the return value of fork; the child sees a 0, the parent sees the pid of the child.

You will note that there are a variety of commands in the exec family; for this project, you must use execvp(). Remember that if execvp() is successful, it will not return; if it does return, there was an error (e.g., the command does not exist). The most challenging part is getting the arguments correctly specified. The first argument specifies the program that should be executed, with the full path specified; this is straight-forward. The second argument, char *argv[] matches those that the program sees in its function prototype:

  int main(int argc, char *argv[]); 

Note that this argument is an array of strings, or an array of pointers to characters. For example, if you invoke a program with:

  foo 205 535 

then argv[0] = "foo", argv[1] = "205" and argv[2] = "535". Important: the list of arguments must be terminated with a NULL pointer; that is, argv[3] = NULL. We strongly recommend that you carefully check that you are constructing this array correctly!

The wait()/waitpid() system calls allow the parent process to wait for its children. Read the man pages for more details.

Miscellaneous Hints

Remember to get the basic functionality of your shell working before worrying about all of the error conditions and end cases. For example, first focus on interactive mode, and get a single command running (probably first a command with no arguments, such as "ls"). Then, add in the functionality to work in batch mode (most of our test cases will use batch mode, so make sure this works!). Next, try working on multiple jobs separated with the ; character. Finally, make sure that you are correctly handling all of the cases where there is miscellaneous white space around commands or missing commands. We strongly recommend that you check the return codes of all system calls from the very beginning of your work. This will often catch errors in how you are invoking these new system calls.

Beat up your own code! You are the best (and in this case, the only) tester of this code. Throw lots of junk at it and make sure the shell behaves well. Good code comes through testing -- you must run all sorts of different tests to make sure things work as desired. Don't be gentle -- other users certainly won't be. Break it now so we don't have to break it later. Keep versions of your code. More advanced programmers will use a source control system such as CVS. Minimally, when you get a piece of functionality working, make a copy of your .c file (perhaps a subdirectory with a version number, such as v1, v2, etc.). By keeping older, working versions around, you can comfortably work on adding new functionality, safe in the knowledge you can always go back to an older, working version if need be.

For testing your code, you will probably want to run commands that take awhile to complete. Try compiling and running this very simple C program; when multiple copies are run in parallel you should see the output from each process interleaved. See the code for more details.

Grading

For this project, you need to hand in tyree distinct items:

  • Your source code (no object files or executables, please!)
  • A Makefile for compiling your source code
  • A README file with some basic documentation about your code

Hand in your source code. We have created a directory ~cs537-2/handin/NAME, and NAME is your login name. For example, if your login is jimmyv, your handin directory is ~cs537-2/handin/jimmyv. Your handin directory has subdirectories: p1, p2, p3, etc. For this assignment, use directory p1.

Copy all of your .c source files into the appropriate subdirectory. Do not submit any .o files. After the deadline for this project, you will be prevented from making any changes in these directory. Remember: No late projects will be accepted! To ensure that we compile your C correctly for the demo, you will need to create a simple makefile; this way our scripts can just run make to compile your code with the right libraries and flags. If you don't know how to write a makefile, you might want to look at the man pages for make, or better yet, read this little tutorial. Otherwise, check out this very simple sample makefile.

Finally, we would like to see a file called README describing your code. This file should have contain the following four sections:

  • Your name and login information
  • Design overview: A few paragraphs describing the overall structure of your code and any important structures.
  • Complete specification: Describe how you handled any ambiguities in the specification. For example, for this project, explain how your shell will handle lines that have no commands between '&'s.
  • Known bugs or problems: A list of any features that you did not implement or that you know are not working correctly

Due to the simplificity of this project, the documentation for this project is fairly minimal. It may be more extensive for future projects. The majority of your grade for this assignment will depend upon how well your implementation works and a smaller portion will be given for documentation and style. We will run your program on a suite of about 20 test cases, some of which will exercise your programs ability to correctly execute commands and some of which will test your programs ability to catch error conditions. Be sure that you thoroughly exercise your program's capabilities on a wide range of test suites, so that you will not be unpleasantly surprised when we run our tests.

Even though you will not be heavily graded on style for this assignment, you should still follow all the principles of software engineering you learned in CS 302, CS 367, and elsewhere, such as top-down design, good indentation, meaningful variable names, modularity, and helpful comments. Don't be sloppy! You should be following these principles for your own benefit. Finally, while you can develop your code on any system that you want, make sure that your code runs correctly on a machine that runs the Linux operating system. Specifically, since libraries and environments sometimes vary in small and large ways across systems, you should verify your code on the linux machines in the 13XX labs (e.g., the royal or emperor clusters). These machines are where your projects will be tested.

Other Shell Fun

The shell you are building is very simplistic; it doesn't have a PATH variable, there is no shell history of previous commands you have run, the user can't customize the prompt, etc. Feel free to play around with adding such functionality; it will give you some more insight into how things really work. However, please don't let your fun ruin your code -- please turn in a copy of the shell that implements only the functionality described in this specification for the project.

This page may have a more recent version on pmwiki.org: PmWiki:Project1, and a talk page: PmWiki:Project1-Talk.


Powered by PmWiki
Skin by CarlosAB

looks borrowed from http://haran.freeshell.org/oswd/sinorca
More skins here


PmWiki can't process your request

Cannot acquire lockfile

We are sorry for any inconvenience.

More information

Return to http://pages.cs.wisc.edu/~swift/classes/cs537-fa09/wiki/pmwiki.php