CS 537: Project 1.2 Clarifications/AdviceParsingA command is the path to a file, optionally followed by a variable number of arguments and input and output redirection. The entire command is optionally followed by an ampersand. The path, arguments and file name will not contain the '&', ';', '|', '>', or '<' characters. Input and output redirection may appear before, after or inbetween arguments. For example, the following are all valid and equivalent:
A list of commands is a sequence of commands separated by semicolons. There may be several semicolons separating two commands. A line is either empty, or any number of lists of commands separated by at most one '|'. Lines are terminated by newline characters. The following examples display correct shell output (assuming the only file in the current directory is shell.c):prompt> /bin/ls shell.c prompt> /bin/ls ; /bin/ls shell.c shell.c prompt> /bin/ls ; /bin/ls ; /bin/ls shell.c shell.c shell.c prompt> Input/Output RedirectionThe command can not redirect input or output without specifying a file that exists. You do not have to print the exact error messages as follows, but your code should not segfault on the following examples.prompt> /bin/ls < Missing name for redirect. prompt> /bin/ls > Missing name for redirect. prompt> prompt> /bin/ls < nonexistant nonexistant: No such file or directory prompt>If the output file does not exist you should create it: prompt> /bin/ls newfile /bin/ls: newfile: No such file or directory prompt> /bin/ls > newfile prompt> /bin/cat newfile shell.c prompt>If the output file already exists you should not overwrite it: prompt> /bin/ls > shell.c shell.c: File exists prompt> For the above you should use the open system call, look at the various arguments it takes. Input or output redirection should not be specified more than once for the same command:prompt> /bin/ls < shell.c < shell.c Ambiguous input redirect. prompt> PipesPipes are the oldest form of UNIX interprocess communication (IPC). They are half-duplex, data only flows in one direction. The pipe system call takes an array of two integers, it puts the file descriptor for reading from the pipe in the first element and the file descriptor for writing to the pipe in the second. Pipes are not useful within a single process, but if a process forks a child (or children) after creating a pipe then both processes have copies of the pipe file descriptors. The process producing data process closes the 1st file descriptor and writes to the 2nd file descriptor. The consuming process closes the 2nd file second file descriptor and reads from the 1st file descriptor. This code illustrates how your shell could set up a pipe between the two processes in the command "ls | grep foo". Make sure you understand what it does. The program pnums takes a single argument n and prints the numbers 0 to n-1 on separate lines. The program clines counts the number of lines of input it receives and prints this information. In the following example the output of pnums is piped to clines. prompt> /u/e/l/eli/537/p1/pnums 3 0 1 2 prompt> /u/e/l/eli/537/p1/pnums 3 | /u/e/l/eli/537/p1/clines 3 lines prompt>Input/output redirection can be combined on the same line: prompt> /u/e/l/eli/537/p1/pnums 3 | /u/e/l/eli/537/p1/clines > /u/e/l/eli/537/p1/newfile prompt> /bin/cat /u/e/l/eli/537/p1/newfile 3 lines prompt> A process can not receive input from both a pipe AND from input redirection. A process can not output to both a pipe and have its output redirected to a file. You do not have to print the following errors for the following examples but your code should not segfault on them (we will not assume that part of the command executes succesfully). prompt> /u/e/l/eli/537/p1/pnums 5 | /u/e/l/eli/537/p1/clines < /u/e/l/eli/537/p1/some_file Ambiguous input redirect. prompt> /u/e/l/eli/537/p1/pnums 5 > /u/e/l/eli/537/p1/newfile | /u/e/l/eli/537/p1/clines Ambiguous output redirect. prompt> WaitingFor part 2, when your shell exits (either with Ctrl-d or with the "exit" command) you should not wait for all children to exit before your shell exits. A command that is executed in the background should not be immediately waited on by the shell. But eventually this process must be waited on because after it exits it will stick around until its parent waits on it (it sticks around so that its return code is available). When a process exits its parent is sent the SIGCHLD signal. A signal is like a software interrupt. Signals are received asynchronously (we do not know when the child will die). To handle this you need to setup a signal handler for SIGCHILD so that processes that exit can be waited on (if you do not wait on them they show up as "defunct" in the ps command's output). The following illustrates what code you need to add. #include <signal.h> void handle_sigchld(int s) { /* execute non-blocking waitpid, loop because we may only receive * a single signal if multiple processes exit around the same time. */ while (waitpid(0, NULL, WNOHANG) > 0); } int main() { ... /* register the handler */ signal(SIGCHLD, handle_sigchld); ... }Your shell should still execute a (blocking) waitpid to wait on a process that is not to be executed in the background. When it does, it should ignore the error return code (indicating the process was already waited on) because it is possible that the signal handler waited on the child first. This is an example of a race condition (the race is between these two calls to waitpid). Background ProcessesIf a sequence of commands ends with an ampersand then all commands in pipe chain should execute in the background. The program endless loops endlessly. In the following example note that endless is executing in the background even though we only use one ampersand at the end of the line.prompt> /u/e/l/eli/537/p1/endless | /u/e/l/eli/537/p1/clines & prompt> /bin/ps PID TTY TIME CMD 5799 pts/2 00:00:00 tcsh 6050 pts/2 00:00:00 shell 6051 pts/2 00:00:03 endless 6052 pts/2 00:00:00 clines 6053 pts/2 00:00:00 ps prompt>This is not true for commands in a list, in the following example we never execute the second command because the first never terminates. prompt> /u/e/l/eli/537/p1/endless ; /u/e/l/eli/537/p1/endless & ... ... ...The following works: prompt> /u/e/l/eli/537/p1/endless & ; /u/e/l/eli/537/p1/endless & prompt> /bin/ps PID TTY TIME CMD 8899 pts/0 00:00:00 tcsh 9841 pts/0 00:00:00 shell 9842 pts/0 00:00:19 endless 9843 pts/0 00:00:19 endless 9847 pts/0 00:00:00 ps prompt> AdviceCome up with a high-level algorithm for parsing the command line. As with any assignment, think about the problem before you start diving into code. For example, do you want to go through each token sequentially, or make several passes over the string with different delimiters each time? You might find strtok_r useful for this. Your life will be easier, and your code will be much cleaner, if you use data structures and functions to modularize your code. If you are not familiar with structs, memory allocation (or other C basics) get a copy of "The C Programming Language" or use the online tutorials. If you need practice, trying doing some simple exercises like implementing a linked list. If you (1) check the return values of all library and system calls for errors, and (2) check pointer values for NULL before using them you will find the source of many bugs. Use default values. For example, if you create an array it will initially contain junk values. If you initially set all array elements to NULL, then later observe junk values you know these values are due to a bug in your program. If you are getting a segfault -- look at one of the gdb tutorials to see how to debug your program. Here's the transcript of a shell session using gdb to see which line in a program caused the segfault. The program should be compiled with the -g flag, this is so gcc will include debug information (like line numbers) in the binary. |