A shell script, in its simplest form, is a series of shell commands put
together in a text file, which can then be executed in order. The commands used
in the scripting language are the same ones the user types in when they are
using Unix through a command line. To those of you who are used to writing code
in languages like C and Java, shell scripting can seem strange because it
doesn't need to be compiled before it is run and because of it's relative lack
of structure, but it can prove to be a surprisingly powerful tool.
Shell scripting is useful in situations where you want to get a program up and running without wasting too much time. Often a shell script will only have a fraction of the number of lines of an equivalent C or Java program, and hence, it only takes a fraction of the time to write. In this tutorial, we will look at how to use the Bourne shell, sh, but there are many other shell scripting languages available, such as ksh, csh and bash.
In this tutorial, any code will be written in a box, and comments will be
coloured green, as shown below:
#!/bin/sh ls # Comment |
In keeping with Computer Science tradition, the first Unix script we will
look at is the "Hello World" example. This is a very simple example, but it is
worth seeing because it demonstrates a number of interesting aspects of Unix
scripting.
#!/bin/sh printf "Hello World\n" # Writes string to standard output |
The program above is a valid shell script, you can run it by typing it into a text file, making the text file executable, and then running it just like any other program. Note that you do not need to compile it, like you would for C, or explicitly run an interpreter, like with Java, it runs as is.
When you look at the program, you will see three distinct parts to it. We will look at each of these seperately.
This line should be the first line of each and every shell script you
write. "#!" (pronounced "Hash bang") tells the computer that what comes after
it is the path of the interpreter needed to execute the script. In this case
(and with all of the scripts presented here), the interpreter is /bin/sh, the
Bourne Shell. If you are going to write a program in a different language, you
change the name of the interpreter, for example, Python scripts start with
#!/local/usr/bin/python.
This line writes Hello World to the standard output. The printf command
operates in a similiar way to the printf() function in C, and can be used to
write formatted output to the standard output. There is a similiar command,
echo, which also writes text to the standard output, but this isn't as
powerful as printf.
In sh, anything between # and the end of the line is considered to be a comment, and is ignored by the interpreter.
One of the features of Unix scripting is that any command you can use in
Unix is valid code in a shell script. All of the commands you have learnt in
Unix can be used in a shell script, for example:
#!/bin/sh ls -alp | more # Displays the contents of the current directory, and displays them one page at a time |
Try writing this code into a file and executing it. Compare the output from this program with the output produced when you type ls -alp | more into the command line. You should see that they produce the same output.
One thing to note in the above example is the use of the pipe (|). All of the
usual Unix redirects are available for use in a Shell script. As an example of
how these might be used, here is a program which writes out the login names of
everyone connected to our machine in lexiograpic order.
#!/bin/sh who > /tmp/tempfile # Writes the names of who is logged in to the file /tmp/tempfile sort /tmp/tempfile # Sorts the file and writes it to standard output rm /tmp/tempfile # Deletes the temporary file |
This program runs the 'who' command, which displays a list of everyone logged
in, and redirects its output to a temporary file. Note that the temporary file
is put in the /tmp directory, it is common practice to put any temporary files
in this directory when writing a Unix script. The next line writes a sorted
version of this file to standard output, and the final line removes the
temporary file which was created by 'who'. We could write this without needing
to create a temporary file by piping the output from who into sort, as shown
below.
#!/bin/sh who | sort # Displays logged in users in lexiographic order |
If you want a command to continue onto the next line, you can put a '\' at
the end of the line. You can use this as many times as you want, and it is
possible to have commands which run over many lines.
#!/bin/sh ls \ -alp |
One final thing to remember is if you don't want error messages from the
commands to appear on the screen, you should redirect the error stream to null
by writing 2>&-. This will stop error messages from the command from
reaching the terminal. Take, for example, this script which removes any
temporary files which may have been created a text editor. Temporary filenames
generally end with a ~, so the script will delete anything ending in ~, but if
there are no temporary files, we don't want to display an error message, we
should just quit normally.
#!/bin/sh rm *~ 2>&- # Error messages are sent to null, ie. they are not displayed |
Try running this script with and without the redirect, and notice the difference when there is nothing to remove.
We will take a closer look at redirecting input and output later on.
I'm sure that by now you are wondering about how you can declare a
variable in shell script. Creating a variable in a shell script is very easy,
much easier than in C or Java, all you need to do is write VARIABLENAME=VALUE
(Note that there are no spaces on each side of the equals sign. If you put in a
space, strange things will happen). To access a variable, put a dollar sign, $,
before the variable name, eg. $VARIABLENAME.
Variables in shell scripts operate in a very different way to variables in other computer languages. Some of the things you must remember are:
Here we have the example from the previous page, altered to include a
variable.
#!/bin/sh
TMPFILE=/tmp/tempfile who > $TMPFILE # Writes the names of who is logged in to the file
/tmp/tempfile |
When the program gets to a line in this program with '$TMPFILE' in it, it
relpaces '$TMPFILE' with '/tmp/tempfile', and then executes the command.
Hopefully you can see that if we replace '$TMPFILE' with '/tmp/tempfile' on each
line, the same three commands are executed as were in the previous page. One
important thing to remember about variables is that they can be used anywhere in
a command. For example, a script which simple executes the who command might
look like this:
#!/bin/sh
COMMAND=who |
When accessing variables, you can optionally put the variable name in { },
for example, the code below runs the ls command, but does so in a strange way.
If you can't see how putting the variable name in braces would be useful, don't
worry, its purpose will become clear later in the tutorial.
#!/bin/sh
COMMAND=l |
One of the more interesting things you can do with variables is to assign
the output of a command to a variable. This is done by enclosing the expression
in backticks (`). The backtick is the button at the top left of you keyboard,
next to the number 1. When a script reaches a line containing a command inside
backticks, it executes this command, replaces that command with its output, and
then executes the line as per normal. As with most aspects of Unix scripting,
this is best explained with an example.
#!/bin/sh
OUTPUT=`echo HELLO` printf "$OUTPUT\n" |
This example executes in the following order:
This was a fairly long winded method of getting something written to the screen, but it does demonstrate how backticks work. Later we will look at more interesting and useful ways of using backticks.
When you start a Unix script, there are already a number of special
variables which you can use. The first ones we will look are the ones relating
to command line arguments. The important variables are:
#!/bin/sh # This script requires at least two command line arguments to run properly printf "The scripts name is
$0\n" |
Type in and execute this script, with different numbers of command line options.
Another other important special variable is known as the Process ID (PID). In
Unix, each process, or running program, is allocated a unique PID between 1 and
32767. You can find out the value of the PID from the $$ variable.
#!/bin/sh printf "The Process ID is $$\n" |
Because the PID is unique, it is often used to help name temporary files.
Here we have the example given in the Using Variables tutorial, where we used a
temporary file.
#!/bin/sh
TMPFILE=/tmp/tempfile who > $TMPFILE # Writes the names of who is logged in to the file
/tmp/tempfile |
The problem with this is that if two copies of this script are run at once,
they will both try to access the same temporary file, and we could get some
errors occuring. A way to avoid this problem would be to ensure that each
invocation of the script gave its temporary file a unique name. This can be done
using the Process ID as shown below:
#!/bin/sh
TMPFILE=/tmp/tempfile-$$ # This is a unique filename who > $TMPFILE # Writes the names of who is logged in to the temporary
file |
The final variable which you should know about is $?. This contains the exit
status of the last command executed. When a program finishes, it passes a number
back to the shell telling it how it exited. Generally, if the program finishes
normally, the program will return a value of 0, and if the program ended due to
an error, it will return a non-zero value. This example below uses cat to
display the file $1, and then writes out the exit status of cat.
#!/bin/sh # Requires at least 1 command line argument cat $1 printf "\n\ncat had an exit status of $EXIT_STAT\n" |
Try running this script with different command line arguments. If the file exists and cat can display it, you will find that the exit status is 0, but if you give it a file it can't read, or one which doesn't exist, the exit status will be something else.
Almost every programming language has a feature that allows you to
execute code based on the truth of a condition, and shell scripting is no
different. The if statement allows you to check if a given expression is true,
and if it is, execute a certain piece of code. The syntax for the if statement
is as follows:
if EXPRESSION ;
then DO THIS ... elif EXPRESSION ; then # else if is optional, and there can be more than 1 of them DO THIS ... elif ... else
# else is optional, only one per if
statement fi # fi = end of the if statement (fi is if backwards) |
The first important part of the if statement is the expression. The
expression is any valid Unix command, and the expression is considered to be
true if and only if it has an exit status of zero. The if statement can be a
good way to check if a program has worked. An example of how this might work is:
#!/bin/sh
if cat $1
2>&- ; then |
This program will attempt to display the file given to it as an argument. If it was displayed successfully, it will write a message to standard output, and if it failed, a different message is written.
While it is nice to see if a program completed successfully, often when we write an if statement we want perform some sort of check on the value of a variable. The test command provides us with this ability. We can run the test command by typing the word test but, to make the code a little more readable, we can just put the thing we want to test in square brackets. In other words, the following commands are equivalent:
The test command will return 0 if the test evaluated to true, and it will return a non-zero value if the test evaluated to false, making it very useful as the expression part of an if statement.
There are many different ways to use the test command (run man test to see them all), but some of the more useful ones include:
#!/bin/sh # This script will display the file give as the first argument. if [ $# -eq 0 ] ;
then
# If no command line arguments were
given... elif [ $# -gt 1 ]
; then if [ -r $1 ] ;
then
# If the file to be displayed if
readable |
At this stage, we have learnt how we can treat variables as strings, but
often we need to treat a variable like it is a number. The expr command allows
us to evaluate arbitrary mathematical expressions, and by using backticks, we
can store the result. For example:
#!/bin/sh
NUM=5 printf "$NUM + 1 = `expr $NUM + 1`\n" # Prints NUM + 1 to the screen NUM=`expr $NUM \*
2`
# NUM equals 2 times the old value of NUM (must put a
slash before the *) |
Before we continue, you should make a note of the '\*' on the third line. A star on its own in a command line is known as the 'glob' character, and if this is in a command, it will be replaced by all of the filenames in the directory. If you want to see how this is done, write a script which simply writes out the number of arguments ($#) to the screen and run this script with * as the argument. You will see that the number of arguments isn't necessarily 1, it is the number of visible files in the directory. To stop Unix from replacing the * with filenames, we put the escape character \ before it. You will also need to put a \ before question marks, open and close brackets, less than and greater than's, dollar signs, etc. if you want them to be treated as normal characters.
Now that that's out of the way, lets look at the above example line by line:
You can also use the expr command to conduct comparisons between two
expressions. If the expression is true, it will output 0, else it will output 1.
#!/bin/sh
# Compares (5 * 6) + 3 to 5 + (6 * 3) if expr 5 \* 6 + 3
\> 5 + 6 \* 3 ; then |
The only major topic we haven't yet covered in shell scripts is looping.
The first type we will look at is the while loop. Those of you who have
programmed in Java or C will already be familiar with this type of loop. When a
while loop is reached, the condition is tested. If the condition is false, we
continue running the script from below the end of the loop, however if it
evaluates to true, the code in the body of the loop is executed. Once the body
has been executed, we return to the top of the loop and test the condition
again. The while loop is useful in situations where we must execute a piece of
code a certain number of times. The syntax for the while loop is as follows:
while EXPRESSION;
do BODY OF LOOP . . . done |
Consider the following piece of code which writes out the numbers from 1 to
$1 to the terminal.
#!/bin/sh
COUNT=1 while [ $COUNT -le
$1 ]; do printf "FINISHED\n" |
On the first iteration through the loop, COUNT has a value of 1. If COUNT is not less than $1, the test will evaluate to false, the loop is finished and the program continues from after the done, but if it was less than 1, we execute the two commands in the body of the loop and try the test again.
The other way to write a loop in a shell script is to use a for loop. The
for loop in a shell script allows you to loop over a list of items, and it works
in a completely different way to the for loop in languages like Java and C. The
syntax for the for loop is as follows:
for VARIABLE in
LIST; do BODY OF LOOP . . . done |
When a for loop is executed, on the first iteration through the for loop,
VARIABLE is the first item in the list. On the next iteration, VARIABLE is the
second item on the list, on the third iteration it is the third item, etc...,
until the end of the list is reached. When the end of the list is reached, the
loop is finished and we continue executing from after the end of the loop. One
question you may have at this point is "How do I make a list?". In shell
scripts, any variable with spaces in it is a list, for example, in the code
fragment below, NAMES is a list with three variables, where 'John' is the first
element of NAMES, 'Bill' is the second element and 'Fred' is the third.
NAMES="John Bill Fred" # A list with three elements |
A simple example of how a for loop might be used is shown below.
#!/bin/sh
KEYBOARD="Q W E R T Y U I O P A S D F G H J K L Z X C V B N M" # Letters from a QWERTY keyboard COUNT=1 |
In this example, KEYBOARD is a list which contains all of the letters on QWERTY keyboard. The for loop is used to loop over each item in this list, and while we are inside the loop, the variable LETTER contains the current list item.
The for loop can be a very powerful tool when used in conjunction with ls. ls can provide you with a list of filenames matching a regular expression. Some examples of how you might use this include:
An example of how you might use this shown below. This script writes out the
name of all files in the current directory which end in .c (C source files), .a
(C Archive files) and .o (C Object files).
#!/bin/sh
printf "C files in
the current directory are:\n" for FILENAME in
`ls *.c *.a *.o 2>&-`; do # If ls gives an error message, send it to
null. printf "\nCounted $COUNT C files\n" |
The final control structure available in a Unix script is the case
statement. This is similiar to the switch statement in Java and C, and it allows
you to choose between multiple paths depending upon the value of a variable. The
syntax for a case statement is:
case VARIABLE
in OPTION1) DO THIS ;; OPTION2) DO THIS ;; OPTION3) . . . esac # esac = case spelled backwards |
When a case statement is executed, the variable is compared to each of the options in order. If the variable matches one of the options, the code after the option is executed, up until the two semicolons. Once the two semicolons are reached, the case statement is completed and we continue from after esac. If two options are found which match the variable, only the first of these is executed. If none of the options match, no code is executed inside the case statement.
Below is an example of how a case statement can be used. This script, like a
Magic 8-Ball, will produce an answer to a yes or no question. The random number
is generated from the Process ID.
#!/bin/sh
printf "Ask the
Magic 8-Ball a yes or no question out loud...\n" printf "Shaking
the eight ball...\n" case `expr $$ % 6`
in # $$ is virtually a random
number, $$ mod 6 is a random number from 0 to 5 inclusive
|
An interesting feature of case statements is that you can use wildcards like
? and * in the options. This next example is an edited version of the script
from 'The for Loop' which, as well as writing out all of the C files, writes a
comment next to them about what type of file they are.
#!/bin/sh
printf "C files in
the current directory are:\n" for FILENAME in
`ls *.c *.a *.o 2>&-`; do # If ls gives an error message, send it to
null. case
$FILENAME in COUNT=`expr
$COUNT + 1` printf "\nCounted $COUNT C files\n" |
We have already had a brief look at how we can use pipes, redirect output
in Unix scripts, and how we can supress error messages using a command like
2>&-. Now we will take a deeper look at how streams can be redirected.
In Unix, each process has 3 file descriptors. These are:
While all of these files default to the user's terminal, it is possible to redirect them to other places. Some ways of doing this include:
Below is an example whch demonstrates how redirecting input and output can be
used. This is a script which tells the user how many times it has been executed
by keeping a counter in a file.
#!/bin/sh
COUNTFILE=$HOME/.countfile # This file contains the number of times the script has been executed. if [ ! -r
$COUNTFILE ] ; then # If the count file doesn't exist, the script has never been
run before, printf "$NUM_TIMES\n" > $COUNTFILE # Updating $COUNTFILE printf "This script has been executed $NUM_TIMES times\n" |
The next example demonstrates how we can use << to redirect input to
come from the script itself. This seems quite strange because it has C code
written directly into a Unix script, but it is valid because the C code is being
used as input to another program.
#!/bin/sh
SRC=/tmp/temp-$$.c cat > $SRC
<< END_C_CODE # The script itself is used as input to cat, until the line
END_C_CODE is reached gcc -o $EXE $SRC $EXE rm $EXE $SRC 2>&- |
Some other useful redirection commands include:
In shell scripts it is possible to use the logical operators &&
and || to group together commands. These logical operators also allow us to
execute commands based on exit values much like an if statement. These logical
operarors work as follows:
One use for these commands is to link together conditions in an if statement,
for example:
#!/bin/sh
if [ 5 -eq $1 ] || [ $1 -gt 7 ] ; then
# $1 must be either 5 or greater than 7.
|
But another way to use them is as a replacement for the if statement.
#!/bin/sh # Tells the user if $1 is a readable file [ -r $1 ] && printf "$1 is a readable file.\n" || printf "$1 is not a readable file. That's a shame...\n" |
If the file $1 is readable, then [ -r $1 ] returns true. This means that the part after the && must be executed. If the file wasn't readable, the test will return false, so the part after the || must be executed. This piece of code is equivalent to an if-then-else statement.
We already know how to create and use variables, but there are still some
aspects of them which we haven't covered. The first one we will look at are
variable constructs. These provide different ways of accessing a variable. The
commands we can use are:
These variable constructs can help to make code much shorter and often remove
the need to cumbersome if statements. To see how they work, type in and execute
the following example.
#!/bin/sh
MY_VAR="Value" printf "MY_VAR = ${MY_VAR}\n" # Write out MY_VAR printf "NEW_VAR = ${NEW_VAR:-New}\n" # NEW_VAR doesn't exist yet, should write New printf "NEW_VAR = ${NEW_VAR:=Variable}\n" # Create NEW_VAR printf "NEW_VAR =
${NEW_VAR:-New}\n" # NEW_VAR does exist now, should write
Variable printf "Checking if MY_VAR
exists..." printf "Checking if SOME_VAR
exists..." |
Another interesting feature of variables is the ability to export them. When
a variable is exported, any process started by the shell in which the script is
running can also see the variable. Consider the following two scripts:
#!/bin/sh # script1 VAR=abc |
#!/bin/sh # script2 printf "$VAR\n" |
When we run script1, it creates a variable VAR and then executes script2.
Even though script2 is called from inside script1, it cannot read the variables
in script1, but if we alter script1 so that it exports the variable, script2
will be able to see it, and we will get abc written to standard output.
#!/bin/sh # script1 VAR=abc |
#!/bin/sh # script2 printf "$VAR\n" |
The final thing we will look at in variables is the difference between single
and double quotes. When a variable is written inside double quotes, it is
replaced with its value, but when a variable is written in single quotes, it is
not replaced. To get an idea of how this works, type in and execute this script.
#!/bin/sh
MY_VAR="Value" printf "Variable
replacement...\n\n" printf "\n\nUsing
backticks...\n\n" |
You will see that when something is written in single quotes, Unix doesn't replace commands in backticks or variables. It will only do the replacement when it is in double quotes.
There are a number of commands and built-in functions which often prove
useful when writing a Unix script. Here we will look at a few of these commands,
describe what they do, and give examples of how they can be used.
#!/bin/sh
printf "What is your name? " read NAME # Reads the next line from standard input and sets NAME to be this value. printf "Hello $NAME\n" |
#!/bin/sh
printf "Enter a java file's path
(must end in .java): " printf "The class in this file is
`basename $NAME
.java`\n" |
#!/bin/sh
cut -d : -f 1,5 /etc/passwd |