Due Monday, November 19, at the start of class.
Run a simulation program many times, each time extracting and saving key data.
By now, we have learned as much Python as we are going to in the class. It is time to put your skills to the test!
Your script will run another scientific application (a silly little queue simulator that I wrote), gather data from
it, save that data, and (optionally) plot the results using gnuplot
(a free visualization tool
available on most Unix/Linux systems).
This is information about the script that I wrote. You will not write this program!
I wrote a very simple simulator of a “single queue system”. That is, the program simulates a location like a bank, where there is a single line of people waiting for one of a group of employees to help them. Using generic terminology, we call the line “a queue”, the people “clients”, and the employees “servers”. Thus, when a server becomes free, they help the next client waiting in the queue.
Clients arrive in pseudo-random fashion, based on an average arrival rate (e.g., 20 per hour). Upon arrival, a client enters the queue at the end and waits for an available server. When a server is available, they help the client at the front of the queue. Once being served, each client takes a pseudo-random amount of time (normal distribution, mean=300s, s.d.=100s) with the server. Optionally, the simulator can allow a client to get mad after waiting in line for too long (normal distribution again, mean=600s, s.d.=30s), and leave without being helped. The program simulates one business day (eight hours), and then displays various statistics.
The simulator accepts some command-line arguments to affect its behavior. Here is its built-in help:
Usage: queue_simulator [options] SERVERS CLIENTS-PER-HOUR Options: --version show program's version number and exit -h, --help show this help message and exit -v, --verbose Turn on lots of extra debugging output -d, --departures Allow clients to leave queue after waiting too long
There are two or three main tasks you need to accomplish:
gnuplot
on the saved dataThis part is very easy. First, download the simulator. Once downloaded, you will need to unpack the actual program. Use this command:
tar xzf queue-simulator-0.2.tar.gz
You will get a single Python script named “queue_simulator”. Take a look through it, if you like. It is reasonably straightforward Python, I hope.
Running the program is very easy. Here is how to get its built-in help:
./queue_simulator -h
A typical run — 2 servers, 24 clients/hour (average), clients allowed to leave early — is run like this:
./queue_simulator --departures 2 24
Here is sample output from the command above:
Run Conditions: 2 servers, 24 clients/hour (average) Total clients served: 208 Total clients who waited: 128 (61.5%) Total clients who bailed: 3 Total wait time: 22960s Amortized wait time: 110.38s Mean wait time: 179.38s
What do the two average wait times mean? The “Amortized wait time” is the average wait time over all clients who were served, including those who did not wait at all (i.e., total wait time divided by total clients served); one could think of it as the expected wait time of a client coming in. The “Mean wait time” is the average wait time of just those clients who had to wait at all (i.e., total wait time divided by total clients who waited). Note that the wait time of the client who left early (“bailed”) is not included in the total wait time; that was a random implementation decision on my part.
Play around with this script! Try turning on verbose mode to see a detailed accounting of when clients arrive, are served, etc. Try different values for the number of servers and arrival rate. Try allowing and not allowing early departures.
OK, that was my script, now it is your turn!
Write a script that runs queue_simulator
a given number of times. Each time the simulator is run, pass
it the same arguments. Because the simulator uses random variables to drive the process, the goal
here is to gather data from many runs under the same conditions.
Your script must accept three command-line values of its own:
For example, to run the simulator 1000 times with 2 servers, 24 clients/hour (average), you would run your script like this:
python homework_08.py 2 24 1000
Obviously, the simulation should be run the given number of times, with the servers and rate passed to the simulation each time. You must decide whether to include the “--departures” option to the simulation; either include or omit it right in your script, as you wish.
After each simulation run, your script must use the standard output from the simulator and extract the following three or four values (depending on whether you used the “--departures” option):
Extract only the values, not the labels, units, or anything else. We want to be able to give this data as input to a graphing program, and it does not like extraneous text.
Save the data to a file named “homework-08-output.txt”. It is more efficient if you keep all of the data in memory, and then write it out to the file all at once at the end (but suit yourself). The format of the data is very simple: Just write the three or four values that came from one run on a single line, with each value separated by a single space character. For example, here is a snippet from one of my data files (note that there are four values, because I had the “--departures” option turned on):
311 254 0 50084 313 251 0 48759 328 275 0 54669 304 259 0 47234 324 283 0 58722 318 268 0 49488 319 267 0 53683 307 252 0 46503 307 259 0 46964 289 225 0 41670
Once your script works, try it for a large number of runs. On my machine, 1000 iterations took just over two minutes to run, so it is not too bad.
If you are relatively new to programming, you may want some extras hints for what steps to do and how to organize your code. If you think you need the extra help, I have written a separate page of hints for you.
If you like, and if you are running on a system that has gnuplot
installed, try having your script also
use gnuplot
to produce a graph of the data. Yes, you could run gnuplot
on the data file
by hand, but the whole point of this class is automating repetitive tasks!
Here is a very simple gnuplot
file that will make a scatter plot of the total clients served versus
total wait time:
set terminal png size 1024, 768 set output 'homework-08-plot.png' plot 'homework-08-output.txt' using 1:4 with points
This particular script is very simple. It assumes that the data file has four values per run (change the
“1:4” to “1:3” if yours has three values). It creates a PNG-format image file,
1024×768 pixels in size. Everything else uses gnuplot
defaults. I am sure that there are ways
to make a more attractive plot!
Anyway, save that text to a file called something like homework-08.gnuplot
. Then, your script just
needs to run a command like this:
gnuplot homework-08.gnuplot
Just let the standard output and error from gnuplot
go straight to the terminal window. But do check
for and report on errors!
Here are some specific tests to consider, for manual or automated testing:
queue_simulator
output? How do you know?
Start your script the right way! Here is a suggestion:
#!/usr/bin/env python """Homework for CS 368-4 (2012 Fall) Assigned on Day 08, 2012-11-15 Written by <Your Name> """
Do the work yourself, consulting reasonable reference materials as needed. Any resource that provides a complete solution or offers significant material assistance toward a solution not OK to use. Asking the instructor for help is OK, asking other students for help is not. All standard UW policies concerning student conduct (esp. UWS 14) and information technology apply to this course and assignment.
All homework assignments must be turned in by email! Attach your Python script to the email as a
text .py
file. See the email page for more details about formatting, etc.