Computer Sciences Department logo

CS 368-2 (2012 Spring) — Day 8 Homework

Due Tuesday, April 17, at the start of class.

Goal

Run a simulation program many times, each time extracting and saving key data.

By now, we have learned as much Python as we are going to in the class. It is time to put your skills to the test! Your script will run another scientific application (a silly little queue simulator that I wrote), gather data from it, save that data, and (optionally) plot the results using gnuplot (a free visualization tool available on most Unix/Linux systems).

Background Information

This is information about the script that I wrote. You will not write this program!

I wrote a very simple simulator of a “single queue system”. That is, the program simulates a location like a bank, where there is a single line of people waiting for one of a group of employees to help them. Using generic terminology, we call the line “a queue”, the people “clients”, and the employees “servers”. Thus, when a server becomes free, they help the next client waiting in the queue.

Clients arrive in pseudo-random fashion, based on an average arrival rate (e.g., 20 per hour). Upon arrival, a client enters the queue at the end and waits for an available server. When a server is available, they help the client at the front of the queue. Once being served, each client takes a pseudo-random amount of time (normal distribution, mean=300s, s.d.=100s) with the server. Optionally, the simulator can allow a client to get mad after waiting in line for too long (normal distribution again, mean=600s, s.d.=30s), and leave without being helped. The program simulates one business day (eight hours), and then displays various statistics.

The simulator accepts some command-line arguments to affect its behavior. Here is its built-in help:

Usage: queue_simulator [options] SERVERS CLIENTS-PER-HOUR

Options:
  --version         show program's version number and exit
  -h, --help        show this help message and exit
  -v, --verbose     Turn on lots of extra debugging output
  -d, --departures  Allow clients to leave queue after waiting too long

Tasks

There are two or three main tasks you need to accomplish:

  1. Get the simulator and play around with it briefly to learn how it works
  2. Write a Python script to run the simulator many times and save key data from it
  3. [Optional] Make your Python script run gnuplot on the saved data

Part 1: The Simulator

This part is very easy. First, download the simulator. Once downloaded, you will need to unpack the actual program. Use this command:

tar xzf queue-simulator-0.2.tar.gz

You will get a single Python script named “queue_simulator”. Take a look through it, if you like. It is reasonably straightforward Python, I hope.

Running the program is very easy. Here is how to get its built-in help:

./queue_simulator -h

A typical run — 2 servers, 24 clients/hour (average), clients allowed to leave early — is run like this:

./queue_simulator --departures 2 24

Here is sample output from the command above:

Run Conditions: 2 servers, 24 clients/hour (average)

Total clients served:     208
Total clients who waited: 128 (61.5%)
Total clients who bailed: 3
Total wait time:          22960s
Amortized wait time:      110.38s
Mean wait time:           179.38s

What do the two average wait times mean? The “Amortized wait time” is the average wait time over all clients who were served, including those who did not wait at all (i.e., total wait time divided by total clients served); one could think of it as the expected wait time of a client coming in. The “Mean wait time” is the average wait time of just those clients who had to wait at all (i.e., total wait time divided by total clients who waited). Note that the wait time of the client who left early (“bailed”) is not included in the total wait time; that was a random implementation decision on my part.

Play around with this script! Try turning on verbose mode to see a detailed accounting of when clients arrive, are served, etc. Try different values for the number of servers and arrival rate. Try allowing and not allowing early departures.

Part 2: Your Python Script

OK, that was my script, now it is your turn!

Write a script that runs the queue_simulator a given number of times. Each time the simulator is run, pass it the same arguments. Because the simulator uses random variables to drive the process, the goal here is to gather data from many runs under the same conditions.

Your script must accept three command-line values of its own:

For example, to run the simulator 1000 times with 2 servers, 24 clients/hour (average), you would run your script like this:

python homework_08.py 2 24 1000

Obviously, the simulation should be run the given number of times, with the servers and rate passed to the simulation each time. You must decide whether to include the “--departures” option to the simulation; either include or omit it right in your script, as you wish.

After each simulation run, your script must use the standard output from the simulator and extract the following three or four values (depending on whether you used the “--departures” option):

Extract only the values, not the labels, units, or anything else. We want to be able to give this data as input to a graphing program, and it does not like extraneous text.

Save the data to a file named “homework-08-output.txt”. It is more efficient if you keep all of the data in memory, and then write it out to the file all at once at the end (but suit yourself). The format of the data is very simple: Just write the three or four values that came from one run on a single line, with each value separated by a single space character. For example, here is a snippet from one of my data files (note that there are four values, because I had the “--departures” option turned on):

311 254 0 50084
313 251 0 48759
328 275 0 54669
304 259 0 47234
324 283 0 58722
318 268 0 49488
319 267 0 53683
307 252 0 46503
307 259 0 46964
289 225 0 41670

Once your script works, try it for a large number of runs. On my machine, 1000 iterations took just over two minutes to run, so it is not too bad.

If you are relatively new to programming, you may want some extras hints for what steps to do and how to organize your code. If you think you need the extra help, I have written a separate page of hints for you.

Optional Part 3: Running gnuplot

If you like, and if you are running on a system that has gnuplot installed, try having your script also use gnuplot to produce a graph of the data. Yes, you could run gnuplot on the data file by hand, but the whole point of this class is automating repetitive tasks!

Here is a very simple gnuplot file that will make a scatter plot of the total clients served versus total wait time:

set terminal png size 1024, 768
set output 'homework-08-plot.png'
plot 'homework-08-output.txt' using 1:4 with points

This particular script is very simple. It assumes that the data file has four values per run (change the “1:4” to “1:3” if yours has three values). It creates a PNG-format image file, 1024×768 pixels in size. Everything else uses gnuplot defaults. I am sure that there are ways to make a more attractive plot!

Anyway, save that text to a file called something like homework-08.gnuplot. Then, your script just needs to run a command like this:

gnuplot homework-08.gnuplot

Just let the standard output and error from gnuplot go straight to the terminal window. But do check for and report on errors!

Reminders

Start your script the right way! Here is a suggestion:

#!/usr/bin/env python

"""Homework for CS 368-2 (2012 Spring)
Assigned on Day 08, 2012-04-12
Written by <Your Name>
"""

Do the work yourself, consulting reasonable reference materials as needed. Any resource that provides a complete solution or offers significant material assistance toward a solution not OK to use. Asking the instructor for help is OK, asking other students for help is not. All standard UW policies concerning student conduct (esp. UWS 14) and information technology apply to this course and assignment.

Hand In

A printout of your code on a single sheet of paper (if possible, it may not be in this case). Be sure to put your own name in the initial comment block of the code. Identifying your work is important, or you may not receive appropriate credit.