Due Monday, December 3, at the start of class.
Run a simulation program many times under a few different conditions, then analyze the raw data using R.
At heart, this assignment uses the same queue simulator from homework #8. But this time, we want to run different experimental conditions and then run a bunch of statistics on the raw data, all by running a single command! That is, we will create a simple workflow to handle all of the work.
This is information about the script that I wrote. You do not need to write this program!
For background on the queue simulator itself, see homework #8. The script for this assignment is a little bit different:
--verbose
optionHere is the new usage information (note the new script name):
Usage: queue_sim_loop [options] SERVERS CLIENTS-PER-HOUR COUNT Options: --version show program's version number and exit -h, --help show this help message and exit -d, --departures Allow clients to leave queue after waiting too long
Also, I wrote a simple (simplistic?) R script that analyzes the raw data. To learn more about R, visit the R Project website.
There are two main tasks you need to complete:
In this part, you will create and test four separate submit files that correspond to the four nodes of our workflow. Because each submit file is similar to ones you have written before, I am providing minimal guidance here.
To get started, follow these steps:
mkdir Homework_11_First_Last cd Homework_11_First_LastThat is, name your directory according to the email submission rules.
tar xzf homework-11.tar.gzYou will have two files:
queue_sim_loop
: The new simulation programqsl-analyze.r
: The R analysis scriptNow, write and test HTCondor submit files for each step of the process. A few details to note:
submit-368
, except it will not work there, because R is not installed on our submit machine),
you would use this command:
qsl-analyze.r
That is, qsl-analyze.r
is the program to run and it takes no arguments. Also, see below about
input filenames. How do you tell HTCondor about the input files? Note that the R script writes to standard
output and it creates a separate PDF file with a data plot.
qsl.1-20-2000.out
qsl.1-25-2000.out
qsl.1-30-2000.out
The numbers in each filename correspond to the experimental condition. They are formatted as
S-CC-TTTT
, where S
is the number of servers, CC
is the number of clients
per hour, and TTTT
is the number of trials. Either make your simulation jobs create output files
with those names, or else change the R script to expect the filenames you choose.
.log
file, but suit yourself.
Make sure that you can successfully run all four jobs before moving on to the next part!
OK, now you have four working submit files. It is time to link them together into a single workflow. Obviously, we will use DAGMan to do this part.
First, draw a picture of the overall workflow. Pencil and paper is OK, no need to hand it in. But make sure you understand what each node does, what its inputs and outputs are, and how the nodes are connected. If you are unsure about this step, write to me!
Then, write the DAGMan submit file itself. It is not very complicated, and if you have done things right up to this point, you will not need to modify your HTCondor submit files from Part I at all.
And now, the moment we have been waiting for… Submit the entire workflow, stand back, and wait for the results!
Some ideas for extra learning:
VARS
statement. Now, merge your three
simulation submit files into one, changing it and the DAGMan submit file to work together.
Do the work yourself, consulting reasonable reference materials as needed. Any resource that provides a complete solution or offers significant material assistance toward a solution not OK to use. Asking the instructor for help is OK, asking other students for help is not. All standard UW policies concerning student conduct (esp. UWS 14) and information technology apply to this course and assignment.
All homework assignments must be turned in by email! See the email page for more details about formatting, etc.
For this assignment, you must submit several files in order for your assignment to be complete:
Please do not include your output files this time!