CS 368-2 (2012 Spring) — Day 11 Homework

Due Thursday, April 26, at the start of class.

Goal

Run a simulation program many times under a few different conditions, then analyze the raw data using R.

At heart, this assignment uses the same queue simulator from homework #8. But this time, we want to run different experimental conditions and then run a bunch of statistics on the raw data, all by running a single command! That is, we will create a simple workflow to handle all of the work.

Background Information

This is information about the script that I wrote. You do not need to write this program!

For background on the queue simulator itself, see homework #8. The script for this assignment is a little bit different:

It takes an extra command-line argument, the number of times to run the simulation
It no longer takes the --verbose option
Its output is ready-made for input into R

Here is the new usage information (note the new script name):

Usage: queue_sim_loop [options] SERVERS CLIENTS-PER-HOUR COUNT

Options:
  --version         show program's version number and exit
  -h, --help        show this help message and exit
  -d, --departures  Allow clients to leave queue after waiting too long

Also, I wrote a simple (simplistic?) R script that analyzes the raw data. To learn more about R, visit the R Project website.

Tasks

There are two main tasks you need to complete:

Write submit files for each part of the workflow and make sure they work
Write a DAGMan submit file for the whole workflow and get it to work

Part I: Individual Jobs

In this part, you will create and test four separate submit files that correspond to the four nodes of our workflow. Because each submit file is similar to ones you have written before, I am providing minimal guidance here.

To get started, follow these steps:

Create and change to a new directory (ultimately, this assignment yields lots of files)
```
mkdir homework-11
cd homework-11
```
Name the directory whatever you like!
Download the new simulator and associated R script
Unpack the scripts:
```
tar xzf homework-11.tar.gz
```
You will have two files:
- queue_sim_loop: The new simulation program
- qsl-analyze.r: The R analysis script

Now, write and test Condor submit files for each step of the process. A few details to note:

Write three separate (but highly similar) submit files to run the simulator. Each submit file should be a different experimental condition. What is similar in each submit file? What is different? For what it’s worth, my submit files were set up to use 1 server in all cases, and 20, 25, and 30 clients per hour respectively. About 2000 trials per condition seemed to be a good number.
Write one submit file to run the R analysis script. To run the R script from a regular command line (e.g., from submit-368, except it will not work there, because R is not installed on our submit machine), you would use this command:
```
qsl-analyze.r
```
That is, qsl-analyze.r is the program to run and it takes no arguments. Also, see below about input filenames. How do you tell Condor about the input files? Note that the R script writes to standard output and it creates a separate PDF file with a data plot.
The R script is hardcoded to expect three input files:
- qsl.1-20-2000.out
- qsl.1-25-2000.out
- qsl.1-30-2000.out
The numbers in each filename correspond to the experimental condition. They are formatted as S-CC-TTTT, where S is the number of servers, CC is the number of clients per hour, and TTTT is the number of trials. Either make your simulation jobs create output files with those names, or else change the R script to expect the filenames you choose.
It is probably easiest to have all jobs log to the same .log file, but suit yourself.

Make sure that you can successfully run all four jobs before moving on to the next part!

Part II: DAGMan

OK, now you have four working submit files. It is time to link them together into a single workflow. Obviously, we will use DAGMan to do this part.

First, draw a picture of the overall workflow. Pencil and paper is OK, no need to hand it in. But make sure you understand what each node does, what its inputs and outputs are, and how the nodes are connected. If you are unsure about this step, write to me!

Then, write the DAGMan submit file itself. It is not very complicated, and if you have done things right up to this point, you will not need to modify your Condor submit files from Part I at all.

And now, the moment we have been waiting for… Submit the entire workflow, stand back, and wait for the results!

Extra Challenges

Some ideas for extra learning:

Experiment with different DAGMan options. Try throttling back the number of jobs running at once. Try removing the DAG run part way through, then restarting it. Try different logging and monitoring options to see what you like best.
In the Condor manual, in the DAGMan section, read about the VARS statement. Now, merge your three simulation submit files into one, changing it and the DAGMan submit file to work together.
Design and implement your own workflow, including the jobs, submit files, DAGMan file, etc.

Reminders

Do the work yourself, consulting reasonable reference materials as needed. Any resource that provides a complete solution or offers significant material assistance toward a solution not OK to use. Asking the instructor for help is OK, asking other students for help is not. All standard UW policies concerning student conduct (esp. UWS 14) and information technology apply to this course and assignment.

Hand In

A printout of:

One of the simulation submit files
The analysis submit file
The DAGMan submit file

If you can squeeze that onto a single sheet of paper (double-sided is great!), the planet and I will thank you. Be sure to put your own name at the top of each piece of paper, regardless; identifying your work is important, or you may not receive appropriate credit.