CS 547: Computer System Modeling Fundamentals

Fall 2006

Assignment 1:  Workload Measures


Due Wednesday, Sept 13 at 5pm.   Hand in to Ting Chen’s mailbox.

This assignment involves computing measures that characterize the jobs that arrived to a production system – the NCSA Origin 2000 (O2K) – which executed large science and engineering programs.  The system contained 1472 processors for executing the jobs in the trace you will analyze.  In a typical month, the average job runtime was on the order of 20 hrs. and the average processor utilization was above 90%.

The trace of the jobs that arrived during a one-week period in 2001 is provided in ~cs547-1/public/html/06/traces/O2Kweektrace.txt.  Documentation of the fields in the trace is in README.O2Ktraces.txt in the same directory.   Also in the directory is a Java program called O2Kreadtrace.java.  This program reads each line of the trace file and prints the total number of jobs that arrived during the week, the maximum job runtime and average job wait time (in hours), and the number of jobs that arrived each day.  You can modify this program to compute the measures below, or you can write a program in any language you like to use. 

For the graphs you will produce, you can use any plotting program you are familiar with, such as Excel or Matlab.  Please make sure the axes are clearly labeled (including the units) in your graphs.

For each measure you compute, report only two or three digits to the right of the decimal point.  This is for readability and because further digits are not meaningful since the measures are only for jobs that arrive during a particular (typical) week.

You are required to do this assignment with a partner from the class.  You can use the class mailing list to find a partner.  Turn in a separate sheet with your answers and graphs, as well as the code that you used to compute the measures.  Make sure both partners’ names are on the answers and on the code.  Also, please be sure that each partner does about 50% of the work to obtain the answers.

1.       Let the hours of each day be numbered 0 through 23, where hour 0 is midnight to 1:00am and hour 23 is 11:00pm to midnight.  Provide the following measures.

(a)  The overall average job arrival rate for the one-week period, in jobs/hour.

(b)  The number of jobs that arrived each hour during the fifth day.

(c)  For all jobs that arrived during hours 12 through 17 on the fifth day, give the average time since the previous job arrival, in minutes.

(d)  Repeat part (c) for the first day and for the seventh day.

2.       For the jobs that arrived during hours 12 through 17 on days 1, 4, and 5, collectively,

(a)  Give the average time since the previous job arrival, the standard deviation of the time since the previous arrival, and the ratio of standard deviation to the average.  Note that this ratio is called the coefficient of variation or CV.

(b)  Give the fraction of times since the previous arrival that are greater than each of the following values (in aeconds):

      0, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024

(c)    Graph the curve of the measured fraction greater than x seconds vs x.  Use a linear scale for each axis in the graph.  On the same graph plot the curve for  where e is approx. 2.718 and t is the average value given in (a).   Also plot a vertical dashed line at the average value, t.

(d)  Repeat the same graph with a log base 10 scale (ranging from 0.001 to 1) for the y axis.

3.       For all jobs that arrived during the week,

(a)  Give the average time since the previous job arrival.

(b)  Give the fraction of times since the previous arrival that are greater than each of the following values (in aeconds):

      0, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024

(c)    Graph the measured fraction greater than x seconds vs x.  Use a log base 10 scale (ranging from 0.001 to 1) for the y axis and a linear scale for the x axis in the graph.  On the same graph plot the measured fractions in 2(b).

(d)  How are the two curves in part (c) similar?  How are they different? 

4.       For all jobs that arrived during the week:

(a)  Give the average job runtime (in hours), the standard deviation of the runtime, and the coefficient of variation.

(b)  Give the maximum job runtime.

(c)  Give the fraction of job runtimes greater than each of the following values, in hours:

      0,  0.25, 0.5, 1, 5, 20, 40, 100, 200

(d)  Plot the fraction of job runtimes greater than x hours vs x.  Use a log base 10 scale for the y axis.  Include a vertical dashed line at the average value given in part (a).

5.       For jobs that requested (and ran) on one processor,

(a)  Give the average runtime (in hours) and the coefficient of variation in runtime.

(b)  Give the fraction of jobs with runtime < 30 minutes.

(c)  Give the fraction of jobs with runtime > 50 hours.

(d)  Give the average wait time (in hours) for all jobs that request one processor, the average wait for the single-processor jobs with runtime less than 30 minutes, and the average wait for the single-processor jobs that run for more than 50 hours.

6.       For jobs that requested and ran on 16 processors, provide the measures in 5 (a) through (d).   Briefly disuss these results and the results in question 5.  Provide whatever observations you feel are the most important observations about the results.


vernon@cs.wisc.edu