Due Wednesday, Sept 13 at
This assignment involves computing measures that characterize the jobs that arrived to a production system – the NCSA Origin 2000 (O2K) – which executed large science and engineering programs. The system contained 1472 processors for executing the jobs in the trace you will analyze. In a typical month, the average job runtime was on the order of 20 hrs. and the average processor utilization was above 90%.
The trace of the jobs that arrived during a one-week
period in 2001 is provided in
~cs547-1/public/html/06/traces/O2Kweektrace.txt. Documentation of the fields in the trace
is in README.O2Ktraces.txt in the same directory. Also in the directory is a Java
program called O2Kreadtrace.java.
This program reads each line of the trace file and prints the total
number of jobs that arrived during the week, the maximum job runtime and
average job wait time (in hours), and the number of jobs that arrived each
day. You can modify this program to
compute the measures below, or you can write a program in any language you like
to use.
For the graphs you will produce, you can use any
plotting program you are familiar with, such as Excel or Matlab. Please make sure the axes are clearly
labeled (including the units) in your graphs.
For each measure you compute, report only two or
three digits to the right of the decimal point. This is for readability and because
further digits are not meaningful since the measures are only for jobs that
arrive during a particular (typical) week.
You are required to do this assignment with a
partner from the class. You can use
the class mailing list to find a partner.
Turn in a separate sheet with your answers and graphs, as well as the
code that you used to compute the measures. Make sure both partners’ names are
on the answers and on the code.
Also, please be sure that each partner does about 50% of the work to
obtain the answers.
1. Let the hours of each day be
numbered 0 through 23, where hour 0 is midnight to 1:00am
and hour 23 is
(a) The overall average job arrival rate for
the one-week period, in jobs/hour.
(b) The number of jobs that arrived each
hour during the fifth day.
(c) For all jobs that arrived during hours
12 through 17 on the fifth day, give the average time since the previous job
arrival, in minutes.
(d) Repeat part (c) for the first day and
for the seventh day.
2. For the jobs that arrived
during hours 12 through 17 on days 1, 4, and 5, collectively,
(a) Give the average time since the previous job arrival, the standard deviation of the time since the previous arrival, and the ratio of standard deviation to the average. Note that this ratio is called the coefficient of variation or CV.
(b) Give the fraction of times since the previous arrival that are greater than each of the following values (in aeconds):
0, 1, 2, 4, 8, 16, 32, 64, 128, 256,
512, 1024
(c) Graph the curve of the
measured fraction greater than x seconds vs x. Use a linear scale for each axis
in the graph. On the same graph
plot the curve for
where e is
approx. 2.718 and t is the average value given in (a). Also plot a vertical dashed line
at the average value, t.
(d) Repeat the same graph with a log base
10 scale (ranging from 0.001 to 1) for the y axis.
3. For all jobs that arrived
during the week,
(a) Give the average time since the previous job arrival.
(b) Give the fraction of times since the previous arrival that are greater than each of the following values (in aeconds):
0, 1, 2, 4, 8, 16, 32, 64, 128, 256,
512, 1024
(c)
Graph the measured fraction greater than x seconds vs
x. Use a log base 10 scale
(ranging from 0.001 to 1) for the y axis and a linear scale for the x
axis in the graph. On the same
graph plot the measured fractions in 2(b).
(d) How are the two curves in part (c)
similar? How are they
different?
4. For all jobs that arrived
during the week:
(a) Give the average job runtime (in hours),
the standard deviation of the runtime, and the coefficient of variation.
(b) Give the maximum job runtime.
(c) Give the fraction of job runtimes
greater than each of the following values, in hours:
0, 0.25, 0.5, 1, 5, 20, 40, 100, 200
(d) Plot the fraction of job runtimes
greater than x hours vs x. Use a log base 10 scale for the y
axis. Include a vertical dashed
line at the average value given in part (a).
5. For jobs that requested (and
ran) on one processor,
(a) Give
the average runtime (in hours) and the coefficient of variation in runtime.
(b) Give
the fraction of jobs with runtime < 30 minutes.
(c) Give
the fraction of jobs with runtime > 50 hours.
(d) Give
the average wait time (in hours) for all jobs that request one processor, the
average wait for the single-processor jobs with runtime less than 30 minutes,
and the average wait for the single-processor jobs that run for more than 50
hours.
6. For jobs that requested and
ran on 16 processors, provide the measures in 5 (a) through (d). Briefly disuss
these results and the results in question 5. Provide whatever observations you feel
are the most important observations about the results.