Recent Changes - Search:

Instructor

  • who: Michael Swift
  • where: Room 7369
  • when: Thurs. 1:30-2:30
  • email: swift 'at' cs.wisc.edu

TA

  • who: Ceyhun Alp
  • where: 1306
  • when: Wednesday 1:15-2:15
  • email: e.ceyhun.alp 'at' gmail.com

Lecture:

  • when: Tues./Thur. 11:00 -- 12:15
  • where: Computer Sciences 1325
  • list: compsci736-1-s16 'at' lists.wisc.edu

HomePage

Resources

edit SideBar

Measurement

Warm-Up Project: Benchmarking Interprocess Communications

Overview

The goal of this assignment is to get some experience doing benchmarking by measuring the performance of various operating system and interprocess communication mechanisms. You will design various experiments, build simple tools, and carry out a methodical experiment, summarize the results, and draw conclusions.

Be careful! Benchmarking is a subtle and tricky business; things that look simple on first glance will often turn out to be quite intricate.

Description

You will compare the performance of three communication mechanisms:

Mechanism 1: Unix Pipe

The most basic of IPC mechanisms on UNIX is the pipe; it has been around since the earliest versions of UNIX. The pipe system call is executed by a process to create both ends of a uni-directional communication channel. This channel is a stream of bytes that insures ordering and correct delivery. The pipe, when combined with a fork operation, allows two processes to pass messages.

Mechanism 2: Internet (INET) Stream Sockets

The stream socket, based on TCP/IP, is the backbone of the Internet. These sockets can be used for remote (inter-host) and local communication, providing much the same abstraction as a pipe: a reliable, ordered byte stream. Almost every operating system supports communication over these sockets.

Mechansim 3: Bounded-buffer in Shared Memory

Perhaps the fastest way to communicate is to bypass the kernel altogether and use shared memory. This mechanism uses a simple bounded buffer to communicate in each direction. The shared memory can be created with the mmap system call, and synchronization can be done with pthread mutexes or spinning.

The Measurements

You will perform the measurements on one of the Linux systems available in one of the Computer Sciences Department's instructional labs. On this platform, you will measure the follow features:

1. Clock precision

The accuracy and granularity of the timer you use will often have a large affect on your measurements. Therefore, you should use the best timer available. There are several APIs you can try: gettimeofday and clock_gettime.

In addition, on x86 platforms a highly accurate cycle counter is available. The instruction to use it is known as rdtsc, (Wikipedia article and it returns a 64-bit cycle count. By knowing the cycle time, one can easily convert the result of rdtsc into a useful time.

A few caveats:

  1. If the processor can automatically vary the clock speed, the timestamp counter may not reflect real time.
  2. On a multicore system, different processor cores may have different values for the timestamp; you can only compare values on a single core and not across cores.

Hence, the first thing you should do is: figure out how to use rdtsc or its analogue (you can use google to find out more about it). Once you know how to call it and get a cycle count, convert the result to seconds and measure how long something takes (e.g., a program that calls sleep(10) and exits should run for about 10 seconds. Confirm your results make sense by comparing it to a less accurate but reliable counter such as gettimeofday. Note that confirmation of timer accuracy is hugely important! If you don't trust your timer, how can you trust the results of your measurements?

One way to do this is to read the clock value at the start and end of a simple loop. Start with a single loop iteration, then increase the iteration count of the loop until the difference between the before and after samples is greater than zero. Try to get the smallest non-zero positive difference. If a single iteration of a loop takes too much time, try putting simple statements between the two timer calls.

You should try at least 2 mechanisms for measuring time, and determine the resolution (smallest time value) you can accurately measure.

More information on time here.

Inter-Process Communication Time:

For each of the communication mechanisms listed above, you will measure the following characteristics:

Message latency: Latency is the time for some activity to complete, from beginning to end. For message passing, it is the time from the start of a send to the completion of a receive. Since the clocks on two different cores (if using rdtsc) may not be sufficiently aligned, the easiest way to measure message latency is to measure the time it takes to complete a round-trip communication (and divide by two).

Measure latency for a variety of message sizes: 4, 16, 64, 256, 1K, 4K, 16K, 64K, 256K, and 512K bytes.

Note 1: Beware of on the internet stream experiments. This mechanism can cause unexpected delays. You can disable nagling with the TCP_NODELAY socket option.

Note 2: Transmission times can be affecting in TCP by the MTU (maximum transmission unit) size (typically 1500 bytes) and read and write buffer sizes (typically 128KB. If you have root access to a Linux system, you can experiment with increasing these sizes. Of course, you will not be allowed to do that on the instructional Linux systems.

Throughput: Throughput is the amount data that is sent per unit time. In this case, a round trip measure is not necessary; you can sent a return message when the entire transfer amount has been sent. Send a large enough total quantity of data such that the single "ack" response contributes a small amount of time compared to the whole transfer.

Measure throughput for a variety of message sizes, the same as above for latency.

The Experimental Method

Computer Scientists are notably sloppy experimentalists. While we do a lot of experimental work, we typically do not follow good experimental practice. The experimental method is a well-established regimen, used in all areas of science. The use of the experimental method keeps us honest and gives form to the work that we do.

The basic parts of an experiment are:

  1. Identify your variables:
    Variables are things that you can observe and quantify. You need to identify which variables might be related and whether a variable is a cause (i.e., the message size of a send operation) or the effect (e.g., the time to complete the send). Even though this sounds obvious, you should consciously identify the variables in each experiment that you perform.
  2. Hypothesis:
    The hypothesis is a guess (we hope, an educated guess) about the outcome of the experiment. The hypothesis needs to be worded in a way that can be tested in an experiment, so it should be stated in terms of the experimental variables.
  3. Experimental apparatus:
    You need to obtain the necessary equipment for your experiment. In this case, it will be the needed computer and software.
  4. Performance of experiment and record the results:
    This part is the one that we typically think of as the real work. Note that several important steps come before it.
  5. Summarize the results:
    Summarization means putting the data in a form that you can understand. You might put the data in tables, graphs, or use statistical techniques to understand the raw data. If you are using averages, make sure to read Jim Smith's paper in the October 1988 issue of CACM, Jose Albertal's Lecture, or Gernot Heiser's web page (there are many types of means, and you need to use the right one)! However, as a warning, you probably do not want to use averages; taking the minimum makes much more sense in this case.
  6. Draw conclusions:
    Note that performing the experiment and summarizing the results are separate steps and both come before you draw conclusions. To present honest and understandable results, we must present the basic data first (so that the reader can draw their own conclusions) before we insert our bias.

    The experimental method has more subtleties than this (such as trying to account for experimenter and subject biases), but the above description is sufficient for basic computer measurement experiments.

Learning about Sockets

If you need help with using the various ocket calls, here are some resources.

What to turn in

Please write a 4-5 page paper, 2 column, 11-point fount, 1-inch margins (conference stye!).

The paper must contain the following parts:

  • Title:
    The title should be descriptive and fit in one line across the page. Interesting titles are acceptable, but avoid overly cute ones.
  • Abstract:
    This is the paper in brief; it is not a description of what is in the paper. It should state the basic ideas, techniques, results, and conclusions of the paper. The abstract is not the introduction, but a summary of everything. It is an advertisement that will draw the reader to your paper, without being misleading. It should be complete enough to understand what will be covered in the paper. Avoid phrases such as "The paper describes...." This is a technical paper and not a mystery novel; do not be afraid of giving away the ending.
  • Body:
    This is the main part of the paper. It should include an introduction that prepares the reader for the remainder of the paper. Assume that the reader is knowledgeable about operating systems. The introduction should motivate the rest of the discussion and outline the approach. The main part of the paper should be split into reasonable sections that follow the basics of the experimental method. This is a discussion of what the reader should have learned from the paper. You can repeat things stated earlier in the paper, but only to the extent that they contribute to the final discussion.
  • References:
    You must cite each paper that you have referenced. This section appears at the end of the paper.
  • Figures:
    A paper without figures, graphs, or diagrams is boring. This paper will certainly need several performance tables and graphs. Your paper must have figures. Your figures should be easy to understand (no microscopic fonts, labeled axes, etc.)

Do not re-describe the assignment; address the issues described above. The paper must be written using correct English grammar. There should be no spelling mistakes.

Note that your paper will be evaluated on both the technical content and the presentation (as would any paper submitted to a journal or conference).

Here is a list of writing suggestions available to help you avoid common mistakes. It is essential that you take a look at these suggestions as you prepare your paper.

A grading rubric is available for the paper.

How to turn it in

Please put the paper in the hand-in directory of you or your partner, and make sure both your names are in the paper. If you are not turning in the paper (please, only one copy of the paper per team), put a text file in your handin directory named "partner.txt" with the name of your partner.

Edit - History - Print - Recent Changes - Search
Page last modified on February 24, 2016, at 04:04 PM