Project 2: Thread Scheduling, Synchronization

  ...where the disposal of time is surrendered merely to the chance of incidence, chaos will soon reign.     -Victor Hugo
Due: Friday April 1st, Midnight.
Etc: This project is to be done with your partner in C (not C++).

Goals

  • Practice using common synchronization primitives
  • Understand the creation, and lifestyles of user-level threads
  • Understand the concept of overlapping I/O and computation
  • Understand the tradeoffs between user and kernel-level threads

Background

Programs are commonly parallelized using multiple processes or multiple threads within a process. Processes are more strongly isolated from each other because they do not share an address space, but data sharing is more difficult (also due to the fact that they do not share an address space). A hybrid approach, using multiple processes and multiple threads within a process, is also reasonable; the latest Apache web server uses this approach. Here is a short history of threads. Threads may be implemented entirely at user-level, in the kernel, or a combination of the two. In this assignment you will learn about the tradeoffs between these approaches, and gain some concurrent programming experience.

Simplethreads

For this project you will be using simplethreads. Simplethreads is a user-level thread package which borrows a fair amount of code from from the GNU portable threads package. While you will only have to add code in small/well-defined sections, you will have to develop a general idea of how the package works. See these instructions on how to build, modify, and add tests to the Simplethreads package.

Credits

This assignment and much of the writeup are borrowed from a CSE 451 project at the University of Washington. Rick Cox is the author of Simplethreads.

The Assignment

If you choose to divy up parts between members make sure that each understands the other's work.

Part 1: Thread Scheduling

Simplethreads already contains:
  1. An implementation of a simple queue (sthread_queue.h)
  2. A context-switch function that, given a new stack pointer, will switch contexts (sthread_ctx.h)
  3. A new-stack function that will create and initialize a new stack such that it is ready to run
  4. Stub routines for a user-level thread system (sthread_user.c)
Your job is to complete the thread system, implement:
  1. Data structures to represent threads (struct _sthread in sthread_user.c)
  2. A thread creation routine (sthread_user_create())
  3. A thread destruction routine (sthread_user_exit())
  4. A simple non-preemptive thread scheduler
  5. A mechanism for a thread to voluntarily yield, allowing another thread to run (sthread_user_yield())
  6. A routine to initialize your data structures (sthread_user_init())
The routines in sthread_ctx.h do all of the stack manipulation, PC changing, register storing, and nasty stuff like that. Rather than focusing on that, this assignment focuses on managing the thread contexts. Viewed as a layered system, you need to implement the grey box below:



At the top layer, applications will use the sthread package (through the API defined in sthread.h). sthread.c will then call either the routines in sthread_pthread.c or your routines in sthread_user.c (it chooses based on the value passed to sthread_init()). Your sthread_user.c, in turn, builds on the sthread_ctx functions (as described in sthread_ctx.h).

Applications (the top-layer) may not use any routines except those listed in sthread.h. They must not know how the threads are implemented; they simply request threads be created (after initializing the library), and maybe request yields/exits. For example, they have no access to sthread_queue. Nor do they keep lists of running threads around; that is the job of the grey box.

Similarly, your grey box - sthread_user.c - should not need to know how sthread_ctx is implemented. Do not attempt to modify the sthread_ctx_t directly; use the routines declared in sthread_ctx.h.

Recommended Procedure:
  1. Figure out what each function you need to implement does. Look at some of the test programs to see usage examples.
  2. Examine the supporting routines we've provided (primarily in sthread_queue.h and sthread_ctx.h).
  3. Design your threading algorithm: When are threads put on the ready queue? When are threads removed? Where is sthread_switch called?
  4. Figure out how to start a new thread and what to do about the initial thread (the one that calls sthread_user_init).
  5. Talk to your fellow students and the TAs about your design.
  6. Implement it.
  7. Test it. The test programs provided are not adequate; for full confidence in your implementation, you'll need to create some of your own.
Hints
  • sthread_create should not immediately run the new thread.
  • Use the provided routines in sthread_ctx.h. You don't need to write any assembly or try to directly manipulate registers for this assignment, nor is an understanding of the exact layout of the stack required.
  • If the routine passed to sthread_create() finishes (returns), you need to make sure the thread's resources are cleaned up (i.e. sthread_exit() is called).
  • The start routine passed to sthread_new_ctx does not take any arguments (unlike the routine that your sthread_user_create is passed). So you can't create a new stack directly with the user-supplied routine; you need to supply a routine that takes no arguments but somehow winds up invoking the user's routine with the user's argument.
  • You should free any resources allocated when a thread exits (whether it exits explicitly, by calling sthread_exit(), or implicitly, because the start routine returns). However, you should not attempt to free the stack of a running thread (note: to free a stack, use shread_free_ctx()). This requires a few tricks.
  • Dealing with the initial thread is tricky. You need to make sure that an sthread_t struct is created for it at some point, so it can be scheduled like the other threads. However, it wasn't created by your sthread_user_create() function. Remember that, while a thread is running, the state stored in the sthread_t is garbage (though it is probably important that you have an sthread_t, so you've got some place to put the state when you want to stop the thread).
  • Be very careful using any local variables after calling sthread_switch() as their values are quite likely different from what they were before (you're on a different stack).
  • While globals are bad in general, there will be places in this assignment where they are necessary.
  • When debugging with gdb(1), you may see messages like [New Thread 1024 (LWP 18771)]. These messages refer to kernel threads.

Part 2: Mutexes and Condition Variables

For part 2, you'll use the thread system you wrote in part 1, extending it to provide support for mutexs (a.k.a. locks) and condition variables. Skeleton functions are again provided in lib/sthread_user.c. You need to:
  • Design data structures for mutexs (struct _sthread_mutex) and condition variables (struct _sthread_cond)
  • Implement the mutex operations (sthread_user_mutex_*())
  • Implement the condition variable operations (sthread_user_cond_*())
  • sthread_user_cond_wait atomically releases the given mutex, and causes the calling thread to block on the condition variable. Upon successful completion, the mutex should be locked (owned) by the calling thread. It is assumed that the calling thread owns the mutex when it calls sthread_user_cond_wait.
The non-preemptive nature of this assignment aids you greatly in your tasks (since it gives you atomic critical sections). However, you should think about what you would need to do if your threads were preemptive; add comments to the routines indicating precisely where the critical sections are and what appropriate protection might be applicable (e.g. how you would use a spinlock or other atomic operation).

Hints
  • Figure out how to block a thread, making it wait on some queue. How do you get the calling thread? How do you switch out of it? How will you wind up back in it?
  • Locks do not immediately switch threads when unlocked.

Part 3: Synchronization Problem

There are several famous synchronization problems in computer science. For part 3, your job is to implement a "food services" problem, an instance of the better-known multiple-producer, multiple-consumer problem. There are N cooks (each a separate thread) that constantly produce burgers. We'll assume for debugging purposes that each burger has a unique id assigned to it at the time it is created. The cooks place their burgers on a stack. Each time a cook produces a burger, she prints a message containing the id of the burger just cooked. There are M hungry students (each a separate thread) that constantly grab burgers from the stack and eat them. Each time a student eats a burger, she prints a message containing the id of the burger she just ate. Ensure, for health and sanity reasons, that a given burger is consumed at most once by at most one student.

Note that you are implementing an application now. That means the only interface to the thread system that you should use is that described by sthread.h (as distributed in the tar.gz). Do not use functions internal to sthreads directly from your solution to this problem.

Place your solution in a new file called test-burgers.c in the test directory. The program should take 3 command-line arguments: N, M, and the total number of burgers to cook. For examples, test-burgers 5 10 100 should simulate 5 cooks, 10 students, and 100 burgers. The Simplethreads instructions tell you how to add a test program.

Hints
  • Review the textbook sections on synchronization.
  • Due to non-preemptive nature of user-level sthreads, you should think about critical points in your code that must contain sthread_yield() calls, as well as ways to simulate an unexpected timer interrupt and context switch (perhaps using randomness and sthread_yield() calls).
  • To test your threads functionality, don't forget that you can recompile the sthreads library to use POSIX threads (use the --with-pthreads argument as shown above).

Part 4: Multithreaded Web Server

Every web server has the following basic algorithm:
  1. Web server starts and listens (see listen(2)) for incoming connections.
  2. A client (i.e. web browser) opens a connection.
  3. The server accepts (see accept(2)) the connection.
  4. The client sends an http request.
  5. The server services the request by:
    1. Parsing the request.
    2. Finding the requested file.
    3. Reading the file.
    4. Sending a set of http headers, followed by the contents of the file, to the client.
    5. Closing the connection.
  6. The client displays the file to the user.
The sioux webserver in the web directory implements the server side of the above algorithm.

For part 4 you will make sioux into a multithreaded web server. It must use a thread pool approach; it should not create a new thread for each request. Instead, incoming requests should be distributed to a pool of waiting threads. Make sure your threads are properly and efficiently synchronized. Use the routines you implemented in part 2.

Don't forget that the only interface to the thread system that you should use is that described by sthread.h. Do not use functions internal to sthreads directly from sioux.

You should accept a command-line argument indicating how many threads to use ("sioux 10" should use 10 threads).

Because we have not used asynchronous IO in sioux, it will be very difficult to obtain good performance using the user-level threads. In some cases, it may be difficult to even get correct behavior at all times (e.g., if no new requests are sent, existing requests may not be serviced at all). We recommend using kernel-level (i.e. pthreads) threads for this part.

Testing your webserver

To test the webserver you will need to have multiple clients access it simultaneously. To do this use the wget utility, which downloads a file using its URL. You can use the following shell script test.csh, but change the hostname and port number so that wget requests a file from your webserver. In the file below my server was running on machine royal21 using port 9493. Log onto several different machines (e.g. ssh royal13 in one XTerm and ssh royal27 in another) and run test.csh (simultaneously) from each. You can adjust the number of iterations to make the script run longer or shorter.
#!/s/std/bin/tcsh

set i=0
while ( $i < 100 )
        wget http://royal21.cs.wisc.edu:9493 -O temp &
        set i = `expr $i + 1`
end
Type chmod u+x test.csh at the prompt to make test.csh executable. Type ./test.csh to run the script.

Handin

In your top-level simplethreads directory, run make dist. This will produce a file named simplethreads-1.10.tar.gz. Run tar tvzf simplethreads-1.10.tar.gz and check to make sure all simplethreads files, and any new files you have added, are listed. Copy the tar.gz file to ~cs537-2/handin/NAME/p2 where NAME is the login name of one member of your team. Hand in one copy per group. After the deadline you will lose permissions to write to your handin directory. Remember that late projects are not accepted. Plan accordingly.

As with the first project, most of your grade will be determined by how well your implementation works. We will run your code on a suite of test cases, some will test correct execution, other error cases. Exercise your program's capabilities thoroughly (read try to break it). Develop on whichever system you like, though make sure your code works on an emperor or royal machine as that is where it will be tested.