In which entirely too many things happen at once.

In this project you'll be experimenting with everyone's favorite topic: concurrency.

In part 1 you'll add a pair of new system calls to support multithreaded processes in xv6.

In part 2 you'll write some user library functions to provide a friendlier threading interface and some basic synchronization facilities.

In part 3 you'll add some more system calls to support more sophisticated synchronization mechanisms for your threads.

In part 4 you'll add more userspace scaffolding code on top of the park() family of syscalls so that your threads can use blocking synchronization.

For bonus points you can instead implement more of clone()'s functionality in userspace.

Important Notes

For this project you may work in pairs! See this page for details.

See also this for how to set permissions to collaborate with your partner.

Remember to start from a fresh copy of xv6! Download it here.

[UPDATE] Due: Friday, April 14 at 11:59 PM (Late policy)

Hand-in instructions are at the bottom of this page.

Kernel support for multithreading

The first step in this project is to add a system call by which a process can create a new thread:

int clone(void (*fn)(void*), void* arg, void* ustack);

clone() should return the pid of the newly-created thread to the parent, start the new thread with a call to the function fn, passing the provided argument arg. The ustack argument specifies the base address of the region of memory to be used as the new thread's stack (it should be one page in size, like the stack created by the exec() syscall). Recall that the stack grows down, however, so the new thread's stack pointer should start at the opposite end of this region. The calling process must allocate space for the new thread's stack before calling clone(). On failure, clone should simply return -1 to the original thread.

The new thread should share the virtual address space of the parent thread. As in fork(), file descriptors should also be copied into the new thread. As in exec(), the new thread's stack should be initialized with a fake return address of 0xffffffff so that if the new thread returns from its outermost function it triggers a trap. To avoid this, a thread should call exit() (like a regular process does) when it has nothing left to do. (The arg argument to fn will of course also need to be place on its initial stack.)

After implementing clone(), you should then create a corresponding system call that is to clone() what wait() is to fork():

int join(void** ustack);

Like wait(), join() returns the PID of the child thread that has exited (or -1 if the process has no child threads). When a thread exits, the base address of its stack (the address that was passed as the ustack argument to the clone() call that created it) should be passed back to the calling process via the ustack argument to the join() (note that this is a void**, not a void*). join() should not act on other processes, only threads (i.e. processes that share the same virtual address space as the calling process). Similarly, you should modify wait() so that it operates only on other processes in different virtual address spaces, not other threads in the same one.

Userspace multithreading support

With clone() and join() in place, the next thing to do is to write some userspace library code to make writing multithreaded programs more pleasant.

First create a pair of helper library functions as wrappers for clone() and join():

int thread_create(void (*fn)(void*), void* arg);
int thread_join(void);

These functions should be short and easy to write: all they have to do is take care of the allocation and deallocation of the stacks used for the threads that they create and join, respectively.

[UPDATE: if you run into errors with missing malloc and free symbols when trying to link the forktest program, you can simply remove that entry from the USER_PROGS variable in user/]

Next, add a simple spinlock implementation so that your threads can synchronize themselves:

struct spinlock;
void spin_init(struct spinlock* lk);
void spin_lock(struct spinlock* lk);
void spin_unlock(struct spinlock* lk);

[Hint: look at the existing spinlock implementation used in the xv6 kernel (kernel/spinlock.c and kernel/spinlock.h) -- your userpsace spinlock can just be a simplified version of that code.]

[UPDATE: use the files spinlock.{c,h} and threads.{c,h} from the P4 user templates for this.]

Lastly, use your new spinlocks to make the existing xv6 user library code thread-safe. Look at the files indicated in the USER_LIBS variable in user/ -- most of it won't need any changing because it doesn't use any global variables or static locals (which cause problems when multiple threads access them concurrently), but there are some small changes you'll need to make.


The next thing to add are some kernel mechanisms for blocking synchronization. This will take the form of three new system calls:

void park(void);
int setpark(void);
int unpark(int pid);

The park() system call simply suspends the calling thread until another thread wakes it up by calling unpark() with its PID as the argument. The setpark() syscall is an auxiliary function to help avoid race conditions; it causes the next park() call to return immediately instead of suspending the thread if another thread calls unpark() on it in between the setpark() and park() calls.

setpark() should fail if the thread has already called it without calling park() (e.g. it makes back-to-back calls to setpark()). unpark() should fail if the thread whose PID is passed as the argument has not called setpark() or park().

[UPDATE: setpark() and unpark() should return -1 on failure and 0 on success.]

(You may want to review chapter 28 of OSTEP for a refresher on these calls.)

Blocking synchronization

Finally, use your park() system calls to provide userspace library support for three blocking synchronization mechanisms we covered in class.

Blocking mutexes:

struct mutex;
void mutex_init(struct mutex* mtx);
void mutex_lock(struct mutex* mtx);
void mutex_unlock(struct mutex* mtx);

Condition variables:

struct condvar;
void cv_init(struct condvar* cv);
void cv_wait(struct condvar* cv, struct mutex* mtx);
void cv_signal(struct condvar* cv); /* wake one waiting thread */
void cv_broadcast(struct condvar* cv); /* wake all waiting threads */


struct semaphore;
void sem_init(struct semaphore* sem, int initval);
void sem_post(struct semaphore* sem);
void sem_wait(struct semaphore* sem);

You may choose to implement some of these in terms of the others, but you must of course implement at least one of them using park() and friends directly (infinite mutual recursion is not a useful synchronization operation).

Testing tip: pressing control-P at the xv6 console will cause it to dump the current state of all tasks. You should be able to use this to verify that your threads are sleeping instead of spinning when using the mutexes, condvars, and semaphores you implement (e.g. by having one thread block in mutex_lock() while the thread holding the lock calls sleep()).

[UPDATE: So that grading can be done in a more fine-grained fashion allowing for more partial credit in the face of bugs, you should implement each of these pieces in separate files. Use mutex.{c,h}, condvar.{c,h}, and semaphore.{c,h} from the P4 user templates for this part.]

Bonus: clone() in hard mode

For bonus points, implement clone() instead with a different signature that makes things a bit trickier:

int clone(void* ustack);

[UPDATE: the return value should still be the PID of the newly-created child thread in the parent, but in the new thread it should return zero (like fork()).]

The signature and operation of your create_thread() userspace library function should remain unchanged. You'll need to write some assembly for this.

[UPDATE: if you do the bonus, make sure to hand it in in as a separate version of your code in its own separate directory (see below). For grading purposes, your main xv6 handin directory should have the non-bonus implementation with the regular three-argument version of clone(). You should also make sure to mention it in your README if you do.]

[UPDATE 2: Don't just work around it by stashing the other two arguments in the stack buffer and then retrieving them from it in your clone syscall -- that would be defeating the purpose.]

[UPDATE 3: The only thing that should happen in the kernel with the ustack argument is saving its value somewhere in your proc struct so that you can pass it back to userspace later from join(). Done properly, there should thus be very little difference between the ustack-only clone() and a clone() that, like fork(), takes no arguments at all.]


Handing in your code

The handin procedure is similar to our other projects thus far. If you are working with a partner only ONE of you should hand in the code (whichever one of you was designated as the one to do so in your initial partner-declaration email).

$ make clean
$ cp -r . ~cs537-2/handin/$USER/p4/xv6
# create a brief README file (mentioning bonus if you did it)
$ cp README ~cs537-2/handin/$USER/p4

If you do the bonus, hand it in in a separate handin directory:

$ make clean
$ mkdir ~cs537-2/handin/$USER/p4/xv6-bonus
$ cp -r . ~cs537-2/handin/$USER/p4/xv6-bonus