Project 3: A Real Page Turner


Project Due Date: Tuesday, April 16 at 11:59 PM


1.0 Objective
When you are finished with this project you should have a much better understanding of how to schedule processes and how to manage virtual memory. You will also expand your Unix knowledge to include signals, memory mapping, and memory protection. If all of this is starting to make you a little dizzy, don't worry - it's all explained in detail later in this write-up.

2.0 Signals, mmap(), and mprotect()
It is essential that you have some understanding of what signals are and how to map and protect memory before reading the project description. The following three subsections will describe each of these in detail.

2.1 Signals
Signals are a software mechanism that is very similar to interrupts (in fact, many signals originate from hardware interrupts). Whenever some major event happens - say a segmentation fault, a timer interrupt, a keyboard interrupt (contorl-C) - the operating system sends a process a signal. The program has three different options when dealing with signals. These are:

  1. Ignore the signal. Some signals (SIGKILL for example) cannot be ignored.
  2. Let the operating system defined default handler deal with it. This is what happens most of the time. For example, on a SIGSEGV (segment fault) signal the default handler terminates the offending process.
  3. Define a specific function to run when the signal arrives. This is what you will do for this project for several signals (including SIGSEGV). Now, whenever a signal arrives that has a handler, whatever the process is doing is stopped and the signal handling function is run. If the handling function finishes and returns, the process continues its execution from where it was interrupted. There are several signals that don't quite work this way but you do not need to worry about them for this project.

For a complete list of all the different signals, you can simply type man signal inside a nova shell.

To define a function as a signal handler, you need to write the function and then register it. Depending on how much information you want to pass the interrupt handling routine you have 2 options for how to declare a handling function and how to register it with the operating system. These two methods are described below:

Handling Function Prototype Registration Description
void functionName(int sig); struct sigaction action;
action.sa_handler = (functionName);
action.sa_flags = SA_RESTART | SA_RESETHAND;
if(sigaction(SIGNAL, &action, NULL) < 0) { error(); }
void functionName(int sig, siginfo_t* sigInfo, void* ucontext); struct sigaction action;
action.sa_sigaction = (functionName);
action.sa_flags = SA_RESTART | SA_SIGINFO | SA_RESETHAND;
if(sigaction(SIGNAL, &action, NULL) < 0) { error(); }

You will need to replace functionName with the actual name of your handling function. You will also need to replace SIGNAL with whatever signal you are trying to catch. The big difference to note between these two is that in the first case, your function can only accept a single integer - the signal number. In the second case, your function must accept 3 arguments. The first must be registered using the sa_handler field of the sigaction stucture. The second uses the sa_sigaction field. If you are using the siginfo_t structure in your function their are several fields of interest. These are:

There are many more fields in this stucture and you can look them up if you want.

Whenever a signal handler gets invoked, you need to reregister it. The reason for this is because the Solaris operating system automatically returns the signal handler back to the default after a signal. Hence, all of your signal handlers will have code similar to the following in them (this handler is for a SIGSEGV signal):

void foo(int sig) {
   struct sigaction action;

	// reregister handler
	action.sa_handler = (foo);
	action.sa_flags = SA_RESTART | SA_RESETHAND;
	if(sigaction(SIGSEGV, &action, NULL) < 0) {
	   perror("registering seg fault handler");
		exit(1);
   }

	// do the actual handling of the signal now
}

Of course, if your function requires the information provided by the siginfo_t structure, your handling declaration and registration is going to look slightly different but the concept is exactly the same.

There is an important thing to realize about signals - they can happen at any time. This means your program can be in the middle of manipulating some data structure when the signal arrives. Your process will stop what it is doing and jump to the signal handler. If your signal handler is also going to manipulate that data structure, you will most likely have problems. The reason is because the data structure is in an inconsistent state. You have no idea where the last function was when the signal occurred (this should sound earily familiar to having multiple threads manipulating the same data). To prevent this case from happening, Unix provides the programmer a means of masking signals. When a signal is masked, the signal is blocked until it becomes unmasked. The signal is not ignored, it is just blocked from delivery. As soon as you unmask the signal, it will get delivered to the process - no matter how long it has been waiting (blocked).

To mask signals you will want to use the sigprocmask function as well as several other functions. Their prototypes are defined as follows:

#include <signal.h>

int sigprocmask(int how, sigset_t* set, sigset_t* oldset);
int sigemptyset(sigset_t* set);
int sigfillset(sigset_t* set);
int sigaddset(sigset_t* set, int sig);
int sigdelset(sigset_t* set, int sig);
int sigismember(sigset_t* set, int sig);

In the section on the basic OS structure you will find an example of how to use some of these functions. For more details, check out the man pages or look on-line. The basic purpose of these is to prevent the deliver of a signal until some future time. Be careful, though, and don't forget to unblock a signal if you want it to be delivered at a later time.

2.2 The mmap Function
One of the useful features of Unix is the ability to map a file into memory and then access the file just as if you were accessing memory. To do this, you use the mmap() function. It's prototype is as follows:

#include <sys/mman.h>

void* mmap(void* addr, size_t len, int prot, int flags, int fd, off_t off);

Don't worry, this function is not as daunting as it looks. First of all, you should be able to tell from the fd parameter that you are going to have to pass it the file descriptor of an open file. This means you have to open a file before you can call this function (there are ways around that but we'll not discuss them here). So here is what the rest of the parameters mean:

On success, mmap() will return the starting address of a memory region that starts on a page boundary. On failure, mmap() returns NULL. You should always check this return value. If you want more information on mmap(), check out the man pages or look on-line. You will find an example of how to use this function in the next subsection.

2.3 The mprotect() Function
The mmap() function allows you to declare an initial protection value for a memory region. The mprotect() function allows you to change this value. It is fairly simple to use. Here is the prototype:

#include <sys/mman.h>

int mprotect(void* addr, size_t len, int prot);

The addr is the starting address of the page you want to protect and the len parameter is the number of bytes to protect. The len parameter should be a multiple of the page size of the system. The prot parameter is the new protection for the region. Again, these are PROT_READ, PROT_WRITE, PROT_NONE, and a few others. On success, mprotect() returns 0. On failure, it returns -1 and sets errno appropriately.

The only tricky part about using this is getting the starting address to be the starting address of a page. Fortunately, this is quite simple. All you have to do is and (&) an address with the page mask (PAGEMASK). A full example of how to use mmap(), mprotect(), and signals is provided in the following file:

example.c

3.0 Project Description
This project can easily be broken up into two separate pieces. First, there is the scheduler part. This will be a round robin scheduler with a fixed quantum. Secondly, you will need to implement the page replacement code. Before describing these two parts of the OS, let us examine what the basic structure of your OS.

3.1 Basic OS Structure
The structure of the OS for this project will be less sophisticated than project 2 because your OS will be single threaded. In other words, you will not be using pthreads at all for this project (insert applause and cheers here).

Basically, all your OS has to do is wait for a request to come from some process. It can do this using the Domain_Read() function from project 2. When it receives a request, it then handles that request and sends a return message to the requesting process. There are only three requests that a processes can generate and send to the OS. T_CREATE_PROC, T_TERM_PROC, and T_SEG_FAULT. These stand for create process, terminate process, and segment fault, respectively. The purpose of the first two messages should be fairly obvious. The last signal, T_SEG_FAULT, is used by the paging part of OS to determine which pages a process has referenced and when the process tried to access a page that is currently "swapped out".

There is one other event the OS must be able to deal with - the SIGALRM signal. This signal is used to implement the round-robin scheduler (discussed in the next subsection). You must be careful when receiving this signal. Remember from above that a signal can be received at any time. If the function that handles the signal needs to manipulate some data, you need to make sure that some other function was not in the middle of manipulating this data when the signal arrived.

For example, assume the OS has received a create process request from some process and is in the middle of manipulating the runnable queue to add this new process (more on this queue in the next subsection). Now the timer interrupt goes off and the SIGALRM signal is sent to the OS. This would mean that the function that does scheduling would get invoked. This could be bad news if this function also manipulates the runnable queue (which is now in an inconsistant state from the process create operation). To prevent this from happening, you will need to mask the interrupt signal while you are manipulating the runnable queue (or any other data the signal handler uses). Here is a short snippet of code that should hopefully make your life much easier:

sigset_t mask;
sigemptyset(&mask);
sigprocmask(SIG_SETMAST, &mask, NULL);
sigaddset(&mask, SIGALRM);
...
for(;;) {
   if(Domain_Read(sock, addr, buf, bufSize) < bufSize) {
      error();
   }

   // block the SIGALRM signal
   sigprocmask(SIG_BLOCK, &mask, NULL);

   // handle the request
   do_request(buf);

	// unblock the timer
	sigprocmask(SIG_UNBLOCK, &mask, NULL);
}

Of course, your code can be done differently from that and there will probably be more to it, but that is the basic idea. If the timer goes off while the SIGALRM signal is masked, your process will delay handling the signal until it is unmasked. Recall from the previous section that if a signal is masked, it does not mean it is lost. As soon as you unmask it (the last line in the code above), the pending SIGALRM signal will be delivered and the handler for it will be run.

3.2 Basic Process Structure
Like project 2, this project will require you to create a shared library called libOS.so. You will do this in the same manner you did for project 2. However, with the exception of the OS_Init() function, the code for in LibOS.c will be much different. It still contains the osErrno variable for defining erros but you will now implement the following functions:

Function Description
int OS_Init(char* file) Create a socket for the process and obtain the address of the OS socket. Any other initialization you need to do for your library should also be done in here. If this call fails for any reason, simply return a -1 (osErrno does not get set). If the call is successful, return 0.
int Proc_Create(char* file) This function will allocate memory for the process and then contact OS to get itself added to the list of runnable processes. The file is a data file that this process needs to map to memory (using mmap). This function should map (NUM_PAGES * PAGESIZE) bytes of the file to memory. If the file is smaller than that, any future reads to a location larger than the file size will simply return 0. This process should then send the starting address of its memory region to the OS along with the process's id number (which can be gotten by calling the getpid() function). The last important thing about this function is that it should not return until the process has been scheduled by the OS. The way to accomplish this is to have the function call pause() after sending a message to the OS. The pause() function blocks a process until it receives a signal, so when the OS sends a SIGCONT signal, the Proc_Create() function will return and the process can continue. If for some reason mmap fails, Proc_Create() should return -1 and set osErrno to E_MEM_ALLOC. If it should fail for any other reason it should return -1 and set osErrno to E_GENERAL.
int Proc_Term() Send a message to the OS indicating this process has terminated and would like to be removed from the list of active processes. If this call fails because the process is not currently in the list of active processes, return -1 and set osErrno to E_NO_SUCH_PROC. Otherwise return 0.

You will also be required to write a second file called Handlers.c. This file will contain the following functions:

Function Description
void RemovePage_Handler(int sig); This should be registered by the process as the signal handler for dealing with the SIGRMV signal. SIGRMV is defined inside Handlers.h and is actually SIGUSR1. The reason this signal was generated by the OS is because some page belonging to this process has been revoked and given to another process. When a process receives this signal, it should make all of the memory mapped by the mmap() call untouchable. It can do this by using the mprotect() function with the PROT_NONE flag. This insures that any future references to any page, including the revoked page, get run by the OS first.
void SegFault_Handler(int sig, siginfo_t* sigInfo, void* context); This should be registered by the process as the signal handler for dealing with the SIGSEGV signal. When a process receives this signal it should send a message to the operating system notifying it of the signal and telling it what address (or page) caused the segmentation fault. The OS will examine its page tables to determine why the fault occurred and will then send back a message to indicate what the fault was all about. There are three possibilities:

  • The process has accessed a piece of memory that is in its address space, is considered in memory by the OS, but has its reference bit cleared. If this is the case, the OS only needs to know about this signal so it can mark the page as referenced. The process should then use the mprotect() function to make this page readable and writeable.
  • The process has accessed a piece of memory that is in its address space but is not considered to be in memory by the OS. In this case, the OS has updated the page tables, maybe picked another page for replacement, and then sends back a message. The process should then use the mprotect() function to make this page readable and writeable.
  • The process has accessed a piece of memory outside its address space. In other words, this is a real segmentation fault (remember, segmentation faults that occur because you try to actually reference an invalid piece of memory also get sent to this handler). If this is the case, you can either catch this error inside Handler.c or have the OS catch it and tell you about it. Either way, your program should print out an error message and terminate.

Any program you write that needs to use LibOS.so should be linked to it just as in project 2. You should also compile Handlers.c into Handlers.o and link any testing process to this object file. In the section on provided materials you will find a very simple test program that will allow you to test your OS.

3.3 The Scheduler
So what does the signal handler for SIGALRM do? Well, it schedules another process to run (if there is another process available to run). Your scheduler will be a simple round-robin scheduler. The quantum is to be passed to the OS through the command line. To implement the scheduler, you will need to set the timer and then catch the signal generated when this timer expires. To set the timer, you will use the ualarm() function. The prototype for this function is as follows:

#include <unistd.h>

useconds_t ualarm(useconds_t useconds, useconds_t interval);

The useconds_t type is simply an unsigned integer. The useconds argument indicates how many microseconds the timer should be set to. The timer then ticks backwards until it reaches zero. At this point, it sends the SIGALRM signal to your process (which you will need to catch). You should be able to always set the second argument, interval, to zero. For your own knowledge, though, this argument would be set if you wanted the timer to go off at set intervals after the first useconds have passed. Setting it to zero simply means that the timer will only go off once - after useconds. If you want the timer to go off again, you will have to make a second call to ualarm().

To make this work, you will have to write a function to deal with the timer signal. You will then have to register this function with the OS (as described above) so that it gets invoked whenever there is a timer interrupt. Inside this function (or other functions that it calls) you will need to stop the current running process and start a new one running. To stop a process, simply send it the SIGSTOP signal. To restart it, send it the SIGCONT signal. You do not need to define a special signal handler for either of these two signals - the defaults provided by Unix will work just fine. To send a signal to a process, use the sigsend() function (check it out in the man pages).

How does the OS know which process is next to run? All you have to do here is keep a simple linked list of processes - a runnable queue. Any time a new process is created, it gets added to the tail of the queue. Any time a process gets stopped, it is placed at the tail of the queue. Whenever you want to run a new process, grab the head of the queue. Whenever a process terminates, remove it from the queue altogether.

3.4 The Page Replacer
This will be by far the most substantial piece of code for this project. First of all, you are required to have one page table for each process. The format of this page table is completely up to you and it should be created when the LibOS function Proc_Create contacts the OS. The OS should create a page table that has NUM_PAGES entries. Every process should have its own page table. This means that the OS is going to have to keep track of which page table it should currently use. You need to be careful, however, because on a page fault you are going to have to find a page to replace based on all the pages from all the processes currently in memory.

On a page fault you are required to find the page to replace using the Clock algorithm (Second Chance algorithm) discussed in class. The following is a very brief review of how the Clock algorithm works:

  1. Accept a page fault.
  2. Check the reference bit of the page currently being pointed to by the "hand". The hand should be pointing to the first page after the page the last search replaced. Also, make sure the page is actually in memory - it doesn't do any good to decide to replace a page that is currently out to disk.
  3. If the reference bit is zero, replace this page and make the hand point to the next page in memory. This is where the next search will start.
  4. If the reference bit is one, set it to zero and go on to the next page.
  5. Repeat steps 2, 3, and 4 until a zero refernce page is found or until all the pages in memory have been checked (every page had been referenced since the last page fault). If this later case is true, you will then replace the first page you examined.

To implement the above algorithm, the OS is going to need a few data structures and some state information. How you do this is completely up to you but here are a few suggestions (you do not have to follow these if you don't want to).

Keep in mind that the OS and the other processes share no memory. Therefore, if the OS wants to have a process activate some frame (or deactivate it), it needs to send the process the starting address of that page and then the process needs to do the work itself. There are a few useful macros defined in sys/param.h that can be used by your processes. These are:

4.0 Grading No late assignments will be accepted. This project is due on Tuesday, April 16 at 11:59 PM.

This assignment will be graded based on correctness of implementation as well as robustness. This means your program should work under all the test cases all the time. Programs that only partially work or fail intermittently will be penalized. The following is a breakdown of the grading for this project:

Requirement Points
compilation of non-trivial program 10
Sending and Catching Signal 10
Creating processes at the OS 5
Terminating processes at the OS 5
Scheduling processes 20
Mapping a file to memory(mmap) 5
Changing access permissions to memory region(mprotect) 5
Setting and clearing reference bits correctly (at OS and at the process) 10
Implementing the Clock Algorithm 30
If you do not have a fully functional program, it is your responsibility to be able to quickly and efficiently show which of the above functionality is working properly. For example, to show that you are creating and terminating processes correctly at the OS, you could print out the entire runnable queue every time a new process enters or leaves the system.

5.0 Program Design and Implementation
Before writing a single line of code, both partners should sit down together and design the entire system. This is so important that it will be repeated - this time in italics. Before writing a single line of code, both partners should sit down together and design the entire system.

Now that we have that out of the way, here are some suggestions on how to approach this - with a little time line attached.

Date What should be done.
Tuesday, March 18 Read the entire project description at least once.
Thursday, March 20 Meet with partner at least once to come up with some preliminary designs. At this point, both partners should be looking at how all the major components should work and fit together. This should be from a very high level. Don't worry about the tiny details here. Some people feel that doing a flow chart at this stage (showing how the different components are to interact) is very helpful in doing the next step.
Friday, March 21 Decide on which partner should do the detailed designs for which parts. Some good suggestions at this point are for one person to analyze how the scheduling mechanisms will work and for another to look at the details in the paging mechanisms. There is a little cross-over here so make sure you talk to your partner regularly to stay on the same page with each other.
Thursday, April 4 Completed detailed design. Both partners should have a written description of all the functions they see being needed. This description should also include psuedo-code describing what each function should do. Remember, no code has been written at this point - you are designing the system first! When both people have their halves done, the partners should sit down and review each other's work, make suggestions, corrections, modifications, etc. Of course, if you've been in touch with each other through-out, this part goes much more smoothly - there are few suprises in store for your partner.
Friday, April 5 Start coding. If you've really taken the time before this to design a good system, you will be amazed at how easily this step goes. Of course you will run into a few snags but they should be few and quickly dispatched. Believe it or not, many professional software developers will tell you that this is the easiest stage in any software development. The only reason for that, however, is because they do their homework up front designing the system.
Thursday, April 16 Hand it in.

There are three major points about the above stategy. Number one, stay in touch with your partner. Do not divide the work on March 18 and then speak to each other on April 9. Stay in touch. Number two, work hard to develop a good design before writing any code. It is very hard to over-emphasize this point (although I'm trying hard). And lastly, get started early. Don't wait until the last week. We give you three weeks to do these projects for a reason - they take that long.

6.0 Provided Materials
The following files have been provided here for you. The first four are most definitely required for your project. The example.c program shows how to use signals, mmap(), and mprotect(). The random-10000.dat file can be used as an argument to pass to the example program after it has been compiled. To download any of these, simply right mouse click on the file name and select "save as" from the popup menu.

Domain.c
Domain.h
LibOS.c
LibOS.h
Handlers.c
Handlers.h
example.c
random-10000.dat

Process.c (testing program)
7.0 Handing in Your Project
The directory for handing in your program can be found at:

~cs537-2/public/section2/(username)/p3

where (username) is your login. You only need to put copies of your code into one partner's handin dirctory.

You should only hand in the files that you created and/or modified. This would include os.c, LibOS.c, etc. However, you should not submit Domain.h, Domain.c, LibOS.h, or any of the test programs we provide for you. We will copy these into your directory at compile time. You should also submit the Makefile needed to build your program. Lastly, don't forget to hand in a README file that indicates how to run your program, known bugs, the names of both partners, and any other information you that is important to runnning your program.