Homework 3: From Crawling To Walking
If you have a question, just send email to email@example.com and we'll try to get back to you quickly. Don't worry, it's just to the TAs and professor.
How to send a good email: Put a lot of information in it! For example, cut and paste what you typed on the screen, and what was printed on the screen as a result. Don't just say that something didn't work! We know it didn't work already; no one sends mail saying that it's all going great.
If your program is taking too long on some of the measurements, it is OK to use a smaller data set size (i.e., one that takes tens of seconds or a few minutes).
If malloc() returns NULL, it is ok to just halt the program. Usually, this is achieved by using assert() , e.g., if the return value from malloc() is placed into variable p, just put assert(p != NULL); in your code. A more general list implementation might return -1 on error and 0 when everything is fine, but we're not doing that here.
Example makefile to build liblist.a here.
Relevant Book Chapters
Relevant reading is all from K+R . Particularly Chapters 5 and 6 (Pointers and Arrays, Structures).
The main purpose of this project is for you to write more C, gaining familiarity with basic libraries and simple data structures. You should also gain some experience with caring about the performance of the code you write via timing.
Due Date: Sunday Feb 26th at some time
Part 1: An Alternative Linked List Library
As you know, a linked list is a basic structure for storing data within programs. In class, we saw one way to construct such a list: every time a new node is inserted, call malloc() to allocate space for a new struct, and then link said struct into the list (either at the beginning, or the end, or in order, for example).
In this part of the project, you will implement the linked list in an alternative manner, using arrays. Specifically, your list will initially allocate some space for contents of the list as an array of node_t structures; then, subsequent inserts, deletes, etc., will simply move elements around the array as need be to function as a list. The header file you should use is here and shows what basic data structures you should use as well as which functions you need to implement.
The way this would work: assume there is a chunk size. Your array can grow and shrink by chunk_size * sizeof(node_t). When the list is initialized, allocate this much space and use it to hold contents of your list. A list insert, then, would just fill the 0th entry of the array. Subsequent inserts (assuming an insert at end) would fill up slots 1 ... chunk_size-1. At this point, however, the array is full, so what happens when the next insert takes place?
In this case, what you should do is make more space in your array. There are two ways to do this. The simplest would be to call realloc() with the next bigger size, 2 * chunk_size * sizeof(node_t); realloc() takes your existing chunk of heap space and either directly grows your allocation or finds a new space and copies the data that was in the original array there (thus freeing the old space). Read the man page (e.g., type man realloc ) to learn more.
The other option would be to simply call malloc() to get a bigger space, copy the existing array into that space, and then free() the original array. Either way is fine, but realloc() is likely to be more efficient. In this manner, your array will grow in chunks as more data is inserted.
Note that the chunk_size parameter is set when you call list_init().
Of course, you have to think about how to implement other operations. For example, how do you insert at the front of the list in an array? Or, how do you keep the array ordered? These operations require a bunch of shuffling around of the contents of the array.
You may also notice that with all the list operations in list.h that when you put something into the list, you not only specify a key (an integer) but also a value (which is of type void*). The void* value just allows the user of the list to insert any arbitrary pointer to something into the list to store along with the key; this is C's way of making a generic data type. You don't have to worry about what is in there; you just have to store and retrieve it as the interface demands. For example, the insert functions will put the key and value into a node_t and store it in the array; a subsequent lookup of that key will return the value (void*) associated with it.
Don't forget to also shrink your array as elements are deleted, eventually. Specifically, if you grow the array chunk_size at a time, you should also free space chunk_size at a time. Thus, if chunk_size were set to 100, your initial array would be size 100. When the 101st element is allocated, your array is grown to size 200. And so forth. However, when an element is deleted, you may have to shrink the array. For example, if the 101st element is deleted and the array now contains 100 elements, the array size should now be 100 (not 200). Note that the array size should always have at least one chunk (and thus is never zero-sized).
To implement this, you should use realloc() again (or with malloc, copy, and free, as desired).
In building this list, you should keep the function prototypes in list.h unchanged (although it is OK to add more information to the list_t structure). You should put your C code into a list.c file, and then compile your list into a statically-linked library called liblist.a as we did in class. You should also provide a makefile called makefile that has the rules in it needed to build your library. Finally, you should build your code with optimization turned on, i.e., use the -O (dash capital O) flag.
Some Other DetailsHere are some other details relevant to your list:
Part 2: Timing Your List
The last part of this project is to use timing to measure the performance of your list, as compared to the standard malloc-based linked list we saw in class. Thus, you'll have to get that working as well to perform these experiments.
In this part of the project, you should create the following graphs, with the names shown in italics:
Each of these graphs should be a single PDF that shows clearly labeled X and Y axes.
For this part of the project, you will have to use gettimeofday() in a main program to time the thing you're trying to time, and random() get find random keys to search for in the last part.
Knowing how to write a little shell script may be very useful for this part of the project; spend some time learning how to write a csh or bash script, or even Python, to launch a bunch of experiments and gather the results.
You should also think about this: what did you expect from the results? What surprised you? Do you have any explanation for the relative performance of these two different approaches? Put your thoughts into a README file for this project. Did the timing help you decide on what chunk size your list should grow and shrink by?
An important part of programming is testing your code to make sure it works. This means writing more code usually!
Write a main.c (much like we did in class) that can be used to test your library. Make it easy to insert and delete and push through all the corner cases you can think of.
For this project, you must use flags -Wall and -Werror when compiling; failure to do so will be a problem, so don't forget!
Handing It In
To hand in these programs, you just have to put the C source code files and makefiles and graphs into your handin directory, under the subdirectory hw3/ (naturally).
At the end of putting everything into your directory, you should check that all the files are there:
prompt> ls -l ~cs354-3/handin/remzi/hw3/
The program ls lists files in a directory and thus should show all the above source files therein.