Project 3: Malloc and Free

Important Dates

Questions about the project? Send them to 537-help@cs.wisc.edu .

Final Deadline: Friday, October 17th @ Whatever time works for you.

Clarifications

10/16: Grading standard for Performance: Please go to ~cs537-1/public/MemoryAllocator/ and read the READTHISFIRST.TXT file first. This page now includes the performance score you will get. Full performance score is 40.

10/16: Grading standard for Correctness: Here is the correctness standard . You now know how to get the full correctness score (i.e. 60).

10/14: Performance code: The performance code is now available. Please go to ~cs537-1/public/MemoryAllocator/. Results are available here

10/13: Mem_Init return values: Return 0 on a success (when call to mmap is successful). Otherwise, return -1. Some cases where Mem_Init should return a failure: Mem_Init is called again after a successful call, sizeOfRegion is 0 not larger than your free block header , sizeOfRegion is negative, etc.

10/8: 4-byte aligned. For better performance, Mem_Alloc() should return 4-byte aligned chunks of memory. For example if a user allocates 1 byte of memory, your Mem_Alloc() implementation should return 4 bytes of memory so that the next free block will be 4-byte alligned too. To debug whether you return 4-byte aligned pointers, you could print the pointer this way printf("%08x", ptr) . The last digit should be a multiple of 4 (i.e. 0, 4, 8, or C). For example, this is okay: b7b2c04c, and this is not okay: b7b2c043.

Objectives

There are three objectives to this assignment:

  • To understand the nuances of building a memory allocator.
  • To understand the art of performance tuning for different workloads.
  • To create a shared library.

Overview

In this project, you will be implementing a memory allocator for the heap of a user-level process. Your functions will be to build your own malloc() and free().

Memory allocators have two distinct tasks. First, the memory allocator asks the operating system to expand the heap portion of the process's address space by calling either sbrk or mmap. Second, the memory allocator doles out this memory to the calling process. This involves managing a free list of memory and finding a contiguous chunk of memory that is large enough for the user's request; when the user later frees memory, it is added back to this list.

This memory allocator is usually provided as part of a standard library and is not part of the OS. To be clear, the memory allocator operates entirely within the address space of a single process and knows nothing about which physical pages have been allocated to this process or the mapping from logical addresses to physical addresses.

When implementing this basic functionality in your project, we have a few guidelines. First, when requesting memory from the OS, you must use mmap() (which we think is easier to use than sbrk()). Second, although a real memory allocator requests more memory from the OS whenever it can't satisfy a request from the user, your memory allocator must call mmap only one time (when it is first initialized). Third, you are free to use any data structures you want to manage the free list as well as any policy for choosing a chunk of memory. Note that you will be graded partially on performance in this project, so think about these aspects very carefully.

Classic malloc() and free() are defined as follows:

  • void *malloc(size_t size): malloc() allocates size bytes and returns a pointer to the allocated memory. The memory is not cleared.
  • void free(void *ptr): free() frees the memory space pointed to by ptr, which must have been returned by a previous call to malloc() (or calloc() or realloc()). Otherwise, or if free(ptr) has already been called before, undefined behaviour occurs. If ptr is NULL, no operation is performed.

For simplicity, your implementations of Mem_Alloc(int size) and Mem_Free(void *ptr) should basically follow what malloc() and free() do; see below for details.

You will also provide a supporting function, Mem_Dump(), described below; this routine simply prints which regions are currently free and should be used by you for debugging purposes.

Program Specifications

For this project, you will be implementing several different routines as part of a shared library. Note that you will not be writing a main() routine for the code that you handin (but you should implement one for your own testing). We have provided the prototypes for these functions in the file mem.h (which is available at ~cs537-1/public/mem.h); you should include this header file in your code to ensure that you are adhering to the specification exactly. You should not change mem.h in any way! We now define each of these routines more precisely.

  • int Mem_Init(int sizeOfRegion): Mem_Init is called one time by a process using your routines. sizeOfRegion is the number of bytes that you should request from the OS using mmap. Note that you may need to round up this amount so that you request memory in units of the page size (see the man pages for getpagesize()). Note that you need to use this allocated memory for your own data structures as well; that is, your infrastructure for tracking the mapping from addresses to memory objects has to be placed in this region as well. If you call malloc(), or any other related function, in any of your routines, we will deduct a significant number of points. Similarly, you should not allocate global arrays! However, you may allocate a few global variables (e.g., a pointer to the head of your free list.)
  • void *Mem_Alloc(int size): Mem_Alloc() is similar to the library function malloc(). Mem_Alloc takes as input the size in bytes of the object to be allocated and returns a pointer to the start of that object. The function returns NULL if there is not enough contiguous free space within sizeOfRegion allocated by Mem_Init to satisfy this request.
  • int Mem_Free(void *ptr): Mem_Free frees the memory object that ptr points to. Just like with the standard free(), if ptr is NULL, then no operation is performed. The function returns 0 on success and -1 if the ptr was not allocated by Mem_Alloc(). If ptr is NULL, also return -1.
  • void Mem_Dump(): This is just a debugging routine for your own use. Have it print the regions of free memory to the screen.

You must provide these routines in a shared library named "libmem.so". Placing the routines in a shared library instead of a simple object file makes it easier for other programmers to link with your code. There are further advantages to shared (dynamic) libraries over static libraries. When you link with a static library, the code for the entire library is merged with your object code to create your executable; if you link to many static libraries, your executable will be enormous. However, when you link to a shared library, the library's code is not merged with your program's object code; instead, a small amount of stub code is inserted into your object code and the stub code finds and invokes the library code when you execute the program. Therefore, shared libraries have two advantages: they lead to smaller executables and they enable users to use the most recent version of the library at run-time. To create a shared library named libmem.so, use the following commands (assuming your library code is in a single file "mem.c"):

gcc -c -fpic mem.c
gcc -shared -o libmem.so mem.o

To link with this library, you simply specify the base name of the library with "-lmem" and the path so that the linker can find the library "-L.".

gcc mymain.c -lmem -L. -o myprogram

Of course, these commands should be placed in a Makefile. Before you run "myprogram", you will need to set the environment variable, LD_LIBRARY_PATH, so that the system can find your library at run-time. Assuming you always run myprogram from this same directory, you can use the command:

setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:.

If the setenv command returns an error "LD_LIBRARY_PATH: Undefined variable", do not panic. The error implies that your shell has not defined the environment variable. In this case, you simply need to run:

setenv LD_LIBRARY_PATH .

Unix Hints

We are providing you with a lot of flexibility in how you implement this project. In particular, you can choose any suitable data structure for tracking memory and any policy for choosing the memory allocated to each request. The place you don't have any flexilibity is: you must use mmap() for allocating more space on the heap for this process. In this project, you will use mmap to map zero'd pages (i.e., allocate new pages) into the address space of the calling process. Note there are a number of different ways that you can call mmap to achieve this same goal; we give one working example here:

// open the /dev/zero device
int fd = open("/dev/zero", O_RDWR);

// size (in bytes) needs to be evenly divisible by the page size
void *ptr = mmap(NULL, sizeOfRegion, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
if (ptr == MAP_FAILED) {
perror("mmap");
return -1;
}

// close the device (don't worry, mapping should be unaffected)
close(fd);
return 0;

Grading

Your implementation will be graded along three main axes:

  • Functionality: Approximately 60% of your project grade will be devoted to how well your implementation matches the specification above; that is, this part of your grade depends upon correctly implementing the specified functions.
  • Performance: Approximately 40% of your project grade will be devoted to the quality of your implementation on a range of workloads. By quality, we mean both how quickly your routines execute and how well you minimize fragmentation and space overhead (i.e., are able to satisfy requests from the user). Therefore, you will want to think about how to optimize for various workloads from the very beginning of your design process. When grading quality, your implementation will be compared directly against others in the class.

We will provide you with sample workloads so that you can optimize your memory allocator appropriately. Note that you should feel free to implement policies that adapt to the workload; that is, you can use a different policy when your memory allocator observes different requests. However, so that you tune your implementation to general workload characteristics (and not any specific anomalies in these workloads), the exact workload traces that we use for grading your project will be slightly different (e.g., the requested sizes or the order of requests will be changed slightly). The sample workloads will be made available soon.

The following is a short description of the workloads. These descriptions should be sufficient for you to optimize your design. Each workload can be characterized by two properties: the size of the requested objects and the order of allocates and frees.

The workloads vary the size of their requests in two different ways:

  • Size 1: The user requests all small objects (the sizes are random, uniformly distributed between 8 bytes and 256 bytes).
  • Size 2: The user (roughly) alternates between requesting a small object (approximately 64 bytes) and a large object (approximately 64 KB).

The workloads vary the order of allocates and frees in two different ways:

  • Order A: The user allocates N objects and then frees all N of them; the user then allocates N more objects and frees all N of them.
  • Order B: The user repeatedly allocates N objects and then frees N/2 of the objects until the end at which point it frees all remaining objects.

In summary, to test your code, we will construct 4 different workloads (2*2) that combine each of these characteristics.

Handing in your Code

Hand in your source code and a README file. We will create a directory ~cs537-SECTION/handin/NAME/p3/, where NAME is your login name, and SECTION is the section you are in (1 or 2).

You should copy all of your server source files (*.c and *.h) and a Makefile to your p3 handin directory. Do not submit any .o files.

In your README file you should have the following four sections:

  • Design overview: A few simple paragraphs describing the overall structure of your code and any important structures. For example, include a brief description of the data structures you use to map between addresses and memory objects and a brief description of the policy that use to perform allocations (e.g., first-fit, best-fit, rotating first-fit, etc.)
  • Specification: Describe how you handled any ambiguities in the specification.
  • Known bugs or problems: A list of any features that you did not implement or that you know are not working correctly.
After the deadline for this project, you will be prevented from making any changes in these directory. Remember: No late projects will be accepted!