In general, the heap is used for dynamically allocated objects.
However, it might be used for other kinds of objects, too.
For example, activation records might be allocated on the heap
for a multi-threaded language, where calls and returns do not
follow a stack protocol (i.e., a "return" is not necessarily
from the most recently called subprogram, because the most recently
called subprogram could be in one thread, while the return was
in another).
Different languages use different syntax for the
allocation of storage for dynamically created objects:
In some languages, deallocation is done by the programmer:
We will first look at basic techniques for implementing the low-level
operations on the heap (how to satisfy requests for storage, and what
to do when storage is freed).
Then will we consider some of the problems of programmer-controlled
and of automatic deallocation.
Finally, we will look at some different techniques for doing automatic
deallocation.
Available storage is managed using a free list: a list of
available "chunks" of free storage.
Some special location is used to hold the address of the first item
on the list;
each item includes:
Here is a series of pictures to illustrate the way the freelist works.
Note that alignment issues are ignored in this example (we assume that
an allocated chunk of storage can start at any address).
Also, we assume that the heap starts at location 0, which is not a
realistic assumption, but is fine for the purposes of this example.
Initially, the freelist might look like this:
Now assume that a request for 10 bytes is received.
Here is the situation after that request has been satisfied:
Some questions to consider are:
Best Fit: Find the chunk on the freelist with the smallest size
greater than or equal to n.
The idea is to preserve larger chunks (i.e., do not break them up if
it is not necessary).
However, it has several disadvantages:
First Fit:Use the first chunk with size greater than
or equal to n.
This technique will generally be faster than Best-Fit;
however, it may produce little pieces of free storage at the front
of the list, which will slow down later searches.
Circular First Fit: Make the freelist circular
(i.e., have the last item point back to the first item).
When a request for n bytes is made, satisfy it using the first chunk
with size greater than or equal to n, but then change the "first free"
pointer to point to the chunk following the one that was returned.
Note: if the list is singly linked, then it will not, in general be
possible to return the very first chunk, because there will be no way
to fix the "next" pointer of the previous item.
This problem can be solved by making the list doubly linked
(which does not lower the amount of available storage, since the pointer
fields are part of the chunk used to satisfy an allocation request).
Another possibility is to have special-case code for the case where
there is just one item on the list, and otherwise to start the search
from the second item, keeping a "trailing" pointer to permit the previous
item's "next" field to be updated.
There are also several possible ways to solve the second problem (how to
coalesce freed storage).
One approach is to use a doubly linked list (i.e., each list
item has a "previous" as well as a "next" pointer).
Also, one bit of the "size" field is reserved to indicate whether the
chunk is "free" or "in-use".
Now when a chunk is freed, we can check the "free-bit" of the storage
that immediately follows the freed chunk (using the freed chunk's
"size" bit to locate the "size" field of the following chunk of storage).
If that following storage is free, then the two chunks can be
coalesced.
For example, suppose the situation is like this:
To allow a newly freed chunk to be coalesced with a free chunk that
precedes it in memory (as well as with one that follows it)
we need to maintain two "size" fields in every chunk: one at the end
of the chunk as well as the one at the beginning.
In that case, when a chunk is freed, we will know that the immediately
preceding 4 bytes are a "size" field (with a "free-bit");
we can use the free-bit to tell whether the preceding memory is available
for coalescing, and we can use the value of the size field to know the
extent of the previous list item.
Here is an example. Assume that we start with this situation:
This has the following advantages over the previously discussed approaches:
Recall that in some languages (Pascal, C, C++), deallocation is
"explicit" (under programmer control), while in other languages (Java)
it is done "automatically".
The main reason to prefer automatic deallocation is that it is easy for
the programmer to make mistakes in their deallocation code, which can
lead to errors that are very hard to track down.
Here is an example of C code that causes a storage leak:
Here is an example of C code that illustrates a dangling pointer:
The technique works as follows:
Note also that this technique requires that every pointer
have a key field, including pointers that are inside dynamically
allocated objects.
This means that allocation must be done according to the type of
the object being allocated (as is done in Pascal, C++, and Java)
so that space for the key fields can be included.
In C, it is not only possible to allocate storage by requesting
a specific number of bytes (rather than using the "sizeof" operator),
it is also possible to store pointers in non-pointer variables such
as integers (via casting).
These kinds of language features make it difficult for a compiler
to ensure that techniques like this lock-and-key approach work
correctly.
There are two important problems with reference counting:
The Mark and Sweep technique has two phases:
The sweep phase starts with an empty freelist.
It looks at every chunk of storage in the heap in order
(note that those chunks can be recognized because we know where the
heap starts, and each chunk starts with a "size" field).
If the mark bit for a chunk is 0, it means that it is inaccessible.
The chunk can simply be added to the freelist, but a better idea
(to reverse fragmentation) is to first check the following chunks.
If there is a sequence of two or more free chunks, then they can be coalesced,
and the coalesced chunk is then added to the freelist.
If the mark bit for a chunk is 1, it means that it is accessible.
Therefore, it is not added to the freelist, but its mark bit is set
back to zero so that it will be processed the next time the
mark-and-sweep garbage collector is started up again.
Below is an example to illustrate the mark-and-sweep process.
Assume that memory looks like this when the garbage collector
is called;
the numbers in the chunks are mark bits, all initially 0.
Note that there is just one free chunk
and that some of the (non-free) objects contain pointers.
For the Stop and Copy technique, the heap is divided into two parts:
"old" space and "new" space.
Old space is used for allocation, and new space is used for garbage
collection.
There is no free list;
instead, a "first-free" pointer is maintained that points to the
first free location in "old" space.
When a chunk of n bytes is requested, the location pointed
to by the first-free pointer is returned, and the first-free pointer
is incremented by n (actually, "invisible" size fields are still
maintained as part of each allocated chunk, so allocating a chunk
would have to include maintaining that field).
When the "old" space is full, or almost full, the stop and copy
garbage collection begins.
It finds all accessible objects (by following pointers from the
static-data area, etc. as for the mark and sweep technique),
but instead of marking them, it copies them to "new" space.
Once all accessible objects have been copied, the roles of the
"old" and "new" space are reversed;
the first-free pointer points to the first free location in the
"old" space (the location just after the last copied object).
Below are two picture to illustrate the idea.
The stack is shown on the left; it contains 2 pointers to
heap objects.
The heap is shown on the right.
Initially, it contains 6 chunks of allocated storage (labeled A - F)
in the "old" space.
(The first-free pointer points to the small remaining chunk of
storage in the "old" space.)
Chunk C itself contains a pointer (pointing to chunk D).
In the second picture, the three accessible chunks have been
copied to what used to be "new" space, leaving behind all garbage.
The first-free pointer now points to the first free location
in what used to be "new" space, and is now "old" space.
The example given above is repeated below, but this time we assume that
object F contains a pointer to C (as well as there being a pointer to C
from the stack).
The first picture shows the situation before garbage collection.
The second picture shows the situation after the top-most stack
pointer (the one pointing to C) has been followed;
C has been copied to "new" space, a forwarding pointer has
been left behind, and the stack pointer has been updated.
The third picture shows the final situation after garbage
collection has finished;
all accessible storage has been copied,
all pointers to accessible storage have been updated, and the roles
of "old" and "new" space reversed.
If normal reference counting is used, then before the loop (when the
value in L is copied in to tmp), the reference count of the first item
on the list is incremented.
The assignment "tmp = tmp->next" inside the loop causes the following
changes to be made on each iteration:
To avoid this kind of extra work, deferred reference counting works
as follows:
Overview
In other languages (e.g., Java), deallocation is done "automatically"
(not under the programmer's control): storage is reclaimed (for later
reuse) when is it "dead"; i.e., when it is no longer accessible via
some variable in the program.
Basic Techniques
Actually, the field that holds the address of the next list item is
also part of the chunk itself.
The size field, however, is not;
that field stays "attached" to the chunk, but should not be overwritten
by the programmer's code.
(In some languages, like C, the programmer can actually overwrite the
value in this field;
this is usually the result of a logical error, but could also be a
deliberate attempt to breach some kind of security.)
0 4 ... 103
+---+ +------------------------------------------+
| | | | | |
| o------->| 100 | \ | ... |
| | | | | |
+---+ +------------------------------------------+
first size next
free
Now assume that a request to allocate 20 bytes is received.
The first 20 bytes (after the size field) would be used
to satisfy the request (i.e., the address "4" would be returned),
and the heap updated to look like this:
0 4 ... 23 24 28 ... 103
+---+ +------------------+ +-------------------------+
| | | | | | | | |
| o---+ | 20 | | | 76 | \ | |
| | | | | | | | | |
+---+ | +------------------+ +-------------------------+
first | size size next
free | ^
| |
+----------------------------
The single chunk of available storage has been split into two parts:
the first part was used to satisfy the storage request;
it still has a "size" field, but the value has been updated to reflect
the size of the allocated chunk.
The second part is the storage that is now available.
The "first free" pointer has been updated to point to this chunk, and
its "size" and "next" fields have been set.
0 4 ... 23 24 28 ... 37 38 42 46...103
+---+ +------------------+ +--------------+ +---------------+
| | | | | | | | | | | |
| o---+ | 20 | | | 10 | | | 62 | \ | |
| | | | | | | | | | | | |
+---+ | +------------------+ +--------------+ +---------------+
first | size size size next
free | ^
| |
+---------------------------------------------
Finally, assume that the first chunk of storage that was allocated is
now freed (the chunk starting at location 4).
That chunk of storage would be added to the front of the
freelist (since that is cheaper than adding it to the middle or the
end), and the picture would be like this:
0 4 8 ... 23 24 28 ... 37 38 42 46...103
+---+ +------------------+ +--------------+ +---------------+
| | | | | | | | | | | | |
| o---+ | 20 | o | | | 10 | | | 62 | \ | |
| | | | | | | | | | | | | | |
+---+ | +-------|----------+ +--------------+ +---------------+
first | size next size size next
free | ^ | ^
| | | |
+------+ +-------------------------------+
Operations on the Freelist
The operations on the freelist that need to be supported are:
A good implementation of those operations should satisfy the following goals:
Techniques for allocation
The answer to the first question is that there are a number of different
schemes for deciding how to allocate a chunk of size n:
+------------------------------------+ +--------+
| | | |
v | | v
+---+ +-----------------+ +---------+ +------|---|----+ +--------------+
| | | | | | | | | | | | | | | | | | | | | |
| o---+ | | \ | o | | | 10 | | | 20 | o | o | | | | o | \ | |
| | | | | | | | | | | | | | | | | | | | | | |
+---+ | +---------|-------+ +---------+ +---------------+ +-----|--------+
first | size prev next size size prev next size prev next
free | ^ | ^ ^ |
| | | | | |
+----+ +----------------------+ +--------------------+
and now the chunk of size 10 is freed.
That chunk can be coalesced with the following chunk (of size 20),
producing this situation:
+------------------------+ +--------------+
| | | |
v | | v
+---+ +-----------------+ +------|----|----------+ +----------------+
| | | | | | | | | | | | | | | | | | |
| o---+ | | \ | o | | | 34 | o | o | | | | o | \ | |
| | | | | | | | | | | | | | | | | | | |
+---+ | +---------|-------+ +----------------------+ +-----|----------+
first | size prev next size prev next size prev next
free | ^ | ^ ^ |
| | | | | |
+------+ +----------+ +---------------------------+
Note:
+-------------------------------------+
| |
v |
+---+ +-----------------+ +-----------+ +-----|-----------+ +-----------+
| | | | | | | | | | | | | | | | | | | | | | |
| o---+ | | \ | o | | | | | | | |20 | o | \ | |20| |16 | |16 |
| | | | | | | | | | | | | | | | | | | | | | | |
+---+ | +--------|--------+ +-----------+ +-----------------+ +-----------+
first | size prev next size size size size prev next size size size
free | ^ | ^
| | | |
+----+ +--------------------------+
Now assume that the last chunk of memory in the picture is freed.
The "free-bit" in the 4 bytes immediately to the left of the size field of the
newly freed chunk will indicate that the preceding chunk is also free, and can
be coalesced.
The result is shown below.
+-------------------------------------+
| |
v |
+---+ +-----------------+ +-----------+ +-----|-------------------------+
| | | | | | | | | | | | | | | | | | |
| o---+ | | \ | o | | | | | | | |44 | o | \ | |44 |
| | | | | | | | | | | | | | | | | | | |
+---+ | +--------|--------+ +-----------+ +-------------------------------+
first | size prev next size size size size prev next size
free | ^ | ^
| | | |
+----+ +--------------------------+
Note that doing the coalesce only requires updating two size fields (the left
field of the preceding chunk, and the right field of the newly freed chunk).
The new size is the sum of the two old sizes + 8 (because the right size
field of the first chunk and the left size field of the second chunk get
"reclaimed").
No pointers need to be changed at all, so this is a faster operation than
coalescing with a following chunk.
However, it has the disadvantage of requiring an extra size field in every
chunk.
Freelists for Fixed-Size Chunks
For languages like Pascal, storage is allocated for fixed-size
chunks whose sizes correspond to the pointer types in the program.
It is possible to determine at compile time exactly what size
chunks may be requested when the program runs.
In this case, another strategy can be used:
If there are N different possible chunk sizes,
divide the heap into n "mini-heaps".
Maintain a separate freelist for each possible chunk size, and return
the first chunk from that freelist when a chunk of the appropriate size
is requested.
The freelists can be maintained us usual (using a linked list), or
a set of bitmaps can be kept (one for each "mini-hap") with each bit
corresponding to one chunk.
Deallocation
Problems with Explicit Deallocation
Storage Leaks
One potential problem is storage leaks;
i.e., some storage is never freed, although it is inaccessible
(and so will never be used again by the program).
The problems with storage leaks are that they can cause a program
to use more memory than necessary.
This can slow down execution, or, in the worst case, if the program
runs out of memory completely, can cause it to crash.
Listnode *p = malloc( sizeof(Listnode) );
.
. // no copy from p in this code
.
p = ...;
When the second assignment to p is executed it over-writes
the address of the allocated chunk of storage that was stored in p.
That storage becomes inaccessible;
the program can no longer use it, but it cannot be freed for reuse.
Dangling pointers
A second potential problem is the use of dangling pointers.
A dangling pointer is one that points to storage that has been freed.
This is a problem because if the pointer is dereferenced for reading,
garbage may be read (causing incorrect behavior at some future point
in the execution);
if the pointer is dereferenced for writing, it may mess up the freelist,
or (if the storage has been re-allocated since it was freed) may corrupt
other, seemingly unrelated values.
This kind of error is especially difficult to track down.
Listnode *p, *q;
p = malloc( sizeof(Listnode) );
q = p;
.
. // no assignment to q in this code
.
free(p);
.
. // no assignment to q in this code
.
*q = ...
In this example, q becomes a dangling pointer when p
is freed.
The final write into the memory pointed to by q might corrupt
the freelist, or (if the storage was reallocated between the free of
p and the dereference of q) might corrupt some object
pointed to by another pointer.
A technique for detecting uninitialized and dangling pointers
In some languages,
the compiler can generate code to detect (at run time) an attempt to
dereference an uninitialized or dangling pointer.
One way to do this is by including a new "invisible" field (like the
size field) as part of every chunk of storage, as well as including
a new "invisible" field associated with every pointer.
The two fields are called the lock and the key,
respectively.
Note that uninitialized pointers can either have their keys set to some
special value (e.g., -1), or the key fields can be uninitialized.
In the former case, we are sure to catch an attempt to dereference an
uninitialized pointer (since a -1 key won't match any lock);
in the latter case we may miss some errors (if by coincidence the value
in the uninitialized pointer is an address whose "lock" field happens to
match the value in the pointer's (uninitialized) key field.
However, that is unlikely, and it may be preferable to save the
time that would be needed to initialize all key fields.
Automatic Deallocation
There are two basic problems that must be solved in order to do
automatic storage deallocation:
And there are two basic approaches to doing automatic deallocation:
Reference counting
Reference counting involves including yet another "invisible" field
in every chunk of storage: its reference count field.
The value of that field is the number of pointers that point to
the chunk.
The value is initialized to 1 when the chunk is allocated, and is
updated as follows:
When a reference count becomes zero, it means that no pointers
are pointing to the object, so it can be returned to free storage.
At that time, if the object itself contains pointers, then the
reference counts of the objects that they point to must in turn be
decremented.
Note that this requires being able to recognize pointers in a
chunk of storage (e.g., by knowing its type).
var p: Nodeptr; /* p is a pointer to a node */
new(p); /* p points to newly allocated storage
for one node; its reference count is 1 */
p^.next = p; /* the next field of the node also points to the
node itself, so now its reference count is 2 */
p = nil; /* p's value is over-written, so the node's
reference count is decremented (from 2 to 1)
In fact, it is inaccessible (it points to itself,
no other pointer points to it), but we can't tell
that just from the reference count. */
Garbage collection
The basic idea behind garbage collection is to wait until there is
little or no storage left, then:
There are many different approaches to doing garbage collection
(this is an active area of current research).
We will discuss two:
The mark and sweep technique requires a new "invisible" bit in each
chunk of storage: its mark bit (this can be one bit of the
chunk's "size" field).
This bit is:
The mark phase works as follows:
When the mark phase has finished, all accessible objects have mark bits
set to one, and all inaccessible object have mark bits set to zero.
+---------------------------------------------+
| +----------------+ |
| | v v
+-----|-+ +---|---+ +-------+ +-------+ +-------+ +-------+
| 0 o | | 0 o | | o 0 | | 0 | | 0 | | 0 |
+-------+ +-------+ +-|-----+ +-------+ +-------+ +-------+
^ | ^ ^
| | | |
+------------------+ | |
| |
ptr on stack: ----------+ first-free
Here's the situation after just the mark phase (note that all
chunks reachable from the stack pointer now have mark-bits = 1):
+---------------------------------------------+
| +----------------+ |
| | v v
+-----|-+ +---|---+ +-------+ +-------+ +-------+ +-------+
| 1 o | | 0 o | | o 1 | | 0 | | 0 | | 1 |
+-------+ +-------+ +-|-----+ +-------+ +-------+ +-------+
^ | ^ ^
| | | |
+------------------+ | |
| |
ptr on stack: ----------+ first-free
Finally, here's the situation after the sweep phase has finished;
the second inaccessible chunk has been coalesced with the chunk
that was free all along, all inaccessible chunks are now on the freelist,
and all of the mark bits have been set to 0.
+---------------------------------------------+
| |
| +------------------------+ |
| v | v
+-----|-+ +-------+ +-------+ +-----|-----------+ +-------+
| 0 o | | 0 |\| | | o 0 | | 0 | o | | | 0 |
+-------+ +-------+ +-|-----+ +-----------------+ +-------+
^ | ^ ^
| | | |
| | | +----------+
+------------------+ | |
| |
ptr on stack: ----------+ first-free
<-------- old space --------> <-------- new space -------->
+---------------------------------------------------------+
| A | B | C o | D | E | F | | |
+-----------|---------------------------------------------+
^ | ^ ^ ^
| | | | | | |
| o-------------+ +--+ | first
| | | free
| o---------------------------+
| |
+---+
stack
<------ new space -------> <------- old space ------>
+---------------------------------------------------+
| | C o | D | F | |
+---------------------------------------------------+
^ | ^ ^ ^
| | | | | | |
| o-----------------------------------+ +--+ | first
| | | free
| o--------------------------------------------+
| |
+---+
stack
We have glossed over an important part of the stop-and-copy approach:
when a chunk of accessible storage is copied, it is vital that all
pointers pointing to that storage be updated (to point to its new
location in "new" space).
It is easy enough to update the pointer that we follow to find the
accessible chunk, but what about other pointers (either on the stack,
or in accessible heap objects) that point to the same object?
The answer is that when an object is copied from "old" to "new"
space, a forwarding pointer is left behind; i.e., the address
of the object in "new" space.
When we follow a pointer P that points to the same object, we must
recognize that it has been replaced with a forwarding pointer, and
we must copy the value of the forwarding pointer into pointer P.
One way to distinguish an object from a forwarding pointer is to
set the invisible size field to 0 to indicate a forwarding pointer
(this works because an object will never have size 0, and because we
don't need the size field in "old" space any more once the object has
been copied to "new" space).
<-------- old space ---------> <------ new space ------>
+----------------+
| |
v |
+-------------------------|----------------------------+
| A | B | C o | D | E | F o | | |
+-----------|------------------------------------------+
^ | ^ ^ ^
| | | | | | |
| o-------------+ +--+ | first
| | | free
| o---------------------------+
| |
+---+
stack
<--------- old space ---------> <------ new space ------>
+---------------+
| +----------|-----+
v v | |
+------------------------|-------|---------------------+
| A | B | o | D | E | F o | | C o | |
+----------|-------------------------------------------+
| ^ ^^ ^
+----------|-------+| |
| | first
| | | | free
| o--------------------------|--------+
| | |
| o--------------------------+
| |
+---+
stack
<-------- new space --------> <-------- old space ------>
+------------+
| |
| +--+ |
v | v |
+-------------------------------|---------|--------------+
| | C o | D | F o | |
+--------------------------------------------------------+
^ ^ ^
| | |
| | first
| | | | free
| o---------------------------------+ |
| | |
| o-------------------------------------------+
| |
+---+
stack
Stop and Copy garbage collection is currently considered the best approach.
It has a number of advantages compared to mark and sweep:
Deutsch-Bobrow deferred reference counting
There is a technique called deferred reference counting
that combines some of the features of (normal) reference counting and
garbage collection.
An important insight behind this technique is that much of the (time)
overhead of reference counting happens because of traversals of
heap data structures, using a local variable as a "temporary" pointer.
For example, consider the following code that traverses the linked list
pointed to by L:
Listptr tmp = L;
while (tmp != null) {
... do something with tmp->data ...
tmp = tmp->next;
}
(Note: "tmp->next" is C syntax; it refers to the "next" field of
the object pointed to by tmp.)
After the loop finishes, all reference counts are back to where
they started;
a lot of extra work has been done for nothing!
Note that this approach requires the compiler to generate different
code for different kinds of assignments:
For example, if p is a local variable of type "pointer to list", then
assignments to p itself (e.g., "p = new list;", or "p = q;") do not
involve any updates to reference counts.
However, assignments like "p->next = new list;" do
require reference count updates, since "p->next" is a location in
the heap.
How to identify pointers
Most of the automatic deallocation techniques discussed above require
that it be possible to recognize pointers at runtime.
There are several possible ways to do this: