Class notes for Wednesday 04/21/99 lecture
by Ali Contractor (contract)
Topics: Garbage Collection
Memory Allocation
for Data Structures
Garbage Collection:
Mark/Sweep (It is a popular algorithm regarding garbage collection)
Flaws with this algorithm
(i) Expensive - looks at the entire heap.
(ii) Doesn't compact heap (a lot of holes).
Area Copy: Break in two pieces of equal sizes
At the very beginning it looks like:
FROM TO
free |
HEAP
- When we hit a threshold, almost all the free space is allocated, then we do a garbage
collection (time to copy object from the FROM to the TO object).
- Make an adjacent physical copy for all live data/pointer in TO object.
- Form a forwarding pointer.
- Then, release FROM, and now TO becomes FROM and FROM becomes TO.
So, finally it look like:
TO
FROM
free | allocated |
HEAP
Crucial difference with regard to Mark/Sweep:
- Garbage collection is based only on live data.
- The more dead allocation, the more it is cheaper.
- Most of the heap remains empty because most of the dead allocation.
- Do not have to inspect everything.
Global Table: It tells how many live objects pointing to a heap.
Generational Garbage Collection: The object which remain in heap for a long time
are called generational garbage collection.
Memory Allocation (at run-time) for Data Structures:
- Structs
-Classes
-Records
-Unions
-Arrays
Structs:
e.g.:
struct { int a;
char b;
double c;
} R;
From SymbolTable aspect, just create one ST for struct and one for local variables (a,b,c).
fields of the struct:
a |
b |
c |
global ST for the struct R:
R |
-For every field of the struct present in the SymbolTable we would keep track of its
offset (the address for a particular object), the type of the field (in this case int for
a, char for b and double for c), and store that it is field.
-Other then that we would store the overall size of the struct (in this case it is 16;
size of 'a' is 4, size of 'b' is 1+3 (the free space), size of 'c' is 8).
Structure of the struct is looks like:
_____
| c | offset 16
| | offset 8
| b | offset 5
| a | offset 4
_____ offset 0
So, the overall size is 16 bytes.
Now, let make R global: (assuming that address of R is 200)
e.g: R.a = 10 (the address of R.a = address of R + offset of a => the address of
R.a. = 200+0 = 200).
e.g: R.c = 0.0 (the address of R.c = address of R + offset of c => the address of R.c.
= 200+8 = 208).
The classes and records are work pretty much the same way.
Class:
e.g.:
class C { int a;
int
sum(int b) { return (a+b); }
};
C varC = new C(); (new object is of 4 bytes long because 'a' is of type int and which
takes 4 bytes).
varC.sum(10); => sum(C.varC,10);
Union: (fields in the union overlaps (mutually exclusive)
e.g.:
union { int a;
double d;
} U;
- size of the union is going to be the max size of the field. (in this case the size for U
is 8 because 'd' takes 8 bytes which is greater than 4 bytes of 'a'). Thus, 'd' will
overlap with 'a'.
suppose, we have:
U.d = 12.34;
U.a = 1234;
cout << U.d;
- This will return illegal/unreasonable number.
Array: Today Prof. Fischer has just talked about 1 dimensional, static bounded
arrays.
e.g.: int a[10];
- first it will look the size of 'a'.
- So, the size of (int a[10]) would be:
size of 'a' + (10)(4) = 4+40 = 44
<--- End_of_notes --->