Class notes for Wednesday 04/21/99 lecture
by Ali Contractor (contract)

Topics: Garbage Collection
             Memory Allocation for Data Structures

Garbage Collection:

Mark/Sweep (It is a popular algorithm regarding garbage collection)
Flaws with this algorithm
(i) Expensive - looks at the entire heap.
(ii) Doesn't compact heap (a lot of holes).

Area Copy:  Break in two pieces of equal sizes   

At the very beginning it looks like:
   FROM         TO

   free

            HEAP

- When we hit a threshold, almost all the free space is allocated, then we do a garbage collection (time to copy object from the FROM to the TO object).
- Make an adjacent physical copy for all live data/pointer in TO object.
- Form a  forwarding pointer.
- Then, release FROM, and now TO becomes FROM and FROM becomes TO.
So, finally it look like:
     TO             FROM

   free allocated

            HEAP

Crucial difference with regard to Mark/Sweep:
- Garbage collection is based only on live data.
   - The more dead allocation, the more it is cheaper.
   - Most of the heap remains empty because most of the dead allocation.
- Do not have to inspect everything.

Global Table: It tells how many live objects pointing to a heap.
Generational Garbage Collection: The object which remain in heap for a long time are called generational garbage collection.

Memory Allocation (at run-time) for Data Structures:
- Structs
-Classes
-Records
-Unions
-Arrays

Structs:
e.g.:
struct { int a;
            char b;
            double c;
          } R;

  From SymbolTable aspect, just create one ST for struct and one for local variables (a,b,c).

fields of the struct:

a
b
c

global ST for the struct R:

R
 

-For every field of the struct present in the SymbolTable we would keep track of its offset (the address for a particular object), the type of the field (in this case int for a, char for b and double for c), and store that it is field.
-Other then that we would store the overall size of the struct (in this case it is 16; size of 'a' is 4, size of 'b' is 1+3 (the free space), size of 'c' is 8).

Structure of the struct is looks like:
_____
|  c     | offset 16
|         | offset 8
|  b     | offset 5
|  a     | offset 4
_____ offset 0
So, the overall size is 16 bytes.

Now, let make R global: (assuming that address of R is 200)

e.g: R.a = 10 (the address of R.a = address of R + offset of a => the address of R.a. = 200+0 = 200).
e.g: R.c = 0.0 (the address of R.c = address of R + offset of c => the address of R.c. = 200+8 = 208).

The classes and records are work pretty much the same way.

Class:
e.g.:

class C { int a;
               int sum(int b) { return (a+b); }
             };

C varC = new C(); (new object is of 4 bytes long because 'a' is of type int and which takes 4 bytes).
varC.sum(10); => sum(C.varC,10);

Union: (fields in the union overlaps (mutually exclusive)
e.g.:
union { int a;
           double d;
          } U;
- size of the union is going to be the max size of the field. (in this case the size for U is 8 because 'd' takes 8 bytes which is greater than 4 bytes of 'a'). Thus, 'd' will overlap with 'a'.

suppose, we have:
U.d = 12.34;
U.a = 1234;
cout << U.d;
- This will return illegal/unreasonable number.

Array: Today Prof. Fischer has just talked about 1 dimensional, static bounded arrays.
e.g.: int a[10];
- first it will look the size of 'a'.

- So, the size of (int a[10]) would be:
size of 'a' + (10)(4) = 4+40 = 44

<--- End_of_notes --->