Classes with Pointer Data Members

Overview
Destructor Functions
- Test Yourself #1
Copy Constructor Functions
Operator =
Wrap-up
Answers to Self-Study Questions

Overview

Every class that has a pointer data member should include the following member functions:

a destructor,
a copy constructor,
operator= (assignment)

The IntList class, defined in the "Introduction to C++ Classes" notes, includes a pointer to a dynamically allocated array. Here is the declaration of the IntList class again, augmented to include declarations of the class's destructor, copy constructor, and assignment operator (in red for emphasis):

class IntList {
  public:
    IntList();                         // constructor; initialize the list to
                                       // be empty
    ~IntList();                            // destructor
    IntList(const IntList &L);             // copy constructor
    IntList & operator=(const IntList &L); // assignment
    void AddToEnd(int k);              // add k to the end of the list
    void Print(ostream &output) const; // print the list to output

  private:
    static const int SIZE = 10;      // initial size of the array
    int *Items;                      // Items will point to the dynamically
                                     // allocated array
    int numItems;                    // number of items currently in the list
    int arraySize;                   // the current size of the array
};

Destructor Functions

An object's destructor function is called when that object is about to "go away"; i.e., when:

a class object (a value parameter or a local variable) goes out of scope, or
a pointer to a class object is deleted (the dynamically allocated storage pointed to by the pointer is freed by the programmer using the delete operator)

The main purpose of the destructor function is to free any dynamically allocated storage pointed to only by a data member of that object. (Note that it is up to the programmer to ensure that no other pointers are pointing to that storage.)

For example, consider the following function, with line numbers included for reference:

[1]  void f(IntList L) {
[2]    IntList *p = new IntList;
[3]    while (...) {
[4]      IntList L1;
[5]      ...
[6]    }
[7]    delete p;
[8]  }

In this example, the scope of value parameter L is the whole function; L goes out of scope at the end of the function (line 8). So when function f ends, L's destructor function is called. (Note: if f had one or more return statements, L's destructor function would be called when a return was executed).

The scope of variable L1 is the body of the while loop (lines 4 to 6). L1's constructor function is called at the beginning of every iteration of the loop, and its destructor function is called at the end of every iteration of the loop. Note that if the loop included a break or continue statement, the destructor would still be called.

Variable p is a pointer to an IntList. When an IntList object is allocated using new at line 2, that object's constructor function is called. When the storage is freed at line 7, the object's destructor function is called (and then the memory for the Intlist itself is freed).

TEST YOURSELF NOW

Why isn't the destructor function of a reference parameter called at the end of the function?

solution

Destructor functions are defined using syntax similar to that used for the constructor function (the name of the class followed by a double colon followed by the name of the function). For example, the definition of the Intlist destructor function would look like this:

IntList::~IntList() {
  delete [] Items; // free the dynamically allocated array pointed to by Items
}

NOTE: If you don't write a destructor function for a class that includes pointers to dynamically allocated storage, your code will still work, but you will probably have some storage leaks.

To understand more about storage management and destructor functions, let's consider a simpler version of the example code given above:

[1]  void f() {
[2]    IntList *p = new IntList;
[3]      ...
[4]    delete p;
[5]  }

Assume that just before line 4, we have the following situation:

p: ---------> +---------------+
              |               |    +---+
              | Items: ----------> | 2 |
              |               |    |---|
              | numItems: 10  |    | 6 |
              |               |    |---|
              | arraySize: 10 |    | . |
              |               |    | . |
              +---------------+    | . |
                                   +---+

If there is no IntList destructor, then when delete p is executed, the storage for the IntList object pointed to by p (which was allocated at line 2) is freed. However, the array pointed to by the IntList's Items field is not freed, and will never be freed, so that is a storage leak. If the IntList destructor given above (that deletes the array pointed to by Items) is provided, then it is called when line 4 is executed. That call frees the array storage, and then the delete operator frees the storage pointed to by p (namely, the storage for the IntList itself), and there is no storage leak.

Copy Constructor Functions

An object's copy constructor is called (automatically, not by the programmer) when it is created, and needs to be initialized to be a copy of an existing object. This happens when an object is:

passed as a value parameter to a function,
returned (by value) as a function result,
declared with initialization from an existing object of the same class.

The purpose of the copy constructor is to make a copy of the

actual parameter,
value being returned,
existing object.

Example

[1]   IntList f( IntList L );
[2] 
[3]   int main() {
[4]     IntList L1, L2;
[5]     ...
[6]     L2 = f( L1 );           // copy constructor called here to copy L1
[7]   }
[8] 
[9]   IntList f( IntList L ) {
[10]    IntList tmp1 = L;       // copy constructor called here to copy L
[11]    IntList tmp2(L);        // copy constructor called here to copy L
[12]    ...
[13]    return tmp1;            // copy constructor called here to copy tmp1
[14]  }

On line 6, variable L1 is passed as a value parameter to function f. The corresponding formal parameter is L. When the call is executed, L's copy constructor is called to initialize L to be a copy of the actual parameter, L1.

On line 10, variable tmp1 is declared to be an IntList, initialized to be the same as variable L. When line 10 is executed, tmp1's copy constructor is called to initialize tmp1 to be a copy of L. Similarly, when line 11 is executed, tmp2's copy constructor is called to initialize tmp2 to be a copy of L.

On line 13, variable tmp1 is returned as the result of calling function f. When line 13 is executed, a copy constructor is called to make a copy of tmp1 to be returned. (Later, that copy is used as the right-hand side of the assignment on line 6.)

If you don't write a copy constructor, the compiler will provide one that just copies the value of each data member (this is sometimes called a shallow copy). If some data member is a pointer, this causes aliasing (both the original pointer and the copy point to the same location), and may lead to trouble. For example, consider the following code:

void f(IntList L) {
  L.AddToEnd(11);
}

int main() {
  IntList I;

  for (int k=1; k<11; k++) I.AddToEnd(k);

  // I's array "Items" is now full
  f(I);
  ...
}

If the IntList class does not include a copy constructor, the compiler will supply one that just copies the value of the pointer Items. Here are pictures illustrating the result of the call to I's copy constructor, which initializes the formal parameter L to be a copy of I. Note that both I.Items and L.Items point to the same array.

   +---------------+
   |               |     +----+
I: | Items: -----------> |  1 |
   |               | +-> |----|
   |               | |   |  2 |
   | numItems: 10  | |   |----|
   |               | |   |  3 |
   | arraySize: 10 | |   |----|
   +---------------+ |   |  . |
                     |   |  . |
   +---------------+ |   |  . |
   |               | |   |----|
L: | Items: ---------+   | 10 |
   |               |     +----+
   |               |
   | numItems: 10  |
   |               |
   | arraySize: 10 |
   +---------------+

Now think about what happens when the body of function f executes. L.AddToEnd discovers that the array is full, so it allocates a new array, copies the values from the old array to the new array, and returns the old array to free storage. Unfortunately, L.AddToEnd doesn't know that I.Items is also pointing to the old array, so when that array is returned to free storage, I.Items becomes a dangling pointer, and any attempt to access the array it points to is likely to lead to trouble.

TEST YOURSELF NOW

Consider the StrList class defined below. A StrList stores a list of strings in a linked list pointed to by the StrList's head field. The Lookup operation determines whether a given string is in the list; if it is there, it is moved to the front of the list, and the value true is returned (otherwise, the list is unchanged, and the value false is returned).

class StrList {
  public:
    // constructor
    StrList();

    // modifiers
    void AddToFront(string s);
    bool Lookup(string s);

    // other operations
    void Print(ostream &output) const;

  private:
    struct ListNode {
       string data;
       ListNode *next;
    };
    
    // pointer to the first node of the list
    ListNode *head;
};

Consider the following code:

void f(StrList L) {
  L.Lookup("b");
}

int main() {
  StrList S;

  S.AddToFront("c"); S.AddToFront("b"); S.AddToFront("a");
  // S.head points to the linked list:
  //     "a" -> "b" -> "c"
  f(S);
  ...
}

Note that there is no StrList copy constructor (so the compiler will supply one). Draw variables S and L as they would appear at the very beginning of function f (just after L's copy constructor is called to initialize it to be a copy of S). Draw a second picture to illustrate what happens as a result of the call to L.Lookup in function f. What goes wrong because there is no StrList copy constructor?

solution

The Copy Constructor Declaration

Recall that the declaration of a class's copy constructor is similar to that of its default (no-argument) constructor: the function has no return type (not even void), and its name is the same as the name of the class. However, unlike the default constructor, the copy constructor has one argument: its type is the class, and it is a const reference parameter. The argument is the object that the copy constructor is supposed to copy. For example:

class IntList {
  public:
    IntList();                 // default constructor
    IntList(const IntList &L)  // copy constructor
  ...
};

The Copy Constructor Definition

The definition of the copy constructor (the actual code for the function) should be put in a ".C" file, along with the code for the other class member functions. The copy constructor should copy the values of all non-pointer data members, and should copy the objects pointed to by all pointer data members (this is sometimes called a deep copy). For example, the copy constructor for the IntList class should perform the following tasks:

allocate a new array of ints of size L.arraySize (L is the copy constructor's IntList parameter); set Items to point to the new array;
copy the values in the array pointed to by L.Items to the new array;
initialize the numItems and arraySize fields to have the same values as the ones in L.numItems and L.arraySize.

Here is the code for the IntList copy constructor (note that, like the other constructor functions, the copy constructor can use a member initialization list to initialize data members, as well as using code in the body of the function):

IntList::IntList(const IntList & L): Items(new int[L.arraySize]),
                                     numItems(L.numItems),
                                     arraySize(L.arraySize)
                                     {
  for (int k=0; k<numItems; k++) {
    Items[k] = L.Items[k];
  }
}

Operator=

In C++ you can assign from one class object to another (of the same type). For example:

IntList L1, L2;
...
L1 = L2;  // this assignment is OK

By default, class assignment is just field-by-field assignment (i.e., a shallow copy is done). For example, the above assignment is equivalent to:

L1.Items = L2.Items;
L1.numItems = L2.numItems;
L1.arraySize = L2.arraySize;

(Of course, the three field assignments could not be written outside an IntList member function, since they are private fields; however, they illustrate the effect of the assignment L1 = L2.)

If a class includes pointer fields, the default class assignment causes aliasing, and as we have seen in the case of the copy constructor, this can lead to trouble! For example, if the L2.Items array is full when the assignment is done, then a subsequent call to L1.AddToFront will cause the array to be returned to free storage (so L2.Items will become a dangling pointer).

The default assignment can also cause storage leaks when the class has a pointer field. For example, when L1 = L2; is executed, L1.Items is simply overwritten with the value in L2.Items, the array that L1 was pointing to is not returned to free storage (and that storage is now lost).

To prevent these problems, you should always define operator= as a class member function for a class with a pointer field, and you should define operator= to do a deep copy. The declaration of the member function looks like this for the IntList class:

IntList & operator=(const IntList &L);

The idea is that when the assignment L1 = L2; is executed, L1's member function operator= is called, and L2 is passed as the argument to that function.

Note that IntList's operator= function returns an IntList. This is to permit chained assignment, for example: L1 = L2 = L3;. When this statement is executed, the expression L2 = L3 is evaluated first; the result of evaluating that expression is used as the right-hand side of the assignment to L1. The operator= function returns its result by reference (that's what the ampersand means). This is done for efficiency, to prevent the IntList copy constructor being called to make a copy of the returned value.

Note that operator= differs from the copy constructor in three important ways:

The object being assigned to has already been initialized; therefore, if it has a pointer field, the storage pointed to must be freed to prevent a storage leak.
It is possible for a programmer to assign from a variable into itself; for example: L1 = L1. The operator= code must check for this case, and do nothing.
The operator= code must return a value.

Here is the definition of operator= for the IntList class:

IntList & IntList::operator=(const IntList &L) {
  // check for "self assignment" and do nothing in that case
  if (this == &L) return *this;
  else {
    delete [] Items;                // free the storage pointed to by Items
    Items = new int[L.arraySize];   // allocate a new array
    arraySize = L.arraySize;        // set the arraySize field

    // copy the items from L's array to the new array
    // also sets the numItems field
    for (numItems=0; numItems < L.numItems; numItems++) {
      Items[numItems] = L.Items[numItems];
    }

    return *this;                   // return this IntList
    
}

Note that, as in Java, every member function has access to a variable named this that is a pointer to the object whose member function was called. So for example, when L1 = L2; is executed, L1's member function operator= is called, so this is a pointer to L1.

To check whether the assignment was L1 = L1, we compare the pointer this with the address of the parameter, L; in the case of L1 = L1, the parameter is L1, so its address is the same as the address that is the value of this. Be sure to include this test every time you write an operator= function!

We also make use of this for the returned value; the type to be returned is IntList (not pointer to IntList) so we return *this (the IntList pointed to by this) rather than plain this.

Wrap-up

Every class that has a pointer data member should include the following member functions:

a destructor,
a copy constructor,
operator= (assignment)

If you don't write a destructor, your code will probably still work, but it may have storage leaks (some uses of the new operator will have no corresponding use of delete).

If you don't write a copy constructor, or you don't write operator=, your code may not work correctly; there may be attempts to dereference dangling pointers (which may cause runtime errors, or may cause garbage values to be assigned to some variables), and/or some data may be lost or corrupted.

A kind of compromise is to forbid the use of the copy constructor and the assignment of two class objects. You can do this by declaring the copy constructor and operator= as private member functions (just declaring them is enough; you do not need to write the actual code). In this case, any code that would normally cause the copy constructor or operator= to be called will instead cause a compile-time error.

Solutions to Self-Study Questions

Test Yourself #1

A reference parameter's destructor function is not called at the end of the function because the corresponding actual parameter refers to the same object. The object does not "go away" when the function ends, so its dynamically allocated storage should not be freed.

Test Yourself #2

Here are actual parameter S and formal parameter L at the beginning of function f:

   +-------------+
   |       +---+ |   +---------+  +---------+  +---------+
S: | head: | ------->| "a" | ---->| "b" | ---->| "c" | \ |
   |       +---+ |   +---------+  +---------+  +---------+
   +-------------+   ^
                     |
   +-------------+   |
   |       +---+ |   |
L: | head: | --------+
   |       +---+ |
   +-------------+

And here are S and L just after the call to Lookup in function f:

   +-------------+
   |       +---+ |   +---------+  +---------+
S: | head: | ------->| "a" | ---->| "c" | \ |
   |       +---+ |   +---------+  +---------+
   +-------------+   ^
                     |
   +-------------+   +-----------+
   |       +---+ |   +---------+ |
L: | head: | ------->| "b" | ----+
   |       +---+ |   +---------+
   +-------------+

Because the StrList class has no copy constructor, the head fields of both S and L point to the same linked-list. The call to Lookup changes that list (by moving the node that contains "b" to the front of the list); it also changes the head field of L to point to the node that was moved. However, it doesn't know anything about S, so S's head field still points to the node that contains "a" (and the node that contains "b" has effectively been removed from the list represented by S).

Classes with Pointer Data Members

Contents

Example