Ownership of Mmmory
                            and
           Constructors and Assignment Operators

                            Bolo


Background

     In the first note, I wrote about the 4 methods that any
C++ class always has, in one form  or  another.   This  note
tries  to go beyond that and explain what the philosophy for
pointer ownership should be.

Everything that is a pointer is owned

     The basic assumption to make things work  correctly  is
that  the ownership of every pointer in the system should be
known.  This does not mean reference  counting,  this  means
that  the  issue  of object ownership is designed.  Designed
into the system and designed into code that  is  written  to
implement that system.

     For example, a class is given a 'char *' to use as a c-
style string.  So it has a pointer in the class to point  to
the string.  At the minimum it means this class will need to
have a Default Constructor, A Copy Constructor,  an  Assign-
ment  Operator,  and a Destructor defined.  On another note,
it will also need to have any comparison operators  properly
defined.   These  methods must all work in the face of arbi-
trary usage by the C++ language and users of that  language.
Not  how  you expect the class to be used, but how the class
could be used.



Simple Example

     Here is a simple class that has a pointer to  a  string
in it.  This is not the only way of going about implementing
such a class, but it illustrates that the class always knows
what  memory  that it owns, and always knows what to do with
it.

     This simple self contained strategy works  really  well
in  most  cases, and should be the minimum that is done with
any pointer containing class.
#include 

class HasString {
     /* memory pointed to owned by class, pointer always valid */
     char *str;

     void construct(char *s);

public:
     HasString(char *s = 0);
     HasString(const HasString &r);
     ~HasString();

     const HasString &operator=(const HasString &r);

     bool operator==(const HasString &r);
     bool operator<(const HasString &r);
};

void HasString::construct(char *s)
{
     // if nothing, provide a null string
     size_t    len = s ? strlen(s) : 0;

     str = new char[len + 1];

     if (len)
          strcpy(str, s);
     else
          str[0] = ' ';
}

HasString::HasString(char *s)
: str(0)
{
     construct(s);
}

HasString::HasString(const HasString &r)
: str(0)
{
     /* Example setup to show that either could work */
#if 1
     /* Works because delete 0 is a no-op */
     /* If you do something like this, it would be good to make a note
        about it. */
     *this = r;
#else
     /* The obvious way */
     construct(r.str);
#endif
}

HasString::~HasString()
{
     /* XXX two thigs going one here:
        1) make sure pointer won't be deleted again if this deleted again.
        2) make our state invalid upon destruction.
      */
     delete [] str;
     str = 0;
}

const HasString &HasString::operator=(const HasString &r)
{
     delete [] str;
     str = 0;

     construct(r.str);

     return *this;
}

bool HasString::operator==(const HasString &r)
{
     return strcmp(str, r.str) == 0;
}

bool HasString::operator<(const HasString &r)
{
     return strcmp(str, r.str) < 0;
}


Why use a Pointer at all?

     C++ has references, and that is one of the  big  advan-
tages  is has over C.  In C, you always need to use pointers
to refer to  a  data  structure  in  memory,  regardless  of
whether  it  exists  or  not.   There are some situations in
which a pointed-to object will always exist, and the pointer
refers  to  this  object.   In  other  cases, the pointed-to
object may be "optional", it may or may  not  exist,  and  a
check for a NULL pointer must always be performed.

     In  C++,  references allow one to differentiate between
these two cases.   If a pointer is always going to be  valid
in  the  scope  of  a  class,  use a reference instead!  The
pointer doesn't do anything for you, because there is always
the  question of is it valid or not?.  A reference is a cer-
tainty.  It also says  that  the  thing  is  guaranteed  (or
should be :) to exist for the lifetime of this object.

     This  also  goes along with having "stateless" classes,
that exist and are valid, or don't exist.   Versus  a  class
that  has  more  states,  file  opened, XYZ done, etc.  Some
objects need states.  But if you can reduce statefullness in
an  object,  you make it that much simpler.  In the previous
example, there is never a state where you have to examine to
see  if a HasString object has a string.  It always does and
it simplifies the code considerably.


Designing in Optimizations

     You can't hack in optimizations for the long term.   It
creates  a  real mess.  The next time you want to change the
thing that is hacked it comes back to bite you.  A hack  can
be good for a prototype, but not for something that is going
to be checked into the repository.  So,  instead,  Optimiza-
tion  has  to  be  designed in to a system.  This note talks
about trying to optimize memory use in particular.

     I see lots of hacks to try to  minimize  memory  usage.
They  might  work for a while, but after a while they become
unmaintainable.  The problem is wanting to do something dif-
ferent  than  what  the hack allows.  Ooops, you can't do it
the right way, so you need to hack the hack to make it work.
Or,  you  have  to  do  something non obvious because of the
hack.  Then everyone has to know and remember how to use the
hacked  object  correctly.   One slip of using it and you've
created a problem that is lurking in the shadows.  The whole
process  is  just  like digging your own grave.  So, to stop
digging your own grave, just do stuff right from the  begin-
ning.

     I  have a note about Premature Optimization is the Root
of all Evil, and trying to optimize memory use via hacks  is
one  of  the  topics  it covers.  Suffice it to say that, in
most cases, you just won't know ahead of time if your  opti-
mization  will  actually  have  any effect on system perfor-
mance.  Code  it  simply,  use  memory  whose  ownership  is
clearly defined by the class that uses it.  If profiling and
other statistics show that it is a problem, then that  prob-
lem  will  need  to  be  addressed.  And the solution to the
problem will have to  be  a  real  design,  that  will  work
robustly  and  be  correct.  I'll address that type of thing
later, since we are worried about correctness for now.

     What to do in the meantime?  Well,  fall  back  on  the
Prime  Directive  that I mentioned at the start.  All Memory
is Owned.

Your Mission, Should you Choose to Accept it

     This goes hand-in-hand with the 4 always  methods  from
my  first  note.  Go through the system and get rid of hacks
about memory ownership.  In all probability, some  of  these
kinds of problems will not be removed easily.  In that case,
don't hack up the system removing  the hacks.  Instead, doc-
ument their odd behaviors very precisely:

     o    Where do the tendrils of the hack reach?

     o    What  performance  problem  the hack was trying to
          solve?

     o    Was the performance problem ever  documented  with
          profiling tools?
That  will  provide  the kind of information we need to plan
and design a good solution for the problem