Ownership of Mmmory
and
Constructors and Assignment Operators
Bolo
Background
In the first note, I wrote about the 4 methods that any
C++ class always has, in one form or another. This note
tries to go beyond that and explain what the philosophy for
pointer ownership should be.
Everything that is a pointer is owned
The basic assumption to make things work correctly is
that the ownership of every pointer in the system should be
known. This does not mean reference counting, this means
that the issue of object ownership is designed. Designed
into the system and designed into code that is written to
implement that system.
For example, a class is given a 'char *' to use as a c-
style string. So it has a pointer in the class to point to
the string. At the minimum it means this class will need to
have a Default Constructor, A Copy Constructor, an Assign-
ment Operator, and a Destructor defined. On another note,
it will also need to have any comparison operators properly
defined. These methods must all work in the face of arbi-
trary usage by the C++ language and users of that language.
Not how you expect the class to be used, but how the class
could be used.
Simple Example
Here is a simple class that has a pointer to a string
in it. This is not the only way of going about implementing
such a class, but it illustrates that the class always knows
what memory that it owns, and always knows what to do with
it.
This simple self contained strategy works really well
in most cases, and should be the minimum that is done with
any pointer containing class.
#include
class HasString {
/* memory pointed to owned by class, pointer always valid */
char *str;
void construct(char *s);
public:
HasString(char *s = 0);
HasString(const HasString &r);
~HasString();
const HasString &operator=(const HasString &r);
bool operator==(const HasString &r);
bool operator<(const HasString &r);
};
void HasString::construct(char *s)
{
// if nothing, provide a null string
size_t len = s ? strlen(s) : 0;
str = new char[len + 1];
if (len)
strcpy(str, s);
else
str[0] = ' ';
}
HasString::HasString(char *s)
: str(0)
{
construct(s);
}
HasString::HasString(const HasString &r)
: str(0)
{
/* Example setup to show that either could work */
#if 1
/* Works because delete 0 is a no-op */
/* If you do something like this, it would be good to make a note
about it. */
*this = r;
#else
/* The obvious way */
construct(r.str);
#endif
}
HasString::~HasString()
{
/* XXX two thigs going one here:
1) make sure pointer won't be deleted again if this deleted again.
2) make our state invalid upon destruction.
*/
delete [] str;
str = 0;
}
const HasString &HasString::operator=(const HasString &r)
{
delete [] str;
str = 0;
construct(r.str);
return *this;
}
bool HasString::operator==(const HasString &r)
{
return strcmp(str, r.str) == 0;
}
bool HasString::operator<(const HasString &r)
{
return strcmp(str, r.str) < 0;
}
Why use a Pointer at all?
C++ has references, and that is one of the big advan-
tages is has over C. In C, you always need to use pointers
to refer to a data structure in memory, regardless of
whether it exists or not. There are some situations in
which a pointed-to object will always exist, and the pointer
refers to this object. In other cases, the pointed-to
object may be "optional", it may or may not exist, and a
check for a NULL pointer must always be performed.
In C++, references allow one to differentiate between
these two cases. If a pointer is always going to be valid
in the scope of a class, use a reference instead! The
pointer doesn't do anything for you, because there is always
the question of is it valid or not?. A reference is a cer-
tainty. It also says that the thing is guaranteed (or
should be :) to exist for the lifetime of this object.
This also goes along with having "stateless" classes,
that exist and are valid, or don't exist. Versus a class
that has more states, file opened, XYZ done, etc. Some
objects need states. But if you can reduce statefullness in
an object, you make it that much simpler. In the previous
example, there is never a state where you have to examine to
see if a HasString object has a string. It always does and
it simplifies the code considerably.
Designing in Optimizations
You can't hack in optimizations for the long term. It
creates a real mess. The next time you want to change the
thing that is hacked it comes back to bite you. A hack can
be good for a prototype, but not for something that is going
to be checked into the repository. So, instead, Optimiza-
tion has to be designed in to a system. This note talks
about trying to optimize memory use in particular.
I see lots of hacks to try to minimize memory usage.
They might work for a while, but after a while they become
unmaintainable. The problem is wanting to do something dif-
ferent than what the hack allows. Ooops, you can't do it
the right way, so you need to hack the hack to make it work.
Or, you have to do something non obvious because of the
hack. Then everyone has to know and remember how to use the
hacked object correctly. One slip of using it and you've
created a problem that is lurking in the shadows. The whole
process is just like digging your own grave. So, to stop
digging your own grave, just do stuff right from the begin-
ning.
I have a note about Premature Optimization is the Root
of all Evil, and trying to optimize memory use via hacks is
one of the topics it covers. Suffice it to say that, in
most cases, you just won't know ahead of time if your opti-
mization will actually have any effect on system perfor-
mance. Code it simply, use memory whose ownership is
clearly defined by the class that uses it. If profiling and
other statistics show that it is a problem, then that prob-
lem will need to be addressed. And the solution to the
problem will have to be a real design, that will work
robustly and be correct. I'll address that type of thing
later, since we are worried about correctness for now.
What to do in the meantime? Well, fall back on the
Prime Directive that I mentioned at the start. All Memory
is Owned.
Your Mission, Should you Choose to Accept it
This goes hand-in-hand with the 4 always methods from
my first note. Go through the system and get rid of hacks
about memory ownership. In all probability, some of these
kinds of problems will not be removed easily. In that case,
don't hack up the system removing the hacks. Instead, doc-
ument their odd behaviors very precisely:
o Where do the tendrils of the hack reach?
o What performance problem the hack was trying to
solve?
o Was the performance problem ever documented with
profiling tools?
That will provide the kind of information we need to plan
and design a good solution for the problem