Ownership of Mmmory and Constructors and Assignment Operators Bolo Background In the first note, I wrote about the 4 methods that any C++ class always has, in one form or another. This note tries to go beyond that and explain what the philosophy for pointer ownership should be. Everything that is a pointer is owned The basic assumption to make things work correctly is that the ownership of every pointer in the system should be known. This does not mean reference counting, this means that the issue of object ownership is designed. Designed into the system and designed into code that is written to implement that system. For example, a class is given a 'char *' to use as a c- style string. So it has a pointer in the class to point to the string. At the minimum it means this class will need to have a Default Constructor, A Copy Constructor, an Assign- ment Operator, and a Destructor defined. On another note, it will also need to have any comparison operators properly defined. These methods must all work in the face of arbi- trary usage by the C++ language and users of that language. Not how you expect the class to be used, but how the class could be used. Simple Example Here is a simple class that has a pointer to a string in it. This is not the only way of going about implementing such a class, but it illustrates that the class always knows what memory that it owns, and always knows what to do with it. This simple self contained strategy works really well in most cases, and should be the minimum that is done with any pointer containing class. #includeclass HasString { /* memory pointed to owned by class, pointer always valid */ char *str; void construct(char *s); public: HasString(char *s = 0); HasString(const HasString &r); ~HasString(); const HasString &operator=(const HasString &r); bool operator==(const HasString &r); bool operator<(const HasString &r); }; void HasString::construct(char *s) { // if nothing, provide a null string size_t len = s ? strlen(s) : 0; str = new char[len + 1]; if (len) strcpy(str, s); else str[0] = ' '; } HasString::HasString(char *s) : str(0) { construct(s); } HasString::HasString(const HasString &r) : str(0) { /* Example setup to show that either could work */ #if 1 /* Works because delete 0 is a no-op */ /* If you do something like this, it would be good to make a note about it. */ *this = r; #else /* The obvious way */ construct(r.str); #endif } HasString::~HasString() { /* XXX two thigs going one here: 1) make sure pointer won't be deleted again if this deleted again. 2) make our state invalid upon destruction. */ delete [] str; str = 0; } const HasString &HasString::operator=(const HasString &r) { delete [] str; str = 0; construct(r.str); return *this; } bool HasString::operator==(const HasString &r) { return strcmp(str, r.str) == 0; } bool HasString::operator<(const HasString &r) { return strcmp(str, r.str) < 0; } Why use a Pointer at all? C++ has references, and that is one of the big advan- tages is has over C. In C, you always need to use pointers to refer to a data structure in memory, regardless of whether it exists or not. There are some situations in which a pointed-to object will always exist, and the pointer refers to this object. In other cases, the pointed-to object may be "optional", it may or may not exist, and a check for a NULL pointer must always be performed. In C++, references allow one to differentiate between these two cases. If a pointer is always going to be valid in the scope of a class, use a reference instead! The pointer doesn't do anything for you, because there is always the question of is it valid or not?. A reference is a cer- tainty. It also says that the thing is guaranteed (or should be :) to exist for the lifetime of this object. This also goes along with having "stateless" classes, that exist and are valid, or don't exist. Versus a class that has more states, file opened, XYZ done, etc. Some objects need states. But if you can reduce statefullness in an object, you make it that much simpler. In the previous example, there is never a state where you have to examine to see if a HasString object has a string. It always does and it simplifies the code considerably. Designing in Optimizations You can't hack in optimizations for the long term. It creates a real mess. The next time you want to change the thing that is hacked it comes back to bite you. A hack can be good for a prototype, but not for something that is going to be checked into the repository. So, instead, Optimiza- tion has to be designed in to a system. This note talks about trying to optimize memory use in particular. I see lots of hacks to try to minimize memory usage. They might work for a while, but after a while they become unmaintainable. The problem is wanting to do something dif- ferent than what the hack allows. Ooops, you can't do it the right way, so you need to hack the hack to make it work. Or, you have to do something non obvious because of the hack. Then everyone has to know and remember how to use the hacked object correctly. One slip of using it and you've created a problem that is lurking in the shadows. The whole process is just like digging your own grave. So, to stop digging your own grave, just do stuff right from the begin- ning. I have a note about Premature Optimization is the Root of all Evil, and trying to optimize memory use via hacks is one of the topics it covers. Suffice it to say that, in most cases, you just won't know ahead of time if your opti- mization will actually have any effect on system perfor- mance. Code it simply, use memory whose ownership is clearly defined by the class that uses it. If profiling and other statistics show that it is a problem, then that prob- lem will need to be addressed. And the solution to the problem will have to be a real design, that will work robustly and be correct. I'll address that type of thing later, since we are worried about correctness for now. What to do in the meantime? Well, fall back on the Prime Directive that I mentioned at the start. All Memory is Owned. Your Mission, Should you Choose to Accept it This goes hand-in-hand with the 4 always methods from my first note. Go through the system and get rid of hacks about memory ownership. In all probability, some of these kinds of problems will not be removed easily. In that case, don't hack up the system removing the hacks. Instead, doc- ument their odd behaviors very precisely: o Where do the tendrils of the hack reach? o What performance problem the hack was trying to solve? o Was the performance problem ever documented with profiling tools? That will provide the kind of information we need to plan and design a good solution for the problem