Chapter 7: Classes and memory allocation

We're always interested in getting feedback. E-mail us if you like this guide, if you think that important material is omitted, if you encounter errors in the code examples or in the documentation, if you find any typos, or generally just if you feel like e-mailing. Send your email to Frank Brokken.
Please state the document version you're referring to, as found in the title (in this document: 5.2.0a) and please state the paragraph you're referring to.
All mail received is seriously considered, and new (sub)releases of the Annotations will normally reflect your suggestions for improvements. Except for the incidental case I will not otherwise acknowledge the receipt of suggestions for improvements. Please don't misinterpret this for lack of appreciation.

In contrast to the set of functions which handle memory allocation in C (i.e., malloc() etc.), the operators new and delete are specifically meant to be used with the features that C++ offers. Important differences between malloc() and new are:

The function malloc() doesn't `know' what the allocated memory will be used for. E.g., when memory for ints is allocated, the programmer must supply the correct expression using a multiplication by sizeof(int). In contrast, new requires the use of a type; the sizeof expression is implicitly handled by the compiler.
The only way to initialize memory which is allocated by malloc() is to use calloc(), which allocates memory and resets it to a given value. In contrast, new can call the constructor of an allocated object where initial actions are defined. This constructor may be supplied with arguments.
All C-allocation functions must be inspected for NULL-returns. In contrast, the new-operator provides a facility called a new_handler (cf. section 7.2.2) which can be used instead of the explicit checks for NULL-returns.

A comparable relationship exists between free() and delete: delete makes sure that when an object is deallocated, a corresponding destructor is called.

The automatic calling of constructors and destructors when objects are created and destroyed, has a number of consequences which we shall discuss in this chapter. Many problems encountered during C program development are caused by incorrect memory allocation or memory leaks: memory is not allocated, not freed, not initialized, boundaries are overwritten, etc.. C++ does not `magically' solve these problems, but it does provide a number of handy tools.

Unfortunately, the very frequently used str...() functions, like strdup() are all malloc() based, and should therefore preferably not be used anymore in C++ programs. Instead, a new set of corresponding functions, based on the operator new, are preferred. Also, since the class string is available, there is less need for these functions in C++ than in C. In cases where operations on char * are preferred or necessary, comparable functions based on new could be developed. E.g., for the function strdup() a comparable function char *strdupnew(char const *str) could be developed as follows:

    char *strdupnew(char const *str)
    {
        return str ? strcpy(new char [strlen(str) + 1], str) : 0;
    }

In this chapter the following topics will be covered:

the assignment operator (and operator overloading in general),
the this pointer,
the copy constructor.

7.1: The operators `new' and `delete'

The C++ language defines two operators which are specific for the allocation and deallocation of memory. These operators are new and delete.

The most basic example of the use of these operators is given below. An int pointer variable is used to point to memory which is allocated by the operator new. This memory is later released by the operator delete.

    int
        *ip;

    ip = new int;
    delete ip;

Note that new and delete are operators and therefore do not require parentheses, as required for functions like malloc() and free(). The operator delete returns void, the operator new returns a pointer to the kind of memory that's asked for by its argument (e.g., a pointer to an int in the above example). Note that the operator new uses a type as its operand, which has the benefit that the correct amount of memory, given the type of the object to be allocated, becomes automatically available. Furthermore, this is a type safe procedure as new returns a pointer to the type that was given as its operand, which pointer must match the type of the variable receiving the pointervalue.

The operator new can be used to allocate primitive types and to allocate objects. When a primitive type is allocated, the allocated memory is initialized to 0. Alternatively, an initialization expression may be provided:

    int
        *v1 = new int,          // initialized to 0
        *v2 = new int(3),       // initialized to 3
        *v3 = new int(3 * *v2); // initialized to 9

When objects are allocated, the constructor must be mentioned, and the allocated memory will be initialized according to the constructor that is used. For example, to allocate a string object the following statement can be used: string *s = new string();

Here, the default constructor was used, and s will point to the newly allocated, but empty, string. If overloaded forms of the constructor are available, these can be used as well. E.g.,

string *s = new string("hello world");

which results in s pointing to a string containing the text hello world.

Memory allocation may fail. What happens then is unveiled in section 7.2.2.

7.1.1: Allocating arrays

Operator new[] is used to allocate arrays. The generic notation new[] is an abbreviation used in the Annotations. Actually, the number of elements to be allocated is specified as an expression between the square brackets, which are prefixed by the type of the values or class of the objects that must be allocated: int *intarr = new int[20]; // allocates 20 ints

Note well that operator new is a different operator than operator new[]. In section 9.8 redefining operator new[] is covered.

Arrays allocated by operator new[] are called dynamic arrays. They are constructed during the execution of a program, and their lifetime may exceed the lifetime of the function in which they were created. Dynamically allocated arrays may last for as long as the program runs.

When new[] is used to allocate an array of primitive values or an array of objects, new[] must be specified with a type and an (unsigned) expression between square brackets. The type and expression together are used by the compiler to determine the required size of the block of memory to make available. With the array allocation, all elements are stored consecutively in memory. The array index notation can be used to access the individual elements: intarr[0] will be the very first int value, immediately followed by intarr[1], and so on until the last element: intarr[19].

To allocate arrays of objects, the new[]-bracket notation is used as well. For example, to allocate an array of 20 string objects the following construction is used:

string *strarr = new string[20]; // allocates 20 strings

Note here that, since objects are allocated, constructors are automatically used. So, whereas new int[20] results in a block of 20 uninitialized int values, new string[20] results in a block of 20 initialized string objects. With arrays of objects the default constructor is used for the initialization. Unfortunately it is not possible to use a constructor having arguments when arrays of objects are allocated. However, it is possible to overload operator new[] and provide it with arguments which may be used for a non-default initialization of arrays of objects. Overloading operator new[] is discussed in section 9.8.

Similar to C, and without resorting to the operator new[], arrays of variable size can also be constructed as ilocal arrays within functions. Such arrays are not dynamic arrays, but local arrays, and their lifetime is restricted to the lifetime of the block in which they were defined.

Once allocated, all arrays are fixed size arrays. There is no simple way to enlarge or shrink arrays: there is no renew operator. In section 7.1.3 an example is given showing how to enlarge an array.

7.1.2: Deleting arrays

A dynamically allocated array may be deleted using operator delete[]. Ooperator delete[] expects a pointer to a block of memory, previously allocated using operator new[].

When an object is deleted, its destructor (see section 7.2) is called automatically, comparably to the calling of the object's constructor when the object was created. It is the task of the destructor, as discussed in depth later in this chapter, to do all kinds of cleanup operations that are required for the proper destruction of the object.

The operator delete[] (empty square brackets) expects as its argument a pointer to an array of objects. This operator will now first call the destructors of the individual objects, an will then delete the allocated block of memory. So, the proper way to delete an array of Objects is:

    Object 
        *op = new Object[10];
    delete[] op;

Realize that delete[] only has an effect if the block of memory to be deallocated contains objects. With any other type of element normally no special action is performed: following int *it = new int[10] the statement delete[] it the memory occupied by all ten int values is returned to the common pool. Nothing special happens.

Note especially that an array of pointers to objects is not handled as an array of objects by delete[]: the array of pointers to objects doesn't contain objects, so the objects are not properly destroyed by delete[], whereas an array of objects contains objects, which are properly destroyed by delete[]. In section 7.2 several examples of the use of delete versus delete[] will be given.

The operator delete is a different operator than operator delete[]. In section 9.8 redefining delete[] is discussed. For now, the rule of thumb is: if new[] was used, also use delete[].

7.1.3: Enlarging arrays

Once allocated, all arrays are fixed sizearrays. There is no simple way to enlarge or shrink arrays: there is no renew operator. In this section an example is given showing how to enlarge an array. Enlarging arrays is only possible with dynamic arrays. Local and global arrays cannot be enlarged. When an array must be enlarged, the following procedure can be used:

Allocate a new block of memory, of larger size
Copy the old array contents to the new array
Delete the old array (see section 7.1.2)
Have the old array pointer point to the newly allocated array

The following example focuses on the enlargement of an array of string objects:

#include <string>

string *enlarge(string *old, unsigned oldsize, unsigned newsize)
{
    string
        *tmp = new string[newsize];     // allocate larger array

    for (unsigned idx = 0; idx < oldsize; ++idx)
        tmp[idx] = old[idx];            // copy old to tmp

    delete[] old;                       // delete old, using [] due to objects

    return tmp;                         // return new array    
}


int main()
{
    string
        *arr = new string[4];           // initially: array of 4 strings

    arr = enlarge(arr, 4, 6);           // enlarge arr to 6 elements.
}

7.2: The destructor

Comparable to the constructor, classes may define a destructor. This function is the opposite of the constructor in the sense that it is invoked when an object ceases to exist. For objects which are local non-static variables, the destructor is called when the block in which the object is defined is left: the destructors of objects that are defined in nested blocks of functions are therefore usually called before the function itself terminates. The destructors of objects that are defined somewhere in the outer block of a function are called just before the function returns (terminates). For static or global variables the destructor is called before the program terminates.

However, when a program is interrupted using an exit() call, the destructors are called only for global objects which exist at that time. Destructors of objects defined locally within functions are not called when a program is forcefully terminated using exit().

The definition of a destructor must obey the following rules:

The destructor has the same name as the class but its name is prefixed by a tilde.
The destructor has no arguments and no a return value.

The destructor for the class Person could thus be declared as follows:

    class Person
    {
        public:
            Person();               // constructor
            ~Person();              // destructor
    };

The position of the constructor(s) and destructor in the class definition is dictated by convention: First the constructors are declared, then the destructor, and only then any other members follow.

The main task of a destructor is to make sure that memory allocated by the object (e.g., by its constructor) is properly deleted when the object goes out of scope. Consider the following definition of the class Person:

    class Person
    {
        char *d_name;
        char *d_address;
        char *d_phone;

        public:
            Person() 
            {}
            Person(char const *name, char const *address,
                   char const *phone);
            ~Person();

            char const *getName() const;
            char const *getAddress() const;
            char const *getPhone() const;
    };
/*
    person.ih contains:

    #include "person.h"
    char const *strdupnew(char const *org);
*/

The task of the constructor is to initialized the data fields of the object. E.g, the constructor is defined as follows:

    #include "person.ih"
    
    Person::Person(char const *name, char const *address, char const *phone)
    :
        d_name(strdupnew(name)),
        d_address(strdupnew(address)),
        d_phone(strdupnew(phone))
    {}

In this class the destructor is necessary to prevent that memory, once allocated for the fields name, address and phone, becomes unreachable when an object ceases to exist, thus producing a memory leak. The destructor of an object is called automatically

When an object goes out of scope;
When a dynamically allocated object is deleted;
When an dynamically allocated array of objects is deleted using the delete[] operator (see section 7.1.2).

Since it is the task of the destructor to delete all memory that was dynamically allocated and used by the object, the task of the Person's destructor would be to delete the memory pointed to by its three data members. The implementation of the destructor would therefore be:

    #include "person.ih"
    
    Person::~Person()
    {
        delete d_name;
        delete d_address;
        delete d_phone;
    }

In the following example a Person object is created, and its data fields are printed. After this the showPerson() function stops, which leads to the deletion of memory. Note that in this example a second object of the class Person is created and destroyed dynamically by respectively, the operators new and delete.

    #include "person.h"
    #include <iostream>

    void showPerson()
    {
        Person
            karel("Karel", "Marskramerstraat", "038 420 1971"),
            *frank = new Person("Frank", "Oostumerweg", "050 403 2223");

        cout << karel.getName()     << ", " <<
                karel.getAddress()  << ", " <<
                karel.getPhone()    << endl <<
                frank->getName()    << ", " <<
                frank->getAddress() << ", " <<
                frank->getPhone()   << endl;

        delete frank;
    }

The memory occupied by the object karel is deleted automatically when showPerson() terminates: the C++ compiler makes sure that the destructor is called. Note, however, that the object pointed to by frank is handled differently. The variable frank is a pointer, and a pointer variable is itself no Person. Therefore, before main() terminates, the memory occupied by the object pointed to by frank should be explicitly deleted; hence the statement delete frank. The operator delete will make sure that the destructor is called, thereby deleting the three char * strings of the object.

7.2.1: New and delete and object pointers

The operators new and delete are used when an object of a given class is allocated. As we have seen, the advantage of the operators new and delete over functions like malloc() and free() lies in the fact that new and delete call the corresponding constructors or destructor. This is illustrated in the next example:

    Person
        *pp = new Person();     // ptr to Person object

    delete pp;                  // now destroyed

The allocation of a new Person object pointed to by pp is a two-step process. First, the memory for the object itself is allocated. Second, the constructor is called which initializes the object. In the above example the constructor is the argument-free version; it is however also possible to use a constructor having arguments:

    frank = new Person("Frank", "Oostumerweg", "050 403 2223");
    delete frank;

Note that, analogously to the construction of an object, the destruction is also a two-step process: first, the destructor of the class is called to delete the memory allocated and used by the object; then the memory which is used by the object itself is freed.

Dynamically allocated arrays of objects can also be manipulated by new and delete. In this case the size of the array is given between the [] when the array is created:

    Person
        *personarray = new Person [10];

The compiler will generate code to call the default constructor for each object which is created. As we have seen in section 7.1.2, the delete[] operator must here be used to destroy such an array in the proper way:

    delete[] personarray;

The presence of the [] ensures that the destructor is called for each object in the array.

What happens if delete rather than delete[] is used? Consider the following situation, in which the destructor ~Person() is modified so that it will tell us that it's called. In a main() function an array of two Person objects is allocated by new, to be deleted by delete []. Next, the same actions are repeated, albeit that the delete operator is called without []:

    #include <iostream>
    #include "person.h"
    
    using namespace std;

    Person::~Person()
    {
        cout << "Person destructor called" << endl;
    }
    
    int main()
    {
        Person *a  = new Person[2];
                 
        cout << "Destruction with []'s" << endl;
        delete [] a;
        
        a = new Person[2];
        
        cout << "Destruction without []'s" << endl;
        delete a;
    
        return 0;
    }
/*
    Generated output:
Destruction with []'s
Person destructor called
Person destructor called
Destruction without []'s
Person destructor called
*/

Looking at the generated output, we see that the destructor of the individual Person objects are called if the delete[] syntax is followed, and not if the [] is omitted.

If no destructor is defined, it is not called. This may seem to be a trivial statement, but it has severe implications: objects which allocate memory will result in a memory leak when no destructor is defined. Consider the following program:

    #include <iostream>
    #include "person.h"

    using namespace std;
    
    Person::~Person()
    {
        cout << "Person destructor called" << endl;
    }
    
    int main()
    {
        Person **a;
    
        a = new Person* [2];
                 
        a[0] = new Person [2];
        a[1] = new Person [2];
                             
        delete [] a;
    
        return 0;
    }

This program produces no output at all. Why is this? The variable a is defined as a pointer to a pointer. For this situation, however, there is no defined destructor. Consequently, the [] is ignored.

Now, because of the [] being ignored, only the array a itself is deleted, because here `delete[] a' deletes the memory pointed to by a. That's all there is to it.

Of course, we don't want this, but require the Person objects pointed to by the elements of a to be deleted too. In this case we have two options:

Explicitly walk all the elements of the a array, deleting them in turn. This will call the destructor for a pointer to Person objects, which will destroy all elements if the [] operator is used, as in:

    #include <iostream>
    #include "person.h"
    
    Person::~Person()
    {
        cout << "Person destructor called" << endl;
    }
    
    int main()
    {
        Person **a;
    
        a = new Person* [2];
                 
        a[0] = new Person [2];
        a[1] = new Person [2];
    
        for (int index = 0; index < 2; index++)
            delete [] a[index];
                 
        delete[] a;
    }
    /*
        Generated output:
Person destructor called
Person destructor called
Person destructor called
Person destructor called
    */

Define a wrapper class containing a pointer to Person objects, and allocate a pointer to this class, rather than a pointer to a pointer to Person objects. The topic of containing classes in classes, composition, was discussed in section 6.4. Here is an example showing the deletion of pointers to memory using such a wrapper class:

    #include <iostream>

    using namespace std;
    
    class Informer
    {
        public:
            ~Informer()
            {
                cout << "destructor called\n";
            }
    };

    class Wrapper
    {
        Informer *d_i;

        public:
            Wrapper()
            :
                d_i(new Informer())
            {}
            ~Wrapper()
            {
                delete d_i;
            }
    };

    int main()
    {
        delete [] new Informer *[4];    // memory leak: no destructor called

        cout << "===========\n";

        delete [] new Wrapper[4];       // ok: 4 x destructor called
    }
    /*
        Generated output:
    ===========
    destructor called
    destructor called
    destructor called
    destructor called
    */

7.2.2: The function set_new_handler()

The C++ run-time system makes sure that when memory allocation fails, an error function is activated. By default this function returns the value 0 to the caller of new, so that the pointer which is assigned by new is set to zero. The error function can be redefined, but it must comply with a few prerequisites:

it has no arguments, and
it returns no value

Please make sure you use this function: it saves you a lot of checks (and problems with a failing allocation that you just happened to forget to protect with a check...).

The redefined error function might, e.g., print a message and terminate the program. The user-written error function becomes part of the allocation system through the function set_new_handler().

The implementation of an error function is illustrated below. This implementation applies to the Gnu C/C++ requirements ( The actual try-out of the program is not encouraged, as it will slow down the computer enormously due to the resulting occupation of Unix' swap area):

    #include <iostream>

    using namespace std;

    void outOfMemory()
    {
        cout << "Memory exhausted. Program terminates." << endl;
        exit(1);
    }

    int main()
    {
        long allocated = 0;
            
        set_new_handler(outOfMemory);       // install error function
        
        while (true)                        // eat up all memory
        {
            new int [100000];
            allocated += 100000 * sizeof(int);
            cout << "Allocated " << allocated << " bytes\n";
        }
    }

The advantage of an allocation error function lies in the fact that once installed, new can be used without wondering whether the allocation succeeded or not: upon failure the error function is automatically invoked and the program exits. It is good practice to install a new handler in each C++ program, even when the actual code of the program does not allocate memory. Memory allocation can also fail in not directly visible code, e.g., when streams are used or when strings are duplicated by low-level functions.

Note that it may not be assumed that the standard C functions which allocate memory, such as strdup(), malloc(), realloc() etc. will trigger the new handler when memory allocation fails. This means that once a new handler is installed, such functions should not automatically be used in an unprotected way in a C++ program. As an example of the use of new to duplicate a string, a rewrite of the function strdup() using the operator new is given in section 7. It is strongly advised to revert to this approach, rather than to keep using functions like strdup(), when the allocation of memory is required.

7.3: The assignment operator

Variables which are structs or classes can be directly assigned in C++ in the same way that structs can be assigned in C. The default action of such an assignment for non-class type data members is a straight byte-by-byte copy from one data member to another. Now consider the consequences of this default action in a function such as the following:

    void printperson(Person const &p)
    {
        Person tmp;

        tmp = p;
        cout << "Name:     " << tmp.getName()       << endl <<
                "Address:  " << tmp.getAddress()    << endl <<
                "Phone:    " << tmp.getPhone()      << endl;
    }

We shall follow the execution of this function step by step.

The function printperson() expects a reference to a Person as its parameter p. So far, nothing extraordinary is happening.
The function defines a local object tmp. This means that the default constructor of Person is called, which -if defined properly- resets the pointer fields name, address and phone of the tmp object to zero.
Next, the object referenced by p is copied to tmp. By default this means that sizeof(Person) bytes from p are copied to tmp.
Now a potentially dangerous situation has arisen. Note that the actual values in p are pointers, pointing to allocated memory. Following the assignment this memory is addressed by two objects: p and tmp.
The potentially dangerous situation develops into an acutely dangerous situation when the function printperson() terminates: the object tmp is destroyed. The destructor of the class Person releases the memory pointed to by the fields name, address and phone: unfortunately, this memory is also in use by p.... The incorrect assignment is illustrated in figure 5.

figure 5: Private data and public interface functions of the class Person, using byte-by-byte assignment

Having executed printperson(), the object which was referenced by p now contain pointers to deleted memory.

This situation is undoubtedly not a desired effect of a function like the above. The deleted memory will likely become occupied during subsequent allocations: the pointer members of p have effectively become wild pointers, as they don't point to allocated memory anymore. In general it can be concluded that

every class containing pointer data members is a potential candidate for trouble. Fortunately, it is possible to prevent these troubles, as discussed in the next section.

7.3.1: Overloading the assignment operator

Obviously, the right way to assign one Person object to another, is not to copy the contents of the object bytewise. A better way is to make an equivalent object; one with its own allocated memory, but which contains the same strings.

The `right' way to duplicate a Person object is illustrated in figure 6.

figure 6: Private data and public interface functions of the class Person, using the `correct' assignment.

There are several ways to duplicate a Person object. One way would be to define a special function to handle assignments of objects of the class Person. The purpose of this function would be to create a copy of an object, but one with its own name, address and phone strings. Such a member function might be:

    void Person::assign(Person const &other)
    {
        // delete our own previously used memory
        delete d_name;
        delete d_address;
        delete d_phone;

        // now copy the other Person's data
        d_name = strdupnew(other.d_name);
        d_address = strdupnew(other.d_address);
        d_phone = strdupnew(other.d_phone);
    }

Using this tool we could rewrite the offending function printperson():

    void printperson(Person const &p)
    {
        Person tmp;

        // make tmp a copy of p, but with its own allocated memory
        tmp.assign(p);
        
        cout << "Name:     " << tmp.getname()       << endl <<
                "Address:  " << tmp.getaddress()    << endl <<
                "Phone:    " << tmp.getphone()      << endl;

        // now it doesn't matter that tmp gets destroyed..
    }

In itself this solution is valid, although it is a purely symptomatic solution. This solution requires the programmer to use a specific member function instead of the operator =. The basic problem, however, remains if this rule is not strictly adhered to. Experience learns that errare humanum est: a solution which doesn't enforce special actions is therefore preferable.

The problem of the assignment operator is solved using operator overloading: the syntactic possibility C++ offers to redefine the actions of an operator in a given context. Operator overloading was mentioned earlier, when the operators << and >> were redefined for the usage with streams as cin, cout and cerr (see section 3.1.2).

Overloading the assignment operator is probably the most common form of operator overloading. However, a word of warning is appropriate: the fact that C++ allows operator overloading does not mean that this feature should be used at all times. A few rules are:

Operator overloading should be used in situations where an operator has a defined action, but when this action is not desired as it has negative side effects. A typical example is the above assignment operator in the context of the class Person.
Operator overloading can be used in situations where the use of the operator is common and when no ambiguity in the meaning of the operator is introduced by redefining it. An example may be the redefinition of the operator + for a class which represents a complex number. The meaning of a + between two complex numbers is quite clear and unambiguous.
In all other cases it is preferable to define a member function, instead of redefining an operator.

Using these rules, operator overloading is minimized which helps keep source files readable. An operator simply does what it is designed to do. Therefore, in our vision, the insertion (<<) and extraction (>>) operators in the context of streams are unfortunate: the stream operations do not have anything in common with the bitwise shift operations.

7.3.1.1: The function 'operator=()'

To achieve operator overloading in the context of a class, the class is simply expanded with a public function stating the particular operator. A corresponding function, the implementation of the overloaded operator, is thereupon defined.

For example, to overload the addition operator +, a function operator+() must be defined. The function name consists of two parts: the keyword operator, followed by the operator itself.

In our case we define a new function operator=() to redefine the actions of the assignment operator. A possible extension to the class Person could therefore be:

    class Person
    {
        public:                             // extension of the class Person
                                            // earlier members are assumed.
            void operator=(Person const &other);
    };

and the implementation could be

    void Person::operator=(Person const &other)
    {
        delete d_name;                      // delete old data
        delete d_address;
        delete d_phone;
        
        d_name = strdupnew(other.d_name);   // duplicate other's data
        d_address = strdupnew(other.d_address);
        d_phone = strdupnew(other.d_phone);
    }

The function operator=() presented here is the first version of the overloaded assignment operator. We shall present improved versions shortly.

The actions of this member function are similar to those of the previously proposed function assign(), but now its name makes sure that this function is also activated when the assignment operator = is used. There are actually two ways to call overloaded operators:

    Person
        pers("Frank", "Oostumerweg", "403 2223"),
        copy;

    copy = pers;                // first possibility
    copy.operator=(pers);       // second possibility

It is obvious that the second possibility, in which operator=() is explicitly stated, is not used often. However, the code fragment does illustrate the two ways of calling the same function.

7.4: The this pointer

As we have seen, a member function of a given class is always called in the context of some object of the class. There is always an implicit ` substrate' for the function to act on. C++ defines a keyword, this, to address this substrate (Note that `this' is not available in the not yet discussed static member functions.)

The this keyword is a pointer variable, which always contains the address of the object in question. The this pointer is implicitly declared in each member function (whether public, protected or private). Therefore, it is as if each member function of the class Person contains the following declaration:

    extern Person *this;

A member function like getName(), which returns the name field of a Person, could therefore be implemented in two ways: with or without the this pointer:

    char const *Person::getName()   // implicit usage of `this'
    {
        return d_name;
    }
                            
    char const *Person::getNAme()   // explicit usage of `this'
    {
        return this->d_name;
    }

Explicit usage of the this pointer is not used very frequently. However, several situations exist where the this pointer is really needed.

7.4.1: Preventing self-destruction with this

As we have seen, the operator = can be redefined for the class Person in such a way that two objects of the class can be assigned, resulting in two copies of the same object.

As long as the two variables are different ones, the previously presented version of the function operator=() will behave properly: the memory of the assigned object is released, after which it is allocated again to hold new strings. However, when an object is assigned to itself (which is called auto-assignment), a problem occurs: the allocated strings of the receiving object are first deleted, resulting in the deletion of the memory of the right-hand side variable, which we call self-destruction. An example of this situation is illustrated here:

    void fubar(Person const &p)
    {
        p = p;          // auto-assignment!
    }

In this example it is perfectly clear that something unnecessary, possibly even wrong, is happening. But auto-assignment can also occur in more hidden forms:

    Person
        one,
        two,
        *pp = &one;

    *pp = two;
    one = *pp;

The problem of auto-assignment can be solved using the this pointer. In the overloaded assignment operator function we simply test whether the address of the right-hand side object is the same as the address of the current object: if so, no action needs to be taken. The definition of the function operator=() thus becomes:

    void Person::operator=(Person const &other)
    {
        // only take action if address of the current object
        // (this) is NOT equal to the address of the other object

        if (this != &other)
        {
            delete d_name;
            delete d_address;
            delete d_phone;

            d_name = strdupnew(other.d_name);
            d_address = strdupnew(other.d_address);
            d_phone = strdupnew(other.d_phone);
        }
    }

This is the second version of the overloaded assignment function. One, yet better version remains to be discussed.

As a subtlety, note the usage of the address operator '&' in the statement

    if (this != &other)

The variable this is a pointer to the `current' object, while other is a reference; which is an `alias' to an actual Person object. The address of the other object is therefore &other, while the address of the current object is this.

7.4.2: Associativity of operators and this

According to C++'s syntax, the associativity of the assignment operator is to the right-hand side. I.e., in statements like:

    a = b = c;

the expression b = c is evaluated first, and the result is assigned to a.

The implementation of the overloaded assignment operator so far does not permit such constructions, as an assignment using the member function returns nothing (void). We can therefore conclude that the previous implementation does solve an allocation problem, but still prevents concatenated assignments.

The problem can be illustrated as follows. When we rewrite the expression a = b = c to the form which explicitly mentions the overloaded assignment member functions, we get:

    a.operator=(b.operator=(c));

This variant is syntactically wrong, since the sub-expression b.operator=(c) yields void. However, the class Person contains no member functions with the prototype operator=(void).

This problem too can be remedied using the this pointer. The overloaded assignment function expects as its argument a reference to a Person object. It can also return a reference to such an object. This reference can then be used as an argument for a concatenated assignment.

It is customary to let the overloaded assignment return a reference to the current object (i.e., *this). The (final) version of the overloaded assignment operator for the class Person thus becomes:

    Person &Person::operator=(Person const &other)
    {
        if (this != &other)
        {
            delete d_address;
            delete d_name;
            delete d_phone;

            d_address = strdupnew(other.d_address);
            d_name = strdupnew(other.d_name);
            d_phone = strdupnew(other.d_phone);
        }
        // return current object. The compiler will make sure
        // that a reference is returned
        return *this;
    }

7.5: The copy constructor: Initialization vs. Assignment

In the following sections we shall take a closer look at another usage of the operator =. Consider, once again, the class Person. The class has the following characteristics:

The class contains several pointers, possibly pointing to allocated memory. As discussed, such a class needs a constructor and a destructor.
A typical action of the constructor would be to set the pointer members to 0. A typical action of the destructor would be to delete the allocated memory.
For the same reason the class requires an overloaded assignment operator.
The class has, besides a default constructor, a constructor which expects the name, address and phone number of the Person object.
For now, the only remaining interface functions return the name, address or phone number of the Person object.

Now consider the following code fragment. The statement references are discussed following the example:

    Person
        karel("Karel", "Marskramerstraat", "038 420 1971"), // see (1)
        karel2,                                             // see (2)
        karel3 = karel;                                     // see (3)

    int main()
    {
        karel2 = karel3                                     // see (4)
        return 0;
    }

Statement 1: this statement shows an initialization. The object karel is initialized with appropriate texts. This construction of the object karel therefore uses the constructor which expects three char const * arguments.
Assume a Person constructor is available having only one char const * parameter, e.g., Person::Person(char const *n). It should be noted that the initialization `Person frank("Frank")' is identical to
```
    Person frank = "Frank";
```
Even though this piece of code uses the operator =, it is no assignment: rather, it is an initialization, and hence, it's done at construction time by a constructor of the class Person.
Statement 2: here a second Person object is created. Again a constructor is called. As no special arguments are present, the default constructor is used.
Statement 3: again a new object karel3 is created. A constructor is therefore called once more. The new object is also initialized. This time with a copy of the data of object karel.
This form of initializations has not yet been discussed. As we can rewrite this statement in the form
```
    Person karel3(karel);
```
it is suggested that a constructor is called, having a reference to a Person object as its argument. Such constructors are quite common in C++ and are called copy constructors. More properties of these constructors are discussed below.
Statement 4: here one object is assigned to another. No object is created in this statement. Hence, this is just an assignment, using the overloaded assignment operator.

The simple rule emanating from these examples is that whenever an object is created, a constructor is needed. All constructors have the following characteristics:

Constructors have no return values.
Constructors are defined in functions having the same names as the class to which they belong.
The argument list of constructors can be deduced from the code. The argument is either present between parentheses or (if there is only one argument) following a =.

Therefore, we conclude that, given the above statement (3), the class Person must be augmented with a copy constructor:

    class Person
    {
        public:
            Person(Person const &other);
    };

The implementation of the Person copy constructor is:

    Person::Person(Person const &other)
    {
        d_name    = strdupnew(other.d_name);
        d_address = strdupnew(other.d_address);
        d_phone   = strdupnew(other.d_phone);
    }

The actions of copy constructors are comparable to those of the overloaded assignment operators: an object is duplicated, so that it contains its own allocated data. The copy constructor, however, is simpler in the following respects:

A copy constructor doesn't need to delete previously allocated memory: since the object in question has just been created, it cannot already have its own allocated data.
A copy constructor never needs to check whether auto-duplication occurs. No variable can be initialized with itself.

Besides the above mentioned quite obvious usage of the copy constructor, the copy constructor has other important tasks. All of these tasks are related to the fact that the copy constructor is always called when an object is created and initialized with another object of its class. The copy constructor is called even when this new object is a hidden or is a temporary variable.

When a function takes an object as argument, instead of, e.g., a pointer or a reference, the copy constructor is called to pass a copy of an object as the argument. This argument, which usually is passed via the stack, is therefore a new object. It is created and initialized with the data of the passed argument. This is illustrated in the following code fragment:
```
    void nameOf(Person p)       // no pointer, no reference
    {                           // but the Person itself
        cout << p.getName() << endl;
    }

    int main()
    {
        Person frank("Frank");

        nameOf(frank);
        return 0;
    }
```
In this code fragment frank itself is not passed as an argument, but instead a temporary (stack) variable is created using the copy constructor. This temporary variable is known inside nameOf() as p. Note that if nameOf() would have had a reference parameter, extra stack usage and a call to the copy constructor would have been avoided.

The copy constructor is also implicitly called when a function returns an object:

    Person getPerson()
    {
        string
            name,
            address,
            phone;

        cin >> name >> address >> phone;

        Person p(name.c_str(), address.c_str(), phone.c_str());

        return p;           // returns a copy of `p'.
    }

Here a hidden object of the class Person is initialized, using the copy constructor, as the value returned by the function. The local variable p itself ceases to exist when getPerson() terminates.

To demonstrate that copy constructors are not called in all situations, consider the following. We could rewrite the above function getline() to the following form:

    Person getPerson()
    {
        string
            name,
            address,
            phone;

        cin >> name >> address >> phone;

        return Person(name.c_str(), address.c_str(), phone.c_str());
    }

This code fragment is perfectly valid, and illustrates the use of an anonymous object. Anonymous objects are const objects: their data members may not change. The use of an anonymous object in the above example illustrates the fact that object return values should be considered constant objects, even though the keyword const is not explicitly mentioned in the return type of the function (as in

Person const
getPersion()

As an other example, once again assuming the availability of a Person(char const *name) constructor, consider:

    Person getNamedPerson()
    {
        string name;

        cin >> name;

        return name.c_str();
    }

Here, even though the return value name.c_str() doesn't match the return type Person, there is a constructor available to construct a Person from a char const *. Since such a constructor is available, the (anonymous) return value can be constructed by promoting a char const * type to a Person type using an appropriate constructor.

Contrary to the situation we encountered with the default constructor, the default copy constructor remains available once a constructor (any constructor) is defined explicitly. The copy constructor can be redefined, but it will not disappear once another constructor is defined.

7.5.1: Similarities between the copy constructor and operator=()

The similarities between on the one hand the copy constructor and on the other hand the overloaded assignment operator are reinvestigated in this section. We present here two primitive functions which often occur in our code, and which we think are quite useful. Note the following features of copy constructors, overloaded assignment operators, and destructors:

The copying of (private) data occurs (1) in the copy constructor and (2) in the overloaded assignment function.
The deletion of allocated memory occurs (1) in the overloaded assignment function and (2) in the destructor.

The above two actions (duplication and deletion) can be coded in two private functions, say copy() and destroy(), which are used in the overloaded assignment operator, the copy constructor, and the destructor. When we apply this method to the class Person, we can implement this approach as follows:

First, the class definition is expanded with two private functions copy() and destroy(). The purpose of these functions is to copy the data of another object or to delete the memory of the current object unconditionally. Hence these functions implement `primitive' functionality:

    // class definition, only relevant functions are shown here
    class Person
    {
        public:
            Person(Person const &other);
            ~Person();
            Person &operator=(Person const &other);
        private:
            void copy(Person const &other);     // new members
            void destroy(void);

            char 
                *d_name, 
                *d_address, 
                *d_phone;
    };

Next, the functions copy() and destroy() are constructed:

    void Person::copy(Person const &other)
    {
        d_name = strdupnew(other.d_name);       // unconditional copying
        d_address = strdupnew(other.d_address);
        d_phone = strdupnew(other.d_phone);
    }

    void Person::destroy()
    {
        delete d_name;                        // unconditional deletion
        delete d_address;
        delete d_phone;
    }

Finally the public functions in which other object's memory is copied or in which memory is deleted are rewritten:

    Person::Person (Person const &other)    // copy constructor
    {
        copy(other);
    }

    Person::~Person()                       // destructor
    {
        destroy();
    }
                                            // overloaded assignment
    Person const &Person::operator=(Person const &other)
    {                                       
        if (this != &other)
        {
            destroy();
            copy(other);
        }
        return *this;
    }

What we like about this approach is that the destructor, copy constructor and overloaded assignment functions are now completely standard: they are independent of a particular class, and their implementations can therefore be used in every class. Any class dependencies are reduced to the implementations of the private member functions copy() and destroy().

Note, that the copy() member function is responsible for the copying of the other object's data fields to the current object. We've shown the situation in which a class only has pointer data members. In most situations classes have non-pointer data members as well. These members must be copied in the copy constructor as well. This can simply be realized in the copy constructor except for the reference data members, which must be initialized using the member initializer method, introduced in section 6.4.2. However, in this case the overloaded assignment operator can't be fully implemented: once initialized, a reference member cannot be given an other value, so an existing object having reference data members is inseparately attached to its referenced object(s).

7.5.2: Preventing the use of certain member functions

As we've seen in the previous section, situations may be encountered in which a member function can't do its job in a completely satisfactory way. In particular: an overloaded assignment operator cannot to its job completely if its class contains reference data members. In this and comparable situations the programmer might want to prevent the (accidental) use of certain member functions. This can be realized in the following ways:

Move all member functions that should not be callable to the private section of the class interface. This will effectively prevent the user from the class to use these members. By moving the assignment operator to the private section, objects of the class cannot be assigned to each other anymore. Here the compiler will detect the use of a private member outside of its class and will flag a compilation error.
The above solution still allows the constructor of the class to use the unwanted member functions within the class members itself. If that is deemed undesirable as well, such functions should stil be moved to the private section of the class interface, but they should not be implemented. The compiler won't be able to prevent the (accidental) use of these forbidden members, but the linker won't be able to solve the associated external reference.
It is not always a good idea to omit member functions that should not be called from the class interface. In particular, the overloaded assignment operator has a default implementation that will be used if no overloaded version is mentioned in the class interface. So, with the overloaded assignment operator in particular the previously mentioned approached should be followed. Moving certain constructors to the private section of the class interface is also a good technique to prevent their use by `the general public'.

7.6: Conclusion

Two important extensions to classes have been discussed in this chapter: the overloaded assignment operator and the copy constructor. As we have seen, classes with pointer data which address allocated memory are potential sources of memory leaks. The two introduced extensions represent the standard way to prevent these memory leaks.

The conclusion is therefore: as soon as a class is defined in which pointer data members are used, a destructor, an overloaded assignment operator and a copy constructor should be implemented.