Chapter 2: Introduction

We're always interested in getting feedback. E-mail us if you like this guide, if you think that important material is omitted, if you encounter errors in the code examples or in the documentation, if you find any typos, or generally just if you feel like e-mailing. Send your email to Frank Brokken.
Please state the document version you're referring to, as found in the title (in this document: 5.2.0a) and please state the paragraph you're referring to.
All mail received is seriously considered, and new (sub)releases of the Annotations will normally reflect your suggestions for improvements. Except for the incidental case I will not otherwise acknowledge the receipt of suggestions for improvements. Please don't misinterpret this for lack of appreciation.

This document presents an introduction to programming in C++. It is a guide for C/C++ programming courses, that Frank gives yearly at the University of Groningen. As such, this document is not a complete C/C++ handbook, but rather serves as an addition to other documentation sources (e.g., the Dutch book De programmeertaal C, Brokken and Kubat, University of Groningen, 1996).

The reader should realize that extensive knowledge of the C programming language is assumed and required. This document continues where topics of the C programming language end, such as pointers, memory allocation and compound types.

The version number of this document (currently 5.2.0a) is updated when the contents of the document change. The first number is the major number, and will probably not be changed for some time: it indicates a major rewriting. The middle number is increased when new information is added to the document. The last number only indicates small changes; it is increased when, e.g., series of typos are corrected.

This document is published by the Computing Center, University of Groningen, the Netherlands. This document was typeset using the yodl formatting system.

All rights reserved. No part of this document may be published or changed without prior consent of the author. Direct all correspondence concerning suggestions, additions, improvements or changes to this document to the author:

Frank B. Brokken
Computing Center, University of Groningen
Nettelbosje 1,
P.O. Box 11044,
9700 CA Groningen
The Netherlands
(email: f.b.brokken@rc.rug.nl)

In this chapter a first impression of C++ is presented. A few extensions to C are reviewed and a tip of the mysterious veil surrounding object oriented programming (OOP) is lifted.

2.1: What's new in the C++ Annotations

This section is modified when the first or second part of the version number changes.

Version 5.2.0 was released after adding a section about the mutable keyword (section 6.6), and after thoroughly changing the discussion of the Fork() abstract base class (section 19.4). All examples should now be up-to-date with respect to the use of the std namespace.
Version 5.1.1 was released after modifying the sections related to the fork() system call in chapter 19. Under the ANSI/ISO standard many of the previously available extensions (like procbuf, and vform()) applied to streams were discontinued. Starting with version 5.1.1. ways of constructing these facilities under the ANSI/ISO standard are discussed in the C++ Annotations. I consider the involved subject sufficiently complex to warrant the upgrade to a new subversion.
With the advent of the Gnu g++ compiler version 3.00, a more strict implementation of the ANSI/ISO C++ standard became available. This resulted in version 5.1.0 of the Annotations, appearing shortly after version 5.0.0. In version 5.1.0 chapter 5 was modified and several cosmetic changes took place (e.g., removing class from template type parameter lists, see chapter 18). Intermediate versions (like 5.0.0a, 5.0.0b) were not further documented, but were mere intermediate releases while approaching version 5.1.0. Code examples will gradually be adapted to the new release of the compiler.
In the meantime the reader should be prepared to insert

using namespace std;
in many code examples, just beyond the #include preprocessor directives as a temporary measure to make the example accepted by the compiler.
New insights develop all the time, resulting in version 5.0.0 of the Annotations. In this version a lot of old code was cleaned up and typos were repaired. According to current standard, namespaces are required in C++ programs, so they are introduced now very early (in section 2.5.1) in the Annotations. A new section about using external programs was added to the Annotations (and removed again in version 5.1.0), and the new stringstream class, replacing the strstream class is now covered too (sections 5.4.3 and 5.5.3). Actually, the chapter on input and output was completely rewritten. Furthermore, the operators new and delete are now discussed in chapter 7, where they fit better than in a chapter on classes, where they previously were discussed. Chapters were moved, split and reordered, so that subjects could generally be introduced without forward references. Finally, the html, PostScript and pdf versions of the C++ Annotations now contain an index ( sigh of relief ?) All in, considering the volume and nature of the modifications, it seemded right to upgrade to a full major version. So here it is.
Considering the volume of the annotations, I'm sure there will be typos found every now and then. Please do not hesitate to send me mail containing any mistakes you find or corrections you would like to suggest.
In release 4.4.1b the pagesize in the LaTeX file was defined to be din A4. In countries where other pagesizes are standard the conversion the default pagesize might be a better choice. In that case, remove the dina4 option from cplusplus.tex (or cplusplus.yo if you have yodl installed), and reconstruct the annotations from the TeX-file or Yodl-files.
The Annotations mailing lists was stopped at release 4.4.1d. From this point on only minor modifications were expected, which are not anymore generally announced.
At some point, I considered version 4.4.1 to be the final version of the C++ annotations. However, a section on special I/O functions was added to cover unformatted I/O, and the section about the string datatype had its layout improved and was, due to its volume, given a chapter of its own (chapter 4). All this eventually resulted in version 4.4.2.
Version 4.4.1 again contains new material, and reflects the ANSI/ISO standard (well, I try to have it reflect the ANSI/ISO standard). In version 4.4.1. several new sections and chapters were added, amon which a chapter about the Standard Template Library ( STL) and generic algorithms.
Version 4.4.0 (and subletters) was a mere construction version and was never made available.
The version 4.3.1a is a precursor of 4.3.2. In 4.3.1a most of the typos I've received since the last update have been processed. In version 4.3.2. extra attention was paid to the syntax for function address function addresses and pointers to member functions.
The decision to upgrade from version 4.2.* to 4.3.* was made after realizing that the lexical scanner function yylex() can be defined in the scanner class that is derived from yyFlexLexer. Under this approach the yylex() function can access the members of the class derived from yyFlexLexer as well as the public and protected members of yyFlexLexer. The result of all this is a clean implementation of the rules defined in the flex++ specification file.
The upgrade from version 4.1.* to 4.2.* was the result of the inclusion of section 3.3.1 about the bool data type in chapter 3. The distinction between differences between C and C++ and extensions of the C programming languages is (albeit a bit fuzzy) reflected in the introduction chapter and the chapter on first impressions of C++: The introduction chapter covers some differences between C and C++, whereas the chapter about first impressions of C++ covers some extensions of the C programming language as found in C++.
Major version 4 represents a major rewrite of the previous version 3.4.14: The document was rewritten from SGML to Yodl and many new sections were added. All sections got a tune-up. The distribution basis, however, hasn't changed: see the introduction.
Modifications in versions 1.*.*, 2.*.*, and 3.*.* were not logged.
Subreleases like 4.4.2a etc. contain bugfixes and typographical corrections.

2.2: The history of C++

The first implementation of C++ was developed in the nineteen-eighties at the AT&T Bell Labs, where the Unix operating system was created.

C++ was originally a `pre-compiler', similar to the preprocessor of C, which converted special constructions in its source code to plain C. This code was then compiled by a normal C compiler. The `pre-code', which was read by the C++ pre-compiler, was usually located in a file with the extension .cc, .C or .cpp. This file would then be converted to a C source file with the extension .c, which was compiled and linked.

The nomenclature of C++ source files remains: the extensions .cc and .cpp are usually still used. However, the preliminary work of a C++ pre-compiler is in modern compilers usually included in the actual compilation process. Often compilers will determine the type of a source file by the extension. This holds true for Borland's and Microsoft's C++ compilers, which assume a C++ source for an extension .cpp. The Gnu compiler g++, which is available on many Unix platforms, assumes for C++ the extension .cc.

The fact that C++ used to be compiled into C code is also visible from the fact that C++ is a superset of C: C++ offers all possibilities of C, and more. This makes the transition from C to C++ quite easy. Programmers who are familiar with C may start `programming in C++' by using source files with an extension .cc or .cpp instead of .c, and can then just comfortably slide into all the possibilities that C++ offers. No abrupt change of habits is required.

2.2.1: History of the C++ Annotations

The original version of the C++ Annotations was originally written by Frank and Karel Kubat in Dutch using LaTeX. After some time, Karel rewrote the text and converted the guide to a more suitable format and (of course) to English in september 1994.

The first version of the guide appeared on the net in october 1994. By then it was converted to SGML.

In time several chapters were added, and the contents were modified thanks to countless readers who sent us their comment, due to which we were able to correct some typos and improve unclear parts.

The transition from major version three to major version four was realized by Frank: again new chapters were added, and the source-document was converted from SGML to Yodl(http://www.xs4all.nl/~jantien/yodl/).

The C++ Annotations are not freely distributable. Be sure to read the legal notes.
Reading the annotations beyond this point implies that you are aware of the restrictions that we pose and that you agree with them.

If you like this document, tell your friends about it. Even better, let us know by sending email to Frank.

In the Internet, many useful hyperlinks exist to C++. Without even suggesting completeness (and without being checked regularly for existence: they might have died by the time you read this), the following might be worthwhile visiting:

http://www.cplusplus.com/ref/: A reference site for C++.
http://www.cygnus.com/misc/wp/dec96pub/: Makes available a version of the C++ ANSI/ISO standard.

2.2.2: Compiling a C program by a C++ compiler

For the sake of completeness, it must be mentioned here that C++ is `almost' a superset of C. There are some small differences which you might encounter when you just rename a file to an extension .cc and run it through a C++ compiler:

In C, sizeof('c') equals sizeof(int), 'c' being any ASCII character. The underlying philosophy is probably that char's, when passed as arguments to functions, are passed as integers anyway. Furthermore, the C compiler handles a character constant like 'c' as an integer constant. Hence, in C, the function calls
```
    putchar(10);
```
and
```
    putchar('\n');
```
are synonyms.
In contrast, in C++, sizeof('c') is always 1 (but see also section 3.3.2), while an int is still an int. As we shall see later (see section 2.5.11), two function calls
```
    somefunc(10);
```
and
```
    somefunc('\n');
```
are quite separate functions: C++ discriminates functions by their arguments, which are different in these two calls: one function requires an int while the other one requires a char.
C++ requires very strict prototyping of external functions. E.g., a prototype like
```
    extern void func();
```
means in C that a function func() exists, which returns no value. However, in C, the declaration doesn't specify which arguments (if any) the function takes.
In contrast, such a declaration in C++ means that the function func() takes no arguments at all.

2.2.3: Compiling a C++ program

In order to compile a C++ program, a C++ compiler is needed. Considering the free nature of this document, it won't come as a surprise that a free compiler is suggested here. The Free Software Foundation ( FSF) provides at http://www.gnu.org a free C++ compiler which is, among other places, also part of the Debian (http://www.debian.org) distribution of Linux ( http://www.linux.org).

2.2.3.1: C++ under MS-Windows

For MS-Windows Cygnus (http://sources.redhat.com/cygwin) provides the foundation for installing the Windows port of the Gnu g++ compiler.

When going to the above URL for a free g++ compiler, click on install now. This will download the file setup.exe, which can be run to install cygwin. The software to be installed can be downloaded by setup.exe from the internet. There are alternatives (e.g., using a CD-ROM), which are described on the Cygwin page. Installation proceeds interactively. The offered defaults are normally what you would want.

The most recent Gnu g++ compiler can be obtained from http://gcc.gnu.org. If the compiler that is made available in the Cygnus distribution lags behind the latest version, the sources of the latest version can be downloaded after which the compiler can be built using the available compiler. The compiler's webpage mentioned above contains detailed instructions on how to proceed. In our experience building a new compiler within the Cygnus environment works flawlessly.

2.2.3.2: Compiling a C++ source text

In general, compiling a C++ source source.cc is done as follows:

g++ source.cc

This produces a binary program (a.out or a.exe). If the default name is not wanted, the name of the executable can be specified using the -o flag:

g++ -o source source.cc

If only a compilation is required, the compiled module can be generated using the -c flag:

g++ -c source.cc

This produces the file source.o, which can be linked to other modules later on.

Using the icmake program a maintenance script can be used to assist in the construction and maintenance of C++ programs. This script has been tested on Linux platforms for several years now. Its description and components are found in a file named icmake-C1.61.tar.gz (or comparably), which is found in the same location as the icmake program. Alternatively, the standard make program can be used for maintenance of C++ programs. It is strongly advised to start using maintenance scripts or programs early in the study of the C++ programming language.

2.3: Advantages and pretensions of C++

Often it is said that programming in C++ leads to `better' programs. Some of the claimed advantages of C++ are:

New programs would be developed in less time because old code can be reused.
Creating and using new data types would be easier than in C.
The memory management under C++ would be easier and more transparent.
Programs would be less bug-prone, as C++ uses a stricter syntax and type checking.
`Data hiding', the usage of data by one program part while other program parts cannot access the data, would be easier to implement with C++.

Which of these allegations are true? In our opinion, C++ is a little overrated; in general this holds true for the entire object-oriented programming (OOP). The enthusiasm around C++ resembles somewhat the former allegations about Artificial-Intelligence (AI) languages like Lisp and Prolog: these languages were supposed to solve the most difficult AI-problems `almost without effort'. Obviously, too promising stories about any programming language must be overdone; in the end, each problem can be coded in any programming language (even BASIC or assembly language). The advantages or disadvantages of a given programming language aren't in `what you can do with them', but rather in `which tools the language offers to make the job easier'.

Concerning the above allegations of C++, we think that the following can be concluded. The development of new programs while existing code is reused can also be realized in C by, e.g., using function libraries: thus, handy functions can be collected in a library and need not be re-invented with each new program. Still, C++ offers its specific syntax possibilities for code reuse, apart from function libraries (see chapter 13).

Creating and using new data types is also very well possible in C; e.g., by using structs, typedefs etc.. From these types other types can be derived, thus leading to structs containing structs and so on.

Memory management is in principle in C++ as easy or as difficult as in C. Especially when dedicated C functions such as xmalloc() and xrealloc() are used (these functions are often present in our C-programs, they allocate or abort the program when the memory pool is exhausted). In short, memory management in C or in C++ can be coded `elegantly', `ugly' or anything in between -- this depends on the developer rather than on the language.

Concerning `bug proneness' we can say that C++ indeed uses stricter type checking than C. However, most modern C compilers implement `warning levels'; it is then the programmer's choice to disregard or heed a generated warning. In C++ many of such warnings become fatal errors (the compilation stops).

As far as `data hiding' is concerned, C does offer some tools. E.g., where possible, local or static variables can be used and special data types such as structs can be manipulated by dedicated functions. Using such techniques, data hiding can be realized even in C; though it must be admitted that C++ offers special syntactical constructions. In contrast, programmers who prefer to use a global variable int i for each counter variable will quite likely not benefit from the concept of data hiding, be it in C or C++.

Concluding, C++ in particular and OOP in general are not solutions to all programming problems. C++, however, does offer some elegant syntactical possibilities which are worthwhile investigating. At the same time, the level of grammatical complexity of C++ has increased significantly compared to C. In time we got used to this increased level of complexity, but the transition didn't take place fast or painless. With the Annotations we hope to help the reader to make the transition from C to C++ by providing, indeed, our annotations to what is found in some textbooks on C++. We hope you like this document and may benefit from it: Good luck!

2.4: What is Object-Oriented Programming?

Object-oriented programming propagates a slightly different approach to programming problems than the strategy which is usually used in C. The C-way is known as a ` procedural approach': a problem is decomposed into subproblems and this process is repeated until the subtasks can be coded. Thus a conglomerate of functions is created, communicating through arguments and variables, global or local (or static).

In contrast, or maybe better: in addition to this, an object-oriented approach identifies the keywords in the problem. These keywords are then depicted in a diagram and arrows are drawn between these keywords to define an internal hierarchy. The keywords will be the objects in the implementation and the hierarchy defines the relationship between these objects. The term object is used here to describe a limited, well-defined structure, containing all information about some entity: data types and functions to manipulate the data. As an example of an object oriented approach, an illustration follows:

The employees and owner of a car dealer and auto garage company are paid as follows. First, mechanics who work in the garage are paid a certain sum each month. Second, the owner of the company receives a fixed amount each month. Third, there are car salesmen who work in the showroom and receive their salary each month plus a bonus per sold car. Finally, the company employs second-hand car purchasers who travel around; these employees receive their monthly salary, a bonus per bought car, and a restitution of their travel expenses.

When representing the above salary administration, the keywords could be mechanics, owner, salesmen and purchasers. The properties of such units are: a monthly salary, sometimes a bonus per purchase or sale, and sometimes restitution of travel expenses. When analyzing the problem in this manner we arrive at the following representation:

The owner and the mechanics can be represented as the same type, receiving a given salary per month. The relevant information for such a type would be the monthly amount. In addition this object could contain data as the name, address and social security number.
Car salesmen who work in the showroom can be represented as the same type as above but with extra functionality: the number of transactions (sales) and the bonus per transaction.
In the hierarchy of objects we would define the dependency between the first two objects by letting the car salesmen be `derived' from the owner and mechanics.
Finally, there are the second-hand car purchasers. These share the functionality of the salesmen except for the travel expenses. The additional functionality would therefore consist of the expenses made and this type would be derived from the salesmen.

The hierarchy of the thus identified objects further illustrated in figure 1.

figure 1: Hierarchy of objects in the salary administration.

The overall process in the definition of a hierarchy such as the above starts with the description of the most simple type. Subsequently more complex types are derived, while each derivation adds a little functionality. From these derived types, more complex types can be derived ad infinitum, until a representation of the entire problem can be made.

In C++ each of the objects can be represented in a class, containing the necessary functionality to do useful things with the variables (called objects) of these classes. Not all of the functionality and not all of the properties of a class is usually available to objects of other classes. As we will see, classes tend to encapsulate their properties in such a way that they are not immediately accessible from the outside world. Instead, dedicated functions are normally used to reach or modify the properties of objects.

2.5: Differences between C and C++

In this section some examples of C++ code are shown. Some differences between C and C++ are highlighted.

2.5.1: Namespaces

C++ introduces the notion of a namespace: all symbols are defined in a larger context, called a namespace. Namespaces are used to avoid name conflicts that could arise when a programmer would like to define a function like sin(), operating on degrees without losing the capability of using the standard sin() function, operating on radians.

Namespaces are covered extensively in section 3.6. For now it should be noted that most compilers require the explicit declaration of a standard namespace: std. So, unless otherwise indicated, it is stressed that all examples in the Annotations now implicitly use the

using namespace std;

declaration. So, if you intend to actually compile the examples given in the Annotations, make sure that the sources start with the above using declaration.

2.5.2: End-of-line comment

According to the ANSI definition, ` end of line comment' is implemented in the syntax of C++. This comment starts with // and ends with the end-of-line marker. The standard C comment, delimited by /* and */ can still be used in C++:

    int main()
    {
        // this is end-of-line comment
        // one comment per line

        /*
            this is standard-C comment, over more
            than one line
        */
    }

2.5.3: NULL-pointers vs. 0-pointers

In C++ all zero values are coded as 0. In C, where pointers are concerned, NULL is often used. This difference is purely stylistic, though one that is widely adopted. In C++ there's no need anymore to use NULL. Indeed, according to the descriptions of the pointer-returning operator new 0 rather than NULL is returned when memory allocation fails.

2.5.4: Strict type checking

C++ uses very strict type checking. A prototype must be known for each function which is called, and the call must match the prototype. The program

    int main()
    {
        printf("Hello World\n");
    }

does often compile under C, though with a warning that printf() is not a known function. Many C++ compilers will fail to produce code in such a situation (When Gnu's g++ compiler encounters an unknown function, it assumes that an `ordinary' C function is meant. It does complain however.). The error is of course the missing #include <stdio.h> directive.

Although, while we're at it: in C++ the function main() always uses the int return value. It is possible to define int main(), without an explicit return statement, but a return statement without an expression cannot be given inside the main() function: a return statement in main() must always be given an int-expression. For example:

    int main()
    {
        return;     // won't compile: expects int expression
                    // omitting the above statement is ok too
    }

2.5.5: A new syntax for casts

Traditionally, C offers the following cast construction: (typename)expression

in which typename is the name of a valid type, and expression an expression. Following that, C++ initially also supported the function call style cast notation:

typename(expression)

But, these casts are now called old-style casts, and they are deprecated. Instead, four new-style casts were introduced:

The standard cast to convert one type to another is static_cast<type>(expression)
There is a special cast to do away with the const type-modification: const_cast<type>(expression)
A third cast is used to change the interpretation of information: reinterpret_cast<type>(expression)
And, finally, there is a cast form which is used in combination with polymorphism (see chapter 14): The dynamic_cast<type>(expression)
is performed run-time to convert, e.g., a pointer to an object of a certain class to a pointer to an object in its so-called class hierarchy. At this point in the Annotations it is a bit premature to discuss the dynamic_cast, but we will return to this topic in section 14.5.1.

2.5.5.1: The `static_cast'-operator

The static_cast<type>(expression) operator is used to convert one type to an acceptable other type. E.g., double to int. An example of such a cast is, assuming intVar is of type int:

intVar = static_cast<int>(12.45);

Another nice example of code in which it is a good idea to use the static_cast<>()-operator is in situations where the arithmetic assignment operators are used in mixed-type situations. E.g., consider the following expression (assume doubleVar is a variable of type double):

intVar += doubleVar;

Here, the evaluated expression actually is:

intVar = static_cast<int>(static_cast<double>(intVar) + doubleVar);

IntVar is first promoted to a double, and is then added as double to doubleVar. Next, the sum is cast back to an int. These two conversions are a bit overdone. The same result is obtained by explicitly casting the doubleVar to an int, thus obtaining an int-value for the right-hand side of the expression:

intVar += static_cast<int>(doubleVar);

2.5.5.2: The `const_cast'-operator

The const_cast<type>(expression) operator is used to do away with the const-ness of a (pointer) type. Assume that a function fun(char *s) is available, which performs some operation on its char *s parameter. Furthermore, assume that it's known that the function does not actually alter the string it receives as its argument. How can we use the function with a string like char const hello[] = "Hello world"?

Passing hello to fun() produces the warning

passing `const char *' as argument 1 of `fun(char *)' discards const

which can be prevented using the call

fun(const_cast<char *>(hello));

2.5.5.3: The `reinterpret_cast'-operator

The reinterpret_cast<type>(expression) operator is used to reinterpret byte patterns. For example, the individual bytes making up a double value can easily be reached using a reinterpret_cast<>(). Assume doubleVar is a variable of type double, then the individual bytes can be reached using

reinterpret_cast<char *>(&doubleVar)

This particular example also suggests the danger of the cast: it looks as though a standard C-string is produced, but there is not normally a trailing 0-byte. It's just a way to reach the individual bytes of the memory holding a double value.

More in general: using the cast-operators is a dangerous habit, as it suppresses the normal type-checking mechanism of the compiler. It is suggested to prevent casts if at all possible. If circumstances arise in which casts have to be used, document the reasons for their use well in your code, to make double sure that the cast is not the underlying cause for a program to misbehave.

2.5.5.4: The `dynamic_cast'-operator

The dynamic_cast<>() operator is used in the context of polymorphism. The discussion of this cast is postponed until section 14.5.1.

2.5.6: The `void' parameter list

A function prototype with an empty parameter list, such as

    extern void func();

means in C that the argument list of the declared function is not prototyped: the compiler will not be able to warn against improper argument usage. When declaring a function in C which has no arguments, the keyword void is used, as in:

    extern void func(void);

Because C++ enforces strict type checking, an empty parameter list is interpreted as the absence of any parameter. The keyword void can then be omitted: in C++ the above two declarations are equivalent.

2.5.7: The `#define __cplusplus'

Each C++ compiler which conforms to the ANSI/ISO standard defines the symbol __cplusplus: it is as if each source file were prefixed with the preprocessor directive #define __cplusplus.

We shall see examples of the usage of this symbol in the following sections.

2.5.8: The usage of standard C functions

Normal C functions, e.g., which are compiled and collected in a run-time library, can also be used in C++ programs. Such functions, however, must be declared as C functions.

As an example, the following code fragment declares a function xmalloc() which is a C function:

    extern "C" void *xmalloc(unsigned size);

This declaration is analogous to a declaration in C, except that the prototype is prefixed with extern "C".

A slightly different way to declare C functions is the following:

    extern "C"
    {
        // C-declarations go in here
    }

It is also possible to place preprocessor directives at the location of the declarations. E.g., a C header file myheader.h which declares C functions can be included in a C++ source file as follows:

    extern "C"
    {
    #   include <myheader.h>
    }

The above presented methods can be used without problem, but are not generally used. A more frequently used method to declare external C functions is presented next.

2.5.9: Header files for both C and C++

The combination of the predefined symbol __cplusplus and of the possibility to define extern "C" functions offers the ability to create header files for both C and C++. Such a header file might, e.g., declare a group of functions which are to be used in both C and C++ programs.

The setup of such a header file is as follows:

    #ifdef __cplusplus
    extern "C"
    {
    #endif
        // declaration of C-data and functions are inserted here. E.g.,
    extern void *xmalloc(unsigned size);

    #ifdef __cplusplus
    }
    #endif

Using this setup, a normal C header file is enclosed by extern "C" { which occurs at the start of the file and by }, which occurs at the end of the file. The #ifdef directives test for the type of the compilation: C or C++. The `standard' C header files, such as stdio.h, are built in this manner and are therefore usable for both C and C++.

An extra addition which is often seen is the following. Usually it is desirable to avoid multiple inclusions of the same header file. This can easily be achieved by including an #ifndef directive in the header file. An example of a file myheader.h would then be:

    #ifndef _MYHEADER_H_
    #define _MYHEADER_H_
        // declarations of the header file is inserted here,
        // using #ifdef __cplusplus etc. directives

    #endif

When this file is scanned for the first time by the preprocessor, the symbol _MYHEADER_H_ is not yet defined. The #ifndef condition succeeds and all declarations are scanned. In addition, the symbol _MYHEADER_H_ is defined.

When this file is scanned for a second time during the same compilation, the symbol _MYHEADER_H_ is defined. All information between the #ifndef and #endif directives is skipped.

The symbol name _MYHEADER_H_ serves in this context only for recognition purposes. E.g., the name of the header file can be used for this purpose, in capitals, with an underscore character instead of a dot.

Apart from all this, the custom has evolved to give C header files the extension .h, and to give C++ header files no extension. For example, the standard iostreams cin, cout and cerr are available after including the preprocessor directive #include <iostream>, rather than #include <iostream.h> in a source. In the Annotations this convention is used with the standard C++ header files, but not everywhere else (Frankly, we tend not to follow this convention: our C++ header files still have the .h extension, and apparently nobody cares...).

There is more to be said about header files. In section 6.5 the preferred organization of header files with C++ classes is discussed.

2.5.10: The definition of local variables

In C local variables can only be defined at the top of a function or at the beginning of a nested block. In C++ local variables can be created at any position in the code, even between statements.

Furthermore, local variables can be defined in some statements, just prior to their usage. A typical example is the for statement:

    #include <stdio.h>
    
    int main()
    {
        for (register int i = 0; i < 20; i++)
            printf("%d\n", i);
        return (0);
    }

In this code fragment the variable i is created inside the for statement. According to the ANSI-standard, the variable does not exist prior to the for-statement and not beyond the for-statement. With some compilers, the variable continues to exist after the execution of the for-statement, but a warning like

warning: name lookup of `i' changed for new ANSI `for' scoping using obsolete binding at `i'

will be issued when the variable is used outside of the for-loop. The implication seems clear: define a variable just before the for-statement if it's to be used after that statement, otherwise the variable can be defined at the for-statement itself.

Defining local variables when they're needed requires a little getting used to. However, eventually it tends to produce more readable code than defining variables at the beginning of compound statements. We suggest the following rules of thumb for defining local variables:

Local variables should be defined at the beginning of a function, following the first {,
or they should be created at `intuitively right' places, such as in the example above. This does not only entail the for-statement, but also all situations where a variable is only needed, say, half-way through the function.

2.5.11: Function Overloading

In C++ it is possible to define functions having identical names but performing different actions. The functions must differ in their parameter lists. An example is given below:

    #include <stdio.h>

    void show(int val)
    {
        printf("Integer: %d\n", val);
    }

    void show(double val)
    {
        printf("Double: %lf\n", val);
    }

    void show(char *val)
    {
        printf("String: %s\n", val);
    }

    int main()
    {
        show(12);
        show(3.1415);
        show("Hello World\n!");
    }

In the above fragment three functions show() are defined, which only differ in their parameter lists: int, double and char *. The functions have the same names. The definition of several functions having identical names is called ` function overloading'.

It is interesting that the way in which the C++ compiler implements function overloading is quite simple. Although the functions share the same name in the source text (in this example show()), the compiler --and hence the linker-- use quite different names. The conversion of a name in the source file to an internally used name is called ` name mangling'. E.g., the C++ compiler might convert the name void show (int) to the internal name VshowI, while an analogous function with a char* argument might be called VshowCP. The actual names which are internally used depend on the compiler and are not relevant for the programmer, except where these names show up in e.g., a listing of the contents of a library.

A few remarks concerning function overloading are:

The usage of more than one function with the same name but quite different actions should be avoided. In the example above, the functions show() are still somewhat related (they print information to the screen).
However, it is also quite possible to define two functions lookup(), one of which would find a name in a list while the other would determine the video mode. In this case the two functions have nothing in common except for their name. It would therefore be more practical to use names which suggest the action; say, findname() and getvidmode().
C++ does not allow that several functions only differ in their return value. This has the reason that it is always the programmer's choice to inspect or ignore the return value of a function. E.g., the fragment
```
    printf("Hello World!\n");
```
holds no information concerning the return value of the function printf() (The return value is, by the way, an integer which states the number of printed characters. This return value is practically never inspected.). Two functions printf() which would only differ in their return type could therefore not be distinguished by the compiler.
Function overloading can produce surprises. E.g., imagine a statement like
```
    show(0);
```
given the three functions show() above. The zero could be interpreted here as a NULL pointer to a char, i.e., a (char *)0, or as an integer with the value zero. C++ will choose to call the function expecting an integer argument, which might not be what one expects.

2.5.12: Default function arguments

In C++ it is possible to provide ` default arguments' when defining a function. These arguments are supplied by the compiler when they are not specified by the programmer. For example:

    #include <stdio.h>

    void showstring(char *str = "Hello World!\n")
    {
        printf(str);
    }

    int main()
    {
        showstring("Here's an explicit argument.\n");

        showstring();           // in fact this says:
                                // showstring("Hello World!\n");
    }

The possibility to omit arguments in situations where default arguments are defined is just a nice touch: the compiler will supply the missing argument when not specified. The code of the program becomes by no means shorter or more efficient.

Functions may be defined with more than one default argument:

    void two_ints(int a = 1, int b = 4)
    {
        ...
    }

    int main()
    {
        two_ints();            // arguments:  1, 4
        two_ints(20);          // arguments: 20, 4
        two_ints(20, 5);       // arguments: 20, 5
    }

When the function two_ints() is called, the compiler supplies one or two arguments when necessary. A statement as two_ints(,6) is however not allowed: when arguments are omitted they must be on the right-hand side.

Default arguments must be known to the compiler when the code is generated where the arguments may have to be supplied. Often this means that the default arguments are present in a header file:

    // sample header file
    extern void two_ints(int a = 1, int b = 4);

    // code of function in, say, two.cc
    void two_ints(int a, int b)
    {
        ...
    }

Note that supplying the default arguments in function definitions instead of in the header file is not the correct approach: the compiler will read the header file and not the function definition when the function is used in other sources. Consequently, in that case no default arguments can be inserted by the compiler.

2.5.13: The keyword `typedef'

The keyword typedef is still allowed in C++, but no longer necessary when used as a prefix in union, struct or enum definitions. This is illustrated in the following example:

    struct somestruct
    {
        int
            a;
        double
            d;
        char
            string[80];
    };

When a struct, union or other compound type is defined, the tag of this type can be used as type name (this is somestruct in the above example):

    somestruct
        what;

    what.d = 3.1415;

2.5.14: Functions as part of a struct

In C++ it is allowed to define functions as part of a struct. This is the first concrete example of the definition of an object: as was described previously (see section 2.4), an object is a structure containing all involved code and data.

A definition of a struct point is given in the code fragment below. In this structure, two int data fields and one function draw() are declared.

    struct point            // definition of a screen
    {                       // dot:
        int
            x,              // coordinates
            y;              // x/y
        void
            draw(void);     // drawing function
    };

A similar structure could be part of a painting program and could, e.g., represent a pixel in the drawing. Concerning this struct it should be noted that:

The function draw() which occurs in the struct definition is only a declaration. The actual code of the function, or in other words the actions which the function should perform, are located elsewhere: in the code section of the program, where all code is collected. We will describe the actual definitions of functions inside structs later (see section 3.2).
The size of the struct point is just two ints. Even though a function is declared in the structure, its size is not affected by this. The compiler implements this behavior by allowing the function draw() to be known only in the context of a point.

The point structure could be used as follows:

    point                   // two points on
        a,                  // screen
        b;

    a.x = 0;                // define first dot
    a.y = 10;               // and draw it
    a.draw();

    b = a;                  // copy a to b
    b.y = 20;               // redefine y-coord
    b.draw();               // and draw it

The function that is part of the structure is selected in a similar manner in which data fields are selected; i.e., using the field selector operator (.). When pointers to structs are used, -> can be used.

The idea of this syntactical construction is that several types may contain functions having identical names. E.g., a structure representing a circle might contain three int values: two values for the coordinates of the center of the circle and one value for the radius. Analogously to the point structure, a function draw() could be declared which would draw the circle.