This chapter will introduce enough
C++ syntax and program construction concepts to allow you to write
and run some simple object-oriented
programs. In the subsequent chapter we will cover the basic syntax of C and C++
in detail.
By reading this chapter first,
you’ll get the basic flavor of what it is like to program with objects in
C++, and you’ll also discover some of the reasons for the enthusiasm
surrounding this language. This should be enough to carry you through Chapter 3,
which can be a bit exhausting since it contains most of the details of the C
language.
The user-defined data
type, or
class, is what distinguishes C++ from traditional
procedural languages. A class is a new data type that you or someone else
creates to solve a particular kind of problem. Once a class is created, anyone
can use it without knowing the specifics of how it works, or even how classes
are built. This chapter treats classes as if they are just another built-in data
type available for use in programs.
Classes that someone else has created are
typically packaged into a library. This chapter uses
several of the class libraries that come with all C++ implementations. An
especially important standard library is iostreams, which (among other things)
allow you to read from files and the keyboard, and to write to files and the
display. You’ll also see the very handy string class, and the
vector container from the Standard C++ Library. By the end of the
chapter, you’ll see how easy it is to use a pre-defined library of
classes.
All computer languages are translated
from something that tends to be easy for a human to understand (source
code) into something that is executed on a computer (machine
instructions). Traditionally, translators fall into
two classes: interpreters and
compilers.
An interpreter translates source code
into activities (which may comprise groups of machine instructions) and
immediately executes those activities. BASIC, for
example, has been a popular interpreted language. Traditional BASIC interpreters
translate and execute one line at a time, and then forget that the line has been
translated. This makes them slow, since they must re-translate any repeated
code. BASIC has also been compiled, for speed. More modern interpreters, such as
those for the Python language, translate the entire
program into an intermediate language that is then executed by a much faster
interpreter[25].
Interpreters have many advantages. The
transition from writing code to executing code is almost immediate, and the
source code is always available so the interpreter can be much more specific
when an error occurs. The benefits often cited for interpreters are ease of
interaction and rapid development (but not necessarily execution) of
programs.
Interpreted languages often have severe
limitations when building large projects (Python seems to be an exception to
this). The interpreter (or a reduced version) must always be in memory to
execute the code, and even the fastest interpreter may introduce unacceptable
speed restrictions. Most interpreters require that the complete source code be
brought into the interpreter all at once. Not only does this introduce a space
limitation, it can also cause more difficult bugs if the language doesn’t
provide facilities to localize the effect of different pieces of
code.
A compiler translates source code
directly into assembly language or machine instructions. The eventual end
product is a file or files containing machine code. This is an involved process,
and usually takes several steps. The transition from writing code to executing
code is significantly longer with a compiler.
Depending on the acumen of the compiler
writer, programs generated by a compiler tend to require much less space to run,
and they run much more quickly. Although size and speed are probably the most
often cited reasons for using a compiler, in many situations they aren’t
the most important reasons. Some languages (such as C) are designed to allow
pieces of a program to be compiled independently. These pieces are eventually
combined into a final executable program by a tool called the
linker. This process is called separate
compilation.
Separate compilation has many benefits. A
program that, taken all at once, would exceed the limits of the compiler or the
compiling environment can be compiled in pieces. Programs can be built and
tested one piece at a time. Once a piece is working, it can be saved and treated
as a building block. Collections of tested and working pieces can be combined
into libraries for use by other programmers. As
each piece is created, the complexity of the other pieces is hidden. All these
features support the creation of large
programs[26].
Compiler debugging
features have improved significantly over time. Early compilers only generated
machine code, and the programmer inserted print statements to see what was going
on. This is not always effective. Modern compilers can insert information about
the source code into the executable program. This information is used by
powerful source-level debuggers to show exactly
what is happening in a program by tracing its progress through the source
code.
Some compilers tackle the
compilation-speed problem by performing in-memory
compilation. Most compilers work with files, reading and writing them in
each step of the compilation process. In-memory compilers keep the compiler
program in RAM. For small programs, this can seem as responsive as an
interpreter.
To program in C and C++, you need to
understand the steps and tools in the compilation process. Some languages (C and
C++, in particular) start compilation by running a
preprocessor on the source code. The preprocessor
is a simple program that replaces patterns in the source code with other
patterns the programmer has defined (using preprocessor
directives). Preprocessor directives are used to save
typing and to increase the readability of the code. (Later in the book,
you’ll learn how the design of C++ is meant to discourage much of the use
of the preprocessor, since it can cause subtle bugs.) The pre-processed code is
often written to an intermediate file.
Compilers usually do their work in two
passes. The first pass parses the pre-processed
code. The compiler breaks the source code into small units and organizes it into
a structure called a tree. In the expression
“A + B” the elements ‘A’,
‘+,’ and ‘B’ are leaves on the parse
tree.
A global
optimizer is sometimes used between the first and
second passes to produce smaller, faster code.
In the second pass, the code
generator walks through the parse tree and generates
either assembly language code or machine code for the nodes of the tree. If the
code generator creates assembly code, the assembler must then be run. The end
result in both cases is an object module (a file that
typically has an extension of .o or .obj). A peephole
optimizer is sometimes used in the second pass to
look for pieces of code containing redundant assembly-language
statements.
The use of the word
“object” to describe chunks of machine code
is an unfortunate artifact. The word came into use before object-oriented
programming was in general use. “Object” is used in the same sense
as “goal” when discussing compilation, while in object-oriented
programming it means “a thing with boundaries.”
The linker
combines a list of object modules into an executable program that can be loaded
and run by the operating system. When a function in one object module makes a
reference to a function or variable in another object module, the linker
resolves these references; it makes sure that all the external functions and
data you claimed existed during compilation do exist. The
linker also adds a special object module to perform start-up
activities.
The linker can search through special
files called libraries in order to resolve all its references. A
library contains a collection of object modules in a
single file. A library is created and maintained by a program called a
librarian.
The compiler performs type
checking during the first pass. Type checking tests
for the proper use of arguments in functions and prevents many kinds of
programming errors. Since type checking occurs during compilation instead of
when the program is running, it is called static type checking.
Some object-oriented languages (notably
Java) perform some type checking at runtime (dynamic
type checking). If combined with static type checking,
dynamic type checking is more powerful than static type
checking alone. However, it also adds overhead to program
execution.
C++ uses static type checking because the
language cannot assume any particular runtime support for bad operations. Static
type checking notifies the programmer about misuses of types during compilation,
and thus maximizes execution speed. As you learn C++, you will see that most of
the language design decisions favor the same kind of high-speed,
production-oriented programming the C language is famous for.
You can disable static type checking in
C++. You can also do your own dynamic type checking – you just need to
write the code.
Separate compilation is particularly
important when building large projects. In C and C++, a
program can be created in small, manageable, independently tested pieces. The
most fundamental tool for breaking a program up into pieces is the ability to
create named subroutines or subprograms. In C and C++, a subprogram is called a
function, and functions are the pieces of code
that can be placed in different files, enabling separate compilation. Put
another way, the function is the atomic unit of code, since you cannot have part
of a function in one file and another part in a different file; the entire
function must be placed in a single file (although files can and do contain more
than one function).
When you call a function, you typically
pass it some arguments, which are values you’d like the function to
work with during its execution. When the function is finished, you typically get
back a return value, a
value that the function hands back to you as a result. It’s also possible
to write functions that take no arguments and return no
values.
To create a program with multiple files,
functions in one file must access functions and data in other files. When
compiling a file, the C or C++ compiler must know about the functions and data
in the other files, in particular their names and proper usage. The compiler
ensures that functions and data are used correctly. This process of
“telling the compiler” the names of external functions and data and
what they should look like is called declaration.
Once you declare a function or variable, the compiler knows how to check to make
sure it is used
properly.
It’s important to understand the
difference between declarations and
definitions because these terms will be used
precisely throughout the book. Essentially all C and C++ programs require
declarations. Before you can write your first program, you need to understand
the proper way to write a declaration.
A declaration introduces a name
– an identifier – to the compiler. It tells the compiler “This
function or this variable exists somewhere, and here is what it should look
like.” A definition, on the other hand, says: “Make this
variable here” or “Make this function here.” It allocates
storage for the name. This meaning works whether you’re talking about a
variable or a function; in either case, at the point of definition the compiler
allocates storage. For a variable, the compiler determines how big that variable
is and causes space to be generated in memory to hold the data for that
variable. For a function, the compiler generates code, which ends up occupying
storage in memory.
You can declare a variable or a function
in many different places, but there must be only one definition in C and C++
(this is sometimes called the ODR: one-definition
rule). When the linker is uniting all the object modules, it will usually
complain if it finds more than one definition for the same function or
variable.
A definition can also be a declaration.
If the compiler hasn’t seen the name x before and you define int
x;, the compiler sees the name as a declaration and allocates storage for it
all at once.
A function declaration in C and C++ gives
the function name, the argument types passed to the function, and the return
value of the function. For example, here is a declaration for a function called
func1( ) that takes two integer arguments (integers are denoted in
C/C++ with the keyword int) and returns an integer:
int func1(int,int);
The first keyword you see is the return
value all by itself: int. The arguments are enclosed in parentheses after
the function name in the order they are used. The semicolon indicates the end of
a statement; in this case, it tells the compiler “that’s all –
there is no function definition here!”
C and C++ declarations attempt to mimic
the form of the item’s use. For example, if a is another integer
the above function might be used this way:
a = func1(2,3);
Since func1( ) returns an
integer, the C or C++ compiler will check the use of func1( ) to
make sure that a can accept the return value and that the arguments are
appropriate.
Arguments in
function declarations may have names. The compiler ignores the names but they
can be helpful as mnemonic devices for the user. For example, we can declare
func1( ) in a different fashion that has the same
meaning:
int func1(int length, int width);
There is a significant difference between
C and C++ for functions with empty argument lists. In C, the
declaration:
int func2();
means “a function with any number
and type of argument.” This prevents type-checking,
so in C++ it means “a function with no arguments.”
Function definitions look like function
declarations except that they have bodies. A body is a
collection of statements enclosed in braces. Braces denote the beginning and
ending of a block of code. To give func1( ) a definition that is an
empty body (a body containing no code), write:
int func1(int length, int width) { }
Notice that in the function definition,
the braces replace the semicolon. Since braces surround a statement or group of
statements, you don’t need a semicolon. Notice also that the arguments in
the function definition must have names if you want to use the arguments in the
function body (since they are never used here, they are
optional).
The meaning attributed to the phrase
“variable declaration” has historically been confusing and
contradictory, and it’s important that you understand the correct
definition so you can read code properly. A variable declaration tells the
compiler what a variable looks like. It says, “I know you haven’t
seen this name before, but I promise it exists someplace, and it’s a
variable of X type.”
In a function declaration, you give a
type (the return value), the function name, the argument list, and a semicolon.
That’s enough for the compiler to figure out that it’s a declaration
and what the function should look like. By inference, a variable declaration
might be a type followed by a name. For example:
int a;
could declare the variable a as an
integer, using the logic above. Here’s the conflict: there is enough
information in the code above for the compiler to create space for an integer
called a, and that’s what happens. To resolve this dilemma, a
keyword was necessary for C and C++ to say “This is only a declaration;
it’s defined elsewhere.” The keyword is
extern. It can mean the
definition is external to the file, or that the definition occurs later
in the file.
Declaring a variable without defining it
means using the extern keyword before a description of the variable, like
this:
extern int a;
extern can also apply to function
declarations. For func1( ), it looks like this:
extern int func1(int length, int width);
This statement is equivalent to the
previous func1( ) declarations. Since there is no function body, the
compiler must treat it as a function declaration rather than a function
definition. The extern keyword is thus superfluous and optional for
function declarations. It is probably unfortunate that the designers of C did
not require the use of extern for function declarations; it would have
been more consistent and less confusing (but would have required more typing,
which probably explains the decision).
Here are some more examples of
declarations:
//: C02:Declare.cpp // Declaration & definition examples extern int i; // Declaration without definition extern float f(float); // Function declaration float b; // Declaration & definition float f(float a) { // Definition return a + 1.0; } int i; // Definition int h(int x) { // Declaration & definition return x + 1; } int main() { b = 1.0; i = 2; f(b); h(i); } ///:~
In the function declarations, the
argument identifiers are optional. In the definitions, they are required (the
identifiers are required only in C, not C++).
Most libraries contain significant
numbers of functions and variables. To save work and ensure consistency when
making the external declarations for these items, C and C++ use a device called
the header file. A header file is a file
containing the external declarations for a library; it conventionally has a file
name extension of ‘h’, such as headerfile.h. (You may also
see some older code using different extensions, such as .hxx or
.hpp, but this is becoming rare.)
The programmer who creates the library
provides the header file. To declare the functions and external variables in the
library, the user simply includes the header file. To include a header file, use
the #include
preprocessor
directive. This tells the preprocessor to open the named header file and insert
its contents where the #include statement appears. A #include may
name a file in two ways: in angle brackets (< >) or in double
quotes.
File names in angle brackets, such
as:
#include <header>
cause the preprocessor to search for the
file in a way that is particular to your implementation, but typically
there’s some kind of “include search path” that you specify in
your environment or on the compiler command line. The mechanism for setting the
search path varies between machines, operating systems, and C++ implementations,
and may require some investigation on your part.
File names in double quotes, such
as:
#include "local.h"
tell the preprocessor to search for the
file in (according to the specification) an “implementation-defined
way.” What this typically means is to search for the file relative to the
current directory. If the file is not found, then the include directive is
reprocessed as if it had angle brackets instead of quotes.
To include the iostream header file, you
write:
#include <iostream>
The preprocessor will find the iostream
header file (often in a subdirectory called “include”) and insert
it.
As C++ evolved, different compiler
vendors chose different extensions for file names. In addition, various
operating systems have different restrictions on file names, in particular on
name length. These issues caused source code portability problems. To smooth
over these rough edges, the standard uses a format that allows file names longer
than the notorious eight characters and eliminates the extension. For example,
instead of the old style of including iostream.h, which looks like
this:
#include <iostream.h>
you can now write:
#include <iostream>
The translator can implement the include
statements in a way that suits the needs of that particular compiler and
operating system, if necessary truncating the name and adding an extension. Of
course, you can also copy the headers given you by your compiler vendor to ones
without extensions if you want to use this style before a vendor has provided
support for it.
The libraries that have been inherited
from C are still available with the traditional ‘.h’
extension. However, you can also use them with the more modern C++ include style
by prepending a “c” before the name. Thus:
#include <stdio.h> #include <stdlib.h>
become:
#include <cstdio> #include <cstdlib>
And so on, for all the Standard C
headers. This provides a nice distinction to the reader indicating when
you’re using C versus C++ libraries.
The effect of the new include format is
not identical to the old: using the .h gives you the older, non-template
version, and omitting the .h gives you the new templatized version.
You’ll usually have problems if you try to intermix the two forms in a
single
program.
The linker collects object modules (which
often use file name extensions like .o or .obj), generated by the
compiler, into an executable program the operating system can load and run. It
is the last phase of the compilation process.
Linker characteristics vary from system
to system. In general, you just tell the linker the names of the object modules
and libraries you want linked together, and the name of the executable, and it
goes to work. Some systems require you to invoke the linker yourself. With most
C++ packages you invoke the linker through the C++ compiler. In many situations,
the linker is invoked for you invisibly.
Some older linkers
won’t search object files
and libraries more than once, and they search through the list you give them
from left to right. This means that the order of object files and libraries can
be important. If you have a mysterious problem that doesn’t show up until
link time, one possibility is the order in which the files are given to the
linker.
Now that you know the basic terminology,
you can understand how to use a library. To use a library:
These steps also
apply when the object modules aren’t combined into a library. Including a
header file and linking the object modules are the basic steps for separate
compilation in both C and C++.
When you make an external reference to a
function or variable in C or C++, the linker, upon encountering this reference,
can do one of two things. If it has not already encountered the definition for
the function or variable, it adds the identifier to its list of
“unresolved
references.” If the linker
has already encountered the definition, the reference is
resolved.
If the linker cannot find the definition
in the list of object modules, it searches the libraries.
Libraries have some sort of indexing so the linker doesn’t need to look
through all the object modules in the library – it just looks in the
index. When the linker finds a definition in a library, the entire object
module, not just the function definition, is linked into the executable program.
Note that the whole library isn’t linked, just the object module in the
library that contains the definition you want (otherwise programs would be
unnecessarily large). If you want to minimize executable program size, you might
consider putting a single function in each source code file when you build your
own libraries. This requires more
editing[27],
but it can be helpful to the user.
Because the linker searches files in the
order you give them, you can pre-empt the use of a library function
by inserting a file with your own function, using the
same function name, into the list before the library name appears. Since the
linker will resolve any references to this function by using your function
before it searches the library, your function is used instead of the library
function. Note that this can also be a bug, and the kind of thing C++ namespaces
prevent.
When a C or C++ executable program is
created, certain items are secretly linked in. One of these is the startup
module, which contains initialization routines that must
be run any time a C or C++ program begins to execute. These routines set up the
stack and initialize certain variables in the program.
The linker always searches the standard
library for the compiled versions of any
“standard” functions called in the program. Because the standard
library is always searched, you can use anything in that library by simply
including the appropriate header file in your program; you don’t have to
tell it to search the standard library. The iostream functions, for example, are
in the Standard C++ library. To use them, you just include the
<iostream> header file.
If you are using an add-on library, you
must explicitly add the library name to the list of files handed to the
linker.
Just because you are writing code in C++,
you are not prevented from using C library functions. In fact, the entire C
library is included by default into Standard C++. There has been a tremendous
amount of work done for you in these functions, so they can save you a lot of
time.
This book will use Standard C++ (and thus
also Standard C) library functions when convenient, but only standard
library functions will be used, to ensure the portability of programs. In the
few cases in which library functions must be used that are not in the C++
standard, all attempts will be made to use POSIX-compliant functions. POSIX is a
standard based on a Unix standardization effort that includes functions that go
beyond the scope of the C++ library. You can generally expect to find POSIX
functions on Unix (in particular, Linux) platforms, and often under DOS/Windows.
For example, if you’re using multithreading you are better off using the
POSIX thread library because your code will then be easier to understand, port
and maintain (and the POSIX thread library will usually just use the underlying
thread facilities of the operating system, if these are
provided).
You now know almost enough of the basics
to create and compile a program. The program will use the Standard C++ iostream
classes. These read from and write to files and “standard” input and
output (which normally comes from and goes to the console, but may be redirected
to files or devices). In this simple program, a stream object will be used to
print a message on the
screen.
To declare the functions and external
data in the iostreams class, include the header file with the
statement
#include <iostream>
The first program uses the concept of
standard output, which means
“a general-purpose place to send output.” You will see other
examples using standard output in different ways, but here it will just go to
the console. The iostream package automatically defines a variable (an object)
called cout that accepts all data bound for
standard output.
To send data to standard output, you use
the operator <<. C programmers know this operator as the
“bitwise left shift,” which will be described in the next chapter.
Suffice it to say that a bitwise left shift has nothing to do with output.
However, C++ allows operators to be overloaded. When you overload an
operator, you give it a new
meaning when that operator is used with an object of a particular type. With
iostream objects, the operator << means “send to.” For
example:
cout << "howdy!";
That’s enough operator overloading
to get you started. Chapter 12 covers operator overloading in
detail.
As mentioned in Chapter 1, one of the
problems encountered in the C language is that you “run out of
names” for functions and identifiers when your programs reach a certain
size. Of course, you don’t really run out of names; it does, however,
become harder to think of new ones after awhile. More importantly, when a
program reaches a certain size it’s typically broken up into pieces, each
of which is built and maintained by a different person or group. Since C
effectively has a single arena where all the identifier and function names live,
this means that all the developers must be careful not to accidentally use the
same names in situations where they can conflict. This rapidly becomes tedious,
time-wasting, and, ultimately, expensive.
Standard C++ has a mechanism to prevent
this collision: the namespace keyword. Each set of C++ definitions in a
library or program is “wrapped” in a namespace, and if some other
definition has an identical name, but is in a different namespace, then there is
no collision.
Namespaces are a convenient and helpful
tool, but their presence means that you must be aware of them before you can
write any programs. If you simply include a header file and use some functions
or objects from that header, you’ll probably get strange-sounding errors
when you try to compile the program, to the effect that the compiler cannot find
any of the declarations for the items that you just included in the header file!
After you see this message a few times you’ll become familiar with its
meaning (which is “You included the header file but all the declarations
are within a namespace and you didn’t tell the compiler that you wanted to
use the declarations in that namespace”).
There’s a keyword that allows you
to say “I want to use the declarations and/or definitions in this
namespace.” This keyword, appropriately enough, is
using. All of the Standard
C++ libraries are wrapped in a single namespace, which is
std (for
“standard”). As this book uses the standard libraries almost
exclusively, you’ll see the following
using directive in almost
every program:
using namespace std;
This means that you want to expose all
the elements from the namespace called std. After this statement, you
don’t have to worry that your particular library component is inside a
namespace, since the using directive makes that namespace available
throughout the file where the using directive was
written.
Exposing all the elements from a
namespace after someone has gone to the trouble to hide them may seem a bit
counterproductive, and in fact you should be careful about thoughtlessly doing
this (as you’ll learn later in the book). However, the using
directive exposes only those names for the current file, so it is not quite as
drastic as it first sounds. (But think twice about doing it in a header file
– that is reckless.)
There’s a relationship between
namespaces and the way header files are included. Before the modern header file
inclusion was standardized (without the trailing ‘.h’, as in
<iostream>), the typical way to include a header file was with the
‘.h’, such as <iostream.h>. At that time,
namespaces were not part of the language either. So to provide backward
compatibility with existing code, if you say
#include <iostream.h>
it means
#include <iostream> using namespace std;
However, in this book the standard
include format will be used (without the ‘.h’) and so the
using directive must be explicit.
For now, that’s all you need to
know about namespaces, but in Chapter 10 the subject is covered much more
thoroughly.
A C or C++ program is a collection of
variables, function definitions, and function calls. When the program starts, it
executes initialization code and calls a special function,
“main( ).” You put the primary
code for the program here.
As mentioned earlier, a function
definition consists of a return type (which must be specified in C++), a
function name, an argument list in parentheses, and the function code contained
in braces. Here is a sample function definition:
int function() { // Function code here (this is a comment) }
The function above has an empty argument
list and a body that contains only a comment.
There can be many sets of braces within a
function definition, but there must always be at least one set surrounding the
function body. Since main( ) is a function, it must follow these
rules. In C++, main( ) always has return type of
int.
C and C++ are free form languages. With
few exceptions, the compiler ignores newlines and white space, so it must have
some way to determine the end of a statement. Statements are delimited by
semicolons.
C comments start with /* and end
with */. They can include newlines. C++ uses C-style comments and has an
additional type of comment: //. The // starts a comment that
terminates with a newline. It is more convenient than /* */ for one-line
comments, and is used extensively in this
book.
And now, finally, the first
program:
//: C02:Hello.cpp // Saying Hello with C++ #include <iostream> // Stream declarations using namespace std; int main() { cout << "Hello, World! I am " << 8 << " Today!" << endl; } ///:~
The cout object is handed a series
of arguments via the ‘<<’ operators. It prints out
these arguments in left-to-right order. The special iostream function
endl outputs the line and a newline. With iostreams, you can string
together a series of arguments like this, which makes the class easy to use.
In C, text inside double quotes is
traditionally called a “string.” However, the
Standard C++ library has a powerful class called string for manipulating
text, and so I shall use the more precise term character array for text
inside double quotes.
The compiler creates storage for
character arrays and stores the ASCII equivalent for each character in this
storage. The compiler automatically terminates this array of characters with an
extra piece of storage containing the value 0 to indicate the end of the
character array.
Inside a character array, you can insert
special characters by using escape sequences.
These consist of a backslash (\) followed by a special code. For example
\n means newline. Your compiler manual or local C
guide gives a complete set of escape sequences; others include \t
(tab), \\ (backslash), and
\b (backspace).
Notice that the statement can continue
over multiple lines, and that the entire statement terminates with a
semicolon
Character array arguments and constant
numbers are mixed together in the above cout statement. Because the
operator << is overloaded with a variety of
meanings when used with cout, you can send cout a variety of
different arguments and it will “figure out what to do with the
message.”
Throughout this book you’ll notice
that the first line of each file will be a comment that starts with the
characters that start a comment (typically //), followed by a colon, and
the last line of the listing will end with a comment followed by
‘/:~’. This is a technique I use to allow easy extraction of
information from code files (the program to do this can be found in volume two
of this book, at www.BruceEckel.com). The first line also has the name
and location of the file, so it can be referred to in text and in other files,
and so you can easily locate it in the source code for this book (which is
downloadable from
www.BruceEckel.com).
After downloading and unpacking the
book’s source code, find the program in the subdirectory CO2.
Invoke the compiler with Hello.cpp as the argument. For simple, one-file
programs like this one, most compilers will take you all the way through the
process. For example, to use the GNU C++ compiler (which is freely available on
the Internet), you write:
g++ Hello.cpp
So far you have seen only the most
rudimentary aspect of the iostreams class. The output formatting available with
iostreams also includes features such as number formatting in decimal, octal,
and hexadecimal. Here’s another example of the use of
iostreams:
//: C02:Stream2.cpp // More streams features #include <iostream> using namespace std; int main() { // Specifying formats with manipulators: cout << "a number in decimal: " << dec << 15 << endl; cout << "in octal: " << oct << 15 << endl; cout << "in hex: " << hex << 15 << endl; cout << "a floating-point number: " << 3.14159 << endl; cout << "non-printing char (escape): " << char(27) << endl; } ///:~
This example shows the iostreams class
printing numbers in decimal, octal, and hexadecimal using iostream
manipulators (which don’t print anything,
but change the state of the output stream). The formatting of floating-point
numbers is determined automatically by the compiler. In addition, any character
can be sent to a stream object using a cast to a
char (a char is a
data type that holds single characters). This cast looks like a function
call: char( ), along with the character’s ASCII value. In the
program above, the char(27) sends an “escape” to
cout.
An important feature of the C
preprocessor is character array
concatenation. This feature is used in some of the
examples in this book. If two quoted character arrays are adjacent, and no
punctuation is between them, the compiler will paste the character arrays
together into a single character array. This is particularly useful when code
listings have width restrictions:
//: C02:Concat.cpp // Character array Concatenation #include <iostream> using namespace std; int main() { cout << "This is far too long to put on a " "single line but it can be broken up with " "no ill effects\nas long as there is no " "punctuation separating adjacent character " "arrays.\n"; } ///:~
At first, the code above can look like an
error because there’s no familiar semicolon at the end of each line.
Remember that C and C++ are free-form languages, and although you’ll
usually see a semicolon at the end of each line, the actual requirement is for a
semicolon at the end of each statement, and it’s possible for a
statement to continue over several
lines.
The iostreams classes provide the ability
to read input. The object used for
standard input is
cin (for “console input”). cin
normally expects input from the console, but this input can be redirected from
other sources. An example of redirection is shown later in this
chapter.
The iostreams operator used with
cin is >>. This operator waits for the same kind of input as
its argument. For example, if you give it an integer argument, it waits for an
integer from the console. Here’s an example:
//: C02:Numconv.cpp // Converts decimal to octal and hex #include <iostream> using namespace std; int main() { int number; cout << "Enter a decimal number: "; cin >> number; cout << "value in octal = 0" << oct << number << endl; cout << "value in hex = 0x" << hex << number << endl; } ///:~
While the typical way to use a program
that reads from standard input and writes to standard output is within a Unix
shell script or DOS batch file, any program can be called from inside a C or C++
program using the Standard C system( )
function, which is declared in the header file
<cstdlib>:
//: C02:CallHello.cpp // Call another program #include <cstdlib> // Declare "system()" using namespace std; int main() { system("Hello"); } ///:~
To use the system( )
function, you give it a character array that you would normally type at the
operating system command prompt. This can also include command-line arguments,
and the character array can be one that you fabricate at run time (instead of
just using a static character array as shown above). The command executes and
control returns to the program.
This program shows you how easy it is to
use plain C library functions in C++; just include the header file and call the
function. This upward compatibility from C to C++ is a
big advantage if you are learning the language starting from a background in
C.
While a character array can be fairly
useful, it is quite limited. It’s simply a group of characters in memory,
but if you want to do anything with it you must manage all the little details.
For example, the size of a quoted character array is fixed at compile time. If
you have a character array and you want to add some more characters to it,
you’ll need to understand quite a lot (including dynamic memory
management, character array copying, and concatenation) before you can get your
wish. This is exactly the kind of thing we’d like to have an object do for
us.
The Standard C++
string class is designed to take care of (and
hide) all the low-level manipulations of character arrays that were previously
required of the C programmer. These manipulations have been a constant source of
time-wasting and errors since the inception of the C language. So, although an
entire chapter is devoted to the string class in Volume 2 of this book,
the string is so important and it makes life so much easier that it will
be introduced here and used in much of the early part of the
book.
To use strings you include the C++
header file <string>. The string class is in the namespace
std so a using directive is necessary. Because of operator
overloading, the syntax for using strings is quite
intuitive:
//: C02:HelloStrings.cpp // The basics of the Standard C++ string class #include <string> #include <iostream> using namespace std; int main() { string s1, s2; // Empty strings string s3 = "Hello, World."; // Initialized string s4("I am"); // Also initialized s2 = "Today"; // Assigning to a string s1 = s3 + " " + s4; // Combining strings s1 += " 8 "; // Appending to a string cout << s1 + s2 + "!" << endl; } ///:~
The first two strings, s1
and s2, start out empty, while s3 and s4 show two
equivalent ways to initialize string objects from character arrays (you
can just as easily initialize string objects from other string
objects).
You can assign to any string
object using ‘=’. This replaces the previous contents of the
string with whatever is on the right-hand side, and you don’t have to
worry about what happens to the previous contents – that’s handled
automatically for you. To combine strings you simply use the
‘+’ operator, which also allows you to combine character
arrays with strings. If you want to append either a string or a
character array to another string, you can use the operator
‘+=’. Finally, note that iostreams
already know what to do with strings, so you can just send a
string (or an expression that produces a string, which happens
with s1 + s2 + "!") directly to cout in order to print
it.
In C, the process of opening and
manipulating files requires a lot of language background to prepare you for the
complexity of the operations. However, the C++ iostream library provides a
simple way to manipulate files, and so this functionality can be introduced much
earlier than it would be in C.
To open files for reading and writing,
you must include <fstream>. Although this
will automatically include <iostream>, it’s generally prudent
to explicitly include <iostream> if you’re planning to use
cin, cout, etc.
To open a file for reading, you create an
ifstream object, which then behaves like
cin. To open a file for writing, you create an
ofstream object, which then behaves like
cout. Once you’ve opened the file, you can read from it or write to
it just as you would with any other iostream object. It’s that simple
(which is, of course, the whole point).
One of the most useful functions in the
iostream library is
getline( ), which
allows you to read one line (terminated by a newline) into a string
object[28]. The
first argument is the ifstream object you’re reading from and the
second argument is the string object. When the function call is finished,
the string object will contain the line.
Here’s a simple example, which
copies the contents of one file into another:
//: C02:Scopy.cpp // Copy one file to another, a line at a time #include <string> #include <fstream> using namespace std; int main() { ifstream in("Scopy.cpp"); // Open for reading ofstream out("Scopy2.cpp"); // Open for writing string s; while(getline(in, s)) // Discards newline char out << s << "\n"; // ... must add it back } ///:~
To open the files, you just hand the
ifstream and ofstream objects the file names you want to create,
as seen above.
There is a new concept introduced here,
which is the
while
loop. Although this will be explained in detail in the next chapter, the basic
idea is that the expression in parentheses following the while controls
the execution of the subsequent statement (which can also be multiple
statements, wrapped inside curly braces). As long as the expression in
parentheses (in this case, getline(in, s)) produces a “true”
result, then the statement controlled by the while will continue to
execute. It turns out that getline( ) will return a value that can
be interpreted as “true” if another line has been read successfully,
and “false” upon reaching the end of the input. Thus, the above
while loop reads every line in the input file and sends each line to the
output file.
getline( ) reads in the
characters of each line until it discovers a newline (the termination character
can be changed, but that won’t be an issue until the iostreams chapter in
Volume 2). However, it discards the newline and doesn’t store it in the
resulting string object. Thus, if we want the copied file to look just
like the source file, we must add the newline back in, as
shown.
//: C02:FillString.cpp // Read an entire file into a single string #include <string> #include <iostream> #include <fstream> using namespace std; int main() { ifstream in("FillString.cpp"); string s, line; while(getline(in, line)) s += line + "\n"; cout << s; } ///:~
Because of the dynamic nature of
strings, you don’t have to worry about how much storage to allocate
for a string; you can just keep adding things and the string will
keep expanding to hold whatever you put into it.
One of the nice things about putting an
entire file into a string is that the string class has many
functions for searching and manipulation that would then allow you to modify the
file as a single string. However, this has its limitations. For one thing, it is
often convenient to treat a file as a collection of lines instead of just a big
blob of text. For example, if you want to add line numbering it’s much
easier if you have each line as a separate string object. To accomplish
this, we’ll need another
approach.
With strings, we can fill up a
string object without knowing how much storage we’re going to need.
The problem with reading lines from a file into individual string objects
is that you don’t know up front how many strings you’re going
to need – you only know after you’ve read the entire file. To solve
this problem, we need some sort of holder that will automatically expand to
contain as many string objects as we care to put into
it.
In fact, why limit ourselves to holding
string objects? It turns out that this kind of problem – not
knowing how many of something you have while you’re writing a program
– happens a lot. And this “container” object sounds like it
would be more useful if it would hold any kind of object at all!
Fortunately, the Standard C++ Library has a ready-made solution: the standard
container classes. The container classes are one of the real powerhouses of
Standard C++.
There is often a bit of confusion between
the containers and algorithms in the Standard C++ Library, and the entity known
as the STL. The
Standard Template Library was the
name Alex Stepanov (who was working at Hewlett-Packard at the time) used when he
presented his library to the C++ Standards Committee at the meeting in San
Diego, California in Spring 1994. The name stuck, especially after HP decided to
make it available for public downloads. Meanwhile, the committee integrated it
into the Standard C++ Library, making a large number of changes. STL's
development continues at
Silicon
Graphics (SGI; see http://www.sgi.com/Technology/STL). The SGI STL
diverges from the Standard C++ Library on many subtle points. So although it's a
popular misconception, the C++ Standard does not “include” the STL.
It can be a bit confusing since the containers and algorithms in the Standard
C++ Library have the same root (and usually the same names) as the SGI STL. In
this book, I will say “The Standard C++ Library” or “The
Standard Library containers,” or something similar and will avoid the term
“STL.”
Even though the implementation of the
Standard C++ Library containers and algorithms uses some advanced concepts and
the full coverage takes two large chapters in Volume 2 of this book, this
library can also be potent without knowing a lot about it. It’s so useful
that the most basic of the standard containers, the vector, is introduced
in this early chapter and used throughout the book. You’ll find that you
can do a tremendous amount just by using the basics of vector and not
worrying about the underlying implementation (again, an important goal of OOP).
Since you’ll learn much more about this and the other containers when you
reach the Standard Library chapters in Volume 2, it seems forgivable if the
programs that use vector in the early portion of the book aren’t
exactly what an experienced C++ programmer would do. You’ll find that in
most cases, the usage shown here is adequate.
The vector class is a
template, which means that it can be efficiently
applied to different types. That is, we can create a vector of
shapes, a vector of cats, a vector of
strings, etc. Basically, with a template you can create a “class of
anything.” To tell the compiler what it is that the class will work with
(in this case, what the vector will hold), you put the name of the
desired type in “angle brackets,” which means ‘<’ and
‘>’. So a vector of string would be denoted
vector<string>. When you do this, you end up with a customized
vector that will hold only string objects, and you’ll get an error
message from the compiler if you try to put anything else into
it.
Since vector expresses the concept
of a “container,” there must be a way to put things into the
container and get things back out of the container. To
add a brand-new element on the end of a vector, you use the member
function push_back( ).
(Remember that, since it’s a member function, you use a
‘.’ to call it for a particular object.) The reason the name
of this member function might seem a bit verbose –
push_back( ) instead of something simpler like “put”
– is because there are other containers and other member functions for
putting new elements into containers. For example, there is an
insert( ) member
function to put something in the middle of a container. vector supports
this but its use is more complicated and we won’t need to explore it until
Volume 2 of the book. There’s also a
push_front( ) (not part of vector) to
put things at the beginning. There are many more member functions in
vector and many more containers in the Standard C++ Library, but
you’ll be surprised at how much you can do just knowing about a few simple
features.
So you can put new elements into a
vector with push_back( ), but how do you get these elements
back out again? This solution is more clever and elegant – operator
overloading is used to make the vector look like an array. The
array (which will be described more fully in the next chapter) is a data type
that is available in virtually every programming language so you should already
be somewhat familiar with it. Arrays are
aggregates, which mean they consist of a number of
elements clumped together. The distinguishing characteristic of an array is that
these elements are the same size and are arranged to be one right after the
other. Most importantly, these elements can be selected by
“indexing,” which means you can say “I want element number
n” and that element will be produced, usually quickly. Although there are
exceptions in programming languages, the indexing is normally achieved using
square brackets, so if you have an array a and you want to produce
element five, you say a[4] (note that
indexing always starts at
zero).
This very compact and powerful indexing
notation is incorporated into the vector using operator overloading, just
like ‘<<’ and ‘>>’ were
incorporated into iostreams. Again, you don’t need to know how the
overloading was implemented – that’s saved for a later chapter
– but it’s helpful if you’re aware that there’s some
magic going on under the covers in order to make the [ ] work with
vector.
With that in mind, you can now see a
program that uses vector. To use a vector, you include the header
file <vector>:
//: C02:Fillvector.cpp // Copy an entire file into a vector of string #include <string> #include <iostream> #include <fstream> #include <vector> using namespace std; int main() { vector<string> v; ifstream in("Fillvector.cpp"); string line; while(getline(in, line)) v.push_back(line); // Add the line to the end // Add line numbers: for(int i = 0; i < v.size(); i++) cout << i << ": " << v[i] << endl; } ///:~
Much of this program is similar to the
previous one; a file is opened and lines are read into string objects one
at a time. However, these string objects are pushed onto the back of the
vector v. Once the while loop completes, the entire file is
resident in memory, inside v.
The next statement in the program is
called a
for
loop. It is similar to a while loop except that it adds some extra
control. After the for, there is a “control
expression” inside of parentheses, just like the while loop.
However, this control expression is in three parts: a part which initializes,
one that tests to see if we should exit the loop, and one that changes
something, typically to step through a sequence of items. This program shows the
for loop in the way you’ll see it most commonly used: the
initialization part int i = 0 creates an integer
i to use as a loop counter and gives it an initial value of zero. The
testing portion says that to stay in the loop, i should be less than the
number of elements in the vector v. (This is produced using the member
function size( ), which I just sort of slipped in here, but you must
admit it has a fairly obvious meaning.) The final portion uses a shorthand for C
and C++, the
“auto-increment”
operator, to add one to the value of i. Effectively, i++ says
“get the value of i, add one to it, and put the result back into
i. Thus, the total effect of the for loop is to take a variable
i and march it through the values from zero to one less than the size of
the vector. For each value of i, the cout statement is
executed and this builds a line that consists of the value of i
(magically converted to a character array by cout), a colon and a space,
the line from the file, and a newline provided by endl. When you compile
and run it you’ll see the effect is to add line numbers to the
file.
Because of the way that the
‘>>’
operator works with iostreams, you can easily modify the program above so that
it breaks up the input into
whitespace-separated words instead of lines:
//: C02:GetWords.cpp // Break a file into whitespace-separated words #include <string> #include <iostream> #include <fstream> #include <vector> using namespace std; int main() { vector<string> words; ifstream in("GetWords.cpp"); string word; while(in >> word) words.push_back(word); for(int i = 0; i < words.size(); i++) cout << words[i] << endl; } ///:~
The expression
while(in >> word)
is what gets the input one
“word” at a time, and when this expression evaluates to
“false” it means the end of the file has been reached. Of course,
delimiting words by whitespace is quite crude, but it makes for a simple
example. Later in the book you’ll see more sophisticated examples that let
you break up input just about any way you’d like.
To demonstrate how easy it is to use a
vector with any type, here’s an example that creates a
vector<int>:
//: C02:Intvector.cpp // Creating a vector that holds integers #include <iostream> #include <vector> using namespace std; int main() { vector<int> v; for(int i = 0; i < 10; i++) v.push_back(i); for(int i = 0; i < v.size(); i++) cout << v[i] << ", "; cout << endl; for(int i = 0; i < v.size(); i++) v[i] = v[i] * 10; // Assignment for(int i = 0; i < v.size(); i++) cout << v[i] << ", "; cout << endl; } ///:~
To create a vector that holds a
different type, you just put that type in as the template argument (the argument
in angle brackets). Templates and well-designed template libraries are intended
to be exactly this easy to use.
This example goes on to demonstrate
another essential feature of vector. In the expression
v[i] = v[i] * 10;
you can see that the vector is not
limited to only putting things in and getting things out. You also have the
ability to assign (and thus to change) to any element of a
vector, also through the use of the
square-brackets indexing operator. This means that vector is a
general-purpose, flexible “scratchpad” for working with collections
of objects, and we will definitely make use of it in coming
chapters.
The intent of this chapter is to show you
how easy object-oriented programming can be – if someone else has
gone to the work of defining the objects for you. In that case, you include a
header file, create the objects, and send messages to them. If the types you are
using are powerful and well-designed, then you won’t have to do much work
and your resulting program will also be powerful.
In the process of showing the ease of OOP
when using library classes, this chapter also introduced some of the most basic
and useful types in the Standard C++ library: the family of iostreams (in
particular, those that read from and write to the console and files), the
string class, and the vector template. You’ve seen how
straightforward it is to use these and can now probably imagine many things you
can accomplish with them, but there’s actually a lot more that
they’re capable
of[29]. Even though
we’ll only be using a limited subset of the functionality of these tools
in the early part of the book, they nonetheless provide a large step up from the
primitiveness of learning a low-level language like C. and while learning the
low-level aspects of C is educational, it’s also time consuming. In the
end, you’ll be much more productive if you’ve got objects to manage
the low-level issues. After all, the whole point of OOP is to hide the
details so you can “paint with a bigger brush.”
However, as high-level as OOP tries to
be, there are some fundamental aspects of C that you can’t avoid knowing,
and these will be covered in the next
chapter.
Solutions to selected exercises
can be found in the electronic document The Thinking in C++ Annotated
Solution Guide, available for a small fee from
http://www.BruceEckel.com
[25]
The boundary between compilers and interpreters can tend to become a bit fuzzy,
especially with Python, which has many of the features and power of a compiled
language but the quick turnaround of an interpreted language.
[26]
Python is again an exception, since it also provides separate
compilation.
[27]
I would recommend using Perl or Python to automate this task as part of your
library-packaging process (see www.Perl.org or www.Python.org).
[28]
There are actually a number of variants of getline( ), which will be
discussed thoroughly in the iostreams chapter in Volume 2.
[29]
If you’re particularly eager to see all the things that can be done with
these and other Standard library components, see Volume 2 of this book at
www.BruceEckel.com, and also www.dinkumware.com.