ANSI/ISO C++ Professional Programmer's Handbook

8 Namespaces

by Danny Kalev

The Rationale Behind Namespaces
A Brief Historical Background
- Large-Scale Projects Are More Susceptible to Name Clashes
Properties of Namespaces
Namespace Utilization Policy in Large-Scale Projects
Namespaces and Version Control
- Namespaces Do not Incur Additional Overhead
The Interaction of Namespaces with Other Language Features
Restrictions on Namespaces
- Namespace std Can Not Be Modified
- User-Defined new and delete Cannot Be Declared in a Namespace
Conclusions

Namespaces were introduced to the C++ Standard in 1995. This chapter explains what namespaces are and why they were added to the language. You will see how namespaces can avoid name conflicts and how they facilitate configuration management and version control in large-scale projects. Finally, you will learn how namespaces interact with other language features.

The Rationale Behind Namespaces

In order to understand why namespaces were added to the language in the first place, here's an analogy: Imagine that the file system on your computer did not have directories and subdirectories at all. All files would be stored in a flat repository, visible all the time to every user and application. Consequently, extreme difficulties would arise: Filenames would clash (with some systems limiting a filename to eight characters, plus three for the extension, this is even more likely to happen), and simple actions such as listing, copying, or searching files would be much more difficult. In addition, security and authorization restrictions would be severely compromised.

Namespaces in C++ are equivalent to directories. They can be nested easily, they protect your code from name conflicts, they enable you to hide declarations, and they do not incur any runtime or memory overhead. Most of the components of the C++ Standard Library are grouped under namespace std. Namespace std is subdivided into additional namespaces such as std::rel_ops, which contains the definitions of STL's overloaded operators.

A Brief Historical Background

In the early 1990s, when C++ was gaining popularity as a general purpose programming language, many vendors were shipping proprietary implementations of various component classes. Class libraries for string manipulations, mathematical functions, and data containers were integral parts of frameworks such as MFC, STL, OWL, and others. The proliferation of reusable components caused a name-clashing problem. A class named vector, for instance, might appear in a mathematical library and in another container library that were both used at the same time; or a class named string might be found in almost every framework and class library. It was impossible for the compiler to distinguish between different classes that had identical names. Similarly, linkers could not cope with identical names of member functions of classes with indistinguishable names. For example, a member function

vector::operator==(const vector&);

might be defined in two different classes -- the first might be a class of a mathematical library, whereas the other might belong to some container library.

Large-Scale Projects Are More Susceptible to Name Clashes

Name-clashes are not confined to third party software libraries. In large-scale software projects, short and elegant names for classes, functions, and constants can also cause name conflicts because it is likely that the same name might be used more than once to indicate different entities by different developers. In the pre-namespace era, the only workaround was to use various affixes in identifiers' names. This practice, however, is tedious and error prone. Consider the following:

class string  // short but dangerous. someone else may have picked //this name already...
{
    //...
};
class excelSoftCompany_string   // a long name is safer but tedious. //A nightmare if company changes its name...
{
    //...
};

Namespaces enable you to use convenient, short, and intelligible names safely. Instead of repeating the unwieldy affixes time after time, you can group your declarations in a namespace and factor out the recurring affix as follows:

//file excelSoftCompany.h
namespace excelSoftCompany { // a namespace definition
    class string {/*..*/};
    class vector {/*..*/};
}

Namespace members, like class members, can be defined separately from their declarations. For example


#include <iostream>
using namespace std;
namespace A
{
  void f(); //declaration
}
void A::f()    //definition in a separate file
{
  cout<<"in f"<<endl;
}
int main()
{
  A::f();
  return 0;
}

Properties of Namespaces

Namespaces are more than just name containers. They were designed to allow fast and simple migration of legacy code without inflicting any overhead. Namespaces have several properties that facilitate their usage. The following sections discuss these properties.

Fully Qualified Names

A namespace is a scope in which declarations and definitions are grouped together. In order to refer to any of these from another scope, a fully qualified name is required. A fully qualified name of an identifier consists of its namespaces, followed by a scope resolution operator (::), its class name, and, finally, the identifier itself. Because both namespaces and classes can be nested, the resulting name can be rather long -- but it ensures unique identification:

unsigned int  maxPossibleLength =
  std::string::npos;  //a fully qualified name. npos is a member of string; //string  belongs to namespace std
  int *p = ::new int; //distinguish global new from overloaded new

However, repeating the fully qualified name is tedious and less readable. Instead, you can use a using declaration or a using directive.

A using Declaration and a using Directive

A using declaration consists of the keyword using, followed by a namespace::member. It instructs the compiler to locate every occurrence of a certain identifier (type, operator, function, constant, and so on) in the specified namespace, as if the fully qualified name were supplied. For example

#include <vector>  //STL vector;  defined in namespace std
int main()
{
   using std::vector;  //using declaration; every occurrence of vector //is looked up in std
   vector <int> vi;  
  return 0;
}

A using directive, on the other hand, renders all the names of a specified namespace accessible in the scope of the directive. It consists of the following sequence: using namespace, followed by a namespace name. For example

#include <vector>    // belongs to namespace std
#include <iostream> //iostream classes and operators are also in namespace std
int main()
{
  using namespace std; // a using-directive; all <iostream> and <vector> //declarations  now accessible
  vector  <int> vi;
  vi.push_back(10);
  cout<<vi[0];
  return 0;
}

Look back at the string class example (the code is repeated here for convenience):

//file excelSoftCompany.h
namespace excelSoftCompany 
{   
  class string {/*..*/};
  class vector {/*..*/};
}

You can now access your own string class as well as the standard string class in the same program as follows:

#include <string> //  std::string
#include "excelSoftCompany.h"
int main()
{
  using namespace excelSoftCompany;
  string s; //referring to class excelSoftCompany::string
  std::string standardstr; //now instantiate an ANSI string
  return 0;
}

Namespaces Can Be Extended

The C++ standardization committee was well aware of the fact that related declarations can span across several translation units. Therefore, a namespace can be defined in parts. For example

  //file proj_const.h
namespace MyProj 
{
   enum NetProtocols
  {
      TCP_IP,
      HTTP,
      UDP
  };  // enum
}
  //file proj_classes.h
namespace MyProj
{ // extending MyProj namespace
   class RealTimeEncoder{ public: NetProtocols detect();  };
   class NetworkLink {}; //global
   class UserInterface {};
}

In a separate file, the same namespace can be extended with additional declarations.

The complete namespace MyProj can be extracted from both files as follows:

  //file app.cpp
#include "proj_const.h"
#include "proj_classes.h"
int main() 
{
  using namespace MyProj;
  RealTimeEncoder encoder;
  NetProtocols protocol = encoder.detect();
  return 0;
}

Namespace Aliases

As you have observed, choosing a short name for a namespace can eventually lead to a name clash. However, very long namespaces are not easy to use. For this purpose, a namespace alias can be used. The following example defines the alias ESC for the unwieldy Excel_Software_Company namespace. Namespace aliases have other useful purposes, as you will see soon.

//file decl.h
namespace Excel_Software_Company 
{
  class Date {/*..*/};
  class Time {/*..*/};
}
//file calendar.cpp
#include "decl.h"
int main()
{
  namespace ESC = Excel_Software_Company; //ESC is an alias for 
                                          // Excel_Software_Company
  ESC::Date date;
  ESC::Time time;
  return 0;
}

Koenig Lookup

Andrew Koenig, one of the creators of C++, devised an algorithm for resolving namespace members' lookup. This algorithm, also called argument dependent lookup, is used in all standard-compliant compilers to handle cases such as the following:

CAUTION: Please note that some existing compilers do not yet fully support Koenig lookup. Consequently, the following programs -- which rely on Koenig lookup -- might not compile under compilers that are not fully compliant to the ANSI/ISO standard in this respect.

namespace MINE
{
  class C {};
  void func;
}
MINE::C c; // global object of type MINE::C
int main()
{
  func( c ); // OK, MINE::f called
  return 0;
}

Neither a using declaration nor a using directive exists in the program. Still, the compiler did the right thing -- it correctly identified the unqualified name func as the function declared in namespace MINE by applying Koenig lookup.

Koenig lookup instructs the compiler to look not just at the usual places, such as the local scope, but also at the namespace that contains the argument's type. Therefore, in the following source line, the compiler detects that the object c, which is the argument of the function func(), belongs to namespace MINE. Consequently, the compiler looks at namespace MINE to locate the declaration of func(), "guessing" the programmer's intent:

func( c ); // OK, MINE::f called

Without Koenig lookup, namespaces impose an unacceptable tedium on the programmer, who has to either repeatedly specify the fully qualified names or use numerous using declarations. To push the argument in favor of Koenig lookup even further, consider the following example:

#include<iostream>
using std::cout;
int main()
{
  cout<<"hello";   //OK, operator << is brought into scope by Koenig lookup
  return 0;
}

The using declaration injects std::cout into the scope of main(), thereby enabling the programmer to use the nonqualified name cout. However, the overloaded << operator, as you might recall, is not a member of std::cout. It is a friend function that is defined in namespace std, and which takes a std::ostream object as its argument. Without Koenig lookup, the programmer has to write something similar to the following:

std::operator<<(cout, "hello");

Alternatively, the programmer can provide a using namespace std; directive. None of these options are desirable, however, because they clutter up code and can become a source of confusion and errors. (using directives are the least favorable form for rendering names visible in the current scope because they make all the members of a namespace visible indiscriminately). Fortunately, Koenig lookup "does the right thing" and saves you from this tedium in an elegant way.

Koenig lookup is applied automatically. No special directives or configuration switches are required to activate it, nor is there any way to turn it off. This fact has to be kept in mind because it can have surprising results in some circumstances. For example

namespace NS1
{
  class B{};
  void f;
};
void f(NS1::B); 
int main()
{
  NS1::B b;
  f;  // ambiguous; NS1::f() or f(NS1::B)?
  return 0;
}

A Standard-compliant compiler should issue an error on ambiguity between NS1::f(NS1::B) and f(NS1::B). However, noncompliant compilers do not complain about the ambiguous call; they simply pick one of the versions of f(). This, however, might not be the version that the programmer intended. Furthermore, the problem might arise only at a later stage of the development, when additional versions of f() are added to the project -- which can stymie the compiler's lookup algorithm. This ambiguity is not confined to global names. It might also appear when two namespaces relate to one another -- for instance, if a namespace declares classes that are used as parameters of a class member function that is declared in a different namespace.

Namespaces in Practice

The conclusion that can be drawn from the previous examples is that namespaces, like other language features, must be used judiciously. For small programs that contain only a handful of classes and a few source files, namespaces are not necessary. In most cases, such programs are coded and maintained by a single programmer, and they use a limited number of components. The likelihood of name clashes in this case is rather small. If name clashes still occur, it is always possible to rename the existing classes and functions, or simply to add namespace later.

On the other hand, large-scale projects -- as was stated previously -- are more susceptible to name clashes; therefore, they need to use namespaces systematically. It is not unusual to find projects on which hundreds of programmers on a dozen or so development teams are working together. The development of Microsoft Visual C++ 6.0, for example, lasted 18 months, and more than 1000 people were involved in the development process. Managing such a huge project requires well documented coding policies -- and namespaces are one of the tools in the arsenal.

Namespace Utilization Policy in Large-Scale Projects

To see how namespaces can be used in configuration management, imagine an online transaction processing system of an imaginary international credit card company, Unicard. The project comprises several development teams. One of them, the database administration team, is responsible for the creation and maintenance of the database tables, indexes, and access authorizations. The database team also has to provide the access routines and data objects that retrieve and manipulate the data in the database. A second team is responsible for the graphical user interface. A third team deals with the international online requests that are initiated by the cinemas, restaurants, shops, and so on where tourists pay with their international Unicard. Every purchase of a cinema ticket, piece of jewelry, or art book has to be confirmed by Unicard before the card owner is charged. The confirmation process involves checking for the validity of the card, its expiration date, and the card owner's balance. A similar confirmation procedure is required for domestic purchases. However, international confirmation requests are transmitted via satellite, whereas domestic confirmations are usually done on the telephone.

In software projects, code reuse is paramount. Because the same business logic is used for both domestic and international confirmations, the same database access objects need to be used to retrieve the relevant information and perform the necessary computations. Still, an international confirmation also involves a sophisticated communication stack that receives the request that is transmitted via satellite, decrypts it, and returns an encrypted response to the sender. A typical implementation of satellite-based confirmation application can be achieved by means of combining the database access objects with the necessary communication objects that encapsulate protocols, communication layers, priority management, message queuing, encryption, and decryption. It is not difficult to imagine a name conflict resulting from the simultaneous use of the communication components and the database access objects.

For example, two objects -- one encapsulating a database connection and the other referring to a satellite connection -- can have an identical name: Connection. If, however, communication software components and database access objects are declared in two distinct namespaces, the potential of name clashes is minimized. Therefore, com::Connection and dba::Connection can be used in the same application simultaneously. A systematic approach can be based on allocating a separate namespace for every team in a project in which all the components are declared. Such a policy can help you avoid name clashes among different teams and third party code used in the project.

Namespaces and Version Control

Successful software projects do not end with the product's rollout. In most projects, new versions that are based on their predecessors are periodically released. Moreover, previous versions have to be supported, patched, and adjusted to operate with new operating systems, locales, and hardware. Web browsers, commercial databases, word processors, and multimedia tools are examples of such products. It is often the case that the same development team has to support several versions of the same software product. A considerable amount of software can be shared among different versions of the same product, but each version also has its specific components. Namespace aliases can be used in these cases to switch swiftly from one version to another.

Continuous projects in general have a pool of infrastructure software components that are used ubiquitously. In addition, every version has its private pool of specialized components. Namespace aliases can provide dynamic namespaces; that is, a namespace alias can point at a given time to a namespace of version X and, at another time, it can refer to a different namespace. For example

namespace ver_3_11  //16 bit
{  
  class Winsock{/*..*/};
  class FileSystem{/*..*/};
};
namespace ver_95 //32 bit
{  
  class Winsock{/*..*/};
  class FileSystem{/*..*/};
}
int main()//implementing 16 bit release
{ 
  namespace current = ver_3_11; // current is an alias of ver_3_11
  using current::Winsock;
  using current::FileSystem;
  FileSystem  fs; // ver_3_11::FileSystem
  //...
  return 0;
}

In this example, the alias current is a symbol that can refer to either ver_3_11 or ver_95. To switch to a different version, the programmer only has to assign a different namespace to it.

Namespaces Do not Incur Additional Overhead

Namespace resolution, including Koenig lookup, are statically resolved. The underlying implementation of namespaces occurs by means of name mangling, whereby the compiler incorporates the function name with its list of arguments, its class name, and its namespace in order to create a unique name for it (see Chapter 13, "C Language Compatibility Issues," for a detailed account of name mangling). Therefore, namespaces do not incur any runtime or memory overhead.

The Interaction of Namespaces with Other Language Features

Namespaces interact with other features of the language and affect programming techniques. Namespaces made some features in C++ superfluous or undesirable.

Scope Resolution Operator Should Not Be Used To Designate Global Names

In some frameworks (MFC, for instance), it is customary to add the scope resolution operator, ::, before a global function's name to mark it explicitly as a function that is not a class member (as in the following example):

void String::operator = (const String& other)
{
  ::strcpy (this->buffer, other.getBuff()); 
}

This practice is not recommended. Many of the standard functions that were once global are now grouped inside namespaces. For example, strcpy now belongs to namespace std, as do most of the Standard Library's functions. Preceding these functions with the scope resolution operator might confuse the lookup algorithm of the compiler; furthermore, doing so undermines the very idea of partitioning the global namespace. Therefore, it is recommended that you leave the scope resolution operator off the function's name.

Turning an External Function into A File-Local Function

In standard C, a nonlocal identifier that is declared to be static has internal linkage, which means that it is accessible only from within the translation unit (source file) in which it is declared (see also Chapter 2, "Standard Briefing: The Latest Addenda to ANSI/ISO C++"). This technique is used to support information hiding (as in the following example):

    //File hidden.c
static void decipher(FILE *f); // accessible only from within this file
    // now use this function in the current source file
decipher ("passwords.bin");
    //end of file

Although it is still supported in C++, this convention is now considered a deprecated feature. Future releases of your compiler might issue a warning message when they find a static identifier that is not a member of a class. In order to make a function accessible only from within its translation unit, use an unnamed namespace instead. The following example demonstrates the process:

//File hidden.cpp
namespace        //unnamed
{
  void decipher(FILE *f);  // accessible only from within this file
}
  //now use the function in the current source file. 
  //No using declarations or directives are needed
decipher ("passwords.bin");

Although names in an unnamed namespace might have external linkage, they can never be seen from any other translation unit; the net effect of this is that the names of an unnamed namespace appear to have static linkage. If you declare another function with the same name in an unnamed namespace of another file, the two functions are hidden from one another, and their names do not clash.

Standard Headers Names

All Standard C++ header files now have to be included as follows:

#include <iostream> //note: no ".h" extension

That is, the .h extension is omitted. Standard C header files also obey this convention, with the addition of the letter c to their name. Therefore, a C standard header that was formerly named <xxx.h> is now <cxxx>. For example

#include <cassert> //formerly: <assert.h>  note the prefix 'c' and the //omission of  ".h"

The older convention for C headers, <xxx.h>, is still supported; however, it is now considered deprecated and, therefore, is not to not be used in new C++ code. The reason for this is that C <xxx.h> headers inject their declarations into the global namespace. In C++, however, most standard declarations are grouped under namespace std, as are the <cxxx> Standard C headers. No inference is to be drawn from the actual name convention that is used on the physical location of a header file or its underlying name. In fact, most implementations share a single physical file for the <xxx.h> and its corresponding <cxxx> notation. This is feasible due to some under-the-hood preprocessor tricks. Recall that you need to have a using declaration, a using directive, or a fully qualified name in order to access the declarations in the new style standard headers. For example

#include <cstdio>
using namespace std;  
void f()
{
    printf ("Hello World\n");
}

Restrictions on Namespaces

The C++ Standard defines several restrictions on the use of namespaces. These restrictions are meant to avert anomalies or ambiguities that can create havoc in the language.

Namespace std Can Not Be Modified

Generally, namespaces are open, so it is perfectly legal to expand existing namespaces with additional declarations and definitions across several files. The only exception to the rule is namespace std. According to the Standard, the result of modifying namespace std with additional declarations -- let alone the removal of existing ones -- yields undefined behavior, and is to be avoided. This restriction might seem arbitrary, but it's just common sense -- any attempt to tamper with namespace std undermines the very concept of a namespace dedicated exclusively to standard declarations.

User-Defined new and delete Cannot Be Declared in a Namespace

The Standard prohibits declarations of new and delete operators in a namespace. To see why, consider the following example:

char *pc; //global
namespace A
{
  void* operator new ( std::size_t );
  void operator delete ( void * );
  void func ()
  {
    pc = new char ( 'a'); //using A::new
  }
} //A
void f() { delete pc; } // call A::delete or //::delete?

Some programmers might expect the operator A::delete to be selected because it matches the operator new that was used to allocate the storage; others might expect the standard operator delete to be called because A::delete is not visible in function f(). By prohibiting declarations of new and delete in a namespace altogether, C++ avoids any such ambiguities.

Conclusions

Namespaces were the latest addition to the C++ Standard. Therefore, some compilers do not yet support this feature. However, all compiler vendors will incorporate namespace support in the near future. The importance of namespaces cannot be over-emphasized. As you have seen, any nontrivial C++ program utilizes components of the Standard Template Library, the iostream library, and other standard header files -- all of which are now namespace members.

Large-scale software projects can use namespaces cleverly to avoid common pitfalls and to facilitate version control, as you have seen.

C++ offers three methods for injecting a namespace constituent into the current scope. The first is a using directive, which renders all the members of a namespace visible in the current scope. The second is a using declaration, which is more selective and enables the injection of a single component from a namespace. Finally, a fully qualified name uniquely identifies a namespace member. In addition, the argument-dependent lookup, or Koenig lookup, captures the programmer's intention without forcing him or her to use wearying references to a namespace.