[Note] More Effective C++
This book shows how to design and implement C++ software that is more effective: more likely to behave correctly; more robust in the face of exceptions; more efficient; more portable; makes better use of language features; adapts to change more gracefully; works better in a mixed-language environment; is easier to use correctly; is harder to use incorrectly. In short, software that's just better.
Item 1: Distinguish between pointers and references.
- There is no such thing as a null reference. A reference must always refer to some object.
- It can be more efficient to use references than to use pointers because there's no need to test the validity of a reference before using it. Pointers, on the other hand, should generally be tested against null.
- Pointers may be reassigned to refer to different objects. A reference, however, always refers to the object with which it is initialized.
- When you're implementing certain operators that need to return something that can be used as the target of an assignment (e.g. operator[]), you should use a reference.
Item 2: Prefer C++-style casts.
- C++-style casts specify more precisely the purpose of each cast, and they are easy to find.
static_cast
has basically the same power and meaning as the general-purpose C-style cast.const_cast
is used to cast away the constness or volatileness of an expression.dynamic_cast
is used to perform safe casts down or across an inheritance hierarchy (cannot be applied to types lacking virtual functions).reinterpret_cast
is used to perform type conversions whose result is nearly always implementation-defined (rarely portable).
Item 3: Never treat arrays polymorphically.
- Manipulating arrays of derived class objects through base class pointers and references almost never works the way you want it to.
- The result of deleting an array of derived class objects through a base class pointer is undefined,
- Polymorphism and pointer arithmetic simply don't mix. Array operations almost always involve pointer arithmetic, so arrays and polymorphism don't mix.
- You're unlikely to make the mistake of treating an array polymorphically if you avoid having a concrete class inherit from another concrete class.
Item 4: Avoid gratuitous default constructors.
- If a class lacks a default constructor, there are restrictions on
how you can use that class. Its use may be problematic in three
contexts:
- There is, in general, no way to specify constructor arguments for objects in arrays.
- They are ineligible for use with many template-based container classes.
- A virtual base class lacking a default constructor requires that all classes derived from that class - no matter how far removed - must know about, understand the meaning of, and provide for the virtual base class's constructors' arguments.
- Inclusion of meaningless default constructors affects the efficiency of classes.
Item 5: Be wary of user-defined conversion functions.
- Two kinds of functions allow compilers to perform implicit conversions: single-argument constructors and implicit type conversion operators.
- Granting compilers license to perform implicit type conversions usually leads to more harm than good, so don't provide conversion functions unless you're sure you want them.
- For implicit type conversion operators, replace the operators with equivalent functions that don't have the syntactically magic names, e.g. asDouble(), which must be called explicitly.
- For single-argument constructors:
- use the explicit keyword, or
- use proxy classes, using the rule that no sequence of conversions is allowed to contain more than one user-defined conversion.
Item 6: Distinguish between prefix and postfix forms of increment and decrement operators.
- Postfix forms take an int argument, and compilers silently pass 0 as that int when those functions are called.
- Prefix forms return a reference.
- Postfix forms return a const object, to prohibit double application
(e.g. i++++), for two reasons:
- It's inconsistent with the behavior of the built-in types.
- It almost never does what clients expect it to, because i would be incremented only once.
- When dealing with user-defined types, prefix increment should be used whenever possible, because it's inherently more efficient.
- Postfix increment and decrement should be implemented in terms of their prefix counterparts.
Item 7: Never overload &&, ||, or ,.
- C++ allows you to customize the behavior of the && and || operators for user-defined types. But if you decide to take advantage of this opportunity, be aware that you are replacing short-circuit semantics with function call semantics, which means all parameters must be evaluated in undefined order.
- An expression containing a comma is evaluated by first evaluating the part of the expression to the left of the comma, then evaluating the expression to the right of the comma; the result of the overall comma expression is the value of the expression on the right. Unfortunately, you can't mimic this behavior.
Item 8: Understand the different meanings of new and delete.
- The new operator always does two things; you can't change its
behavior in any way.
- First, it allocates enough memory to hold an object of the type requested.
- Second, it calls a constructor to initialize an object in the memory that was allocated.
- What you can change is how the memory for an object is allocated.
The
new operator
calls theoperator new function
to perform the requisite memory allocation, and you can overload that function to change its behavior by adding additional parameters, but the first parameter must always be of typesize_t
. - If you want to create an object on the heap, use the new operator. It both allocates memory and calls a constructor for the object. If you only want to allocate memory, call operator new; no constructor will be called. If you want to customize the memory allocation that takes place when heap objects are created, write your own version of operator new and use the new operator; it will automatically invoke your custom version of operator new. If you want to construct an object in memory you've already got a pointer to, use placement new.
- If you want to deal only with raw, uninitialized memory, you should bypass the new and delete operators entirely. Instead, you should call operator new to get the memory and operator delete to return it to the system.
- If you use placement new to create an object in some memory, you should avoid using the delete operator on that memory. Instead, you should undo the effect of the constructor by explicitly calling the object's destructor.
- For arrays, the new operator behaves slightly differently from the
case of single-object creation.
- Memory is allocated by the array-allocation equivalent, a function called operator new[].
- A constructor must be called for each object in the array.
- Similarly, when the delete operator is used on an array, it calls a destructor for each array element and then calls operator delete[] to deallocate the memory.
Item 9: Use destructors to prevent resource leaks.
- Use auto_ptr objects instead of raw pointers, and you won't have to worry about heap objects not being deleted, not even when exceptions are thrown.
- Because the auto_ptr destructor uses the single-object form of delete, auto_ptr is not suitable for use with pointers to arrays of objects. If you'd like an auto_ptr-like template for arrays, you'll have to write your own. In such cases, however, it's often a better design decision to use a vector instead of an array, anyway.
- The idea behind auto_ptr - using an object to store a resource that needs to be automatically released and relying on that object's destructor to release it - applies to more than just pointer-based resources.
Item 10: Prevent resource leaks in constructors.
- C++ destroys only fully constructed objects, and an object isn't fully constructed until its constructor has run to completion.
- Because C++ won't clean up after objects that throw exceptions during construction, you must design your constructors so that they clean up after themselves. Often, this involves simply catching all possible exceptions, executing some cleanup code, then rethrowing the exception so it continues to propagate. To avoid code duplication, move the common cleanup code into a private helper function, and have both the constructor's catch block and the destructor call it. However, this won't work for member initialization lists.
- A better solution: if you replace pointer class members with their corresponding auto_ptr objects, you fortify your constructors against resource leaks in the presence of exceptions, you eliminate the need to manually deallocate resources in destructors, and you allow const member pointers to be handled in the same graceful fashion as non-const pointers..
Item 11: Prevent exceptions from leaving destructors.
There are two situations in which a destructor is called:
- The first is when an object is destroyed under "normal" conditions, e.g., when it goes out of scope or is explicitly deleted.
- The second is when an object is destroyed by the exception-handling mechanism during the stack-unwinding part of exception propagation.
You must write your destructors under the conservative assumption that an exception is active, because if control leaves a destructor due to an exception while another exception is active, C++ calls the terminate function.
Two good reasons for keeping exceptions from propagating out of destructors:
- First, it prevents terminate from being called during the stack-unwinding part of exception propagation.
- Second, it helps ensure that destructors always accomplish everything they are supposed to accomplish.
The only way to do that is by using try and catch blocks, and doing nothing in catch blocks:
1
2
3
4
5
6
7Session::~Session()
{
try {
logDestruction(this);
}
catch (...) { }
}
Item 12: Understand how throwing an exception differs from passing a parameter or calling a virtual function.
When you call a function, control eventually returns to the call site (unless the function fails to return), but when you throw an exception, control does not return to the throw site.
C++ specifies that an object thrown as an exception is always copied (based on its static type), even if the object being thrown is not in danger of being destroyed (e.g. static objects). This helps explain another difference between parameter passing and throwing an exception: the latter is typically much slower than the former.
You'll want to use the
throw;
syntax to rethrow the current exception, because there's no chance that that will change the type of the exception being propagated. Furthermore, it's more efficient, because there's no need to generate a new exception object.1
2
3
4
5
6
7
8
9
10
11
12
13catch (Widget& w) // catch Widget exceptions
{
... // handle the exception
throw; // rethrow the exception so it continues to propagate
// no copy, so the type won't change
}
catch (Widget& w) // catch Widget exceptions
{
... // handle the exception
throw w; // propagate a copy of the caught exception
// if w is a SpecialWidget, the copy will always be of type Widget
}A thrown object (which is always a temporary) may be caught by simple reference; it need not be caught by reference-to-const. Passing a temporary object to a non-const reference parameter is not allowed for function calls, but it is for exceptions.
When we catch an exception by value, we expect to pay for the creation of two copies of the thrown object, one to create the temporary that all exceptions generate, the second to copy that temporary into w. Similarly, when we catch an exception by reference, we still expect to pay for the creation of a copy of the exception: the copy that is the temporary. In contrast, when we pass function parameters by reference, no copying takes place. When throwing an exception, then, we expect to construct (and later destruct) one more copy of the thrown object than if we passed the same object to a function.
Throw by pointer is equivalent to pass by pointer. Either way, a copy of the pointer is passed. About all you need to remember is not to throw a pointer to a local object, which is the behavior the mandatory copying rule is designed to avoid.
Two kinds of conversions are applied when matching exceptions to catch clauses.
- The first is inheritance-based conversions. A catch clause for base class exceptions is allowed to handle exceptions of derived class types, too.
- The second type of allowed conversion is from a typed to an untyped pointer, so a catch clause taking a const void* pointer will catch an exception of any pointer type.
Catch clauses are always tried in the order of their appearance. In contrast, when you call a virtual function, the function invoked is the one in the class closest to the dynamic type of the object invoking the function.
Never put a catch clause for a base class before a catch clause for a derived class.
Item 13: Catch exceptions by reference.
- You have three choices, just as when specifying how parameters should be passed to functions: by pointer, by value, or by reference.
- By pointer: the only way of moving exception information without copying an object, but programmers must define exception objects in a way that guarantees the objects exist after control leaves the functions throwing pointers to them (e.g. global and static objects). Do not throw a pointer to a new heap object. Furthermore, it doesn't work with the standard exception types.
- By value: eliminates questions about exception deletion and works with the standard exception types, but it requires that exception objects be copied twice each time they're thrown, and it also gives rise to the specter of the slicing problem.
- By reference: suffers from none of the problems we have discussed.
Item 14: Use exception specifications judiciously.
- The default behavior for unexpected is to call terminate, and the default behavior for terminate is to call abort, so the default behavior for a program with a violated exception specification is to halt.
- Compilers only partially check exception usage for consistency with exception specifications. The language standard prohibits them from rejecting (though they may issue a warning) a call to a function that might violate the exception specification of the function making the call.
- There is no way to know anything about the exceptions thrown by a template's type parameters. Templates and exception specifications don't mix.
- Omit exception specifications on functions making calls to functions that themselves lack exception specifications.
- When allowing users to register callback functions, tighten the exception specification in the function typedef.
- Handle exceptions "the system" may throw, such as
bad_alloc
. - C++ allows you to replace unexpected exceptions with exceptions of a
different type by
set_unexpected
. If the unexpected function's replacement rethrows the current exception, that exception will be replaced by a new exception of the standard typebad_exception
. - Exception specifications result in unexpected being invoked even when a higher-level caller is prepared to cope with the exception that's arisen.
Item 15: Understand the costs of exception handling.
- Exception handling has costs, and you pay at least some of them even if you never use the keywords try, throw, or catch.
- Programs compiled without support for exceptions are typically both faster and smaller than their counterparts compiled with support for exceptions.
- As a rough estimate, expect your overall code size to increase by 5-10% and your runtime to go up by a similar amount if you use try blocks (assuming no exceptions are thrown, just the cost of having try blocks in your programs). To minimize this cost, you should avoid unnecessary try blocks.
- An exception specification generally incurs about the same cost as a try block.
- Compared to a normal function return, returning from a function by throwing an exception may be as much as three orders of magnitude slower. But you'll take it only if you throw an exception, and that should be almost never.
Item 16: Remember the 80-20 rule.
- The overall performance of your software is almost always determined by a small part of its constituent code.
- The 20 percent of your program that is causing you heartache, and the way to identify that horrid 20 percent is to use a program profiler, which directly measures the resources you are interested in.
- Profile your software using as many data sets as possible, and ensure that each data set is representative of how the software is used by its clients (or at least its most important clients).
Item 17: Consider using lazy evaluation.
- When you employ lazy evaluation, you write your classes in such a way that they defer computations until the results of those computations are required.
- Reference Counting: don't bother to make a copy of something until you really need one. Instead, be lazy - use someone else's copy as long as you can get away with it.
- Distinguishing Reads from Writes: by using lazy evaluation and proxy classes as described in Item 30, however, we can defer the decision on whether to take read actions or write actions until we can determine which is correct.
- Lazy Fetching: read no data from disk when a large object is created. Instead, only the "shell" of an object is created, and data is retrieved from the database only when that particular data is needed inside the object.
- Lazy Expression Evaluation: avoid unnecessary numerical computations. For example, APL employed lazy evaluation to defer its computations until it knew exactly what part of a result matrix was needed, then it computed only that part.
Item 18: Amortize the cost of expected computations.
- The philosophy of this item might be called over-eager evaluation: doing things before you're asked to do them.
- If you expect a computation to be requested frequently, you can
lower the average cost per request by designing your data structures to
handle the requests especially efficiently:
- Caching values that have already been computed and are likely to be needed again.
- Prefetching: the computational equivalent of a discount for buying in bulk.
- The locality of reference phenomenon: if data in one place is requested, it's quite common to want nearby data, too.
- You can often trade space for time.
- This Item is not contradictory to Item 17:
- Lazy evaluation is a technique for improving the efficiency of programs when you must support operations whose results are not always needed.
- Over-eager evaluation is a technique for improving the efficiency of programs when you must support operations whose results are almost always needed or whose results are often needed more than once.
- Both are more difficult to implement than run-of-the-mill eager evaluation, but both can yield significant performance improvements in programs whose behavioral characteristics justify the extra programming effort.
Item 19: Understand the origin of temporary objects.
- True temporary objects in C++ are invisible - they don't appear in
your source code. They arise whenever a non-heap object is created but
not named. Such unnamed objects usually arise in one of two situations:
- when implicit type conversions are applied to make function calls succeed, and
- when functions return objects.
- The attendant costs of their construction and destruction can have a noticeable impact on the performance of your programs.
- Two general ways to eliminate implicit type conversions:
- To redesign your code so conversions like these can't take place. (See Item 5)
- To modify your software so that the conversions are unnecessary. (See Item 21)
- These conversions occur only when passing objects by value or when passing to a reference-to-const parameter. They do not occur when passing an object to a reference-to-non-const parameter.
- For most functions that return objects (except for operator+=, see Item 22), there is no way to avoid the construction and destruction of the return value. However, sometimes you can write your object-returning functions in a way that allows your compilers to optimize temporary objects out of existence (e.g. return value optimization, see Item 20).
Item 20: Facilitate the return value optimization.
- A function either has to return an object in order to offer correct behavior or it doesn't. If it does, there's no way to get rid of the object being returned.
- It is frequently possible to write functions that return objects in such a way that compilers can eliminate the cost of the temporaries. The trick is to return constructor arguments instead of objects.
- This particular optimization - eliminating a local temporary by using a function's return location (and possibly replacing that with an object at the function's call site) - is both well-known and commonly implemented. It even has a name: the return value optimization.
- Besides, you can eliminate the overhead of the call to the function by declaring that function inline.
Item 21: Overload to avoid implicit type conversions.
- Besides implicit type conversion, there is another way to make mixed-type calls to operator+ succeed: overloading to eliminate type conversions.
- Every overloaded operator must take at least one argument of a user-defined type.
- Overloading to avoid temporaries isn't limited to operator functions. Any function taking arguments of type string, char*, complex, etc., is a reasonable candidate for overloading to eliminate type conversions.
- Still, it's important to keep the 80-20 rule (see Item 16) in mind. There is no point in implementing a slew of overloaded functions unless you have good reason to believe that it will make a noticeable improvement in the overall efficiency of the programs that use them.
Item 22: Consider using op= instead of stand-alone op.
A good way to ensure that the natural relationship between the assignment version of an operator (e.g., operator+=) and the stand-alone version (e.g., operator+) exists is to implement the latter in terms of the former (see also Item 6).
In general, assignment versions of operators are more efficient than stand-alone versions, because stand-alone versions must typically return a new object, and that costs us the construction and destruction of a temporary. Assignment versions of operators write to their left-hand argument, so there is no need to generate a temporary to hold the operator's return value.
By offering assignment versions of operators as well as stand-alone versions, you allow clients of your classes to make the difficult trade-off between efficiency and convenience.
When faced with a choice between a named object and a temporary object, you may be better off using the temporary, which has always been eligible for the return value optimization (see Item 20).
1
2
3
4
5
6
7
8
9
10
11
12template<class T>
const T operator+(const T& lhs, const T& rhs)
{
return T(lhs) += rhs;
}
template<class T>
const T operator+(const T& lhs, const T& rhs)
{
T result(lhs);
return result += rhs;
}
Item 23: Consider alternative libraries.
- Different libraries offering similar functionality often feature different performance trade-offs, so once you've identified the bottlenecks in your software (via profiling, see Item 16), you should see if it's possible to remove those bottlenecks by replacing one library with another.
Item 24: Understand the costs of virtual functions, multiple inheritance, virtual base classes, and RTTI.
When a virtual function is called, the code executed must correspond to the dynamic type of the object on which the function is invoked; the type of the pointer or reference to the object is immaterial. How can compilers provide this behavior efficiently? Most implementations use virtual tables (vtbls) and virtual table pointers (vptrs).
A vtbl is usually an array of pointers to functions. Each class in a program that declares or inherits virtual functions has its own vtbl, and the entries in a class's vtbl are pointers to the implementations of the virtual functions for that class. Therefore, you have to set aside space for a virtual table for each class that contains virtual functions. The size of a class's vtbl is proportional to the number of virtual functions declared for that class (including those it inherits from its base classes).
You need only one copy of a class's vtbl in your programs. Where to put it? Compiler vendors tend to fall into two camps:
- Generate a copy of the vtbl in each object file that might need it, then the linker strips out duplicate copies, leaving only a single instance of each vtbl in the final executable or library.
- Employ a heuristic to determine which object file should contain the vtbl for a class: a class's vtbl is generated in the object file containing the definition (i.e., the body) of the first non-inline non-pure virtual function in that class.
Each object whose class declares virtual functions carries with it a hidden data member that points to the virtual table for that class. This hidden data member - the vptr - is added by compilers at a location in the object known only to the compilers. Therefore, you have to pay for an extra pointer inside each object that is of a class containing virtual functions.
How do compilers determine which virtual function to call?
- Follow the object's vptr to its vtbl. This is a simple operation, because the compilers know where to look inside the object for the vptr. As a result, this costs only an offset adjustment (to get to the vptr) and a pointer indirection (to get to the vtbl).
- Find the pointer in the vtbl that corresponds to the function being called. This, too, is simple, because compilers assign each virtual function a unique index within the table. The cost of this step is just an offset into the vtbl array.
- Invoke the function pointed to by the pointer located in step 2.
The cost of calling a virtual function is thus basically the same as that of calling a function through a function pointer. Virtual functions per se are not usually a performance bottleneck.
For all practical purposes, virtual functions aren't inlined. (Virtual functions can be inlined when invoked through objects, but most virtual function calls are made through pointers or references to objects, and such calls are not inlined.)
With multiple inheritance, offset calculations to find vptrs within objects become more complicated; there are multiple vptrs within a single object (one per base class); and special vtbls must be generated for base classes in addition to the stand-alone vtbls we have discussed. As a result, both the per-class and the per-object space overhead for virtual functions increases, and the runtime invocation cost grows slightly, too.
Virtual base classes may incur a cost of their own, however, because implementations of virtual base classes often use pointers to virtual base class parts as the means for avoiding the replication, and one or more of those pointers may be stored inside your objects.
The language specification states that we're guaranteed accurate information on an object's dynamic type only if that type has at least one virtual function.
RTTI was designed to be implementable in terms of a class's vtbl. The space cost of RTTI is an additional entry in each class vtbl plus the cost of the storage for the type_info object for each class.
The following table summarizes the primary costs of virtual functions, multiple inheritance, virtual base classes, and RTTI:
Feature Increases Size of Objects Increases Per-Class Data Reduces Inlining Virtual Functions Yes Yes Yes Multiple Inheritance Yes Yes No Virtual Base Classes Often Sometimes No RTTI No Yes No
Item 25: Virtualizing constructors and non-member functions.
- A virtual constructor is a function that creates different types of objects depending on the input it is given.
- A virtual copy constructor returns a pointer to a new copy of the object invoking the function. It just calls its real copy constructor.
- Virtual copy constructors can take advantage of a relaxation in the rules for virtual function return types: no longer must a derived class's redefinition of a base class's virtual function declare the same return type. Instead, if the function's return type is a pointer (or a reference) to a base class, the derived class's function may return a pointer (or reference) to a class derived from that base class.
- Making non-member functions act virtual: you write virtual functions to do the work, then write a non-virtual function that does nothing but call the virtual function. To avoid incurring the cost of a function call for this syntactic sleight-of-hand, of course, you inline the non-virtual function.
Item 26: Limiting the number of objects of a class.
Zero object: declare the constructors of that class private.
One object:
- private constructors
- a global function declared a friend of the class (or static member function)
- a static object inside that function
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15class Printer {
public:
static Printer& thePrinter();
...
private:
Printer();
Printer(const Printer& rhs);
...
};
Printer& Printer::thePrinter()
{
static Printer p;
return p;
}An object that's static in a class is, for all intents and purposes, always constructed (and destructed), even if it's never used. In contrast, an object that's static in a function is created the first time through the function, so if the function is never called, the object is never created. (You do, however, pay for a check each time the function is called to see whether the object needs to be created.)
C++ offers certain guarantees regarding the order of initialization of statics within a particular translation unit (i.e., a body of source code that yields a single object file), but it says nothing about the initialization order of static objects in different translation units.
If you create an inline non-member function containing a local static object, you may end up with more than one copy of the static object in your program! So don't create inline non-member functions that contain local static data.
Classes with private constructors can't be used as base classes, nor can they be embedded inside other objects (in the absence of friend declarations).
Allowing objects to come and go (i.e. only limiting the number of objects at a time):
- private constructors (to only allow objects to exist on their own)
- a static member function returning a pointer to a unique object
- object-counting
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46class Printer {
public:
class TooManyObjects{};
// pseudo-constructors
static Printer* makePrinter();
static Printer* makePrinter(const Printer& rhs);
~Printer();
void submitJob(const PrintJob& job);
void reset();
void performSelfTest();
...
private:
static size_t numObjects;
static const size_t maxObjects = 10;
Printer();
Printer(const Printer& rhs);
};
// Obligatory definitions of class statics
size_t Printer::numObjects = 0;
const size_t Printer::maxObjects;
Printer::Printer()
{
if (numObjects >= maxObjects) {
throw TooManyObjects();
}
proceed with normal object construction here;
++numObjects;
}
Printer::Printer(const Printer& rhs)
{
if (numObjects >= maxObjects) {
throw TooManyObjects();
}
...
}
Printer* Printer::makePrinter()
{ return new Printer; }
Printer * Printer::makePrinter(const Printer& rhs)
{ return new Printer(rhs); }An object-counting base class:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33template<class BeingCounted>
class Counted {
public:
class TooManyObjects{};
static int objectCount() { return numObjects; }
protected:
Counted();
Counted(const Counted& rhs);
~Counted() { --numObjects; }
private:
static int numObjects;
static const size_t maxObjects;
void init();
};
template<class BeingCounted>
int Counted<BeingCounted>::numObjects = 0;
template<class BeingCounted>
Counted<BeingCounted>::Counted()
{ init(); }
template<class BeingCounted>
Counted<BeingCounted>::Counted(const Counted<BeingCounted>&)
{ init(); }
template<class BeingCounted>
void Counted<BeingCounted>::init()
{
if (numObjects >= maxObjects) throw TooManyObjects();
++numObjects;
}1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19class Printer: private Counted<Printer> {
public:
// pseudo-constructors
static Printer * makePrinter();
static Printer * makePrinter(const Printer& rhs);
~Printer();
void submitJob(const PrintJob& job);
void reset();
void performSelfTest();
...
using Counted<Printer>::objectCount;
using Counted<Printer>::TooManyObjects;
private:
Printer();
Printer(const Printer& rhs);
};
const size_t Counted<Printer>::maxObjects = 10;
Item 27: Requiring or prohibiting heap-based objects.
To limit object creation to the heap, make the destructor private and the constructors public, then introduce a privileged pseudo-destructor function that has access to the real destructor.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15class UPNumber {
public:
UPNumber();
UPNumber(int initValue);
UPNumber(double initValue);
UPNumber(const UPNumber& rhs);
// pseudo-destructor (a const member function, because even const objects may be destroyed)
void destroy() const { delete this; }
...
private:
~UPNumber();
};
UPNumber *p = new UPNumber;
p->destroy();The inheritance problem can be solved by making UPNumber's destructor protected (while keeping its constructors public), and classes that need to contain objects of type UPNumber can be modified to contain pointers to UPNumber objects instead.
There is no easy way to enforce the restriction that all UPNumber objects - even base class parts of more derived objects - must be on the heap. It is not possible for a UPNumber constructor to determine whether it's being invoked as the base class part of a heap-based object.
Not all pointers to things on the heap can be safely deleted (e.g. a pointer to a member of a heap-based object). It's easier to determine whether it's safe to delete a pointer than to determine whether a pointer points to something on the heap: operator new adds entries to a collection of allocated addresses, operator delete removes entries, and isSafeToDelete does a lookup in the collection to see if a particular address is there.
An abstract mixin base class that offers derived classes the ability to determine whether a pointer was allocated from operator new:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44class HeapTracked { // mixin class; keeps track of ptrs returned from op. new
public:
class MissingAddress{};
virtual ~HeapTracked() = 0;
static void *operator new(size_t size);
static void operator delete(void *ptr);
bool isOnHeap() const;
private:
typedef const void* RawAddress;
static list<RawAddress> addresses;
};
// mandatory definition of static class member
list<RawAddress> HeapTracked::addresses;
HeapTracked::~HeapTracked() {}
void * HeapTracked::operator new(size_t size)
{
void *memPtr = ::operator new(size);
addresses.push_front(memPtr);
return memPtr;
}
void HeapTracked::operator delete(void *ptr)
{
list<RawAddress>::iterator it =
find(addresses.begin(), addresses.end(), ptr);
if (it != addresses.end()) {
addresses.erase(it);
::operator delete(ptr);
} else {
throw MissingAddress();
}
}
bool HeapTracked::isOnHeap() const
{
// dynamic_casting a pointer to void* yields a pointer to the
// beginning of the memory for the object pointed to by the pointer
const void *rawAddress = dynamic_cast<const void*>(this);
list<RawAddress>::iterator it =
find(addresses.begin(), addresses.end(), rawAddress);
return it != addresses.end();
}Prohibiting heap-based objects:
- For objects that are directly instantiated: declare operator new (operator new[]) and operator delete (operator delete[]) private.
- For objects instantiated as base class parts of derived class objects: if operator new/delete aren't declared public in a derived class, that class inherits the private versions declared in its base(s).
- For objects embedded inside other objects: there is no portable way.
Item 28: Smart pointers.
- Because object ownership is transferred when auto_ptr's copy constructor is called, passing auto_ptrs by value is often a very bad idea.
- auto_ptr objects are modified if they are copied or are the source of an assignment.
- The return type of operator* function is a reference. It would be disastrous to return an object instead, though compilers will let you do it.
- There are only two things operator-> can return: a dumb pointer to an object or another smart pointer object. Most of the time, you'll want to return an ordinary dumb pointer.
- There is a middle ground that allows you to offer a reasonable syntactic form for testing for nullness while minimizing the chances of accidentally comparing smart pointers of different types. It is to overload operator! for your smart pointer classes so that operator! returns true if and only if the smart pointer on which it's invoked is null.
- Don't provide implicit conversion operators to dumb pointers unless there is a compelling reason to do so.
- Use member function templates to generate conversion functions, then use casts in those cases where ambiguity results.
- Have each smart pointer-to-T class publicly inherit from a corresponding smart pointer-to-const-T class.
Item 29: Reference counting.
Two common motivations for the technique:
- Reference counting eliminates the burden of tracking object ownership, which constitutes a simple form of garbage collection.
- It's better to let all the objects with that value share its representation. Doing so not only saves memory, it also leads to faster-
- running programs, because there's no need to construct and destruct redundant copies of the same object value.
A reference-counted String:
Create a class to store reference counts and the values they track, and nest it inside String's private section.
Nesting a struct in the private part of a class is a convenient way to give access to the struct to all the members of the class, but to deny access to everybody else (except, of course, friends of the class).
Ensure that the reference count for a String's StringValue object is exactly one any time we return a reference to a character inside that StringValue object.
There is no way for C++ compilers to tell us whether a particular use of operator[] is for a read or a write, so we must be pessimistic and assume that all calls to the non-const operator[] are for writes. (Proxy classes can help us differentiate reads from writes - see Item 30.)
Add a flag to each StringValue object indicating whether that object is shareable. Turn the flag on initially, but turn it off whenever the non-const operator[] is invoked on the value represented by that object. Once the flag is set to false, it stays that way forever.
A reference-counting base class:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116// template class for smart pointers-to-T objects;
// T must inherit from RCObject
template<class T>
class RCPtr {
public:
RCPtr(T* realPtr = 0);
RCPtr(const RCPtr& rhs);
~RCPtr();
RCPtr& operator=(const RCPtr& rhs);
T* operator->() const;
T& operator*() const;
private:
T *pointee;
void init();
};
// base class for reference-counted objects
class RCObject {
public:
void addReference();
void removeReference();
void markUnshareable();
bool isShareable() const;
bool isShared() const;
protected:
RCObject();
RCObject(const RCObject& rhs);
RCObject& operator=(const RCObject& rhs);
virtual ~RCObject() = 0;
private:
int refCount;
bool shareable;
};
// class to be used by application developers
class String {
public:
String(const char *value = "");
const char& operator[](int index) const;
char& operator[](int index);
private:
// class representing string values
struct StringValue: public RCObject {
char *data;
StringValue(const char *initValue);
StringValue(const StringValue& rhs);
void init(const char *initValue);
~StringValue();
};
RCPtr<StringValue> value;
};
// implementation of RCObject
RCObject::RCObject() : refCount(0), shareable(true) {}
RCObject::RCObject(const RCObject&) : refCount(0), shareable(true) {}
RCObject& RCObject::operator=(const RCObject&) { return *this; }
RCObject::~RCObject() {}
void RCObject::addReference() { ++refCount; }
void RCObject::removeReference() { if (--refCount == 0) delete this; }
void RCObject::markUnshareable() { shareable = false; }
bool RCObject::isShareable() const { return shareable; }
bool RCObject::isShared() const { return refCount > 1; }
// implementation of RCPtr
template<class T>
void RCPtr<T>::init()
{
if (pointee == 0) return;
if (pointee->isShareable() == false) {
pointee = new T(*pointee);
}
pointee->addReference();
}
template<class T>
RCPtr<T>::RCPtr(T* realPtr) : pointee(realPtr) { init(); }
template<class T>
RCPtr<T>::RCPtr(const RCPtr& rhs) : pointee(rhs.pointee) { init(); }
template<class T>
RCPtr<T>::~RCPtr() { if (pointee) pointee->removeReference(); }
template<class T>
RCPtr<T>& RCPtr<T>::operator=(const RCPtr& rhs)
{
if (pointee != rhs.pointee) {
if (pointee) pointee->removeReference();
pointee = rhs.pointee;
init();
}
return *this;
}
template<class T>
T* RCPtr<T>::operator->() const { return pointee; }
template<class T>
T& RCPtr<T>::operator*() const { return *pointee; }
// implementation of String::StringValue
void String::StringValue::init(const char *initValue)
{
data = new char[strlen(initValue) + 1];
strcpy(data, initValue);
}
String::StringValue::StringValue(const char *initValue) { init(initValue); }
String::StringValue::StringValue(const StringValue& rhs) { init(rhs.data); }
String::StringValue::~StringValue() { delete [] data; }
// implementation of String
String::String(const char *initValue) : value(new StringValue(initValue)) {}
const char& String::operator[](int index) const { return value->data[index]; }
char& String::operator[](int index)
{
if (value->isShared()) {
value = new StringValue(value->data);
}
value->markUnshareable();
return value->data[index];
}Reference counting can also be added to existing classes (e.g. some class Widget that's in a library we can't modify).
Most problems in Computer Science can be solved with an additional level of indirection.
Reference counting is most useful for improving efficiency under the following conditions:
- Relatively few values are shared by relatively many objects.
- Object values are expensive to create or destroy, or they use lots of memory.
If you find yourself weighed down with uncertainty over who's allowed to delete what, reference counting could be just the technique you need to ease your burden.
Item 30: Proxy classes.
Objects that stand for other objects are often called proxy objects, and the classes that give rise to proxy objects are often called proxy classes.
Proxies can be employed to:
implement classes whose instances act like multidimensional arrays.
Array1D is a proxy class. Each Array1D object stands for a one-dimensional array that is absent from the conceptual model used by clients of Array2D.
prevent single-argument constructors from being used to perform unwanted type conversions. (See Item5)
help distinguish reads from writes through operator[].
We can treat reads differently from writes if we delay our lvalue-versus-rvalue actions until we see how the result of operator[] is used. A proxy class allows us to buy the time we need, because we can modify operator[] to return a proxy for a string character instead of a string character itself. We can then wait to see how the proxy is used. If it's read, we can belatedly treat the call to operator[] as a read. If it's written, we must treat the call to operator[] as a write.
Proxy classes have disadvantages:
- As function return values, proxy objects are temporaries, so they must be created and destroyed.
- The very existence of proxy classes increases the complexity of software systems that employ them.
- Shifting from a class that works with real objects to a class that works with proxies often changes the semantics of the class.
Item 31: Making functions virtual with respect to more than one object.
- Using virtual functions and RTTI: need to determine the type of only one of the objects involved in the collision. The other object is *this, and its type is determined by the virtual function mechanism.
- Using virtual functions only: implement double-dispatching as two single dispatches, i.e., as two separate virtual function calls. The first determines the dynamic type of the first object, the second determines that of the second object.
- Emulating virtual function tables: create an associative array that, given a class name, yields the appropriate member function pointer.
Item 32: Program in the future tense.
- Express design constraints in C++ instead of (or in addition to) comments or other documentation.
- Determine the meaning of a function and whether it makes sense to let it be redefined in derived classes. If it does, declare it virtual, even if nobody redefines it right away. If it doesn't, declare it nonvirtual, and don't change it later just because it would be convenient for someone; make sure the change makes sense in the context of the entire class and the abstraction it represents.
- Handle assignment and copy construction in every class, even if "nobody ever does those things." If these functions are difficult to implement, declare them private.
- Strive to provide classes whose operators and functions have a natural syntax and an intuitive semantics. Preserve consistency with the behavior of the built-in types: when in doubt, do as the ints do.
- Recognize that anything somebody can do, they will do. Make your classes easy to use correctly and hard to use incorrectly. Accept that clients will make mistakes, and design your classes so you can prevent, detect, or correct such errors.
- Strive for portable code.
- Design your code so that when changes are necessary, the impact is localized. Encapsulate as much as you can; make implementation details private. Where applicable, use unnamed namespaces or file-static objects and functions. Try to avoid designs that lead to virtual base classes. Avoid RTTI-based designs that make use of cascading if-then-else statements.
- Instead of asking how a class is used now, it asks how the class is designed to be used.
- Provide complete classes, even if some parts aren't currently used.
- If there is no great penalty for generalizing your code, generalize it.
Item 33: Make non-leaf classes abstract.
- Non-leaf classes should be abstract.
- Declaring a function pure virtual doesn't mean it has no
implementation, it means:
- the current class is abstract, and
- any concrete class inheriting from the current class must declare the function as a "normal" virtual function (i.e., without the "=0").
- Implementing pure virtual functions may be uncommon in general, but for pure virtual destructors, it's not just common, it's mandatory. Pure virtual destructors must be implemented, because they are called whenever a derived class destructor is invoked. Furthermore, they often perform useful tasks, such as releasing resources or logging messages.
- Replacement of a concrete base class with an abstract base class
yields benefits as follows:
- make the behavior of operator= easier to understand
- reduce the chances that you'll try to treat arrays polymorphically (see Item 3)
- make you create new abstract classes for useful concepts, even if you aren't aware of the fact that the useful concepts exist.
- Useful abstractions are those that are needed in more than one context. That is, they correspond to classes that are useful in their own right (i.e., it is useful to have objects of that type) and that are also useful for purposes of one or more derived classes. This is precisely why the transformation from concrete base class to abstract base class is useful: it forces the introduction of a new abstract class only when an existing concrete class is about to be used as a base class, i.e., when the class is about to be (re)used in a new context.
Item 34: Understand how to combine C++ and C in the same program.
If you want to mix C++ and C in the same program, remember the following simple guidelines:
- Make sure the C++ and C compilers produce compatible object files.
- Declare functions to be used by both languages extern "C".
- If at all possible, write main in C++.
- Always use delete with memory from new; always use free with memory from malloc.
- Limit what you pass between the two languages to data structures that compile under C; the C++ version of structs may contain non-virtual member functions.
Item 35: Familiarize yourself with the language standard.
- STL is very simple. It is just a collection of class and function templates that adhere to a set of conventions. The STL collection classes provide functions like begin and end that return iterator objects of types defined by the classes. The STL algorithm functions move through collections of objects by using iterator objects over STL collections. STL iterators act like pointers. That's really all there is to it. There's no big inheritance hierarchy, no virtual functions, none of that stuff. Just some class and function templates and a set of conventions to which they all subscribe.
- STL is extensible. You can add your own collections, algorithms, and iterators to the STL family. As long as you follow the STL conventions, the standard STL collections will work with your algorithms and your collections will work with the standard STL algorithms.
- Before you can use the library effectively, you must learn more about it than I've had room to summarize, and before you can write your own STL-compliant templates, you must learn more about the conventions of the STL.