Contents
As promised last week in my post about strange include techniques, I will go into reducing compile time dependencies. Reducing dependencies by shifting them from headers to source files can considerably improve compile times. The main mechanism to achieve this is forward declarations.
Definitions vs. declarations
C++ distinguishes definitions from declarations. Declarations more or less tell the compiler that something exists, but not the exact details. Definitions give all the details. Usually, something can be defined only once – at least in a translation unit – while it can be declared multiple times.
The best-known example is a function declaration vs. its definition. The declaration only tells us – and the compiler – what parameters the function takes and what it returns:
int foo(std::string const& str);
The definition is the whole function with its body.
int foo(std::string const& str) {
if (str.empty()) {
return 0;
}
return str.length() * (str[0]-'A');
}
Variables can be declared as well, with the keyword extern
, but we very rarely have to use that. Usually, we define them right where they are used. More interesting are class declarations:
class MyClass;
This is all that is needed to tell the compiler that there is a class named MyClass
, but not what it contains. At first sight, this seems of very limited use, but it is an important tool to reduce dependencies in headers. It allows us to postpone the actual definition of MyClass
until later, which is why class declarations are usually called forward declarations.
Unless we write functional or procedural code, most of our headers contain class definitions. A class definition contains definitions of its member variables and either definitions or declarations of the member functions. The usual default is to only declare member functions in the header and define them in the .cpp file.
Reducing compile-time dependencies with forward declarations
To reduce the compile-time dependencies of our translation units, we should strive to reduce the number of #includes in our headers. The reason is simple: including a header X.h into another header Y.h means that every translation unit that includes Y.h also includes X.h transitively. Since #includes are plain text replacement done by the preprocessor, the contents of all included headers have to be parsed by the compiler. This can be millions of lines of code for a small .cpp file with just a handful of #includes.
Here forward declarations come in handy, because not every type we depend on in a class definition has to be defined itself. A declaration often suffices, which means that instead of #including MyDependency.h we can simply declare class MyDependency;
. We will usually need the class definition of our dependency when we implement (define) our class methods, but since we do that in the .cpp file, the #include can be postponed until then.
What dependencies does a class definition need?
So, what dependencies actually have to be defined for our class definition to compile? The answer is: everything the compiler needs to determine the size and memory layout of the objects it has to instantiate. For everything else, forward declarations are enough.
Broadly speaking, that are base classes and the types of member variables. Since every object that has a base class contains a subobject of that base class, it is clear that the base class definition is needed. For member variables we need to go into more detail: We only need class definitions of the actual types of our member variables. If our member variable is a pointer, we don’t need the class definition, because, for the compiler, pointers are only addresses. The same goes for references, which are technically pointers with a few restrictions.
What about function parameter and return types? No definitions needed when we only declare the functions! Of course, if we define the functions, we actually use the parameter types and therefore also need their definitions. Here again, pointers and references are the exceptions, as long as we do not access the objects behind them. Passing around pointers to X is perfectly OK as long as we don’t do anything with them that requires knowing more about X.
Here’s an example class with forward declarations and only those #includes that are really needed:
#include "BaseClass.h"
#include "Member.h"
#include "AnotherType.h"
class Pointee;
class ReturnType;
class ArgumentType;
class MyClass : public BaseClass {
Member aMember; //definition needed
Pointee* aPointer; //declaration is enough
public:
ReturnType funcDecl(ArgumentType arg);
Pointee* ptrFuncDef(ArgumentType const& ref) {
//function definition, ArgumentType
//is only use by reference, no defintion needed
//same for Pointee
return aPointer;
}
AnotherType anotherFunc(AnotherType other) {
//AnotherType is copied, so the definition is needed
return other;
}
};
That last function adds a dependency we could get rid of: If we only declare the function in the class definition and move the function definition to MyClass.cpp, the #include of AnotherType.h can be moved there, too. We would then only need a forward declaration in the header.
Forward declarations to break dependency cycles
Forward declarations are not only a useful help in reducing compile times. They are also crucial to break dependency cycles. Imagine that the class Member
from the example contained a Pointer to MyClass
.
class Member {
MyClass* myPointer;
//...
};
To compile this, the compiler needs to know what MyClass
is. Without forward declarations, we would have to #include MyClass.h here, which in turn #includes Member.h, which #includes MyClass.h… Sure, that’s what include guards are for. But with those, either MyClass
or Member
would be the first definition the compiler sees, without knowing about the other. There is no other way than to use a forward declaration to MyClass
in Member.h.
Conclusion
Forward declarations are not only useful, they are a crucial tool to work with class dependencies. It gets a little more complicated when we think about class templates: Do we need a definition of class X
if our class contains a std::shared_ptr<X>
, or is a declaration enough? What about a std::vector<X>
? I’ll answer those questions next week, stay tuned!
Permalink
I would like to consider the following when thinking about using forward declarations for reducing compile time.
There might be situations where one has to use a forward declaration.
On the other hand if your code base grows, maintaining becomes a bigger factor.
Today’s tooling is not perfect, but it can help a lot when you need to move things around or refactor. And give you more confidence that the changes did not introduce bugs.
And forward declarations are a nightmare in this regard.
I am experiencing this at the moment.
When changing or adding namespaces, moving code around, forward declarations always need manual editing.
You can try with grep, sed and similar but chances are you make mistakes.
If you are lucky, the compiler complains. But I found places where the compiler is fine eating forward declarations that are not right anymore.
And using forward declarations everywhere just for reducing compile time, might make you pay the price in increased developer time for refactoring and hunting bugs introduced by refactoring.
Permalink
Hi, a few points for the future:
1) Forward declaring enums should be covered
(I have never understood why if you can forward declare structs, where the size is not needed, why can’t you do the same with enums (unless compiler vendor mangles the size as part of the enum name))
I know with the newer types of enums, you can forward declare.
2) Forward declaring structs & unions
3) Forward declaring typedefs which can be done for structs, unions if you forward declare the struct, union and then typedef the forward declaration (it has to be the same as any previous forward declaration).
4) Forward declaring template declarations. And in general you cannot, which is why you have to include the header for std::string and also MFC’s CString. Having said that, there is #include which is forward declarations.
I am unsure if the picture with recent standards
Permalink
“For everything else, forward definitions are enough.”
Should read “forward declarations” 😉
Permalink
Fixed, thanks!
Permalink
I have found that forward declarations also reduce the size of the binary by about 5-8%. I am unsure as to the reason why this happens. It may be a positive side effect.
Permalink
Unfortunately using typedefs muddies the forward declarations, as you need to forward declare both the typedef and the types used in the typedef.
Permalink
Yes, I’ll cover that as well in a future post 🙂