Since my last post you know what I think how C++ code should be split into header and source files. But where should we put those files? How should the directory structure look for our project?
This is again just what I think to be good practice. It’s derived from what I have seen work well and not so well over the years.
What to put into a directory?
Java has the convention to organize sources in directories according to the package structure. Other languages have that convention as well, but in Java it is even mandatory. Package name and directory always have to correspond to each other.
We do usually not have packages in C++, unless we use an IDE that imposes the concept upon us. What we have are translation units and libraries. The distinction of translation units is already covered by naming the source files, so what remains are libraries.
Often enough, what we link together into a library consists of just a handful of translation units. In those cases, using a single directory for the code is just natural. The directory should simply be named after the library.
Sometimes, our libraries get large enough to make it infeasible to put all sources into a single directory. At that scale, we usually are able to identify subcomponents. Giving those subcomponents a name helps in conceptualizing the program structure, so we can well use that name for a directory. And while we are at it, consider to put those subcomponents into their own smaller libraries.
That way, we have divided the library not only in our source file organization, but also mentally. This is a good thing. It prevents us from forming huge blobs of interdependent code and gives our code the structure it needs to stay maintainable. When our directory structure reflects the architecture of our program, large directories visibly identify possible problem areas in our architecture.
Directories = namespaces?
The closest thing we have to packages at a syntactical level in C++ are namespaces. Should we use different namespaces for different libraries and subcomponents, i.e. for sources in different directories? Should we also use different directories for different namespaces?
While the general concept is a good idea in my eyes, there are limits. First of all, namespaces do not directly correspond to packages in other languages. For example, we use them to avoid namespace pollution, by using
detail namespaces. We also have anonymous namespaces that affect linkage.
With these helper namespaces we can have several namespaces in a single translation unit. Therefore it would be impossible to put the source files into a corresponding directory, unless we split them into multiple sources. Not a good idea.
On the other hand, having a different namespace for every directory is not a feasible way to organize things, at least not with most current IDEs. Architecture changes and evolves, therefore we have to move sources into different directories usually more than once in their lifetime. Reflecting those changes by also changing the namespaces in which the classes and functions reside usually would be done manually which is extremely cumbersome.
However, for the large scale organization, i.e. top level directory structure, different namespaces can of course very well help to organize our classes and functions. To avoid confusion, name the namespaces after those directories and libraries.
Directories for tests
This may be a controversial topic. I have seen test sources put into the same directories as the sources of the tested classes. Other projects do have a
test_xy directory for each directory
xy. The third option, also mandatorily used in Java, is to have a test directory at the topmost level. The subdirectory structure below it is then identical to the main source directory.
The latter is my favorite. Having test source files in the same directories as the tested source files basically doubles the size of those directories, making them harder to manage. Test directories at each level are also harder to maintain: would a test for
src/adir/bdir/x.cpp go below
src/adir/bdir_test/? Instead, putting that test in
test/adir/bdir/ seems just natural.
Having the test directory separated at the topmost level also allows us to run tools like searches and documentation generators differently for test and main code. For example, if we want to find the use count of a class or function in our code, we may find tens or hundreds of uses in the test code. If we look closer, we may see that it is not used at all in the project itself.
C++ developers sometimes like to have a good laugh at languages like Java. I also have written about that C++ is not Java, that there are differences. But in C++ we have the luxury to be able to adopt from other languages what we like and what makes sense in our context.
You may or may not totally agree with the conventions I described above. Use them, modify them, or come up with your own – but have some conventions. A firm convention for what code goes where is certainly better than having the halfheartedly organized chaos of sources, directories and namespaces we sometimes encounter elsewhere.
Some Good points – as every week 🙂 But I’m missing some words about how to separate header files that describe the interface of a library and header files for the internal classes that shall not be exposed (is there an official term for both types? I just call them internal and external header files). I saw many projects where this is not differentiated (means: all headers exposed) but that’s a bad idea because you can’t prevent internal classes to be used by others So changing the internal interface may have unforeseen side effects.
My favourite way is to put the internal header files mixed together with my source files in a directory called “src” (perhaps with further subdirectories reflecting the namespace structure) and the external header files in a separate directory called “inc” (with the same subdirectory structure) which I add to the compiler’s list of include directories.
I’m interested how you handle internal and external header files?
Good question! That depends largely on how detached the library is form the rest of your application. If it simply is just another part of your application that is only used in that application, then the library and its API probably evolve alongside the rest. That means that separating it too much from the rest of your code may hinder the evolution of the API, so I’d just use the “all headers exposed” approach. Yes, there is the danger of using other headers outside the library as well, i.e. watering down the borders between modules. But that danger exists on other levels as well (e.g. using libraries that are meant to be used in other layers only), and has to be countered by architecture reviews etc.
The other extreme is if the library in question is detached from your application. It might even be provided by another team, used in other applications as well etc. In that case I’d build it separately, and that build should include a step where a
includedirectory is created that contains all the public interface of the library. Then use that interface (i.e. the headers in that directory) and link the library file built before your application’s build started.
Similar to what you said, I think it’s common to have an “include” folder for the public api, and a “src” folder for the private internals.
For something different, the Unreal Engine has a “source” folder with their modules inside of it. You know a folder is a module if it contains a build script and the following folders: each module has a “Private” and “Public” folder. The “Public” folder has the interface classes like IBlutilityModule, and the “Private” folder has a whole bunch of classes related to that modules interface(s). I think that works well, makes it explicit.
Epic may even use a Classes folder. But according to them those header files could easily be put in Private, it’s up to you. I think for this discussion we can ignore that. But here’s a link to their docs: https://docs.unrealengine.com/latest/INT/Engine/Basics/DirectoryStructure/#commondirectories