Contents
Have you ever seen one of those nasty errors that appear on one machine and not on another? To be able to track those down we need a reproducible build environment.
Same tools and environment
The first step to have a reproducible build is to have the same versions of the tools we use in the build. For C++ development, that is most obviously the compiler and linker. But that’s not it – it is also the environment the compiler runs in and build tools we use, like CMake.
The environment can mean the operating system, but also other software that can influence the behavior of our build tools. For example, I only recently had to discover that under MSYS/MinGW, GCC’s __attribute__(packed)
may work differently depending on the version. It also means that GCC itself works differently, depending on whether we use MinGW or not.
Especially in larger projects, we should define and document at least one reference environment that is to be used for our builds. Individual developers may use their own environments, but in that case, it’s on them if the code works in the reference environment but not on theirs.
Of course, we want our projects to be portable and work on as many environments as possible. However, having a reference environment and a reproducible build on that environment allows us to determine whether a problem is a genuine bug or a portability issue.
If we want to be portable, we can very well define multiple reference environments. We then need to ensure that we have a reproducible build on all those environments, which brings us to build servers.
Build server environment
Usually these days, collaborative projects are built using some kind of build server, e.g. Travis CI or Jenkins. Since those are the machines that tell us whether our projects build or not. If the build server is red, it does not matter whether the project builds on my local machine. This is true especially if we use the build server to prepare our releases.
In projects that build on multiple environments on the build server, a reproducible build means that we should be able to reproduce all those environments on a machine where we can debug and analyze the build and the software. We can achieve this easiest if the build server uses Docker containers or virtual machines that can be reproduced on our local machine.
Script the build
Many build servers have their own configuration and script files that determine what has to be done to build the software. Usually, it is not easy to run those files on a local machine to reproduce the build there. Therefore it is important that the build server files do not include too many detailed steps, because copying all those lines into our terminal is cumbersome and error-prone.
The solution is to have script files that provide a single, easily-called command for each build step. Each build step on our build server can then consist of four substeps:
- Check out sources from version control
- Restore saved artifacts from previous build steps
- Run the script for this step
- Save artifacts that are needed later
The restoring of earlier artifacts should happen at the same location as they were saved. Then it is easiest to reproduce the build on our local machines: Unlike the build server, we usually do not switch contexts between build steps. And if we do, it is easiest to just copy the current state of our directories to that other context. Ideally, we can just run the scripts for all the build steps in sequence.
If we really need to have the artifacts from previous steps in a different location, we can do so by having a script that copies or moves them from the original location to where they belong. We’ll need that script for our local builds anyway, so we should use it on the build server as well to have a truly reproducible build.