Every now and then we have to change something in our build procedure, and more often than not those changes are a real pain.
Build scripts are the step children of many software projects. Someone wrote them, nobody cares much about them, they just work and build the project. Until they don’t.
Be it because you have a new artifact that needs to be included in the build somehow, or because your build decides to stop working for curious reasons, the day will come where you have to open that build script in an editor and stare in awe at the gibberish that lies before you.
Know your build language
Whether it is a simple shell script, a make file, or a bunch of xml files for MSBuild, Maven or other build systems that control how your precious code is transformed into something executable, you have to know the syntax and semantics.
I don’t say you have to know it by heart and as well as your main programming language, but you should be able to read what it does and to google for the right keywords if you need to add something.
Not knowing the build language means not being in control of the build. In extreme situations that means you are not able to build – and guess how much your customers value a shipment of a bunch of source files instead of the real thing.
Make sure your team knows the language as well. There should not be just one or two persons knowing the build, everyone should, because the build is crucial for everyone. Like any code, the build files should be owned by the team, not by individuals.
That means, if you are to set up a new build system, make it a team decision which language to use. Don’t just whip up a Rake build because you heard it’s in these days. You don’t want to be the one stuck with maintaining the build when you could be working at the cool stuff.
If you already have a build system in place, consider reusing it for new builds as well.
Build scripts are code
You may hope that you don’t have to touch you build files as often as your usual source files. Nevertheless you will have to maintain or debug them some day, and you’d surely like that maintenance to go as smoothly and quickly as possible. Therefore you should treat them like any other code.
Use clean code principles
As far as possible, you should use meaningful names and a good structure in you build script. Nobody wants to sift through an unformatted 300 line DOS shell script with variable names AA to BM.
Use variables if the language allows it. Having a single block of configuration variables in the script allows you to easily change that configuration without having to skim the whole file in search of occurrences of some paths.
Break up the script and modularize it if that makes sense. Parts of your build scripts can often be reused in other locations, e.g. in different modules of the same project. The DRY principle applies to build scripts as well as to other code.
Build system languages often are not as expressive as general purpose languages and therefore lack some of the facilities that allow us to clearly cast our intent in code. In addition, readers usually are not too fluent in the language, so the intent of a given piece of build code may not be clear to everyone. Therefore make use of comments where needed to clarify, but take special care to not let the comments lie to the reader.
Check your scripts into source control
There is no reason why you would not want to check in your build scripts. There are tons of reasons why you would. People might not be as fluent in the build script language as they are in other languages, so it’s good to be able to roll back if changes don’t turn out the way you thought.
Checking in the build script alongside the application code also allows new programmers to get productive faster, and it allows everyone in the team to adopt easily to changed package structures, added dependencies and the like. It basically allows everyone to immediately start a build and the accompanying tests right after checking out the project.
If you have a build server like Jenkins that allows you to define shell commands, don’t be tempted to write a lengthy script right on that server. Instead, make it one or two lines that call a script that does the work and has been checked into version control.
Have one build only
It can be tempting to have two different build scripts: one for local builds in the development environment and another for the nightly build/CI server. Typically in such a situation people use a plain shell script or makefile on the server, while they rely on the built in facilities of their IDE for local builds.
This leads to double maintenance, and bugs in the build may only be reproducible on one of the systems. Therefore it is better to have the same script controlling both local and server side builds.
To achieve that, one ideally either has to get the IDE to execute the script used on the build server, or to use a build system on the server that can understand the IDE’s project files. Calling the build script from command line instead of from the IDE for local builds is an inferior option, since it is less comfortable and even may result in a executable that the IDE can not debug or is not aware of.
If those options are not available, the next best thing can be to either generate one build script from the other or have a single source and generate both of the build scripts from it.
“Build scripts are code” is the reason I don’t use IDE’s for building code at all. I’m also a bit shocked to hear that there are people not checking their build scripts into version control. It’s like there are little pockets of software development the internet still hasn’t reached.
The most common pitfall I see is people setting up their build with a whitelist of source files, so that every time a source file gets added or deleted the build script must also be changed. The biggest problem there is that people are lazy – adding even a slight overhead toward creating a new file is going to have them adding that new function to an existing file and voila you have 1000-line files instead of 100-line files. The second problem with that is removing files from the build system but not version control, so dead code can sit there inert for (not exaggerating) several years repeatedly misleading people who are looking for this or that somewhere in the codebase. The least important of these problems, but the one that is most frustrating for me because it bites me so very frequently in these setups, is the reverse – adding a new file and forgetting to add it to the build script. Depending on the setup you might get a linker error or you might actually get a final build that just doesn’t behave properly.
And it’s a shame because pretty much every build system worth a spit has globbing or filesets in one form or another, but as you pointed out people tend not to learn their build language – they copy paste as much as needed to get by.
I have experienced all of the problems you describe myself. A variant is adding a new file in the IDE which maintains a whitelist, but failing to check in the changed IDE project file. This was a common problem in the last project I worked at, because on the one hand the VCS was too rigid, applying a strict lock on all files, on the other hand the IDE wanted to touch, randomly reorder (!) and save the project files whenever you just thought of opening one of the files.
As a result we had a script that removed the sensless locks to allow the IDE to apply its useless reorderings, and by default we just ignored any changed project files which lead to build failures if there actually was a change.
God dammit, my legacy C++ Builder project suffers from this useless reordering. I do use version control but I just ignore whatever the hell the IDE do and commit changes. When something happens I can at least diff things.
Guess what – it was the C++ builder 2010 IDE where we had the problem. I quit that job a while ago, otherwise I would have tried to write a script that sorts those Elements in order to have a meaningful diff.