This is the start of a small series where I write up a talk I held earlier this year at some conferences and user groups.
The title of the original talk is “Bringing Clean Code to Large Scale Legacy C++ Applications”. It was never recorded, at least not in a satisfactory format and not publicly available. Therefore I write down the essence to share with a broader audience.
What this is about
I am writing about some experience I gathered in past projects where I had to deal with this kind of application. Actually the whole blog emerged from these experiences. While it has grown beyond that initial scope it all started in a time when I had to fight my way through this kind of application.
This is a rather vague term and can mean all sorts of things. There’s a quote by Bjarne Stroustrup:
“Legacy code” is a term often used derogatorily to characterize code that is written in a language or style that the speaker/writer consider outdated and/or is competing with something sold/promoted by the speaker/writer.
“Legacy code” often differs from its suggested alternative by actually working and scaling.
I concur with the outdated style. With this I do not mean C++98 – it is perfectly OK to use the old standard, e.g. if you are struck with a compiler that does not support the new standards. I mean last century C++ style, like “C with classes” and Java for the C++ compiler.
And of course I mean messy code. I am guilty: I promote to write clean and readable code. For me, legacy code means historically grown for several years if not decades. Code that does not look as if there ever has been any serious refactoring.
A code base that has been maintained for a decade or more and where copy&paste is a standard approach can easily reach that size.
If you have not read the book by “Uncle Bob” Martin, you should do that. It’s a classic and on every must read top 10 list for programmers I ever have seen. While the code used in the book is Java, pretty much all of it are language independent principles that also apply to C++.
People sometimes say that C++ is too complicated and messy to write clean code in it. There are cases where we have to write ugly code, even with the newer standards, but in general I disagree. This more often than not is an excuse for people being lazy about their code.
Other people (or often the same) say that they can’t write clean code because they have to write for performance. Don’t get me started on that topic. Use a profiler and understand the optimizer, and you’ll still be able to write clean code most of the time if you try.
The bottom line is:
Bringing Clean Code to Large Scale Legacy C++ ApplicationsMeans fighting for maintainability and against code rot in a sea of old C++ code, usually while simultaneously fixing bugs and adding new features
It’s a team game
Let’s face it: You can’t fight a dragon alone. Cleaning up a large code base like that is an extremely hard task at best. Doing it alone while the rest of the team is happily hacking out new code in the old style is impossible.
A team can write new legacy code faster than a single person can clean it up. This may sound harsh, but think about it for a second: If the code base is messy today, this has a reason. The reason is that the team let it happen in the past.
So to tackle a task like this we have to get the team on board. Getting everyone to agree that the code is a mess and has to be changed can be a challenge. Not everyone likes to admit that they have been writing bad code for the past years.
It is unlikely to hold long lasting results if the boss simply orders the clean up . People fall back to old habits, so they have to really want the change.
Train the team
The reason for the existing bad code can be anything. People may not know how to write clean code. They might hold on to old myths about suboptimal code or may not even know the language and libraries well enough to use it to its full potential.
It might be just a habit to write quick and dirty, taken up in a period where there was “no time” for writing clean and – in the end – equally quick code. The solution usually is training.
Training can come in many forms. Pair programming and code reviews can help to pass knowledge between team members. You can do trainings and workshops in clean code and modern C++, from one hour to several days. They can be held either by external teachers or colleagues.
Coding dojos are also a good opportunity to practice a bit of green field development. Setting up a clean new project can give new insights easier than imagining a clean version of that messy class you have been hacking on for years.
In the end one of the most important things is to build an awareness for what lead to the legacy code in the past. Without this awareness it is too easy to fall back into old habits and further mess up the code base.
As I have written before, not everyone likes to admit that they might have written bad code in the past. So there can be considerable resistance from developers against a clean up project.
There can also be resistance by the management and other stakeholders, but we often can convince them with numbers. Nobody has anything against saving some money, if you can convince them that cleaning up the code base is worth the time in the long run.
Resistance by developers is often more critical. Sometimes it is harder to convince them of the necessary steps, and it is more crucial to have every single developer on your side.
This may sound a little dramatic, but trust me. If one in ten developers does not play along that is a serious roadblock. Cleaning up a really messy class can take considerable time, but that single resisting developer can copy&paste the messy class in minutes, leaving you with no less work than before and a profound feeling of frustration.
So, now that we have the team on our side, trained and prepped for the big task, we are ready for the next step: The next post in this mini series will be about refactoring that large and ugly code base.