In the last two posts I have written about compiler warnings. There’s much more to getting hints about code smells and potentially problematic pieces of code than that. That’s the job for a static analyzer.
Have you refactored your code yet so you don’t get any compiler warnings any more? Good. Have you also tuned your compiler to give you a proper set of warnings that you care about? Very good. I hope you don’t think you’re done now. There’s more, much more.
What is a static analyzer?
A compiler’s job is to – erm – compile your code. It does that very well. It parses the code, translates it to some internal representation and analyzes that representation to further translate it into some intermediate, much simpler language. It then optimizes the hell out of that intermediate language if you told it to, and then as a last step translates the outcome to machine code.
I practice that process is much more involved, e.g. I left linking completely out of the picture. However I am not writing an essay about compilers (extremely interesting topic though), so these few sentences should do.
Why am I telling you this? Well, I have written about warnings emitted by the compiler in the last posts. Compilers are not required to emit any warning by the C++ standard, nevertheless they do. They do so because it’s helpful for developers and it’s not a big deal to implement. The warnings are a byproduct of simple checks during the internal representation analysis.
In other words, the warnings we get from compilers are mostly a byproduct of the stuff a compiler has to do anyways to get the job done. It is possible to do a much more thorough analysis of the internal representation, but that’s not in the scope of a compiler. Therefore there is another class of tools called static analyzers.
Like compilers, a static analyzer does the parsing and syntactic analysis like a compiler to build an internal representation. That internal representation may look different, because the two tools have different goals, but it may also be the same. For example, the Clang static analyzer just reuses some parts of the Clang compiler to get there.
A static analyzer then does its main job on that representation – it analyzes it and looks for code smells and potential problems. Do you access elements past the end of an array? Do you check if a pointer is null after you assigned a non-null value to it? There can be hundreds, even thousands of different checks. The analysis can be on a small scope like function level, but it there are also tools that check the program as a whole.
Why should we use a static analyzer?
You should use static analyzers pretty much for the same reasons as for the compiler warnings: They can point you at potential bugs, unnecessary code and more. Like the compiler, you can usually tune them to emit only the warnings you are interested in (the more the better).
If you still doubt the benefit of such a tool, have a look at the PVS Studio blog. The people of PVS Studio regularly pick some open source product and throw their static analyzer against its code. In any sufficiently large code base the tool finds enough warnings and nasty little bugs to fill long blog pages.
Compiler warnings only point at very basic problems. Use a static analyzer to find more complex issues.
… or two?
There are differences between the static analyzers available. They may focus on different categories of problems or just implement their checks differently. So, having one static analyzer is good. Having two is even better, since the second can find stuff the first does not check.
Of course it’s a tradeoff: There is yet another tool in your build chain to manage and your compiles may take longer, even though you can run static analysis and compilation in parallel build jobs. However, you also will have even more guards against awkward errors.