Static code analysis is a cheap way to improve code quality. One might argue with this by stating that tools like Klocwork and Coverity are actually fairly expensive. And here comes one common misconception related to static analysis, some people think it’s about tools while it’s not. Sure good tools help, but there are a lot more important issues related to this. There are several problems related in doing static code analysis and only some of those might be solved with proper tools.
Problems / Solutions
Problem: Too many problems reported
Amount of reported problems is devastating and developers feel overwhelmed. As a result, the problems aren’t fixed. Sometimes a large number of reported problems are actually not real problems but e.g. style issues or even false positives. Good tools help in this but truth is all of them require manual work in tuning the results.
Solution: Define baseline and focus only on new problems
Measure the number of problems reported and define that as a baseline. Then make sure that no new problems are introduced. This ensures that the codebase doesn’t at least get worse. Later on its possible to tighten up the thresholds to get rid of older reported problems as well.
Problem: Tools are not used
It doesn’t really help if you have the best tools money can get but nobody’s using them.
Solution: Run tools automatically
If you’re using continuous integration system such as Jenkins you can run static analysis for every commit. In case the analysis takes long time, run it at least once a day (or night). Make sure all committers are being informed should the analysis reveal new problems. In my opinion, if static analysis reveals new problems, it should be considered a stop-the-line situation. So stop everything, fix problems, and only then continue.
You can also take this to extreme by introducing a pre-receive (or pre-commit) hook in your VCS. There a commit is rejected if static analysis doesn’t pass. In this case analysis needs to be really fast.
Problem: Results are not understood
It might well be that tools produce fine reports pointing out the problems but if developers don’t understand why something should be fixed then it doesn’t get fixed. Commonly heard reasoning is “but the code works why should we fix it?”
Solution: Educate developers, chew the results automatically
In the worst case you need to do like was suggested in the previous solution, declare stop-the-line in case analysis reveals problems or even reject commits to VCS in such a case. I’ve noticed one of the most effective ways is to just show and tell. So show developers a recent problem which has been found, point out the source code location and tell what’s wrong with the code. For example complex code usually doesn’t look to be “wrong” but if you try refactoring it turns out to be difficult. Or if code is not covered by unit tests, retro-fitting them is not easy.
It makes sense to do some pre-analysis for the results. So not to show raw results to developers but try to filter out non-problems and possibly also old problems. The simpler the better.
Problem: Tools are used by wrong people
Some companies have solved some of the previously presented problems by devoting a separate team to perform code analysis tasks. Unfortunately it might be that this team is capable of only finding the problems but not fixing them. Sure it might be more effective if there’s number of people telling that here are these problems which should be fixed but it’s still likely the problems will not get fixed.
Solution: Automatic analysis, everyone responsible for fixing problems
A basic rule of thumb which works in several different issues is “if you break it, you fix it”. Making problems visible by automatic analysis and rapid feedback makes it more likely that problems get fixed. With the tools available it is fairly trivial to point out who caused codebase to become worse. If nothing else helps, public humiliation is the last option to try. And that was a joke, being constructive is of course always really important.
So what kind of different static source code analysis tools and methods are out there? I’ll tell something about the ones we are using in the next post. Stay tuned, meanwhile you can for example run some static analysis.