Compiler Warnings: Use Them, Don't Trust Them

Wed, 10/25/2017 - 22:13

Turning On All Warnings Is Definitely a Good Thing

Most compilers provide useful warning messages that inform about circumstances that may not correspond to the intentions of the programmer. In most environments where code quality and low defect rates are important, a rule requiring the code to compile without warnings when all the compiler warnings are enabled is increasingly being enforced. The rationale is that most compiler warnings flag code that is truly problematic; when this is not the case, the compiler is confused by the flagged section of code because it is extremely convoluted and, as such, there is a good chance that it will also confuse fellow programmers: in both cases, the code should be modified.

Turning On All Warnings Is Not Enough

The fact that more and more people compile with all warning enabled is a definitely a good thing. However, a significant number of developers seem to assume that, since the compiler can detect some instances of issue X, then the compiler must be able to detect all instances of issue X. Unfortunately, this is false more often than it is true. There are several intertwined reasons that compound to that effect: some are of technological nature, others are sociological.

Technological Reason: Compilers Issue Warnings Because They Can

C/C++ Compilers, at least since the DEC C compiler, have traditionally provided warnings because they have some of the information required to provide many of them: they build and visit abstract syntax trees and, for the purpose of optimization, they perform static analysis of the code. Thus, incorporating some diagnostic facilities into compilers as a byproduct of the other activities is a natural thing to do.

Sociological Reason: Compilers Have To Be Fast

Compilers have to be fast, because people using them want them to be fast. This is only partly justifiable, but it is a fact (too many developers recompile their code very often due to uncertainty on the language syntax and semantics; a deeper knowledge of the language would result in less recompilation and increased productivity). The compiler can be excused for higher compilation times only when optimizations are turned on. For this reason, optimizations are usually turned off during large parts of the development (and, in some industry sectors, they are always turned off).

Technological Reason: Optimizations and Analyses Are Optimized

When optimizations are turned off, the corresponding static analyses are also turned off for the sake of compilation speed. Hence, when optimizations are turned off, the information needed to provide high quality warnings is in scarce supply. When optimizations are turned on, more static analyses are active and more information is available. However, compilation time is an issue also when optimizing code, and this has two consequences:

  1. optimization is geared towards high-rewarding opportunities that tend to occur often in real code;
  2. static analyses that enable such optimizations are complexity-throttled in the same way: the analysis targets the very same opportunities and the algorithms employed are quick to give up when, based on some heuristics, such opportunities are unlikely to be present.

Summarizing: optimization turned off means little information; optimization turned on means more information according to some heuristics whose objective is achieving a good complexity/precision trade-off from the point of view of optimization, not from the point of view of the consistent generation of high-quality warnings.

When information is insufficient, any algorithm for the generation of diagnostics will have false positives (warnings are given for instances that are not problematic), false negatives (warnings are not given for instances that are problematic), or both. Moreover, when the information is insufficient (and excluding buggy algorithms), reduction of the number of false positives is demonstrably only possible by increasing the number of false negatives and the other way around.

Sociological Reason: False Positives Make for Angry Developers

Developers do not tolerate false positives. This is only partly justifiable, but it is a fact (developers should pay more attention to the consequences of false negatives). This is what happens as a result: a false positive in some project generates complaints; compiler developers tweak the algorithm to suppress it; more often than not this results into false negatives. If you follow the development of GCC and CLANG (as the author has done since their inception) you will see this pattern at work (if you have a few hours to spare, you can see by yourself: "git clone ; cd clang ; git log -p | less",  then search for keywords like "suppress"). Or you can try some simple experiments: you will see that warnings may be given or not given depending on factors that have nothing to do with the existence of a problem: whether part of the syntax involved comes from macro expansion or not, whether the problematic construct is within a loop or not, and so on. An example that caught my attention recently is the following:

$ cat r.c
#define FORCED_STOP (0)

static int state[2][2];

static void foo(int action) {
   const int stop = FORCED_STOP;
   if (action == 2) {
     for (;;) {
       if (stop || !state[0]) {

int main() {
   return 0;
$ gcc -O3 -W -Wall -Wunreachable-code r.c
$ clang-4.0 -O3 -W -Wall -Weverything -Wunreachable-code r.c
$ eclair_env -config=MC3.R2.1,enabled -- gcc r.c
r.c:10.9-10.13: violation for rule MC3.R2.1 (A project shall not contain unreachable code.) Loc #1 [culprit: `break' statement is unreachable]

GCC and CLANG report many instances of unreachable code, but they fail to report many other instances (even though, in this specific case, their optimizers know what is going on: you will see it if you examine the produced assembly code). The same is true for many other warnings.


Quoting G. J. Holzmann, "The power of ten — Rules for developing safety critical code", IEEE Computer, 39(6):95--97, 2006:

All code must be compiled, from the first day of development, with all compiler warnings enabled at the compiler’s most pedantic setting. All code must compile with these setting without any warnings. [...] There simply is no excuse for any software development effort not to make use of this readily available technology.

However, assuming compilers can consistently cover well-defined classes of potential defects is a big mistake: they simply have not been designed for that. Even compilers that claim to (partially) cover the MISRA standards do so with many false positives and tons of false negatives. In contrast, a well-designed static analysis tool will unambiguously define what it is checking (e.g., MISRA C:2012), the precision guarantees it provides for each rule (e.g., no false negatives and/or no false positives) and will be consistent with such definitions.

In conclusion, on the theme of warning messages, my final advice to those who develop code with safety and/or security concerns is the following: enable all compiler warnings and avoid all of them; in addition, select a well-defined and well-maintained coding standard (i.e., one of the MISRA ones) and a state-of-the-art static source code analyzer properly supporting it; base systematic code reviews upon the findings of the tool making sure that each report is carefully considered and dealt with by changing the code or by filing a deviation.

Update (August 24th, 2016)

I have been asked why I mention MISRA in the article. I do so because one frequently-occurring misconception is that enabling warnings can substitute proper language subsetting with respect to a well-defined coding standard. This is false for at least two reasons; in fact, the set of conditions covered by warnings generated by a given version of a given compiler with a certain set of compilation option:

  1. is not well defined (this is the point of the present article);
  2. it only covers a fraction of the conditions that are covered by the MISRA standards, which, in my not-so-humble opinion, are by far the best coding standards in existence and a must for whoever develops C/C++ code whose proper functioning matters.
We are a passionate team of experts. Do not hesitate to let us have your feedback:
You may be surprised to discover just how much your suggestions matter to us.