1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 2 "http://www.w3.org/TR/html4/strict.dtd"> 3<html> 4<head> 5 <meta http-equiv="content-type" content="text/html; charset=iso-8859-1"> 6 <title>Clang - Expressive Diagnostics</title> 7 <link type="text/css" rel="stylesheet" href="menu.css"> 8 <link type="text/css" rel="stylesheet" href="content.css"> 9 <style type="text/css"> 10 .warn { color:magenta; } 11 .err { color:red; } 12 .snip { color:darkgreen; } 13 .point { color:blue; } 14 </style> 15</head> 16<body> 17 18<!--#include virtual="menu.html.incl"--> 19 20<div id="content"> 21 22 23<!--=======================================================================--> 24<h1>Expressive Diagnostics</h1> 25<!--=======================================================================--> 26 27<p>In addition to being fast and functional, we aim to make Clang extremely user 28friendly. As far as a command-line compiler goes, this basically boils down to 29making the diagnostics (error and warning messages) generated by the compiler 30be as useful as possible. There are several ways that we do this. This section 31talks about the experience provided by the command line compiler, contrasting 32Clang output to GCC 4.2's output in several examples. 33<!-- 34Other clients 35that embed Clang and extract equivalent information through internal APIs.--> 36</p> 37 38<h2>Column Numbers and Caret Diagnostics</h2> 39 40<p>First, all diagnostics produced by clang include full column number 41information. The clang command-line compiler driver uses this information 42to print "point diagnostics". 43(IDEs can use the information to display in-line error markup.) 44Precise error location in the source is a feature provided by many commercial 45compilers, but is generally missing from open source 46compilers. This is nice because it makes it very easy to understand exactly 47what is wrong in a particular piece of code</p> 48 49<p>The point (the blue "^" character) exactly shows where the problem is, even 50inside of a string. This makes it really easy to jump to the problem and 51helps when multiple instances of the same character occur on a line. (We'll 52revisit this more in following examples.)</p> 53 54<pre> 55 $ <b>gcc-4.2 -fsyntax-only -Wformat format-strings.c</b> 56 format-strings.c:91: warning: too few arguments for format 57 $ <b>clang -fsyntax-only format-strings.c</b> 58 format-strings.c:91:13: <span class="warn">warning:</span> '.*' specified field precision is missing a matching 'int' argument 59 <span class="snip"> printf("%.*d");</span> 60 <span class="point"> ^</span> 61</pre> 62 63<h2>Range Highlighting for Related Text</h2> 64 65<p>Clang captures and accurately tracks range information for expressions, 66statements, and other constructs in your program and uses this to make 67diagnostics highlight related information. In the following somewhat 68nonsensical example you can see that you don't even need to see the original source code to 69understand what is wrong based on the Clang error. Because clang prints a 70point, you know exactly <em>which</em> plus it is complaining about. The range 71information highlights the left and right side of the plus which makes it 72immediately obvious what the compiler is talking about. 73Range information is very useful for 74cases involving precedence issues and many other cases.</p> 75 76<pre> 77 $ <b>gcc-4.2 -fsyntax-only t.c</b> 78 t.c:7: error: invalid operands to binary + (have 'int' and 'struct A') 79 $ <b>clang -fsyntax-only t.c</b> 80 t.c:7:39: <span class="err">error:</span> invalid operands to binary expression ('int' and 'struct A') 81 <span class="snip"> return y + func(y ? ((SomeA.X + 40) + SomeA) / 42 + SomeA.X : SomeA.X);</span> 82 <span class="point"> ~~~~~~~~~~~~~~ ^ ~~~~~</span> 83</pre> 84 85<h2>Precision in Wording</h2> 86 87<p>A detail is that we have tried really hard to make the diagnostics that come 88out of clang contain exactly the pertinent information about what is wrong and 89why. In the example above, we tell you what the inferred types are for 90the left and right hand sides, and we don't repeat what is obvious from the 91point (e.g., that this is a "binary +").</p> 92 93<p>Many other examples abound. In the following example, not only do we tell you that there is a problem with the * 94and point to it, we say exactly why and tell you what the type is (in case it is 95a complicated subexpression, such as a call to an overloaded function). This 96sort of attention to detail makes it much easier to understand and fix problems 97quickly.</p> 98 99<pre> 100 $ <b>gcc-4.2 -fsyntax-only t.c</b> 101 t.c:5: error: invalid type argument of 'unary *' 102 $ <b>clang -fsyntax-only t.c</b> 103 t.c:5:11: <span class="err">error:</span> indirection requires pointer operand ('int' invalid) 104 <span class="snip"> int y = *SomeA.X;</span> 105 <span class="point"> ^~~~~~~~</span> 106</pre> 107 108<h2>No Pretty Printing of Expressions in Diagnostics</h2> 109 110<p>Since Clang has range highlighting, it never needs to pretty print your code 111back out to you. GCC can produce inscrutible error messages in some cases when 112it tries to do this. In this example P and Q have type "int*":</p> 113 114<pre> 115 $ <b>gcc-4.2 -fsyntax-only t.c</b> 116 #'exact_div_expr' not supported by pp_c_expression#'t.c:12: error: called object is not a function 117 $ <b>clang -fsyntax-only t.c</b> 118 t.c:12:8: <span class="err">error:</span> called object type 'int' is not a function or function pointer 119 <span class="snip"> (P-Q)();</span> 120 <span class="point"> ~~~~~^</span> 121</pre> 122 123<p>This can be particularly bad in G++, which often emits errors 124 containing lowered vtable references. For example:</p> 125 126<pre> 127 $ <b>cat t.cc</b> 128 struct a { 129 virtual int bar(); 130 }; 131 132 struct foo : public virtual a { 133 }; 134 135 void test(foo *P) { 136 return P->bar() + *P; 137 } 138 $ <b>gcc-4.2 t.cc</b> 139 t.cc: In function 'void test(foo*)': 140 t.cc:9: error: no match for 'operator+' in '(((a*)P) + (*(long int*)(P->foo::<anonymous>.a::_vptr$a + -0x00000000000000020)))->a::bar() + * P' 141 t.cc:9: error: return-statement with a value, in function returning 'void' 142 $ <b>clang t.cc</b> 143 t.cc:9:18: <span class="err">error:</span> invalid operands to binary expression ('int' and 'foo') 144 <span class="snip"> return P->bar() + *P;</span> 145 <span class="point"> ~~~~~~~~ ^ ~~</span> 146</pre> 147 148 149<h2>Typedef Preservation and Selective Unwrapping</h2> 150 151<p>Many programmers use high-level user defined types, typedefs, and other 152syntactic sugar to refer to types in their program. This is useful because they 153can abbreviate otherwise very long types and it is useful to preserve the 154typename in diagnostics. However, sometimes very simple typedefs can wrap 155trivial types and it is important to strip off the typedef to understand what 156is going on. Clang aims to handle both cases well.<p> 157 158<p>The following example shows where it is important to preserve 159a typedef in C. Here the type printed by GCC isn't even valid, but if the error 160were about a very long and complicated type (as often happens in C++) the error 161message would be ugly just because it was long and hard to read.</p> 162 163<pre> 164 $ <b>gcc-4.2 -fsyntax-only t.c</b> 165 t.c:15: error: invalid operands to binary / (have 'float __vector__' and 'const int *') 166 $ <b>clang -fsyntax-only t.c</b> 167 t.c:15:11: <span class="err">error:</span> can't convert between vector values of different size ('__m128' and 'int const *') 168 <span class="snip"> myvec[1]/P;</span> 169 <span class="point"> ~~~~~~~~^~</span> 170</pre> 171 172<p>The following example shows where it is useful for the compiler to expose 173underlying details of a typedef. If the user was somehow confused about how the 174system "pid_t" typedef is defined, Clang helpfully displays it with "aka".</p> 175 176<pre> 177 $ <b>gcc-4.2 -fsyntax-only t.c</b> 178 t.c:13: error: request for member 'x' in something not a structure or union 179 $ <b>clang -fsyntax-only t.c</b> 180 t.c:13:9: <span class="err">error:</span> member reference base type 'pid_t' (aka 'int') is not a structure or union 181 <span class="snip"> myvar = myvar.x;</span> 182 <span class="point"> ~~~~~ ^</span> 183</pre> 184 185<p>In C++, type preservation includes retaining any qualification written into type names. For example, if we take a small snippet of code such as: 186 187<blockquote> 188<pre> 189namespace services { 190 struct WebService { }; 191} 192namespace myapp { 193 namespace servers { 194 struct Server { }; 195 } 196} 197 198using namespace myapp; 199void addHTTPService(servers::Server const &server, ::services::WebService const *http) { 200 server += http; 201} 202</pre> 203</blockquote> 204 205<p>and then compile it, we see that Clang is both providing more accurate information and is retaining the types as written by the user (e.g., "servers::Server", "::services::WebService"): 206 207<pre> 208 $ <b>g++-4.2 -fsyntax-only t.cpp</b> 209 t.cpp:9: error: no match for 'operator+=' in 'server += http' 210 $ <b>clang -fsyntax-only t.cpp</b> 211 t.cpp:9:10: <span class="err">error:</span> invalid operands to binary expression ('servers::Server const' and '::services::WebService const *') 212 <span class="snip">server += http;</span> 213 <span class="point">~~~~~~ ^ ~~~~</span> 214</pre> 215 216<p>Naturally, type preservation extends to uses of templates, and Clang retains information about how a particular template specialization (like <code>std::vector<Real></code>) was spelled within the source code. For example:</p> 217 218<pre> 219 $ <b>g++-4.2 -fsyntax-only t.cpp</b> 220 t.cpp:12: error: no match for 'operator=' in 'str = vec' 221 $ <b>clang -fsyntax-only t.cpp</b> 222 t.cpp:12:7: <span class="err">error:</span> incompatible type assigning 'vector<Real>', expected 'std::string' (aka 'class std::basic_string<char>') 223 <span class="snip">str = vec</span>; 224 <span class="point">^ ~~~</span> 225</pre> 226 227<h2>Fix-it Hints</h2> 228 229<p>"Fix-it" hints provide advice for fixing small, localized problems 230in source code. When Clang produces a diagnostic about a particular 231problem that it can work around (e.g., non-standard or redundant 232syntax, missing keywords, common mistakes, etc.), it may also provide 233specific guidance in the form of a code transformation to correct the 234problem. In the following example, Clang warns about the use of a GCC 235extension that has been considered obsolete since 1993. The underlined 236code should be removed, then replaced with the code below the 237point line (".x =" or ".y =", respectively).</p> 238 239<pre> 240 $ <b>clang t.c</b> 241 t.c:5:28: <span class="warn">warning:</span> use of GNU old-style field designator extension 242 <span class="snip">struct point origin = { x: 0.0, y: 0.0 };</span> 243 <span class="err">~~</span> <span class="point">^</span> 244 <span class="snip">.x = </span> 245 t.c:5:36: <span class="warn">warning:</span> use of GNU old-style field designator extension 246 <span class="snip">struct point origin = { x: 0.0, y: 0.0 };</span> 247 <span class="err">~~</span> <span class="point">^</span> 248 <span class="snip">.y = </span> 249</pre> 250 251<p>"Fix-it" hints are most useful for 252working around common user errors and misconceptions. For example, C++ users 253commonly forget the syntax for explicit specialization of class templates, 254as in the error in the following example. Again, after describing the problem, 255Clang provides the fix--add <code>template<></code>--as part of the 256diagnostic.<p> 257 258<pre> 259 $ <b>clang t.cpp</b> 260 t.cpp:9:3: <span class="err">error:</span> template specialization requires 'template<>' 261 struct iterator_traits<file_iterator> { 262 <span class="point">^</span> 263 <span class="snip">template<> </span> 264</pre> 265 266<h2>Template Type Diffing</h2> 267 268<p>Templates types can be long and difficult to read. Moreso when part of an 269error message. Instead of just printing out the type name, Clang has enough 270information to remove the common elements and highlight the differences. To 271show the template structure more clearly, the templated type can also be 272printed as an indented text tree.</p> 273 274Default: template diff with type elision 275<pre> 276t.cc:4:5: <span class="note">note:</span> candidate function not viable: no known conversion from 'vector<map<[...], <span class="template-highlight">float</span>>>' to 'vector<map<[...], <span class="template-highlight">double</span>>>' for 1st argument; 277</pre> 278-fno-elide-type: template diff without elision 279<pre> 280t.cc:4:5: <span class="note">note:</span> candidate function not viable: no known conversion from 'vector<map<int, <span class="template-highlight">float</span>>>' to 'vector<map<int, <span class="template-highlight">double</span>>>' for 1st argument; 281</pre> 282-fdiagnostics-show-template-tree: template tree printing with elision 283<pre> 284t.cc:4:5: <span class="note">note:</span> candidate function not viable: no known conversion for 1st argument; 285 vector< 286 map< 287 [...], 288 [<span class="template-highlight">float</span> != <span class="template-highlight">double</span>]>> 289</pre> 290-fdiagnostics-show-template-tree -fno-elide-type: template tree printing with no elision 291<pre> 292t.cc:4:5: <span class="note">note:M</span> candidate function not viable: no known conversion for 1st argument; 293 vector< 294 map< 295 int, 296 [<span class="template-highlight">float</span> != <span class="template-highlight">double</span>]>> 297</pre> 298 299<h2>Automatic Macro Expansion</h2> 300 301<p>Many errors happen in macros that are sometimes deeply nested. With 302traditional compilers, you need to dig deep into the definition of the macro to 303understand how you got into trouble. The following simple example shows how 304Clang helps you out by automatically printing instantiation information and 305nested range information for diagnostics as they are instantiated through macros 306and also shows how some of the other pieces work in a bigger example.</p> 307 308<pre> 309 $ <b>gcc-4.2 -fsyntax-only t.c</b> 310 t.c: In function 'test': 311 t.c:80: error: invalid operands to binary < (have 'struct mystruct' and 'float') 312 $ <b>clang -fsyntax-only t.c</b> 313 t.c:80:3: <span class="err">error:</span> invalid operands to binary expression ('typeof(P)' (aka 'struct mystruct') and 'typeof(F)' (aka 'float')) 314 <span class="snip"> X = MYMAX(P, F);</span> 315 <span class="point"> ^~~~~~~~~~~</span> 316 t.c:76:94: note: instantiated from: 317 <span class="snip">#define MYMAX(A,B) __extension__ ({ __typeof__(A) __a = (A); __typeof__(B) __b = (B); __a < __b ? __b : __a; })</span> 318 <span class="point"> ~~~ ^ ~~~</span> 319</pre> 320 321<p>Here's another real world warning that occurs in the "window" Unix package (which 322implements the "wwopen" class of APIs):</p> 323 324<pre> 325 $ <b>clang -fsyntax-only t.c</b> 326 t.c:22:2: <span class="warn">warning:</span> type specifier missing, defaults to 'int' 327 <span class="snip"> ILPAD();</span> 328 <span class="point"> ^</span> 329 t.c:17:17: note: instantiated from: 330 <span class="snip">#define ILPAD() PAD((NROW - tt.tt_row) * 10) /* 1 ms per char */</span> 331 <span class="point"> ^</span> 332 t.c:14:2: note: instantiated from: 333 <span class="snip"> register i; \</span> 334 <span class="point"> ^</span> 335</pre> 336 337<p>In practice, we've found that Clang's treatment of macros is actually more useful in multiply nested 338macros that in simple ones.</p> 339 340<h2>Quality of Implementation and Attention to Detail</h2> 341 342<p>Finally, we have put a lot of work polishing the little things, because 343little things add up over time and contribute to a great user experience.</p> 344 345<p>The following example shows a trivial little tweak, where we tell you to put the semicolon at 346the end of the line that is missing it (line 4) instead of at the beginning of 347the following line (line 5). This is particularly important with fixit hints 348and point diagnostics, because otherwise you don't get the important context. 349</p> 350 351<pre> 352 $ <b>gcc-4.2 t.c</b> 353 t.c: In function 'foo': 354 t.c:5: error: expected ';' before '}' token 355 $ <b>clang t.c</b> 356 t.c:4:8: <span class="err">error:</span> expected ';' after expression 357 <span class="snip"> bar()</span> 358 <span class="point"> ^</span> 359 <span class="point"> ;</span> 360</pre> 361 362<p>The following example shows much better error recovery than GCC. The message coming out 363of GCC is completely useless for diagnosing the problem. Clang tries much harder 364and produces a much more useful diagnosis of the problem.</p> 365 366<pre> 367 $ <b>gcc-4.2 t.c</b> 368 t.c:3: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token 369 $ <b>clang t.c</b> 370 t.c:3:1: <span class="err">error:</span> unknown type name 'foo_t' 371 <span class="snip">foo_t *P = 0;</span> 372 <span class="point">^</span> 373</pre> 374 375<p>The following example shows that we recover from the simple case of 376forgetting a ; after a struct definition much better than GCC.</p> 377 378<pre> 379 $ <b>cat t.cc</b> 380 template<class T> 381 class a {} 382 class temp {}; 383 a<temp> b; 384 struct b { 385 } 386 $ <b>gcc-4.2 t.cc</b> 387 t.cc:3: error: multiple types in one declaration 388 t.cc:4: error: non-template type 'a' used as a template 389 t.cc:4: error: invalid type in declaration before ';' token 390 t.cc:6: error: expected unqualified-id at end of input 391 $ <b>clang t.cc</b> 392 t.cc:2:11: <span class="err">error:</span> expected ';' after class 393 <span class="snip">class a {}</span> 394 <span class="point"> ^</span> 395 <span class="point"> ;</span> 396 t.cc:6:2: <span class="err">error:</span> expected ';' after struct 397 <span class="snip">}</span> 398 <span class="point"> ^</span> 399 <span class="point"> ;</span> 400</pre> 401 402<p>While each of these details is minor, we feel that they all add up to provide 403a much more polished experience.</p> 404 405</div> 406</body> 407</html> 408