shading.html revision 3bfedb7ed4a35cfcc7187bc22314833ef1d96ec9
1<HTML> 2 3<TITLE>Shading Language Support</TITLE> 4 5<link rel="stylesheet" type="text/css" href="mesa.css"></head> 6 7<BODY> 8 9<H1>Shading Language Support</H1> 10 11<p> 12This page describes the features and status of Mesa's support for the 13<a href="http://opengl.org/documentation/glsl/" target="_parent"> 14OpenGL Shading Language</a>. 15</p> 16 17<p> 18Last updated on 28 March 2007. 19</p> 20 21<p> 22Contents 23</p> 24<ul> 25<li><a href="#unsup">Unsupported Features</a> 26<li><a href="#notes">Implementation Notes</a> 27<li><a href="#hints">Programming Hints</a> 28<li><a href="#standalone">Stand-alone GLSL Compiler</a> 29<li><a href="#implementation">Compiler Implementation</a> 30<li><a href="#validation">Compiler Validation</a> 31</ul> 32 33 34<a name="unsup"> 35<h2>Unsupported Features</h2> 36 37<p> 38The following features of the shading language are not yet supported 39in Mesa: 40</p> 41 42<ul> 43<li>Dereferencing arrays with non-constant indexes 44<li>Comparison of user-defined structs 45<li>Linking of multiple shaders is not supported 46<li>gl_ClipVertex 47<li>The derivative functions such as dFdx() are not implemented 48<li>The inverse trig functions asin(), acos(), and atan() are not implemented 49<li>The gl_Color and gl_SecondaryColor varying vars are interpolated 50 without perspective correction 51<li>Floating point literal suffixes 'f' and 'F' aren't allowed. 52</ul> 53 54<p> 55All other major features of the shading language should function. 56</p> 57 58 59<a name="notes"> 60<h2>Implementation Notes</h2> 61 62<ul> 63<li>Shading language programs are compiled into low-level programs 64 very similar to those of GL_ARB_vertex/fragment_program. 65<li>All vector types (vec2, vec3, vec4, bvec2, etc) currently occupy full 66 float[4] registers. 67<li>Float constants and variables are packed so that up to four floats 68 can occupy one program parameter/register. 69<li>All function calls are inlined. 70<li>Shaders which use too many registers will not compile. 71<li>The quality of generated code is pretty good, register usage is fair. 72<li>Shader error detection and reporting of errors (InfoLog) is not 73 very good yet. 74<li>The ftransform() function doesn't necessarily match the results of 75 fixed-function transformation. 76</ul> 77 78<p> 79These issues will be addressed/resolved in the future. 80</p> 81 82 83<a name="hints"> 84<h2>Programming Hints</h2> 85 86<ul> 87<li>Declare <em>in</em> function parameters as <em>const</em> whenever possible. 88 This improves the efficiency of function inlining. 89</li> 90<br> 91<li>To reduce register usage, declare variables within smaller scopes. 92 For example, the following code: 93<pre> 94 void main() 95 { 96 vec4 a1, a2, b1, b2; 97 gl_Position = expression using a1, a2. 98 gl_Color = expression using b1, b2; 99 } 100</pre> 101 Can be rewritten as follows to use half as many registers: 102<pre> 103 void main() 104 { 105 { 106 vec4 a1, a2; 107 gl_Position = expression using a1, a2. 108 } 109 { 110 vec4 b1, b2; 111 gl_Color = expression using b1, b2; 112 } 113 } 114</pre> 115 Alternately, rather than using several float variables, use 116 a vec4 instead. Use swizzling and writemasks to access the 117 components of the vec4 as floats. 118</li> 119<br> 120<li>Use the built-in library functions whenever possible. 121 For example, instead of writing this: 122<pre> 123 float x = 1.0 / sqrt(y); 124</pre> 125 Write this: 126<pre> 127 float x = inversesqrt(y); 128</pre> 129<li> 130 Use ++i when possible as it's more efficient than i++ 131</li> 132</ul> 133 134 135<a name="standalone"> 136<h2>Stand-alone GLSL Compiler</h2> 137 138<p> 139A unique stand-alone GLSL compiler driver has been added to Mesa. 140<p> 141 142<p> 143The stand-alone compiler (like a conventional command-line compiler) 144is a tool that accepts Shading Language programs and emits low-level 145GPU programs. 146</p> 147 148<p> 149This tool is useful for: 150<p> 151<ul> 152<li>Inspecting GPU code to gain insight into compilation 153<li>Generating initial GPU code for subsequent hand-tuning 154<li>Debugging the GLSL compiler itself 155</ul> 156 157<p> 158After building Mesa the glslcompiler should be found in the Mesa/bin/ directory. 159If it's not there, it can be built manually: 160</p> 161<pre> 162 cd src/mesa/drivers/glslcompiler 163 make 164</pre> 165 166 167<p> 168Here's an example of using the compiler to compile a vertex shader and 169emit GL_ARB_vertex_program-style instructions: 170</p> 171<pre> 172 bin/glslcompiler --debug --numbers --fs progs/glsl/CH06-brick.frag.txt 173</pre> 174<p> 175results in: 176</p> 177<pre> 178# Fragment Program/Shader 179 0: RCP TEMP[4].x, UNIFORM[2].xxxx; 180 1: RCP TEMP[4].y, UNIFORM[2].yyyy; 181 2: MUL TEMP[3].xy, VARYING[0], TEMP[4]; 182 3: MOV TEMP[1], TEMP[3]; 183 4: MUL TEMP[0].w, TEMP[1].yyyy, CONST[4].xxxx; 184 5: FRC TEMP[1].z, TEMP[0].wwww; 185 6: SGT.C TEMP[0].w, TEMP[1].zzzz, CONST[4].xxxx; 186 7: IF (NE.wwww); # (if false, goto 9); 187 8: ADD TEMP[1].x, TEMP[1].xxxx, CONST[4].xxxx; 188 9: ENDIF; 189 10: FRC TEMP[1].xy, TEMP[1]; 190 11: SGT TEMP[2].xy, UNIFORM[3], TEMP[1]; 191 12: MUL TEMP[1].z, TEMP[2].xxxx, TEMP[2].yyyy; 192 13: LRP TEMP[0], TEMP[1].zzzz, UNIFORM[0], UNIFORM[1]; 193 14: MUL TEMP[0].xyz, TEMP[0], VARYING[1].xxxx; 194 15: MOV OUTPUT[0].xyz, TEMP[0]; 195 16: MOV OUTPUT[0].w, CONST[4].yyyy; 196 17: END 197</pre> 198 199<p> 200Note that some shading language constructs (such as uniform and varying 201variables) aren't expressible in ARB or NV-style programs. 202Therefore, the resulting output is not always legal by definition of 203those program languages. 204</p> 205<p> 206Also note that this compiler driver is still under development. 207Over time, the correctness of the GPU programs, with respect to the ARB 208and NV languagues, should improve. 209</p> 210 211 212 213<a name="implementation"> 214<h2>Compiler Implementation</h2> 215 216<p> 217The source code for Mesa's shading language compiler is in the 218<code>src/mesa/shader/slang/</code> directory. 219</p> 220 221<p> 222The compiler follows a fairly standard design and basically works as follows: 223</p> 224<ul> 225<li>The input string is tokenized (see grammar.c) and parsed 226(see slang_compiler_*.c) to produce an Abstract Syntax Tree (AST). 227The nodes in this tree are slang_operation structures 228(see slang_compile_operation.h). 229The nodes are decorated with symbol table, scoping and datatype information. 230<li>The AST is converted into an Intermediate representation (IR) tree 231(see the slang_codegen.c file). 232The IR nodes represent basic GPU instructions, like add, dot product, 233move, etc. 234The IR tree is mostly a binary tree, but a few nodes have three or four 235children. 236In principle, the IR tree could be executed by doing an in-order traversal. 237<li>The IR tree is traversed in-order to emit code (see slang_emit.c). 238This is also when registers are allocated to store variables and temps. 239<li>In the future, a pattern-matching code generator-generator may be 240used for code generation. 241Programs such as L-BURG (Bottom-Up Rewrite Generator) and Twig look for 242patterns in IR trees, compute weights for subtrees and use the weights 243to select the best instructions to represent the sub-tree. 244<li>The emitted GPU instructions (see prog_instruction.h) are stored in a 245gl_program object (see mtypes.h). 246<li>When a fragment shader and vertex shader are linked (see slang_link.c) 247the varying vars are matched up, uniforms are merged, and vertex 248attributes are resolved (rewriting instructions as needed). 249</ul> 250 251<p> 252The final vertex and fragment programs may be interpreted in software 253(see prog_execute.c) or translated into a specific hardware architecture 254(see drivers/dri/i915/i915_fragprog.c for example). 255</p> 256 257<h3>Code Generation Options</h3> 258 259<p> 260Internally, there are several options that control the compiler's code 261generation and instruction selection. 262These options are seen in the gl_shader_state struct and may be set 263by the device driver to indicate its preferences: 264 265<pre> 266struct gl_shader_state 267{ 268 ... 269 /** Driver-selectable options: */ 270 GLboolean EmitHighLevelInstructions; 271 GLboolean EmitCondCodes; 272 GLboolean EmitComments; 273}; 274</pre> 275 276<ul> 277<li>EmitHighLevelInstructions 278<br> 279This option controls instruction selection for loops and conditionals. 280If the option is set high-level IF/ELSE/ENDIF, LOOP/ENDLOOP, CONT/BRK 281instructions will be emitted. 282Otherwise, those constructs will be implemented with BRA instructions. 283</li> 284 285<li>EmitCondCodes 286<br> 287If set, condition codes (ala GL_NV_fragment_program) will be used for 288branching and looping. 289Otherwise, ordinary registers will be used (the IF instruction will 290examine the first operand's X component and do the if-part if non-zero). 291This option is only relevant if EmitHighLevelInstructions is set. 292</li> 293 294<li>EmitComments 295<br> 296If set, instructions will be annoted with comments to help with debugging. 297Extra NOP instructions will also be inserted. 298</br> 299 300</ul> 301 302 303<a name="validation"> 304<h2>Compiler Validation</h2> 305 306<p> 307A new <a href="http://glean.sf.net" target="_parent">Glean</a> test has 308been create to exercise the GLSL compiler. 309</p> 310<p> 311The <em>glsl1</em> test runs over 150 sub-tests to check that the language 312features and built-in functions work properly. 313This test should be run frequently while working on the compiler to catch 314regressions. 315</p> 316<p> 317The test coverage is reasonably broad and complete but additional tests 318should be added. 319</p> 320 321 322</BODY> 323</HTML> 324