shading.html revision b67d93111da01c30317f4642e041e5c8c111d3d3
1<HTML> 2 3<TITLE>Shading Language Support</TITLE> 4 5<link rel="stylesheet" type="text/css" href="mesa.css"></head> 6 7<BODY> 8 9<H1>Shading Language Support</H1> 10 11<p> 12This page describes the features and status of Mesa's support for the 13<a href="http://opengl.org/documentation/glsl/" target="_parent"> 14OpenGL Shading Language</a>. 15</p> 16 17<p> 18Last updated on 26 March 2007. 19</p> 20 21<p> 22Contents 23</p> 24<ul> 25<li><a href="#unsup">Unsupported Features</a> 26<li><a href="#notes">Implementation Notes</a> 27<li><a href="#hints">Programming Hints</a> 28<li><a href="#standalone">Stand-alone Compiler</a> 29<li><a href="#implementation">Compiler Implementation</a> 30</ul> 31 32 33<a name="unsup"> 34<h2>Unsupported Features</h2> 35 36<p> 37The following features of the shading language are not yet supported 38in Mesa: 39</p> 40 41<ul> 42<li>Dereferencing arrays with non-constant indexes 43<li>Comparison of user-defined structs 44<li>Linking of multiple shaders is not supported 45<li>gl_ClipVertex 46</ul> 47 48<p> 49All other major features of the shading language should function. 50</p> 51 52 53<a name="notes"> 54<h2>Implementation Notes</h2> 55 56<ul> 57<li>Shading language programs are compiled into low-level programs 58 very similar to those of GL_ARB_vertex/fragment_program. 59<li>All vector types (vec2, vec3, vec4, bvec2, etc) currently occupy full 60 float[4] registers. 61<li>Float constants and variables are packed so that up to four floats 62 can occupy one program parameter/register. 63<li>All function calls are inlined. 64<li>Shaders which use too many registers will not compile. 65<li>The quality of generated code is pretty good, register usage is fair. 66<li>Shader error detection and reporting of errors (InfoLog) is not 67 very good yet. 68<li>There are known memory leaks in the compiler. 69</ul> 70 71<p> 72These issues will be addressed/resolved in the future. 73</p> 74 75 76<a name="hints"> 77<h2>Programming Hints</h2> 78 79<ul> 80<li>Declare <em>in</em> function parameters as <em>const</em> whenever possible. 81 This improves the efficiency of function inlining. 82</li> 83<br> 84<li>To reduce register usage, declare variables within smaller scopes. 85 For example, the following code: 86<pre> 87 void main() 88 { 89 vec4 a1, a2, b1, b2; 90 gl_Position = expression using a1, a2. 91 gl_Color = expression using b1, b2; 92 } 93</pre> 94 Can be rewritten as follows to use half as many registers: 95<pre> 96 void main() 97 { 98 { 99 vec4 a1, a2; 100 gl_Position = expression using a1, a2. 101 } 102 { 103 vec4 b1, b2; 104 gl_Color = expression using b1, b2; 105 } 106 } 107</pre> 108 Alternately, rather than using several float variables, use 109 a vec4 instead. Use swizzling and writemasks to access the 110 components of the vec4 as floats. 111</li> 112<br> 113<li>Use the built-in library functions whenever possible. 114 For example, instead of writing this: 115<pre> 116 float x = 1.0 / sqrt(y); 117</pre> 118 Write this: 119<pre> 120 float x = inversesqrt(y); 121</pre> 122</ul> 123 124 125<a name="standalone"> 126<h2>Stand-alone Compiler</h2> 127 128<p> 129A unique stand-alone GLSL compiler driver has been added to Mesa. 130<p> 131 132<p> 133The stand-alone compiler (like a conventional command-line compiler) 134is a tool that accepts Shading Language programs and emits low-level 135GPU programs. 136</p> 137 138<p> 139This tool is useful for: 140<p> 141<ul> 142<li>Inspecting GPU code to gain insight into compilation 143<li>Generating initial GPU code for subsequent hand-tuning 144<li>Debugging the GLSL compiler itself 145</ul> 146 147<p> 148To build the glslcompiler program (this will be improved someday): 149</p> 150<pre> 151 cd src/mesa 152 make libmesa.a 153 cd drivers/glslcompiler 154 make 155</pre> 156 157 158<p> 159Here's an example of using the compiler to compile a vertex shader and 160emit GL_ARB_vertex_program-style instructions: 161</p> 162<pre> 163 glslcompiler --arb --linenumbers --vs vertshader.txt 164</pre> 165<p> 166The output may look similar to this: 167</p> 168<pre> 169!!ARBvp1.0 170 0: MOV result.texcoord[0], vertex.texcoord[0]; 171 1: DP4 temp0.x, state.matrix.mvp.row[0], vertex.position; 172 2: DP4 temp0.y, state.matrix.mvp.row[1], vertex.position; 173 3: DP4 temp0.z, state.matrix.mvp.row[2], vertex.position; 174 4: DP4 temp0.w, state.matrix.mvp.row[3], vertex.position; 175 5: MOV result.position, temp0; 176 6: END 177</pre> 178 179<p> 180Note that some shading language constructs (such as uniform and varying 181variables) aren't expressible in ARB or NV-style programs. 182Therefore, the resulting output is not always legal by definition of 183those program languages. 184</p> 185<p> 186Also note that this compiler driver is still under development. 187Over time, the correctness of the GPU programs, with respect to the ARB 188and NV languagues, should improve. 189</p> 190 191 192 193<a name="implementation"> 194<h2>Compiler Implementation</h2> 195 196<p> 197The source code for Mesa's shading language compiler is in the 198<code>src/mesa/shader/slang/</code> directory. 199</p> 200 201<p> 202The compiler follows a fairly standard design and basically works as follows: 203</p> 204<ul> 205<li>The input string is tokenized (see grammar.c) and parsed 206(see slang_compiler_*.c) to produce an Abstract Syntax Tree (AST). 207The nodes in this tree are slang_operation structures 208(see slang_compile_operation.h). 209The nodes are decorated with symbol table, scoping and datatype information. 210<li>The AST is converted into an Intermediate representation (IR) tree 211(see the slang_codegen.c file). 212The IR nodes represent basic GPU instructions, like add, dot product, 213move, etc. 214The IR tree is mostly a binary tree, but a few nodes have three or four 215children. 216In principle, the IR tree could be executed by doing an in-order traversal. 217<li>The IR tree is traversed in-order to emit code (see slang_emit.c). 218This is also when registers are allocated to store variables and temps. 219<li>In the future, a pattern-matching code generator-generator may be 220used for code generation. 221Programs such as L-BURG (Bottom-Up Rewrite Generator) and Twig look for 222patterns in IR trees, compute weights for subtrees and use the weights 223to select the best instructions to represent the sub-tree. 224<li>The emitted GPU instructions (see prog_instruction.h) are stored in a 225gl_program object (see mtypes.h). 226<li>When a fragment shader and vertex shader are linked (see slang_link.c) 227the varying vars are matched up, uniforms are merged, and vertex 228attributes are resolved (rewriting instructions as needed). 229</ul> 230 231<p> 232The final vertex and fragment programs may be interpreted in software 233(see prog_execute.c) or translated into a specific hardware architecture 234(see drivers/dri/i915/i915_fragprog.c for example). 235</p> 236 237<h3>Code Generation Options</h3> 238 239<p> 240Internally, there are several options that control the compiler's code 241generation and instruction selection. 242These options are seen in the gl_shader_state struct and may be set 243by the device driver to indicate its preferences: 244 245<pre> 246struct gl_shader_state 247{ 248 ... 249 /** Driver-selectable options: */ 250 GLboolean EmitHighLevelInstructions; 251 GLboolean EmitCondCodes; 252 GLboolean EmitComments; 253}; 254</pre> 255 256<ul> 257<li>EmitHighLevelInstructions 258<br> 259This option controls instruction selection for loops and conditionals. 260If the option is set high-level IF/ELSE/ENDIF, LOOP/ENDLOOP, CONT/BRK 261instructions will be emitted. 262Otherwise, those constructs will be implemented with BRA instructions. 263</li> 264 265<li>EmitCondCodes 266<br> 267If set, condition codes (ala GL_NV_fragment_program) will be used for 268branching and looping. 269Otherwise, ordinary registers will be used (the IF instruction will 270examine the first operand's X component and do the if-part if non-zero). 271This option is only relevant if EmitHighLevelInstructions is set. 272</li> 273 274<li>EmitComments 275<br> 276If set, instructions will be annoted with comments to help with debugging. 277Extra NOP instructions will also be inserted. 278</br> 279 280</ul> 281 282 283</BODY> 284</HTML> 285