shading.html revision 7ff72a76590d5abdbe0891da51a5fed37d6fe312
1<HTML> 2 3<TITLE>Shading Language Support</TITLE> 4 5<link rel="stylesheet" type="text/css" href="mesa.css"></head> 6 7<BODY> 8 9<H1>Shading Language Support</H1> 10 11<p> 12This page describes the features and status of Mesa's support for the 13<a href="http://opengl.org/documentation/glsl/" target="_parent"> 14OpenGL Shading Language</a>. 15</p> 16 17<p> 18Last updated on 28 March 2007. 19</p> 20 21<p> 22Contents 23</p> 24<ul> 25<li><a href="#unsup">Unsupported Features</a> 26<li><a href="#notes">Implementation Notes</a> 27<li><a href="#hints">Programming Hints</a> 28<li><a href="#standalone">Stand-alone Compiler</a> 29<li><a href="#implementation">Compiler Implementation</a> 30<li><a href="#validation">Compiler Validation</a> 31</ul> 32 33 34<a name="unsup"> 35<h2>Unsupported Features</h2> 36 37<p> 38The following features of the shading language are not yet supported 39in Mesa: 40</p> 41 42<ul> 43<li>Dereferencing arrays with non-constant indexes 44<li>Comparison of user-defined structs 45<li>Linking of multiple shaders is not supported 46<li>gl_ClipVertex 47<li>The derivative functions such as dFdx() are not implemented 48<li>The inverse trig functions asin(), acos(), and atan() are not implemented 49</ul> 50 51<p> 52All other major features of the shading language should function. 53</p> 54 55 56<a name="notes"> 57<h2>Implementation Notes</h2> 58 59<ul> 60<li>Shading language programs are compiled into low-level programs 61 very similar to those of GL_ARB_vertex/fragment_program. 62<li>All vector types (vec2, vec3, vec4, bvec2, etc) currently occupy full 63 float[4] registers. 64<li>Float constants and variables are packed so that up to four floats 65 can occupy one program parameter/register. 66<li>All function calls are inlined. 67<li>Shaders which use too many registers will not compile. 68<li>The quality of generated code is pretty good, register usage is fair. 69<li>Shader error detection and reporting of errors (InfoLog) is not 70 very good yet. 71<li>The ftransform() function doesn't necessarily match the results of 72 fixed-function transformation. 73</ul> 74 75<p> 76These issues will be addressed/resolved in the future. 77</p> 78 79 80<a name="hints"> 81<h2>Programming Hints</h2> 82 83<ul> 84<li>Declare <em>in</em> function parameters as <em>const</em> whenever possible. 85 This improves the efficiency of function inlining. 86</li> 87<br> 88<li>To reduce register usage, declare variables within smaller scopes. 89 For example, the following code: 90<pre> 91 void main() 92 { 93 vec4 a1, a2, b1, b2; 94 gl_Position = expression using a1, a2. 95 gl_Color = expression using b1, b2; 96 } 97</pre> 98 Can be rewritten as follows to use half as many registers: 99<pre> 100 void main() 101 { 102 { 103 vec4 a1, a2; 104 gl_Position = expression using a1, a2. 105 } 106 { 107 vec4 b1, b2; 108 gl_Color = expression using b1, b2; 109 } 110 } 111</pre> 112 Alternately, rather than using several float variables, use 113 a vec4 instead. Use swizzling and writemasks to access the 114 components of the vec4 as floats. 115</li> 116<br> 117<li>Use the built-in library functions whenever possible. 118 For example, instead of writing this: 119<pre> 120 float x = 1.0 / sqrt(y); 121</pre> 122 Write this: 123<pre> 124 float x = inversesqrt(y); 125</pre> 126<li> 127 Use ++i when possible as it's more efficient than i++ 128</li> 129</ul> 130 131 132<a name="standalone"> 133<h2>Stand-alone Compiler</h2> 134 135<p> 136A unique stand-alone GLSL compiler driver has been added to Mesa. 137<p> 138 139<p> 140The stand-alone compiler (like a conventional command-line compiler) 141is a tool that accepts Shading Language programs and emits low-level 142GPU programs. 143</p> 144 145<p> 146This tool is useful for: 147<p> 148<ul> 149<li>Inspecting GPU code to gain insight into compilation 150<li>Generating initial GPU code for subsequent hand-tuning 151<li>Debugging the GLSL compiler itself 152</ul> 153 154<p> 155To build the glslcompiler program (this will be improved someday): 156</p> 157<pre> 158 cd src/mesa 159 make libmesa.a 160 cd drivers/glslcompiler 161 make 162</pre> 163 164 165<p> 166Here's an example of using the compiler to compile a vertex shader and 167emit GL_ARB_vertex_program-style instructions: 168</p> 169<pre> 170 glslcompiler --arb --linenumbers --vs vertshader.txt 171</pre> 172<p> 173The output may look similar to this: 174</p> 175<pre> 176!!ARBvp1.0 177 0: MOV result.texcoord[0], vertex.texcoord[0]; 178 1: DP4 temp0.x, state.matrix.mvp.row[0], vertex.position; 179 2: DP4 temp0.y, state.matrix.mvp.row[1], vertex.position; 180 3: DP4 temp0.z, state.matrix.mvp.row[2], vertex.position; 181 4: DP4 temp0.w, state.matrix.mvp.row[3], vertex.position; 182 5: MOV result.position, temp0; 183 6: END 184</pre> 185 186<p> 187Note that some shading language constructs (such as uniform and varying 188variables) aren't expressible in ARB or NV-style programs. 189Therefore, the resulting output is not always legal by definition of 190those program languages. 191</p> 192<p> 193Also note that this compiler driver is still under development. 194Over time, the correctness of the GPU programs, with respect to the ARB 195and NV languagues, should improve. 196</p> 197 198 199 200<a name="implementation"> 201<h2>Compiler Implementation</h2> 202 203<p> 204The source code for Mesa's shading language compiler is in the 205<code>src/mesa/shader/slang/</code> directory. 206</p> 207 208<p> 209The compiler follows a fairly standard design and basically works as follows: 210</p> 211<ul> 212<li>The input string is tokenized (see grammar.c) and parsed 213(see slang_compiler_*.c) to produce an Abstract Syntax Tree (AST). 214The nodes in this tree are slang_operation structures 215(see slang_compile_operation.h). 216The nodes are decorated with symbol table, scoping and datatype information. 217<li>The AST is converted into an Intermediate representation (IR) tree 218(see the slang_codegen.c file). 219The IR nodes represent basic GPU instructions, like add, dot product, 220move, etc. 221The IR tree is mostly a binary tree, but a few nodes have three or four 222children. 223In principle, the IR tree could be executed by doing an in-order traversal. 224<li>The IR tree is traversed in-order to emit code (see slang_emit.c). 225This is also when registers are allocated to store variables and temps. 226<li>In the future, a pattern-matching code generator-generator may be 227used for code generation. 228Programs such as L-BURG (Bottom-Up Rewrite Generator) and Twig look for 229patterns in IR trees, compute weights for subtrees and use the weights 230to select the best instructions to represent the sub-tree. 231<li>The emitted GPU instructions (see prog_instruction.h) are stored in a 232gl_program object (see mtypes.h). 233<li>When a fragment shader and vertex shader are linked (see slang_link.c) 234the varying vars are matched up, uniforms are merged, and vertex 235attributes are resolved (rewriting instructions as needed). 236</ul> 237 238<p> 239The final vertex and fragment programs may be interpreted in software 240(see prog_execute.c) or translated into a specific hardware architecture 241(see drivers/dri/i915/i915_fragprog.c for example). 242</p> 243 244<h3>Code Generation Options</h3> 245 246<p> 247Internally, there are several options that control the compiler's code 248generation and instruction selection. 249These options are seen in the gl_shader_state struct and may be set 250by the device driver to indicate its preferences: 251 252<pre> 253struct gl_shader_state 254{ 255 ... 256 /** Driver-selectable options: */ 257 GLboolean EmitHighLevelInstructions; 258 GLboolean EmitCondCodes; 259 GLboolean EmitComments; 260}; 261</pre> 262 263<ul> 264<li>EmitHighLevelInstructions 265<br> 266This option controls instruction selection for loops and conditionals. 267If the option is set high-level IF/ELSE/ENDIF, LOOP/ENDLOOP, CONT/BRK 268instructions will be emitted. 269Otherwise, those constructs will be implemented with BRA instructions. 270</li> 271 272<li>EmitCondCodes 273<br> 274If set, condition codes (ala GL_NV_fragment_program) will be used for 275branching and looping. 276Otherwise, ordinary registers will be used (the IF instruction will 277examine the first operand's X component and do the if-part if non-zero). 278This option is only relevant if EmitHighLevelInstructions is set. 279</li> 280 281<li>EmitComments 282<br> 283If set, instructions will be annoted with comments to help with debugging. 284Extra NOP instructions will also be inserted. 285</br> 286 287</ul> 288 289 290<a name="validation"> 291<h2>Compiler Validation</h2> 292 293<p> 294A new <a href="http://glean.sf.net" target="_parent">Glean</a> test has 295been create to exercise the GLSL compiler. 296</p> 297<p> 298The <em>glsl1</em> test runs over 150 sub-tests to check that the language 299features and built-in functions work properly. 300This test should be run frequently while working on the compiler to catch 301regressions. 302</p> 303<p> 304The test coverage is reasonably broad and complete but additional tests 305should be added. 306</p> 307 308 309</BODY> 310</HTML> 311