shading.html revision c4341fe80acf75bf5417ffeb85fe6902218b752a
1<HTML> 2 3<TITLE>Shading Language Support</TITLE> 4 5<link rel="stylesheet" type="text/css" href="mesa.css"></head> 6 7<BODY> 8 9<H1>Shading Language Support</H1> 10 11<p> 12This page describes the features and status of Mesa's support for the 13<a href="http://opengl.org/documentation/glsl/" target="_parent"> 14OpenGL Shading Language</a>. 15</p> 16 17<p> 18Last updated on 15 December 2008. 19</p> 20 21<p> 22Contents 23</p> 24<ul> 25<li><a href="#120">GLSL 1.20 support</a> 26<li><a href="#unsup">Unsupported Features</a> 27<li><a href="#notes">Implementation Notes</a> 28<li><a href="#hints">Programming Hints</a> 29<li><a href="#standalone">Stand-alone GLSL Compiler</a> 30<li><a href="#implementation">Compiler Implementation</a> 31<li><a href="#validation">Compiler Validation</a> 32</ul> 33 34 35 36<a name="120"> 37<h2>GLSL 1.20 support</h2> 38 39<p> 40GLSL version 1.20 is supported in Mesa 7.3. 41Among the features/differences of GLSL 1.20 are: 42<ul> 43<li><code>mat2x3, mat2x4</code>, etc. types and functions 44<li><code>transpose(), outerProduct(), matrixCompMult()</code> functions 45(but untested) 46<li>precision qualifiers (lowp, mediump, highp) 47<li><code>invariant</code> qualifier 48<li><code>array.length()</code> method 49<li><code>float[5] a;</code> array syntax 50<li><code>centroid</code> qualifier 51<li>unsized array constructors 52<li>initializers for uniforms 53<li>const initializers calling built-in functions 54</ul> 55 56 57 58<a name="unsup"> 59<h2>Unsupported Features</h2> 60 61<p> 62The following features of the shading language are not yet supported 63in Mesa: 64</p> 65 66<ul> 67<li>Linking of multiple shaders is not supported 68<li>gl_ClipVertex 69<li>The gl_Color and gl_SecondaryColor varying vars are interpolated 70 without perspective correction 71</ul> 72 73<p> 74All other major features of the shading language should function. 75</p> 76 77 78<a name="notes"> 79<h2>Implementation Notes</h2> 80 81<ul> 82<li>Shading language programs are compiled into low-level programs 83 very similar to those of GL_ARB_vertex/fragment_program. 84<li>All vector types (vec2, vec3, vec4, bvec2, etc) currently occupy full 85 float[4] registers. 86<li>Float constants and variables are packed so that up to four floats 87 can occupy one program parameter/register. 88<li>All function calls are inlined. 89<li>Shaders which use too many registers will not compile. 90<li>The quality of generated code is pretty good, register usage is fair. 91<li>Shader error detection and reporting of errors (InfoLog) is not 92 very good yet. 93<li>The ftransform() function doesn't necessarily match the results of 94 fixed-function transformation. 95</ul> 96 97<p> 98These issues will be addressed/resolved in the future. 99</p> 100 101 102<a name="hints"> 103<h2>Programming Hints</h2> 104 105<ul> 106<li>Declare <em>in</em> function parameters as <em>const</em> whenever possible. 107 This improves the efficiency of function inlining. 108</li> 109<br> 110<li>To reduce register usage, declare variables within smaller scopes. 111 For example, the following code: 112<pre> 113 void main() 114 { 115 vec4 a1, a2, b1, b2; 116 gl_Position = expression using a1, a2. 117 gl_Color = expression using b1, b2; 118 } 119</pre> 120 Can be rewritten as follows to use half as many registers: 121<pre> 122 void main() 123 { 124 { 125 vec4 a1, a2; 126 gl_Position = expression using a1, a2. 127 } 128 { 129 vec4 b1, b2; 130 gl_Color = expression using b1, b2; 131 } 132 } 133</pre> 134 Alternately, rather than using several float variables, use 135 a vec4 instead. Use swizzling and writemasks to access the 136 components of the vec4 as floats. 137</li> 138<br> 139<li>Use the built-in library functions whenever possible. 140 For example, instead of writing this: 141<pre> 142 float x = 1.0 / sqrt(y); 143</pre> 144 Write this: 145<pre> 146 float x = inversesqrt(y); 147</pre> 148<li> 149 Use ++i when possible as it's more efficient than i++ 150</li> 151</ul> 152 153 154<a name="standalone"> 155<h2>Stand-alone GLSL Compiler</h2> 156 157<p> 158A unique stand-alone GLSL compiler driver has been added to Mesa. 159<p> 160 161<p> 162The stand-alone compiler (like a conventional command-line compiler) 163is a tool that accepts Shading Language programs and emits low-level 164GPU programs. 165</p> 166 167<p> 168This tool is useful for: 169<p> 170<ul> 171<li>Inspecting GPU code to gain insight into compilation 172<li>Generating initial GPU code for subsequent hand-tuning 173<li>Debugging the GLSL compiler itself 174</ul> 175 176<p> 177After building Mesa, the glslcompiler can be built by manually running: 178</p> 179<pre> 180 cd src/mesa/drivers/glslcompiler 181 make 182</pre> 183 184 185<p> 186Here's an example of using the compiler to compile a vertex shader and 187emit GL_ARB_vertex_program-style instructions: 188</p> 189<pre> 190 bin/glslcompiler --debug --numbers --fs progs/glsl/CH06-brick.frag.txt 191</pre> 192<p> 193results in: 194</p> 195<pre> 196# Fragment Program/Shader 197 0: RCP TEMP[4].x, UNIFORM[2].xxxx; 198 1: RCP TEMP[4].y, UNIFORM[2].yyyy; 199 2: MUL TEMP[3].xy, VARYING[0], TEMP[4]; 200 3: MOV TEMP[1], TEMP[3]; 201 4: MUL TEMP[0].w, TEMP[1].yyyy, CONST[4].xxxx; 202 5: FRC TEMP[1].z, TEMP[0].wwww; 203 6: SGT.C TEMP[0].w, TEMP[1].zzzz, CONST[4].xxxx; 204 7: IF (NE.wwww); # (if false, goto 9); 205 8: ADD TEMP[1].x, TEMP[1].xxxx, CONST[4].xxxx; 206 9: ENDIF; 207 10: FRC TEMP[1].xy, TEMP[1]; 208 11: SGT TEMP[2].xy, UNIFORM[3], TEMP[1]; 209 12: MUL TEMP[1].z, TEMP[2].xxxx, TEMP[2].yyyy; 210 13: LRP TEMP[0], TEMP[1].zzzz, UNIFORM[0], UNIFORM[1]; 211 14: MUL TEMP[0].xyz, TEMP[0], VARYING[1].xxxx; 212 15: MOV OUTPUT[0].xyz, TEMP[0]; 213 16: MOV OUTPUT[0].w, CONST[4].yyyy; 214 17: END 215</pre> 216 217<p> 218Note that some shading language constructs (such as uniform and varying 219variables) aren't expressible in ARB or NV-style programs. 220Therefore, the resulting output is not always legal by definition of 221those program languages. 222</p> 223<p> 224Also note that this compiler driver is still under development. 225Over time, the correctness of the GPU programs, with respect to the ARB 226and NV languagues, should improve. 227</p> 228 229 230 231<a name="implementation"> 232<h2>Compiler Implementation</h2> 233 234<p> 235The source code for Mesa's shading language compiler is in the 236<code>src/mesa/shader/slang/</code> directory. 237</p> 238 239<p> 240The compiler follows a fairly standard design and basically works as follows: 241</p> 242<ul> 243<li>The input string is tokenized (see grammar.c) and parsed 244(see slang_compiler_*.c) to produce an Abstract Syntax Tree (AST). 245The nodes in this tree are slang_operation structures 246(see slang_compile_operation.h). 247The nodes are decorated with symbol table, scoping and datatype information. 248<li>The AST is converted into an Intermediate representation (IR) tree 249(see the slang_codegen.c file). 250The IR nodes represent basic GPU instructions, like add, dot product, 251move, etc. 252The IR tree is mostly a binary tree, but a few nodes have three or four 253children. 254In principle, the IR tree could be executed by doing an in-order traversal. 255<li>The IR tree is traversed in-order to emit code (see slang_emit.c). 256This is also when registers are allocated to store variables and temps. 257<li>In the future, a pattern-matching code generator-generator may be 258used for code generation. 259Programs such as L-BURG (Bottom-Up Rewrite Generator) and Twig look for 260patterns in IR trees, compute weights for subtrees and use the weights 261to select the best instructions to represent the sub-tree. 262<li>The emitted GPU instructions (see prog_instruction.h) are stored in a 263gl_program object (see mtypes.h). 264<li>When a fragment shader and vertex shader are linked (see slang_link.c) 265the varying vars are matched up, uniforms are merged, and vertex 266attributes are resolved (rewriting instructions as needed). 267</ul> 268 269<p> 270The final vertex and fragment programs may be interpreted in software 271(see prog_execute.c) or translated into a specific hardware architecture 272(see drivers/dri/i915/i915_fragprog.c for example). 273</p> 274 275<h3>Code Generation Options</h3> 276 277<p> 278Internally, there are several options that control the compiler's code 279generation and instruction selection. 280These options are seen in the gl_shader_state struct and may be set 281by the device driver to indicate its preferences: 282 283<pre> 284struct gl_shader_state 285{ 286 ... 287 /** Driver-selectable options: */ 288 GLboolean EmitHighLevelInstructions; 289 GLboolean EmitCondCodes; 290 GLboolean EmitComments; 291}; 292</pre> 293 294<ul> 295<li>EmitHighLevelInstructions 296<br> 297This option controls instruction selection for loops and conditionals. 298If the option is set high-level IF/ELSE/ENDIF, LOOP/ENDLOOP, CONT/BRK 299instructions will be emitted. 300Otherwise, those constructs will be implemented with BRA instructions. 301</li> 302 303<li>EmitCondCodes 304<br> 305If set, condition codes (ala GL_NV_fragment_program) will be used for 306branching and looping. 307Otherwise, ordinary registers will be used (the IF instruction will 308examine the first operand's X component and do the if-part if non-zero). 309This option is only relevant if EmitHighLevelInstructions is set. 310</li> 311 312<li>EmitComments 313<br> 314If set, instructions will be annoted with comments to help with debugging. 315Extra NOP instructions will also be inserted. 316</br> 317 318</ul> 319 320 321<a name="validation"> 322<h2>Compiler Validation</h2> 323 324<p> 325A <a href="http://glean.sf.net" target="_parent">Glean</a> test has 326been create to exercise the GLSL compiler. 327</p> 328<p> 329The <em>glsl1</em> test runs over 170 sub-tests to check that the language 330features and built-in functions work properly. 331This test should be run frequently while working on the compiler to catch 332regressions. 333</p> 334<p> 335The test coverage is reasonably broad and complete but additional tests 336should be added. 337</p> 338 339 340</BODY> 341</HTML> 342