f3d0daf7ea7e42ff9ce11e8bd6fba1059a2406e8 |
|
26-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Index sampler program key data by linker-assigned index. Now that most things are based on the linker-assigned index, it makes sense to convert the arrays in the VS/WM program key as well. It seems silly to leave them indexed by texture unit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
bf0308d8d6fbc842d0120060e65a3fe445f5b2fb |
|
21-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Remove unused 'sampler' parameter in emit_texture_genX(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
454dc83f66643e66ea7ee9117368211f0cfe84d7 |
|
21-Jun-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Communicate the pull constant block read parameters through fs_regs. I wanted to add the surface index as a variable value for UBO support, and a reg seemed like the obvious way to go. This exposes more of the information to CSE, which we'll probably want to apply to pull constant loads for UBOs eventually (you might access 4 floats in a row, each of which would produce an oword block read of the same block). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
c0f60106df724188d6ffe7c9f21eeff22186ab25 |
|
05-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Don't clobber sampler message MRFs with subexpressions. Consider a texture call such as: textureLod(s, coordinate, log2(...)) First, we begin setting up the sampler message by loading the texture coordinates into MRFs, starting with m2. Then, we realize we need the LOD, and go to compute it with: ir->lod_info.lod->accept(this); On Gen4-5, this will generate a SEND instruction to compute log2(), loading the operand into m2, and clobbering our texcoord. Similar issues exist on Gen6+. For example, nested texture calls: textureLod(s1, c1, texture(s2, c2).x) Any texturing call where evaluating the subexpression trees for LOD or shadow comparitor would generate SEND instructions could potentially break. In some cases (like register spilling), we get lucky and avoid the issue by using non-overlapping MRF regions. But we shouldn't count on that. Fixes four Piglit test regressions on Gen4-5: - glsl-fs-shadow2DGradARB-{01,04,07,cumulative} NOTE: This is a candidate for stable release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
27bf9c1997b77f85c2099436e9ad5dfc0f1608c7 |
|
05-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Factor out texcoord setup into a helper function. With the textureRect support and GL_CLAMP workarounds, it's grown sufficiently that it deserves its own function. Separating it out makes the original function much more readable. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
82bfb4b41af7d61aa45e41d62c1842b6a09e9585 |
|
05-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Move message header and texture offset setup to generate_tex(). Setting the texture offset bits in the message header involves very specific hardware register descriptions. As such, I feel it's better suited for the lower level "generate" layer that has direct access to the weird register layouts, rather than at the fs_inst abstraction layer. This also parallels the approach I took in the VS backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
9544e44262651a51ffdb3a572f99f902807a6205 |
|
19-Jul-2012 |
Paul Berry <stereotype441@gmail.com> |
i965: Replace fs_visitor::kill_emitted with gl_fragment_program::UsesKill. The kill_emitted variable was duplicating the functionality of gl_fragment_program::UsesKill. There's no need for both. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
a454f8ec6df9334df42249be910cc2d57d913bff |
|
07-Jul-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs.h: Refactor tests for instructions modifying a register. There's one instance of a potential behavior change: propagate_constants may now propagate into a part of a vgrf after a different part of it was overwritten by a send that returns multiple registers. I don't think we ever generate IR that meets that condition, but it's something to note if we bisect behavior change to this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
a6411520b40d59a8806289c7aaea4a6b26a54443 |
|
06-Jul-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Rename virtual_grf_next to virtual_grf_count. "count" is a more useful name, since most of the time we're using it for looping over the variables. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
fe27916ddf41b9fb60c334c47c1aa81b8dd9005e |
|
04-Jul-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Move class functions from the header to .cpp files. Cuts compile time for brw_fs.h changes from 2.7s to .7s and reduces i965_dri.so size by 70k. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
458f7f014139deb48a4cf0a9e6bdca3a57d24208 |
|
06-Jun-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Move copy propagation test out to a separate function. It's going to get more complicated in a moment. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
3f929efa2872aa5a4402520ec9fd551392e2413a |
|
18-Jun-2012 |
Paul Berry <stereotype441@gmail.com> |
i965/fs: Add FS_OPCODE_MOV_DISPATCH_TO_FLAGS to fragment shader backend. In order to compute centroid varyings correctly, the fragment shader needs to be able to load the current pixel/sample mask into a flag register. This patch adds an opcode to the fragment shader back-end to do this; the opcode gets translated into the instruction mov(1) f0<1>UW g1.14<0,1,0>UW { align1 WE_all } Since this instruction clobbers f0, instruction scheduling has to treat it the same as instructions that have a conditional modifier. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
d1056541e239dfcee0ad6af2fd2d9fab37dbf025 |
|
18-Jun-2012 |
Paul Berry <stereotype441@gmail.com> |
i965/msaa: Add backend support for centroid interpolation. This patch causes the fragment shader to be configured correctly (and the correct code to be generated) for centroid interpolation. This required two changes: brw_compute_barycentric_interp_modes() needs to determine when centroid barycentric coordinates need to be included in the pixel shader thread payload, and fs_visitor::emit_general_interpolation() needs to interpolate using the correct set of barycentric coordinates. Fixes piglit tests "EXT_framebuffer_multisample/interpolation {2,4} centroid-edges" on i965. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
cf0e7aa9f8bc9c175ebd9b2ab3a8bfec4afc5abf |
|
21-Jun-2012 |
Paul Berry <stereotype441@gmail.com> |
i965/fs: Refactor interpolation code to prepare for adding centroid support. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
82d25963a838cfebdeb9b080169979329ee850ea |
|
20-Jun-2012 |
Paul Berry <stereotype441@gmail.com> |
i965: Compute dFdy() correctly for FBOs. On i965, dFdx() and dFdy() are computed by taking advantage of the fact that each consecutive set of 4 pixels dispatched to the fragment shader always constitutes a contiguous 2x2 block of pixels in a fixed arrangement known as a "sub-span". So we calculate dFdx() by taking the difference between the values computed for the left and right halves of the sub-span, and we calculate dFdy() by taking the difference between the values computed for the top and bottom halves of the sub-span. However, there's a subtlety when FBOs are in use: since FBOs use a coordinate system where the origin is at the upper left, and window system framebuffers use a coordinate system where the origin is at the lower left, the computation of dFdy() needs to be negated for FBOs. This patch modifies the fragment shader back-ends to negate the value of dFdy() when an FBO is in use. It also modifies the code that populates the program key (brw_wm_populate_key() and brw_fs_precompile()) so that they always record in the program key whether we are rendering to an FBO or to a window system framebuffer; this ensures that the fragment shader will get recompiled when switching between FBO and non-FBO use. This will result in unnecessary recompiles of fragment shaders that don't use dFdy(). To fix that, we will need to adapt the GLSL and NV_fragment_program front-ends to record whether or not a given shader uses dFdy(). I plan to implement this in a future patch series; I've left FIXME comments in the code as a reminder. Fixes Piglit test "fbo-deriv". NOTE: This is a candidate for stable release branches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
2f18698220d8b27991fab550c4721590d17278e0 |
|
01-Jun-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Fix user-defined FS outputs with less than four components. OpenGL allows you to declare user-defined fragment shader outputs with less than four components: out ivec2 color; This makes sense if you're rendering to an RG format render target. Previously, we assumed that all color outputs had four components (like the built-in gl_FragColor/gl_FragData variables). This caused us to call emit_color_write for invalid indices, incrementing the output virtual GRF's reg_offset beyond the size of the register. This caused cascading failures: split_virtual_grfs would allocate new size-1 registers based on the virtual GRF size, but then proceed to rewrite the out-of-bounds accesses assuming that it had allocated enough new (contiguously numbered) registers. This resulted in instructions that accessed size-1 GRFs which register numbers beyond virtual_grf_next (i.e. registers that were never allocated). Finally, this manifested as live variable analysis and instruction scheduling accessing their temporary array with an out of bounds index (as they're all sized based on virtual_grf_next), and the program would segfault. It looks like the hardware's Render Target Write message requires you to send four components, even for RT formats such as RG or RGB. This patch continues to use all four MRFs, but doesn't bother to fill any data for the last few, which should be unused. +2 oglconforms. NOTE: This is a candidate for stable release branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
29362875f2613ad87abe7725ce3c56c36d16cf9b |
|
25-Apr-2012 |
Eric Anholt <eric@anholt.net> |
i965/gen6+: Add support for GL_ARB_blend_func_extended. v2: Add support for gen6, and don't turn it on if blending is disabled. (fixes GPU hang), and note it in docs/GL3.txt Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
f220f73b9c5aca16ca21ea8bbbbf8718703b12cf |
|
08-May-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Do more register coalescing by using the interference graph. By using the live variables code for determining interference, we can handle coalescing in the presence of control flow, which the other register coalescing path couldn't. Total instructions: 207184 -> 206990 74/1246 programs affected (5.9%) 33993 -> 33799 instructions in affected programs (0.6% reduction) There is a newerth shader that loses out, because of some extra MOVs that now get their dead-code nature obscured by coalescing. This should be fixed by doing better at dead code elimination.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
9e9ae280e215988287b0f875c81bc2e146b9f5dd |
|
04-May-2012 |
Eric Anholt <eric@anholt.net> |
Revert "i965/fs: Jump from discard statements to the end of the program when done." This reverts commit 31866308fcf989df992ace28b5b986c3d3770e90. Fixes piglit glsl-fs-discard-exit-3 and unigine tropics rendering. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
d7787adda8006506545256547d8d590a282487af |
|
08-May-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add support for copy propagation. We could do more by handling abs/negate and non-GRF sources, but this is a good start. Improves tropics performance 0.30% +/- .17% (n=43). shader-db results: Total instructions: 208032 -> 207184 60/1246 programs affected (4.8%) 23286 -> 22438 instructions in affected programs (3.6% reduction) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
a4e9b5a768d2d9e59b6054148afb6a6b94c0e4e6 |
|
11-May-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Add a local common subexpression elimination pass. Total instructions: 18210 -> 17836 49/163 programs affected (30.1%) 12888 -> 12514 instructions in affected programs (2.9% reduction) This reduces Lightsmark's "Scale down filter" shader from 395 instructions to 283, a whopping 28%. It also reduces register pressure significantly: the SIMD8 program now uses 29 registers instead of 101, giving us more than enough room for a SIMD16 program. v2: Add && !inst->conditional_mod to the "skip some instructions" check. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
d1029f99884e2ba7f663765274cd6bdb4f82feed |
|
11-May-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Use a const reference in fs_reg::equals instead of a pointer. This lets you omit some ampersands and is more idiomatic C++. Using const also marks the function as not altering either register (which was obvious, but nice to enforce). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
dc42910e98dc00760255cc4579da458de09175b9 |
|
24-Apr-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Fix regression in comparison handling from ANDs change. I had fixed up the logic ops for delayed ANDing, but not equality comparisons on bools. Fixes new piglit fs-bool-less-compare-true. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48629
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
080b125c64b48447a515b1a169f779e62b3de13d |
|
10-Apr-2012 |
Eric Anholt <eric@anholt.net> |
i965: Add basic block generator. This takes the fs_inst list generated by the visitor, and generates a list of basic blocks with edges between them. This is a building block for data-flow analysis.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
32ae8d3b321185a85b73ff703d8fc26bd5f48fa7 |
|
10-Mar-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Try to avoid generating extra MOVs to do saturates. This change (before the previous two) produced a .23% +/- .11% performance improvement in Unigine Tropics at 1024x768 on IVB. Total instructions: 269270 -> 262649 614/2148 programs affected (28.6%) 179386 -> 172765 instructions in affected programs (3.7% reduction) v2: Move some of the logic of finding the instruction that produced the result of an expression tree to a helper.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
31866308fcf989df992ace28b5b986c3d3770e90 |
|
19-Dec-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Jump from discard statements to the end of the program when done. From the GLSL 1.30 spec: The discard keyword is only allowed within fragment shaders. It can be used within a fragment shader to abandon the operation on the current fragment. This keyword causes the fragment to be discarded and no updates to any buffers will occur. Control flow exits the shader, and subsequent implicit or explicit derivatives are undefined when this control flow is non-uniform (meaning different fragments within the primitive take different control paths). v2: Don't emit the final HALT if no other HALTs were emitted. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
8ab02b511882857a09fceed0e93bf4a0b25c17b2 |
|
14-Feb-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Add a new fs_inst::regs_written function. Certain instructions write more than one register. Texturing, for example, returns 4 registers. (We set rlen to 4 even for TXS and float shadow sampling.) Some math functions return 2. Most return 1. The next commit introduces a use of this function. NOTE: This is a candidate for the 8.0 branch (dependency of a fix). Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
7d55f37b0e87db9b3806088797075161a1c9a8bb |
|
07-Feb-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add support for generating MADs. Improves nexuiz performance 0.65% +/- .10% (n=5) on my gen6, and .39% +/- .11% (n=10) on gen7. No statistically significant performance difference on warsow (n=5, but only one shader has MADs). v2: Add support for MADs in 16-wide by using compression control. v3: Don't generate MADs when it will force an immediate to be moved to a temp. (it's not clear whether this is a win or not, but it should result in less questionable change to codegen compared to v2). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
e910241e9754b6e673ed0fc3133c8b1de56e76c7 |
|
27-Jan-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Fix rendering corruption in unigine tropics. We were allocating registers into the MRF hack region, resulting in sparkly renering in a few of the scenes. We could do better allocation by making an MRF class, having MRFs conflict with the corresponding GRFs, and tracking the live intervals of the "MRF"s and setting up the conflicts. But this is way easier for the moment. NOTE: This is a candidate for the 8.0 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
febad1779ae5cb5c85d66c2635baea62da52d2fa |
|
26-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Rename texturing ops from FS_OPCODE to SHADER_OPCODE, except TXB. We'll be reusing most of these for the VS shortly. The one exception is TXB (texturing with LOD bias), which is explicitly forbidden in the VS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
a3b8c5ed5bd591d4ae7d215f71f039d3b19200bb |
|
23-Nov-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Make register file enum 0 be the undefined register file. In 6d874d0ee18b3694c49e0206fa519bd8b746ec24, I checked whether a register that had been stored was BAD_FILE (as opposed to a legitimate GRF), but actually the unset register was ARF NULL because it had been memset to 0. Finding BAD_FILE for unset values in debugging was my intention with that file, so make it the case more often by rearranging the enum. There was only one place we relied on the magic enum register_file to hardware register file correspondance anyway.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
6d874d0ee18b3694c49e0206fa519bd8b746ec24 |
|
09-Nov-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add support for user-defined out variables. Before, I was tracking the ir_variable * found for gl_FragColor or gl_FragData[]. Instead, when visiting those variables, set up an array of per-render-target fs_regs to copy the output data from. This cleans up the color emit path, while making handling of multiple user-defined out variables easier. v2: incorporate idr's feedback about ir->location (changes by Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
a73c65c5342bf41fa0dfefe7daa9197ce6a11db4 |
|
18-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Enable faster workaround-free math on Ivybridge. According to the documentation, Ivybridge's math instruction works in SIMD16 mode for the fragment shader, and no longer forbids align16 mode for the vertex shader. The documentation claims that SIMD16 mode isn't supported for INT DIV, but empirical evidence shows that it works fine. Presumably the note is trying to warn us that the variant that returns both quotient and remainder in (dst, dst + 1) doesn't work in SIMD16 mode since dst + 1 would be sechalf(dst), trashing half your results. Since we don't use that variant, we don't care and can just enable SIMD16 everywhere. The documentation also still claims that source modifiers and conditional modifiers aren't supported, but empirical evidence and study of the simulator both show that they work just fine. Goodbye workarounds. Math just works now. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
e04bdeae82797dbdcf6f544a997a4626fdfd4aee |
|
22-Oct-2011 |
Paul Berry <stereotype441@gmail.com> |
i965/gen6+: Parameterize barycentric interpolation modes. This patch modifies the fragment shader back-end so that instead of using a single delta_x/delta_y register pair to store barycentric coordinates, it uses an array of such register pairs, one for each possible intepolation mode. When setting up the WM, we intstruct it to only provide the barycentric coordinates that are actually needed by the fragment shader--that is computed by brw_compute_barycentric_interp_modes(). Currently this function returns just BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC, because this is the only interpolation mode we support. However, that will change in a later patch. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
73b0a28ba8b3e2ab917d4c729f34ddbde52c9e88 |
|
04-Oct-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Fix comparisions with uint negation. The condmod instruction ends up generating garbage condition codes, because apparently the comparison happens on the accumulator value (33 bits for UD), not the truncated value that would be written. Fixes fs-op-neg-* Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
2e5a1a254ed81b1d3efa6064f48183eefac784d0 |
|
07-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
intel: Convert from GLboolean to 'bool' from stdbool.h. I initially produced the patch using this bash command: for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i 's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i 's/GL_FALSE/false/g' $file; done Then I manually added #include <stdbool.h> to fix compilation errors, and converted a few functions back to GLboolean that were used in core Mesa's function pointer table to avoid "incompatible pointer" warnings. Finally, I cleaned up some whitespace issues introduced by the change. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chad Versace <chad@chad-versace.us> Acked-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
de772c402215b956ab3aa0875330fc1bf7cdf95b |
|
21-Aug-2011 |
Ian Romanick <ian.d.romanick@intel.com> |
mesa: Use gl_shader_program::_LinkedShaders instead of FragmentProgram Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
ff8f272b0d02b41a0ce34ab6af7119b9e06f4961 |
|
29-Sep-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Implement integer quotient and remainder math operations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
74e927bcafad0a994be5f88fbda4058bef08bc51 |
|
18-Aug-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Split generate_math into gen4/gen6 and 1/2 operand variants. This mirrors the structure Eric used in the new VS backend, and seems simpler. In particular, the math1/math2 split will avoid having to figure out how many operands there are, as this is already known by the caller. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
30be2cc6c7c3378ee17885b5bf41d7ae53bf6fe0 |
|
26-Aug-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Implement texelFetch() on Ironlake and Sandybridge. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
2f0edc60f4bd2ae5999a6afa656e3bb3f181bf0f |
|
26-Aug-2011 |
Chad Versace <chad@chad-versace.us> |
i965: Fix Android build by removing relative includes Replace each occurence of #include "../glsl/*.h" with #include "glsl/*.h" Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad@chad-versace.us>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
dc7f449d1ac53a66e6efb56ccf2a5953418a26ca |
|
26-Aug-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Avoid generating MOVs for most ir_assignment handling. This is a port of vec4_visitor::try_rewrite_rhs_to_dst to fs_visitor. Not only is this technique less invasive and more robust, it also generates better code. Over and above the previous technique, this reduced instruction count in shader-db by 0.28% on average and 1.4% in the best case. In no case did this technique result in more code than the prior method. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
d28a3bd4bf25157aff5379a003bbf4a66157ed06 |
|
26-Aug-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Revert "Avoid generating MOVs for assignments for expressions." This reverts commit 53c89c67f33639afef951e178f93f4e29acc5d53, along with the subsequent this->result = reg_undef additions it required. Both Eric and I agree that the way he did this is really fragile; if you forget to add this->result = reg_undef before calling accept(), it may end up using the same register for two separate things, breaking things in strange and mysterious ways. The next commit will port over the new VS backend's method for solving this problem, which is simpler, less intrusive, and still manages to avoid MOVs in the common case.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
ecf8963754489abfb5097c130a9bcd4cdb76b6bd |
|
19-Jun-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Implement textureSize (TXS) on Gen5+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
65b5cbbcf783f6c668ab5b31a0734680dd396794 |
|
05-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965: Rename math FS_OPCODE_* to SHADER_OPCODE_*. I want to just use the same enums in the VS.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
6034b9a5124475d300d0678bd2fb6160865fa972 |
|
03-May-2011 |
Eric Anholt <eric@anholt.net> |
i965: Create a shared enum for hardware and compiler-internal opcodes. This should make gdbing more pleasant, and it might be used in sharing part of the codegen between the VS and FS backends.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
c9e81fe14f36933617c862efb15ae09194485eab |
|
15-May-2011 |
Eric Anholt <eric@anholt.net> |
i965: Drop the reg/hw_reg distinction. "reg" was set in only one case, virtual GRFs pre register allocation, and would be unset and have hw_reg set after allocation. Since we never bothered with looking at virtual GRF number after allocation anyway, just use the same storage and avoid confusion.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
b76378d46a211521582cfab56dc05031a57502a6 |
|
04-May-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Eliminate the magic nature of virtual GRF 0. This was a debugging aid at one point -- virtual grf 0 should never be allocated, and it would be used if undefined register access occurred in codegen. However, it made the confusing register allocation code even more confusing by indexing things off of 1 all over.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
ee0373b833155804bb8846c6f05f897b9ee5afa6 |
|
26-Jul-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Don't upload unused uniform components. This saves both register space and upload bandwidth for unused values. Note that previously we were relying on the visitor not initially generating references to different sets of uniforms between the 8-wide and 16-wide code generation, and now we're relying on them dead-code eliminating the same stuff, too.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
a8b86459a1bb74cfdf0d63572a9fe194b2b5b53f |
|
23-Jul-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Optimize a * 1.0 -> a. This appears in our instruction stream as a result of the brw_vs_constval.c handling.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
01fa9addf447120e994415ad8fc8246ac234ec27 |
|
10-Jun-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Refactor texture result swizzling into a helper function. The next patch will add a few additional uses. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
8752764076e5b3f052a57e0134424a37bf2e9164 |
|
17-May-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Do a FS compile up front at link time to produce link errors. At glLinkShaders time, a fail() call in FS compile in 8-wide (the one that's required to succeed, though we may relax that at some point for pre-Ironlake performance) will now report out as a link error.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
d1f70a8a6c6ec7007bad22d3d6013415be2d243a |
|
25-May-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Split the GLSL IR -> FS LIR visitor to brw_fs_visitor.cpp. We now have: brw_fs.cpp handles calling out to everything and optimization. brw_fs_visitor.cpp handles translating to our LIR. brw_fs_emit.cpp handles emitting from our LIR to native code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
53c89c67f33639afef951e178f93f4e29acc5d53 |
|
27-Apr-2011 |
Eric Anholt <eric@anholt.net> |
i965: Avoid generating MOVs for assignments of expressions. No statistically significant difference measured in 3dbenchmark egypt/pro. It does reduce fragment shader instructions across shader-db by 0.3%.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
b126a0c0cb30b1e2f2df1953fe14d8596d1cf4f7 |
|
02-Nov-2010 |
Eric Anholt <eric@anholt.net> |
i965: Add support for correct GL_CLAMP behavior by clamping coordinates. This removes the stupid strict-conformance fallback code I broke when adding ARB_sampler_objects. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36572 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
51761a1aefd31b7df12edd9467ac630b9cbbbbc9 |
|
11-May-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Cut an instruction and a temporary from gen6 discard statements. I thought I was thwarted initially when I couldn't do conditional mod on a MOV, and couldn't use two immediate constants in one instruction. But g0 != g0 is also a way to produce a failing comparison. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
ff6e3c73f6553cd29b915497b5b00e3ef158a27d |
|
29-Apr-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Add support for Ivybridge texturing messages. Ivybridge puts the shadow comparator first, then lod/bias, and finally the coordinate---unlike previous generations which always reserved four slots for the coordinate at the beginning. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
3b20f999bb7e9056e83ca09a842a9747d4ac1674 |
|
23-Mar-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add support for 16-wide dispatch with uniforms in use. This is glued in in a bit of an ugly way -- we rely on the uniforms having been set up by 8-wide dispatch, and we just reuse them without the ability to add new uniforms for any reason, since the 8-wide compile is already completed. Today, this all works out because our optimization passes are effectively the same for both and even if they weren't, we don't reduce the set of uniforms pushed after optimization. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
42ad2f0b9b6a18f1613f6d915a46b4a4a89c5aa2 |
|
14-Mar-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add support for 16-wide dispatch on gen5. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
662f1b48bd1a02907bb42ecda889a3aa52a5755d |
|
12-Mar-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add initial support for 16-wide dispatch on gen6. At this point it doesn't do uniforms, which have to be laid out the same between 8 and 16. Other than that, it supports everything but flow control, which was the thing that forced us to choose 8-wide for general GLSL support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
141b0bb2779c80d3cd3fd21d2e9d10efa0433f26 |
|
21-Mar-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add support for computing pixel_[xy] in 16-wide. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
4847f802c28e595130bda14055cd52c9b1f51cd7 |
|
09-Apr-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Constant-fold immediates in src0 of SEL instructions. This is like what we do for add/mul, but we have to invert the predicate to choose the other source instead. This removes 5 extra moves of constants in nexuiz shaders. No statistically significant performance difference on my Sandybridge laptop (n=5). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
2911fa0cca86f7acbc5423cab4dd328a412253cd |
|
13-Mar-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Make compile failure more verbose with INTEL_DEBUG=wm.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
53d78be3bde68bfb6416fb9c1abfbc24030f390e |
|
13-Mar-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Clean up the emit calls by introducing emit() overload helpers. I think the code ends up a lot more legible this way, though we've still got the overloads in the fs_inst as well (even though there's only one caller left currently).
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
58f7c9c72ee52527610b26ca8a137dd88c082c89 |
|
25-Feb-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Initial plumbing to support TXD. This adds the opcode and the code to convert ir_txd to OPCODE_TXD; it doesn't actually add support yet.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
0f7325b89038937bd428f7c89ed9859189a0ab0b |
|
27-Dec-2010 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Emit texel offsets in sampler messages.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
d3073f58c17d8675a2ecdd5dfa83e5520c78e1a8 |
|
21-Jan-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
Convert everything from the talloc API to the ralloc API.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
382c2d99da3f219a5b82f391a81b534b6b44ebce |
|
19-Jan-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add a helper function for detecting math opcodes.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
63879d90ace519749fed228ca0e21b5b56c7e1c0 |
|
19-Jan-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add an instruction scheduler. Improves performance of my GLSL demo by 5.1% (+/- 1.4%, n=7). It also reschedules the giant multiply tree at the end of glsl-fs-convolution-1 so that we end up not spilling registers, producing the expected level of performance.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
3f2fe31eee1667ef9cad99aaad69e52a09c9effa |
|
19-Jan-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add a helper for detecting texturing opcodes.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
e4be665bbddcb6ddfd7b9b13f01152a97097b35c |
|
18-Jan-2011 |
Eric Anholt <eric@anholt.net> |
i965: Fix dead pointers to fp->Parameters->ParameterValues[] after realloc. Fixes texrect-many regression with ff_fragment_shader -- as we added refs to the subsequent texcoord scaling paramters, the array got realloced to a new address while our params[] still pointed at the old location.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
c3f000b3926988124a44ce7e8cd6588e46063058 |
|
12-Jan-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Do flat shading when appropriate. We were trying to interpolate, which would end up doing unnecessary math, and doing so on undefined values. Fixes glsl-fs-flat-color.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
e880a57a71bbd5152ed26367dcc7051f21c20981 |
|
12-Jan-2011 |
Eric Anholt <eric@anholt.net> |
i965: Clarify when we need to (re-)calculate live intervals. The ad-hoc placement of recalculation somewhere between when they got invalidated and when they were next needed was confusing. This should clarify what's going on here.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
54df8e48bcceacbfa468d5237f2981b26493df29 |
|
28-Dec-2010 |
Eric Anholt <eric@anholt.net> |
i965: Fix regression in FS comparisons on original gen4 due to gen6 changes. Fixes 26 piglit cases on my GM965.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
b6b91fa02911f5dfc5d528d822674ee5557800d9 |
|
19-Nov-2010 |
Eric Anholt <eric@anholt.net> |
i965: Remove duplicate MRF writes in the FS backend. This is quite common for multitexture sampling, and not only cuts down on the second and later set of MOVs, but typically also allows compute-to-MRF on the first set. No statistically siginficant performance difference in nexuiz (n=3), but it reduces instruction count in one of its shaders and seems like a good idea.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
19631fab35ca4d5ca64d606922f3f20774b27645 |
|
19-Nov-2010 |
Eric Anholt <eric@anholt.net> |
i965: Recognize saturates and turn them into a saturated mov. On pre-gen6, this turns 4 instructions into 1. We could still do better by folding the saturate into the instruction generating the value if nobody else uses it, but that should be a separate pass.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
5944cda6ed1182f8dc45452708df5fde2474d437 |
|
19-Nov-2010 |
Eric Anholt <eric@anholt.net> |
i965: Just use memset() to clear most members in FS constructors. This should make it a lot harder to forget to zero things.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
07cd8f46acc34b04308f81de2faf05ba33da264b |
|
22-Oct-2010 |
Eric Anholt <eric@anholt.net> |
i965: Add support for pull constants to the new FS backend. Fixes glsl-fs-uniform-array-5, but not 6 which fails in ir_to_mesa.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
ff622d5528c8cca465e29081c0792ca210cdd092 |
|
22-Oct-2010 |
Eric Anholt <eric@anholt.net> |
i965: Move the FS disasm/annotation printout to codegen time. This makes it a lot easier to track down where we failed when some code emit triggers an assert. Plus, less memory allocation for codegen.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
99b2c8570ea6f46c6564681631f0e0750a0641cc |
|
19-Oct-2010 |
Eric Anholt <eric@anholt.net> |
i965: Add support for register spilling. It can be tested with if (0) replaced with if (1) to force spilling for all virtual GRFs. Some simple tests work, but large texturing tests fail.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
ae5698e60467db2a7e3f730788cdcdd3711da101 |
|
19-Oct-2010 |
Eric Anholt <eric@anholt.net> |
i965: Use the new style of IF statement with embedded comparison on gen6. "Everyone else" does it this way, so follow suit. It's fewer instructions, anyway.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
81d0a1fb3f1e5b7bcf43145f8a096691e3a5fdfb |
|
15-Oct-2010 |
Eric Anholt <eric@anholt.net> |
i965: Set the type of the null register to fix gen6 FS comparisons. We often use reg_null as the destination when setting up the flag regs. However, on gen6 there aren't general implicit conversions to destination types from src types, so the comparison to produce the flag regs would be done on the integer result interpreted as a float. Hilarity ensued. Fixes 20 piglit cases.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
d5599c0b6a22cd0bbc475ec715824660144d02a0 |
|
14-Oct-2010 |
Eric Anholt <eric@anholt.net> |
i965: Add a function for handling the move of boolean values to flag regs. This will be a place to peephole comparisions directly to the flag regs, and for now avoids using MOV with conditional mod on gen6, which is now illegal.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
4f88550ba0e1ad07e39903f268975921c0101e85 |
|
14-Oct-2010 |
Eric Anholt <eric@anholt.net> |
i965: Add a pass to the FS to split virtual GRFs to float channels. Improves nexuiz performance 0.91% (+/- 0.54%, n=8)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
f9995b30756140724f41daf963fa06167912be7f |
|
12-Oct-2010 |
Kristian Høgsberg <krh@bitplanet.net> |
Drop GLcontext typedef and use struct gl_context instead
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
c6dbf253d284f68b0d0e4a3c145583880855324b |
|
08-Oct-2010 |
Eric Anholt <eric@anholt.net> |
i965: Compute to MRF in the new FS backend. This didn't produce a statistically significant performance difference in my demo (n=4) or nexuiz (n=3), but it still seems like a good idea and is recommended by the HW team.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
06fd639c519214b6ebcbf29127b6d9ed429f8641 |
|
09-Oct-2010 |
Eric Anholt <eric@anholt.net> |
i965: Give the FB write and texture opcodes the info on base MRF, like math.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
0cd6cea8a3e9339fc69f9de0da6b40e4f9d5f4fe |
|
08-Oct-2010 |
Eric Anholt <eric@anholt.net> |
i965: Give the math opcodes information on base mrf/mrf len. This is progress towards enabling a compute-to-MRF pass.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|
37758fb1cbb1ddcd106553763c1b1f222f4cfb47 |
|
11-Oct-2010 |
Eric Anholt <eric@anholt.net> |
i965: Move FS backend structures to a header. It's time to start splitting some of this up.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs.h
|