0ba14013f66c03c6a3cc0a5e3ef74e92bfe5afb9 |
|
26-Nov-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Don't generate saturates over existing variable values. Fixes a crash in http://workshop.chromeexperiments.com/stars/ on i965, and the new piglit test glsl-fs-clamp-5. We were trying to emit a saturating move into a uniform, which the code generator appropriately choked on. This was broken in the change in 32ae8d3b321185a85b73ff703d8fc26bd5f48fa7. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57166 NOTE: This is a candidate for the 9.0 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit b9b033d8e456228fb05c5e28f85323de40f3292f) Conflicts: 9.0 doesn't have the MOV() helper. Convert to old style.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
457eab5a9bf563b54d46fa57cba513837224b120 |
|
12-Nov-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Fix the gen6-specific if handling for 80ecb8f15b9ad7d6edc Fixes oglconform shad-compiler advanced.TestLessThani. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48629 NOTE: This is a candidate for the 9.0 branch. Acked-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit 0482998ccc205a9d29953c7a8b33f41ae3584935) Conflicts: fs_inst doesn't have a "predicate" field on the 9.0 branch, so convert it to "predicated = true". See 54679fcbcae7a2d41cb43.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
6928bea7ca1f2ed308d8255c6816f44467306255 |
|
29-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Initialize output_components[] by filling it with zeros. Prior to commit 2f1869822, emit_fb_writes() looped from 0 to 3, writing all four components of a vec4 color output. However, that broke for smaller output types (float, vec2, or vec3). To fix that, I introduced a new variable (output_components[]) containing the size of the output type for each render target. Unfortunately, I forgot to actually initialize it in the constructor, which meant that unless a shader wrote to gl_FragColor, or the specific output for each render target, output_components would contain a garbage value, and we'd loop for a completely non-deterministic amount of time. Not actually emitting any color writes seems like the right approach. We may still need to emit a render target write (to terminate the thread), but don't have to put in any sensible values (the shader didn't write anything, after all). Fixes a regression since 2f18698220d8b27991fab550c4721590d17278e0. NOTE: This is a candidate for stable release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54193 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Tested-by: Ian Romanick <idr@freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
f3d0daf7ea7e42ff9ce11e8bd6fba1059a2406e8 |
|
26-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Index sampler program key data by linker-assigned index. Now that most things are based on the linker-assigned index, it makes sense to convert the arrays in the VS/WM program key as well. It seems silly to leave them indexed by texture unit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
85e8e9e000732908b259a7e2cbc1724a1be2d447 |
|
24-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Use linker-assigned sampler IDs in instruction encoding. When assigning uniform locations, the linker assigns each sampler uniform a sequential numerical ID. gl_shader_program::SamplerUnits maps these sampler variable IDs to the actual texture units they reference (specified via glUniform1i). Previously, we encoded this mapping in the SEND instruction encoding: the "sampler" was the texture unit number, and the binding table index was SURF_INDEX_TEXTURE(the texture unit number). This unfortunately meant that whenever the application changed the value of a sampler uniform, we had to recompile the shader to change the SEND instructions. This was horrible for the game Cogs, which repeatedly switches between using texture unit 0 and 1. It also made fragment shader precompiles useless: we'd do the precompile at glLinkShader() time, before the application called glUniform1i to set the sampler values. As soon as it did that, we'd have to recompile, wasting time and space in the program cache. This patch encodes the SamplerUnits indirection in the binding table, sampler state, and sampler default color tables. Instead of baking the texture unit number into the shader, we bake in the sampler variable ID assigned by the linker. Since those never change, we don't need to recompile programs on uniform changes. This does mean that the tables now depend on the linked shader program being used for rendering, rather than simply representing all available texture units. This could cause an increase in state emission. Another plus is that the sampler state and sampler default color tables are now compact: we only emit as many entries as there are sampler uniforms, with no holes in the table since the new sampler IDs are sequential. Previously we had to emit a full 16 entries every time, since the tables tracked the state of all active texture units. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
0ad2dce24aa0475e607e3c58b8aa50057412c6ef |
|
21-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Rename "sampler" to "texunit" in texturing code. The number we're passing around is actually the ID of the texture unit, as opposed to the numerical value our of sampler uniforms. Calling it "texunit" clarifies this slightly. Don't bother renaming fs_instruction::sampler. Although it's currently the texture unit, this series will change that. No need for the churn. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
bf0308d8d6fbc842d0120060e65a3fe445f5b2fb |
|
21-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Remove unused 'sampler' parameter in emit_texture_genX(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
e592f7df0361eb8b5c75944f0151c4e6b3f839dd |
|
02-Aug-2012 |
Anuj Phogat <anuj.phogat@gmail.com> |
i965/msaa: Add sample-alpha-to-coverage support for multiple render targets Render Target Write message should include source zero alpha value when sample-alpha-to-coverage is enabled for an FBO with multiple render targets. Source zero alpha value is used as fragment coverage for all the render targets. This patch makes piglit tests draw-buffers-alpha-to-coverage and alpha-to-coverage-no-draw-buffer-zero to pass on Sandybridge. No regressions are observed with piglit all.tests. V2: Revert all the changes made in emit_color_write() function to include src0 alpha for targets > 0. Now handling this case in a if block. V3: Correctly calculate the instruction length for buffer zero. Properly handle the case of dual_src_blend when alpha-to-coverage is enabled. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
90de96ff0d6d54ba0f9a337a6a107acf4134682d |
|
21-Jun-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add support for loading uniform buffer variables as pull constants. Variable array indexing isn't finished, because the lowering pass turns it all into conditional moves of constant index accesses so I can't test it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
2ea3ab14f2182978f471674c9dfce029d37f70a7 |
|
10-Jul-2012 |
Eric Anholt <eric@anholt.net> |
glsl: Add a "ubo_load" expression type for fetches from UBOs. Drivers will probably want to be able to take UBO references in a shader like: uniform ubo1 { float a; float b; float c; float d; } void main() { gl_FragColor = vec4(a, b, c, d); } and generate a single aligned vec4 load out of the UBO. For intel, this involves recognizing the shared offset of the aligned loads and CSEing them out. Obviously that involves breaking things down to loads from an offset from a particular UBO first. Thus, the driver doesn't want to see variable_ref(ir_variable("a")), and even more so does it not want to see array_ref(record_ref(variable_ref(ir_variable("a")), "field1"), variable_ref(ir_variable("i"))). where a.field1[i] is a row_major matrix. Instead, we're going to make a lowering pass to break UBO references down to expressions that are obvious to codegen, and amenable to merging through CSE. v2: Fix some partial thoughts in the ir_binop comment (review by Kenneth) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
c0f60106df724188d6ffe7c9f21eeff22186ab25 |
|
05-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Don't clobber sampler message MRFs with subexpressions. Consider a texture call such as: textureLod(s, coordinate, log2(...)) First, we begin setting up the sampler message by loading the texture coordinates into MRFs, starting with m2. Then, we realize we need the LOD, and go to compute it with: ir->lod_info.lod->accept(this); On Gen4-5, this will generate a SEND instruction to compute log2(), loading the operand into m2, and clobbering our texcoord. Similar issues exist on Gen6+. For example, nested texture calls: textureLod(s1, c1, texture(s2, c2).x) Any texturing call where evaluating the subexpression trees for LOD or shadow comparitor would generate SEND instructions could potentially break. In some cases (like register spilling), we get lucky and avoid the issue by using non-overlapping MRF regions. But we shouldn't count on that. Fixes four Piglit test regressions on Gen4-5: - glsl-fs-shadow2DGradARB-{01,04,07,cumulative} NOTE: This is a candidate for stable release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
27bf9c1997b77f85c2099436e9ad5dfc0f1608c7 |
|
05-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Factor out texcoord setup into a helper function. With the textureRect support and GL_CLAMP workarounds, it's grown sufficiently that it deserves its own function. Separating it out makes the original function much more readable. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
82bfb4b41af7d61aa45e41d62c1842b6a09e9585 |
|
05-Aug-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Move message header and texture offset setup to generate_tex(). Setting the texture offset bits in the message header involves very specific hardware register descriptions. As such, I feel it's better suited for the lower level "generate" layer that has direct access to the weird register layouts, rather than at the fs_inst abstraction layer. This also parallels the approach I took in the VS backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
cc44aa77490e1360b099eb0b887266f434298b4f |
|
21-Jul-2012 |
Eric Anholt <eric@anholt.net> |
i965: Remove unused param conversion code. Ever since ctx->NativeIntegers was set, the conversion flag has been PARAM_NO_CONVERT. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
9544e44262651a51ffdb3a572f99f902807a6205 |
|
19-Jul-2012 |
Paul Berry <stereotype441@gmail.com> |
i965: Replace fs_visitor::kill_emitted with gl_fragment_program::UsesKill. The kill_emitted variable was duplicating the functionality of gl_fragment_program::UsesKill. There's no need for both. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
a6411520b40d59a8806289c7aaea4a6b26a54443 |
|
06-Jul-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Rename virtual_grf_next to virtual_grf_count. "count" is a more useful name, since most of the time we're using it for looping over the variables. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
86e401b771ce4a6f9a728f76c5061c339f012d0a |
|
12-Jul-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Always emit alpha when nr_color_buffers == 0. If alpha-testing is enabled, we need to send alpha down the pipeline even if nr_color_buffers == 0. However, tracking whether alpha-testing is enabled in the WM program key is expensive: it causes us to compile multiple specializations of the same shader, using program cache space. This patch removes the check for alpha-testing, and simply emits alpha whenever nr_color_buffers == 0. We believe this will also be necessary for alpha-to-coverage, and it should add minimal overhead to an uncommon case. Saving the recompiles should more than make up the difference. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
b546aebae922214dced54c75e6f64830aabd5d1c |
|
10-Jul-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Delete previous workaround for textureGrad with shadow samplers. It had many problems: - The shadow comparison was done post-filtering. - It required state-dependent recompiles whenever the comparison function changed. - It didn't even work: many cases hit assertion failures. - I never implemented it for the VS. The new lowering pass which converts textureGrad to textureLod by computing the LOD value works much better. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
fe27916ddf41b9fb60c334c47c1aa81b8dd9005e |
|
04-Jul-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Move class functions from the header to .cpp files. Cuts compile time for brw_fs.h changes from 2.7s to .7s and reduces i965_dri.so size by 70k. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
cf0bbb30f6bd9d3fa61b5207320e8f34c563a2c6 |
|
21-Jun-2012 |
Chad Versace <chad.versace@linux.intel.com> |
i965/fs: Fix conversions float->bool, int->bool Fixes gles2conform GL.equal.equal_bvec2_frag. This fixes brw_fs_visitor's translation of ir_unop_f2b. It used CMP to convert the float to one of 0 or ~0. However, the convention in the compiler is that true is represented by 1, not ~0. This patch adds an AND to convert ~0 to 1. By inspection, a similar problem existed with ir_unop_i2b, with a similar fix. [v2 kayden]: eliminate extra temporary register. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49621 Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
11a7b93592c22c8165f8fde6395f76778fca452e |
|
14-Jun-2012 |
Paul Berry <stereotype441@gmail.com> |
i965: Add support for ir_unop_f2u to i965 backend. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
05790746df077183d6c3caf87ca2d276a60302a8 |
|
07-Jun-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Enable the GL_ARB_shader_bit_encode extension. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
2f18698220d8b27991fab550c4721590d17278e0 |
|
01-Jun-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Fix user-defined FS outputs with less than four components. OpenGL allows you to declare user-defined fragment shader outputs with less than four components: out ivec2 color; This makes sense if you're rendering to an RG format render target. Previously, we assumed that all color outputs had four components (like the built-in gl_FragColor/gl_FragData variables). This caused us to call emit_color_write for invalid indices, incrementing the output virtual GRF's reg_offset beyond the size of the register. This caused cascading failures: split_virtual_grfs would allocate new size-1 registers based on the virtual GRF size, but then proceed to rewrite the out-of-bounds accesses assuming that it had allocated enough new (contiguously numbered) registers. This resulted in instructions that accessed size-1 GRFs which register numbers beyond virtual_grf_next (i.e. registers that were never allocated). Finally, this manifested as live variable analysis and instruction scheduling accessing their temporary array with an out of bounds index (as they're all sized based on virtual_grf_next), and the program would segfault. It looks like the hardware's Render Target Write message requires you to send four components, even for RT formats such as RG or RGB. This patch continues to use all four MRFs, but doesn't bother to fill any data for the last few, which should be unused. +2 oglconforms. NOTE: This is a candidate for stable release branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
217b62bf001f6b1da31807b803bbe45d7cabe3ba |
|
04-Jun-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Fix texelFetchOffset() on pre-Gen7. Commit f41ecade7b458c02d504158b522acb2231585040 fixed texelFetchOffset() on Ivybridge, but didn't update the Ironlake/Sandybridge code. +15 piglits on Sandybridge. NOTE: This and f41ecade7b458 are both candidates for stable branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
29362875f2613ad87abe7725ce3c56c36d16cf9b |
|
25-Apr-2012 |
Eric Anholt <eric@anholt.net> |
i965/gen6+: Add support for GL_ARB_blend_func_extended. v2: Add support for gen6, and don't turn it on if blending is disabled. (fixes GPU hang), and note it in docs/GL3.txt Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
1e188f2daef1ae31224d2429bcc1fab75c81fb36 |
|
10-May-2012 |
Eric Anholt <eric@anholt.net> |
intel: Fix signed/unsigned comparison warnings.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
d1029f99884e2ba7f663765274cd6bdb4f82feed |
|
11-May-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Use a const reference in fs_reg::equals instead of a pointer. This lets you omit some ampersands and is more idiomatic C++. Using const also marks the function as not altering either register (which was obvious, but nice to enforce). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
dc42910e98dc00760255cc4579da458de09175b9 |
|
24-Apr-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Fix regression in comparison handling from ANDs change. I had fixed up the logic ops for delayed ANDing, but not equality comparisons on bools. Fixes new piglit fs-bool-less-compare-true. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48629
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
b443ca96a55a06ee215a3f9a9e7dba558deeb58c |
|
24-Apr-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Fix FB writes that tried to use the non-existent m16 register. A little analysis shows that the worst-case value for "nr" is 17: - base_mrf = 2 ... 2 - header present (say gen == 5) ... 4 - aa_dest_stencil_reg (stencil test) ... 5 - SIMD16 mode: += 4 * reg_width ... 13 - source_depth_to_render_target ... 15 - dest_depth_reg ... 17 This resulted in us setting base_mrf to 2 and mlen to 15. In other words, we'd try to use m2..m16. But m16 doesn't exist pre-Gen6. Also, the instruction scheduler data structures use arrays of size 16, so this would cause us to access them out of bounds. While the debugger system routine may need m0 and m1, we don't use it today, so the simplest solution is just to move base_mrf back to 1. That way, our worst case message fits in m1..m15, which is legal. An alternative would be to fail on SIMD16 in this case, but that seems a bit unfortunate if there's no real need to reserve m0 and m1. Fixes new piglit test shaders/depth-test-and-write on Ironlake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48218 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
f41ecade7b458c02d504158b522acb2231585040 |
|
17-Apr-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Fix texelFetchOffset() It appears that when using 'ld' with the offset bits, address bounds checking happens before the offset is applied, so parts of the drawing in piglit texelFetchOffset() with a negative texcoord go black.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
8890c759513597903997f519c69e9db30790b6f4 |
|
11-Apr-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Suppress printing the whole loop in BRW_OPCODE_DO annotation.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
80ecb8f15b9ad7d6edcc85bd19f1867c368b09b6 |
|
13-Mar-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Avoid generating extra AND instructions on bool logic ops. By making a bool fs_reg only have a defined low bit (matching CMP output), instead of being a full 0 or 1 value, we reduce the ANDs generated in logic chains like: if (v_texcoord.x < 0.0 || v_texcoord.x > texwidth || v_texcoord.y < 0.0 || v_texcoord.y > 1.0) discard; My concern originally when writing this code was that we would end up generating unnecessary ANDs on bool uniforms, so I put the ANDs right at the point of doing the CMPs that otherwise set only the low bit. However, in order to use a bool, we're generating some instruction anyway (e.g. moving it so as to produce a condition code update), and those instructions can often be turned into an AND at that point. It turns out in the shaders I have on hand, none of them regress in instruction count: Total instructions: 262649 -> 262545 39/2148 programs affected (1.8%) 14253 -> 14149 instructions in affected programs (0.7% reduction)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
32ae8d3b321185a85b73ff703d8fc26bd5f48fa7 |
|
10-Mar-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Try to avoid generating extra MOVs to do saturates. This change (before the previous two) produced a .23% +/- .11% performance improvement in Unigine Tropics at 1024x768 on IVB. Total instructions: 269270 -> 262649 614/2148 programs affected (28.6%) 179386 -> 172765 instructions in affected programs (3.7% reduction) v2: Move some of the logic of finding the instruction that produced the result of an expression tree to a helper.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
f75c2d53146ea14f8dfedcc5b7a4704278ba0792 |
|
21-Sep-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
glsl: Demote 'type' from ir_instruction to ir_rvalue and ir_variable. Variables have types, expression trees have types, but statements don't. Rather than have a nonsensical field that stays NULL in the base class, just move it to where it makes sense. Fix up a few places that lazily used ir_instruction even though they actually knew the particular subclass. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
01044fce6b3de11635ea5078b76ffee1a33b3802 |
|
30-Mar-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Avoid explicit accumulator operands in SIMD16 mode on Gen7. According to the BSpec ISA volume's "Accumulator Register" section: "[DevIVB] SIMD16 execution on dwords is not allowed when accumulator is explicit source or destination operand." Fixes piglit tests: - fs-multiply-const-ivec4 - fs-multiply-const-uvec4 - fs-multiply-ivec4-const - fs-multiply-uvec4-const Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
99cd475cc9d3bd54140f84c24b55b9e05d7310a1 |
|
09-Jan-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Enable SIMD16 mode for shaders with loops on Gen6+. The hardware supports it; there's no reason not to. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
df5963c25641a7c3a4bbfcb81cc3dc771581590e |
|
18-Feb-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Make the dummy fragment shader work in SIMD16 mode. If you're resorting to the dummy shader, you've probably already turned off SIMD16 mode. But if you didn't, it would die in a fire. We could either fail to compile in SIMD16 mode...or just fix it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
393b42240f22dbbfb4f089036319031ad36173f3 |
|
18-Feb-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Fix GPU hangs in the dummy fragment shader. The dummy FB write failed to specify EOT and a message length, causing the GPU to hang. Now we can enjoy "everyone's favorite color" again. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
4b274068204c7f0bacaa4639f24feb433353b861 |
|
14-Feb-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Take # of components into account in try_rewrite_rhs_to_dst. Commit dc7f449d1ac53a66e6efb56ccf2a5953418a26ca introduced a new method for avoiding MOVs: try to rewrite the destination of the instruction that produced the RHS so it writes into the LHS. Unfortunately, this is not safe for swizzled texturing operations, as they return a set of four contiguous registers. Consider the following: (assign (x) (var_ref vec_ctor_x) (swiz x (tex vec4 (var_ref m_sampY) (var_ref m_cordY) 0 1 ()))) In this case, the source and destination registers are equal, since reg_offset is 0 for both. Yet, this is only a partial move: the texture operation generates four registers, and the LHS only covers one. Fixes color distortion in XBMC when using GLSL shaders. NOTE: This is a candidate for the 8.0 branch (with the previous commit). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44333 Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
7d55f37b0e87db9b3806088797075161a1c9a8bb |
|
07-Feb-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add support for generating MADs. Improves nexuiz performance 0.65% +/- .10% (n=5) on my gen6, and .39% +/- .11% (n=10) on gen7. No statistically significant performance difference on warsow (n=5, but only one shader has MADs). v2: Add support for MADs in 16-wide by using compression control. v3: Don't generate MADs when it will force an immediate to be moved to a temp. (it's not clear whether this is a win or not, but it should result in less questionable change to codegen compared to v2). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
7c857a6b159debf76d4661f494fd2c97d205b5b1 |
|
04-Feb-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Implement GL_CLAMP behavior on texture rectangles on gen6+. We were doing saturate-based clamping on the [0,width] or [0,height] coordinate, which meant only the first pixel was addressable. Fixes piglit ARB_texture_rectangle/texwrap-RECT-bordercolor NOTE: This is a candidate for the 8.0 release branch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
07e621c52329cd17b97051a26493626228d043b9 |
|
03-Feb-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Move GL_CLAMP handling to coordinate setup. We should be able to merge self-move instruction into the MRF move anyway, and this simplifies things for the next commit. NOTE: This is a candidate for the 8.0 release branch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
83dc891b41c0224f5ba3624b3e3560129e644e28 |
|
09-Jan-2012 |
Eric Anholt <eric@anholt.net> |
i965/fs: Fix GPU hangs with 16-wide integer div/mod on gen7. Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
1b05fc7cdd0e5d77b50bc8ee2f2c851da5884d72 |
|
07-Dec-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Factor out texturing related data from brw_wm_prog_key. The idea is to reuse this for the VS and (in the future) GS as well. v2: Include yuvtex data since we're not dropping GL_MESA_ycbycr. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1] Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
475d70d6ef5feb94efab3923e5607e625f2aee67 |
|
26-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Factor out texture offset bitfield computation. We'll want to reuse this for the VS, and it's complex enough that I'd rather not cut and paste it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
febad1779ae5cb5c85d66c2635baea62da52d2fa |
|
26-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Rename texturing ops from FS_OPCODE to SHADER_OPCODE, except TXB. We'll be reusing most of these for the VS shortly. The one exception is TXB (texturing with LOD bias), which is explicitly forbidden in the VS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
c592ebc581e5ca0122660b3d76ec924b96581216 |
|
06-Dec-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Don't swizzle the results of textureSize(). Fixes a regression since d2235b0f4681f75d562131d655a6d7b7033d2d8b, in my new textureSize sampler(1DArrayShadow|2DShadow|2DArrayShadow) piglit tests, though I'm not honestly sure how this ever worked. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
51e5a266c1e1c12c4f0d82bee3caff008a41c9fd |
|
23-Nov-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Fix regression in fbo-alphatest-nocolor. In the refactor for handling user-defined out params, we failed to set up the new color output tracking when there was no color drawbuffer in place but alpha testing was on. Just always set up at least one when handling gl_FragColor, since we won't make use of its value unless we need to. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42806
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
6d874d0ee18b3694c49e0206fa519bd8b746ec24 |
|
09-Nov-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add support for user-defined out variables. Before, I was tracking the ir_variable * found for gl_FragColor or gl_FragData[]. Instead, when visiting those variables, set up an array of per-render-target fs_regs to copy the output data from. This cleans up the color emit path, while making handling of multiple user-defined out variables easier. v2: incorporate idr's feedback about ir->location (changes by Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
e988d816e16f9c0844424472d689486a833931c3 |
|
09-Nov-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Preserve the source register type when doing color writes. When rendering to integer color buffers, we need to be careful to use MRFs of the correct type when emitting color writes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
512431b3575eb5f2c27d8795c5e2191047ebb5ed |
|
27-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Use the actual hardware g0 register for texel offset setup. The idea here is to set up the message header with the Sampler State pointer which the hardware provides as part of the PS Thread Payload in register g0. Unfortunately, the existing code fs_reg(GRF, 0, BRW_REGISTER_TYPE_UD)) actually references "virtual GRF 0" rather than the hardware g0. This is just some arbitrary GRF temporary which will get register allocated. So, we ended up setting up the header with garbage. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
e04bdeae82797dbdcf6f544a997a4626fdfd4aee |
|
22-Oct-2011 |
Paul Berry <stereotype441@gmail.com> |
i965/gen6+: Parameterize barycentric interpolation modes. This patch modifies the fragment shader back-end so that instead of using a single delta_x/delta_y register pair to store barycentric coordinates, it uses an array of such register pairs, one for each possible intepolation mode. When setting up the WM, we intstruct it to only provide the barycentric coordinates that are actually needed by the fragment shader--that is computed by brw_compute_barycentric_interp_modes(). Currently this function returns just BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC, because this is the only interpolation mode we support. However, that will change in a later patch. Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
73b0a28ba8b3e2ab917d4c729f34ddbde52c9e88 |
|
04-Oct-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Fix comparisions with uint negation. The condmod instruction ends up generating garbage condition codes, because apparently the comparison happens on the accumulator value (33 bits for UD), not the truncated value that would be written. Fixes fs-op-neg-* Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
2e5a1a254ed81b1d3efa6064f48183eefac784d0 |
|
07-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
intel: Convert from GLboolean to 'bool' from stdbool.h. I initially produced the patch using this bash command: for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i 's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i 's/GL_FALSE/false/g' $file; done Then I manually added #include <stdbool.h> to fix compilation errors, and converted a few functions back to GLboolean that were used in core Mesa's function pointer table to avoid "incompatible pointer" warnings. Finally, I cleaned up some whitespace issues introduced by the change. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chad Versace <chad@chad-versace.us> Acked-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
ff8f272b0d02b41a0ce34ab6af7119b9e06f4961 |
|
29-Sep-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Implement integer quotient and remainder math operations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
7de6e749df90a214d1547956dd66cfec6edcb446 |
|
27-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Add support for bit-shift operations. Reviewed-by: Chad Versace <chad@chad-versace.us> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
83df7fbe62be2798d557142a47e01af86ec9e2e2 |
|
27-Sep-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Allow SIMD16 color writes on Ivybridge. Again, the check was needlessly specific: this works fine on Gen7. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
79cba4c2b17456e2b25ac555c45e1c106b4e3f6b |
|
27-Sep-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Allow SIMD16 with control flow on Ivybridge. The check was designed to forbid it on old generations (Gen5/Ironlake), not on new ones. It just works on Gen7/Ivybridge. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
c310c35a754be835c9ceafe578c4974a667b9a77 |
|
20-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965: Fix compiler warnings.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
47b556fbcaea4660b21481e40d89167d883d47f5 |
|
07-Sep-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Implement texelFetch() on Gen4. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
0edf5d63d60100cc2b7467da78ce811c4824b760 |
|
26-Aug-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Implement texelFetch() on Ivybridge. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
30be2cc6c7c3378ee17885b5bf41d7ae53bf6fe0 |
|
26-Aug-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Implement texelFetch() on Ironlake and Sandybridge. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
4bc5bfb641bce931bf35f0e78ec2b44263d152ba |
|
02-Sep-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Implement ir_u2f opcode. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
2f0edc60f4bd2ae5999a6afa656e3bb3f181bf0f |
|
26-Aug-2011 |
Chad Versace <chad@chad-versace.us> |
i965: Fix Android build by removing relative includes Replace each occurence of #include "../glsl/*.h" with #include "glsl/*.h" Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad@chad-versace.us>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
dc7f449d1ac53a66e6efb56ccf2a5953418a26ca |
|
26-Aug-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Avoid generating MOVs for most ir_assignment handling. This is a port of vec4_visitor::try_rewrite_rhs_to_dst to fs_visitor. Not only is this technique less invasive and more robust, it also generates better code. Over and above the previous technique, this reduced instruction count in shader-db by 0.28% on average and 1.4% in the best case. In no case did this technique result in more code than the prior method. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
d28a3bd4bf25157aff5379a003bbf4a66157ed06 |
|
26-Aug-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Revert "Avoid generating MOVs for assignments for expressions." This reverts commit 53c89c67f33639afef951e178f93f4e29acc5d53, along with the subsequent this->result = reg_undef additions it required. Both Eric and I agree that the way he did this is really fragile; if you forget to add this->result = reg_undef before calling accept(), it may end up using the same register for two separate things, breaking things in strange and mysterious ways. The next commit will port over the new VS backend's method for solving this problem, which is simpler, less intrusive, and still manages to avoid MOVs in the common case.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
9d4b98eb9eadecc17cd1cda0074b420a39e74647 |
|
17-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/gen6+: Use non-normalized coordinates for GL_TEXTURE_RECTANGLE. Improves performance of a GL_TEXTURE_RECTANGLE microbenchmark by 1.84% +/- .15% (n=3)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
4eeb4c150598605d1be3ce6674fa63076a720ae9 |
|
17-Aug-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Implement textureSize (TXS) on Gen4. Also, remove the BRW_SAMPLER_MESSAGE_SIMD8_RESINFO #define because there totally isn't a SIMD8 variant. Unfortunately, resinfo returns FLOAT32 on Broadwater/Crestline, unlike G45 which returns a proper UINT32. This turns out to be simple, however: when we emit MOVs to select the desired half of the SIMD16 result, we can simply override the register type to be float so it's converted to an integer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
ecf8963754489abfb5097c130a9bcd4cdb76b6bd |
|
19-Jun-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Implement textureSize (TXS) on Gen5+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
b6bdcf2a908889532ef6d5eb643791176dffcb9d |
|
18-Aug-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Rudimentary support for non-floating point texture results. Not all texturing operations return floating point data. For example, the resinfo message (textureSize or TXS) returns integer data. In the future, we'll also add integer texture support. ir_texture's type field contains this information; use its base type to appropriately type the destination register. We want to keep it as a four component vector, however, since SIMD8 samplers always have a response length of 4. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
1e3bcbdf31f09666ba358f35ff9486faee3642ca |
|
25-Feb-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
glsl: Add a new ir_txs (textureSize) opcode to ir_texture. One unique aspect of TXS is that it doesn't have a coordinate. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
eb86bb55f5faef67c21604db19210c6788592679 |
|
18-Aug-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Change incorrect use of 'struct fs_reg' to simply 'fs_reg'. It's actually a class. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
3f78f719732b87e6707f94c187ad6e263c6c2ef0 |
|
16-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Fix 32-bit integer multiplication. The MUL opcode does a 16bit * 32bit multiply, and we need to do the MACH to get the top 16bit * 32bit added in. Fixes fs-op-mult-int-*, fs-op-mult-ivec* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
65b5cbbcf783f6c668ab5b31a0734680dd396794 |
|
05-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965: Rename math FS_OPCODE_* to SHADER_OPCODE_*. I want to just use the same enums in the VS.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
b76378d46a211521582cfab56dc05031a57502a6 |
|
04-May-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Eliminate the magic nature of virtual GRF 0. This was a debugging aid at one point -- virtual grf 0 should never be allocated, and it would be used if undefined register access occurred in codegen. However, it made the confusing register allocation code even more confusing by indexing things off of 1 all over.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
44ffb4ae207e48f78fae55925601b8708ed09c1d |
|
29-Jul-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Stop using the exec_list iterator. The old style has gone out of favor in the project, but I kept copy and pasting from existing iterator code.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
4fdd289805d14d4f7a234f88cd375be1b3b96764 |
|
26-Jul-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Respect ARB_color_buffer_float clamping. This was done in the old codegen path, but not the new one. Caught by piglit fbo tests after the conversion to GLSL ff_fragment_shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
3e1fd13f605f16e8b48f3a9b71910a3c66eb84b5 |
|
25-Jul-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/gen4: Fix message parameter loading for 1D TXD sampling. We were neglecting to load dvdx and dvdy. v is not optional. Fixes glslparsertests tex-grad-0[12345].frag on Broadwater/Crestline. (We still need an execution test using sampler1D.) NOTE: This is a candidate for the 7.11 branch. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
156cef0fbacf242e8fc67e39ab964e5f8f3739cb |
|
22-Jul-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Clear result before visiting shadow comparitor and LOD info. Commit 53c89c67f33639afef951e178f93f4e29acc5d53 ("i965: Avoid generating MOVs for assignments of expressions.") added the line "this->result = reg_undef" all over the code. Unfortunately, since Eric developed his patch before I landed Ivybridge support, he missed adding it to fs_visitor::emit_texture_gen7() after rebasing. Furthermore, since I developed TXD support before Eric's patch, I neglected to add it to the gradient handling when I rebased. Neglecting to set this causes the visitor to use this->result as storage rather than generating a new temporary. These missing statements resulted in the same register being used to store several different values. Fixes the following piglit tests on Ivybridge: - glsl-fs-shadow2dproj.shader_test - glsl-fs-shadow2dproj-bias.shader_test NOTE: This is a candidate for the 7.11 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
eafc74d7d4982a835ac43c73963dda9982652464 |
|
30-Jun-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Fix message register allocation in FB writes. Commit 6750226e6d915742ebf96bae2cfcdd287b85db35 bumped the base MRF to m2 instead of m0, but failed to adjust inst->mlen, which was being set to the highest MRF. Subtracting the base MRF solves the issue. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
b633ddeb9fd951ddc49e8a3fd25a946e5a16361f |
|
14-Jun-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Implement new ir_unop_u2i and ir_unop_i2u opcodes. No MOV is necessary since signed/unsigned integers share the same bit-representation; it's simply a question of interpretation. In particular, the fs_reg::imm union shouldn't need updating. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
6750226e6d915742ebf96bae2cfcdd287b85db35 |
|
17-Jun-2011 |
Ben Widawsky <ben@bwidawsk.net> |
i965: step message register allocation The system routine requires m0 be reserved for saving off architectural state. Moved the allocation to start at 2 instead of 0. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
6430df37736d71dd2bd6f1fe447d39f0b68cb567 |
|
10-Jun-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Add support for TXD with shadow comparisons. Our hardware doesn't have a sample_d_c message, so we have to do a regular sample_d and emit instructions to manually perform the comparison. This requires a state dependent recompile whenever the sampler's compare mode or function change. This adds the per-sampler comparison functions to brw_wm_prog_key, but only sets them when the sampler's compare mode is GL_COMPARE_R_TO_TEXTURE (i.e. only for shadow sampling). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
01fa9addf447120e994415ad8fc8246ac234ec27 |
|
10-Jun-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Refactor texture result swizzling into a helper function. The next patch will add a few additional uses. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
f1622cfe9c0f37a9b452be1297f187cba8c46e6a |
|
10-Jun-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Move sampler fetch to the top of the ir_texture visit function. This makes it available earlier, which will soon be necessary. (Separating code motion from actual changes.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
6c947cfd1973c3791d54f1406c973357b4a9621a |
|
09-Jun-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Add support for non-shadow textureGrad (TXD) on gen4. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
2f4a4b943f1cad9bbbb8f66c34dca506503ba5bb |
|
09-Jun-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Add support for non-shadow textureGrad (TXD) on gen5/6. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
3fa910fff9f72d1adf33f0f4dea3d790a9ce04ab |
|
09-Jun-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/fs: Add support for non-shadow textureGrad (TXD) on Ivybridge. This is somewhat ugly, but I couldn't think of a nicer way to handle the interleaved coordinate/derivative parameter loading. Ironlake and Sandybridge will still hit an assertion in visit(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
c331b3123ecda127919458e24848b7c1596525ac |
|
12-May-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Use the embedded compare in SEL on gen6+. This avoids the extra CMP and the predication on SEL, so in addition to one less instruction, it makes scheduling less constrained. Improves glbenchmark Egypt performance 0.6% +/- 0.2% (n=3). Reduces FS instruction count across affected shaders in shader-db by 1.3% without regressing any. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
0653c450cc8da1212e1123a1cd6635c02f7d6919 |
|
27-May-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Fix up for 8752764076e5b3f052a57e0134424a37bf2e9164. I failed to commit and squash before pushing.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|
d1f70a8a6c6ec7007bad22d3d6013415be2d243a |
|
25-May-2011 |
Eric Anholt <eric@anholt.net> |
i965/fs: Split the GLSL IR -> FS LIR visitor to brw_fs_visitor.cpp. We now have: brw_fs.cpp handles calling out to everything and optimization. brw_fs_visitor.cpp handles translating to our LIR. brw_fs_emit.cpp handles emitting from our LIR to native code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
|