History log of /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
0ba14013f66c03c6a3cc0a5e3ef74e92bfe5afb9 26-Nov-2012 Eric Anholt <eric@anholt.net> i965/fs: Don't generate saturates over existing variable values.

Fixes a crash in http://workshop.chromeexperiments.com/stars/ on i965,
and the new piglit test glsl-fs-clamp-5.
We were trying to emit a saturating move into a uniform, which the code
generator appropriately choked on. This was broken in the change in
32ae8d3b321185a85b73ff703d8fc26bd5f48fa7.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57166
NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b9b033d8e456228fb05c5e28f85323de40f3292f)

Conflicts: 9.0 doesn't have the MOV() helper. Convert to old style.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
457eab5a9bf563b54d46fa57cba513837224b120 12-Nov-2012 Eric Anholt <eric@anholt.net> i965/fs: Fix the gen6-specific if handling for 80ecb8f15b9ad7d6edc

Fixes oglconform shad-compiler advanced.TestLessThani.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48629
NOTE: This is a candidate for the 9.0 branch.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0482998ccc205a9d29953c7a8b33f41ae3584935)

Conflicts: fs_inst doesn't have a "predicate" field on the 9.0 branch,
so convert it to "predicated = true". See 54679fcbcae7a2d41cb43.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
6928bea7ca1f2ed308d8255c6816f44467306255 29-Aug-2012 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Initialize output_components[] by filling it with zeros.

Prior to commit 2f1869822, emit_fb_writes() looped from 0 to 3, writing
all four components of a vec4 color output. However, that broke for
smaller output types (float, vec2, or vec3). To fix that, I introduced
a new variable (output_components[]) containing the size of the output
type for each render target.

Unfortunately, I forgot to actually initialize it in the constructor,
which meant that unless a shader wrote to gl_FragColor, or the specific
output for each render target, output_components would contain a garbage
value, and we'd loop for a completely non-deterministic amount of time.

Not actually emitting any color writes seems like the right approach.
We may still need to emit a render target write (to terminate the
thread), but don't have to put in any sensible values (the shader didn't
write anything, after all).

Fixes a regression since 2f18698220d8b27991fab550c4721590d17278e0.
NOTE: This is a candidate for stable release branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54193
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Ian Romanick <idr@freedesktop.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
f3d0daf7ea7e42ff9ce11e8bd6fba1059a2406e8 26-Aug-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Index sampler program key data by linker-assigned index.

Now that most things are based on the linker-assigned index, it makes
sense to convert the arrays in the VS/WM program key as well. It seems
silly to leave them indexed by texture unit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
85e8e9e000732908b259a7e2cbc1724a1be2d447 24-Aug-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Use linker-assigned sampler IDs in instruction encoding.

When assigning uniform locations, the linker assigns each sampler
uniform a sequential numerical ID. gl_shader_program::SamplerUnits maps
these sampler variable IDs to the actual texture units they reference
(specified via glUniform1i).

Previously, we encoded this mapping in the SEND instruction encoding:
the "sampler" was the texture unit number, and the binding table index
was SURF_INDEX_TEXTURE(the texture unit number). This unfortunately
meant that whenever the application changed the value of a sampler
uniform, we had to recompile the shader to change the SEND instructions.

This was horrible for the game Cogs, which repeatedly switches between
using texture unit 0 and 1. It also made fragment shader precompiles
useless: we'd do the precompile at glLinkShader() time, before the
application called glUniform1i to set the sampler values. As soon as
it did that, we'd have to recompile, wasting time and space in the
program cache.

This patch encodes the SamplerUnits indirection in the binding table,
sampler state, and sampler default color tables. Instead of baking the
texture unit number into the shader, we bake in the sampler variable ID
assigned by the linker. Since those never change, we don't need to
recompile programs on uniform changes.

This does mean that the tables now depend on the linked shader program
being used for rendering, rather than simply representing all available
texture units. This could cause an increase in state emission.

Another plus is that the sampler state and sampler default color tables
are now compact: we only emit as many entries as there are sampler
uniforms, with no holes in the table since the new sampler IDs are
sequential. Previously we had to emit a full 16 entries every time,
since the tables tracked the state of all active texture units.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
0ad2dce24aa0475e607e3c58b8aa50057412c6ef 21-Aug-2012 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Rename "sampler" to "texunit" in texturing code.

The number we're passing around is actually the ID of the texture unit,
as opposed to the numerical value our of sampler uniforms. Calling it
"texunit" clarifies this slightly.

Don't bother renaming fs_instruction::sampler. Although it's currently
the texture unit, this series will change that. No need for the churn.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
bf0308d8d6fbc842d0120060e65a3fe445f5b2fb 21-Aug-2012 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Remove unused 'sampler' parameter in emit_texture_genX().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
e592f7df0361eb8b5c75944f0151c4e6b3f839dd 02-Aug-2012 Anuj Phogat <anuj.phogat@gmail.com> i965/msaa: Add sample-alpha-to-coverage support for multiple render targets

Render Target Write message should include source zero alpha value when
sample-alpha-to-coverage is enabled for an FBO with multiple render targets.
Source zero alpha value is used as fragment coverage for all the render
targets.

This patch makes piglit tests draw-buffers-alpha-to-coverage and
alpha-to-coverage-no-draw-buffer-zero to pass on Sandybridge. No
regressions are observed with piglit all.tests.

V2: Revert all the changes made in emit_color_write() function to
include src0 alpha for targets > 0. Now handling this case in a if
block.

V3: Correctly calculate the instruction length for buffer zero.
Properly handle the case of dual_src_blend when alpha-to-coverage
is enabled.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
90de96ff0d6d54ba0f9a337a6a107acf4134682d 21-Jun-2012 Eric Anholt <eric@anholt.net> i965/fs: Add support for loading uniform buffer variables as pull constants.

Variable array indexing isn't finished, because the lowering pass
turns it all into conditional moves of constant index accesses so I
can't test it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
2ea3ab14f2182978f471674c9dfce029d37f70a7 10-Jul-2012 Eric Anholt <eric@anholt.net> glsl: Add a "ubo_load" expression type for fetches from UBOs.

Drivers will probably want to be able to take UBO references in a
shader like:

uniform ubo1 {
float a;
float b;
float c;
float d;
}

void main() {
gl_FragColor = vec4(a, b, c, d);
}

and generate a single aligned vec4 load out of the UBO. For intel,
this involves recognizing the shared offset of the aligned loads and
CSEing them out. Obviously that involves breaking things down to
loads from an offset from a particular UBO first. Thus, the driver
doesn't want to see

variable_ref(ir_variable("a")),

and even more so does it not want to see

array_ref(record_ref(variable_ref(ir_variable("a")),
"field1"), variable_ref(ir_variable("i"))).

where a.field1[i] is a row_major matrix.

Instead, we're going to make a lowering pass to break UBO references
down to expressions that are obvious to codegen, and amenable to
merging through CSE.

v2: Fix some partial thoughts in the ir_binop comment (review by Kenneth)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
c0f60106df724188d6ffe7c9f21eeff22186ab25 05-Aug-2012 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Don't clobber sampler message MRFs with subexpressions.

Consider a texture call such as:

textureLod(s, coordinate, log2(...))

First, we begin setting up the sampler message by loading the texture
coordinates into MRFs, starting with m2. Then, we realize we need the
LOD, and go to compute it with:

ir->lod_info.lod->accept(this);

On Gen4-5, this will generate a SEND instruction to compute log2(),
loading the operand into m2, and clobbering our texcoord.

Similar issues exist on Gen6+. For example, nested texture calls:

textureLod(s1, c1, texture(s2, c2).x)

Any texturing call where evaluating the subexpression trees for LOD or
shadow comparitor would generate SEND instructions could potentially
break. In some cases (like register spilling), we get lucky and avoid
the issue by using non-overlapping MRF regions. But we shouldn't count
on that.

Fixes four Piglit test regressions on Gen4-5:
- glsl-fs-shadow2DGradARB-{01,04,07,cumulative}

NOTE: This is a candidate for stable release branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
27bf9c1997b77f85c2099436e9ad5dfc0f1608c7 05-Aug-2012 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Factor out texcoord setup into a helper function.

With the textureRect support and GL_CLAMP workarounds, it's grown
sufficiently that it deserves its own function. Separating it out
makes the original function much more readable.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
82bfb4b41af7d61aa45e41d62c1842b6a09e9585 05-Aug-2012 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Move message header and texture offset setup to generate_tex().

Setting the texture offset bits in the message header involves very
specific hardware register descriptions. As such, I feel it's better
suited for the lower level "generate" layer that has direct access to
the weird register layouts, rather than at the fs_inst abstraction layer.

This also parallels the approach I took in the VS backend.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
cc44aa77490e1360b099eb0b887266f434298b4f 21-Jul-2012 Eric Anholt <eric@anholt.net> i965: Remove unused param conversion code.

Ever since ctx->NativeIntegers was set, the conversion flag has been
PARAM_NO_CONVERT.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
9544e44262651a51ffdb3a572f99f902807a6205 19-Jul-2012 Paul Berry <stereotype441@gmail.com> i965: Replace fs_visitor::kill_emitted with gl_fragment_program::UsesKill.

The kill_emitted variable was duplicating the functionality of
gl_fragment_program::UsesKill. There's no need for both.

Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
a6411520b40d59a8806289c7aaea4a6b26a54443 06-Jul-2012 Eric Anholt <eric@anholt.net> i965/fs: Rename virtual_grf_next to virtual_grf_count.

"count" is a more useful name, since most of the time we're using it for
looping over the variables.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
86e401b771ce4a6f9a728f76c5061c339f012d0a 12-Jul-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Always emit alpha when nr_color_buffers == 0.

If alpha-testing is enabled, we need to send alpha down the pipeline
even if nr_color_buffers == 0. However, tracking whether alpha-testing
is enabled in the WM program key is expensive: it causes us to compile
multiple specializations of the same shader, using program cache space.

This patch removes the check for alpha-testing, and simply emits alpha
whenever nr_color_buffers == 0. We believe this will also be necessary
for alpha-to-coverage, and it should add minimal overhead to an uncommon
case. Saving the recompiles should more than make up the difference.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b546aebae922214dced54c75e6f64830aabd5d1c 10-Jul-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Delete previous workaround for textureGrad with shadow samplers.

It had many problems:
- The shadow comparison was done post-filtering.
- It required state-dependent recompiles whenever the comparison
function changed.
- It didn't even work: many cases hit assertion failures.
- I never implemented it for the VS.

The new lowering pass which converts textureGrad to textureLod by
computing the LOD value works much better.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
fe27916ddf41b9fb60c334c47c1aa81b8dd9005e 04-Jul-2012 Eric Anholt <eric@anholt.net> i965/fs: Move class functions from the header to .cpp files.

Cuts compile time for brw_fs.h changes from 2.7s to .7s and reduces
i965_dri.so size by 70k.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
cf0bbb30f6bd9d3fa61b5207320e8f34c563a2c6 21-Jun-2012 Chad Versace <chad.versace@linux.intel.com> i965/fs: Fix conversions float->bool, int->bool

Fixes gles2conform GL.equal.equal_bvec2_frag.

This fixes brw_fs_visitor's translation of ir_unop_f2b. It used CMP to
convert the float to one of 0 or ~0. However, the convention in the
compiler is that true is represented by 1, not ~0. This patch adds an AND
to convert ~0 to 1.

By inspection, a similar problem existed with ir_unop_i2b, with a similar
fix.

[v2 kayden]: eliminate extra temporary register.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49621
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
11a7b93592c22c8165f8fde6395f76778fca452e 14-Jun-2012 Paul Berry <stereotype441@gmail.com> i965: Add support for ir_unop_f2u to i965 backend.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
05790746df077183d6c3caf87ca2d276a60302a8 07-Jun-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Enable the GL_ARB_shader_bit_encode extension.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
2f18698220d8b27991fab550c4721590d17278e0 01-Jun-2012 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Fix user-defined FS outputs with less than four components.

OpenGL allows you to declare user-defined fragment shader outputs with
less than four components:

out ivec2 color;

This makes sense if you're rendering to an RG format render target.

Previously, we assumed that all color outputs had four components (like
the built-in gl_FragColor/gl_FragData variables). This caused us to
call emit_color_write for invalid indices, incrementing the output
virtual GRF's reg_offset beyond the size of the register.

This caused cascading failures: split_virtual_grfs would allocate new
size-1 registers based on the virtual GRF size, but then proceed to
rewrite the out-of-bounds accesses assuming that it had allocated enough
new (contiguously numbered) registers. This resulted in instructions
that accessed size-1 GRFs which register numbers beyond
virtual_grf_next (i.e. registers that were never allocated).

Finally, this manifested as live variable analysis and instruction
scheduling accessing their temporary array with an out of bounds index
(as they're all sized based on virtual_grf_next), and the program would
segfault.

It looks like the hardware's Render Target Write message requires you to
send four components, even for RT formats such as RG or RGB. This patch
continues to use all four MRFs, but doesn't bother to fill any data for
the last few, which should be unused.

+2 oglconforms.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
217b62bf001f6b1da31807b803bbe45d7cabe3ba 04-Jun-2012 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Fix texelFetchOffset() on pre-Gen7.

Commit f41ecade7b458c02d504158b522acb2231585040 fixed texelFetchOffset()
on Ivybridge, but didn't update the Ironlake/Sandybridge code.

+15 piglits on Sandybridge.

NOTE: This and f41ecade7b458 are both candidates for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
29362875f2613ad87abe7725ce3c56c36d16cf9b 25-Apr-2012 Eric Anholt <eric@anholt.net> i965/gen6+: Add support for GL_ARB_blend_func_extended.

v2: Add support for gen6, and don't turn it on if blending is
disabled. (fixes GPU hang), and note it in docs/GL3.txt

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
1e188f2daef1ae31224d2429bcc1fab75c81fb36 10-May-2012 Eric Anholt <eric@anholt.net> intel: Fix signed/unsigned comparison warnings.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
d1029f99884e2ba7f663765274cd6bdb4f82feed 11-May-2012 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Use a const reference in fs_reg::equals instead of a pointer.

This lets you omit some ampersands and is more idiomatic C++. Using
const also marks the function as not altering either register (which
was obvious, but nice to enforce).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
dc42910e98dc00760255cc4579da458de09175b9 24-Apr-2012 Eric Anholt <eric@anholt.net> i965/fs: Fix regression in comparison handling from ANDs change.

I had fixed up the logic ops for delayed ANDing, but not equality
comparisons on bools. Fixes new piglit fs-bool-less-compare-true.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48629
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b443ca96a55a06ee215a3f9a9e7dba558deeb58c 24-Apr-2012 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Fix FB writes that tried to use the non-existent m16 register.

A little analysis shows that the worst-case value for "nr" is 17:
- base_mrf = 2 ... 2
- header present (say gen == 5) ... 4
- aa_dest_stencil_reg (stencil test) ... 5
- SIMD16 mode: += 4 * reg_width ... 13
- source_depth_to_render_target ... 15
- dest_depth_reg ... 17

This resulted in us setting base_mrf to 2 and mlen to 15. In other
words, we'd try to use m2..m16. But m16 doesn't exist pre-Gen6. Also,
the instruction scheduler data structures use arrays of size 16, so this
would cause us to access them out of bounds.

While the debugger system routine may need m0 and m1, we don't use it
today, so the simplest solution is just to move base_mrf back to 1.
That way, our worst case message fits in m1..m15, which is legal.

An alternative would be to fail on SIMD16 in this case, but that seems
a bit unfortunate if there's no real need to reserve m0 and m1.

Fixes new piglit test shaders/depth-test-and-write on Ironlake.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48218
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
f41ecade7b458c02d504158b522acb2231585040 17-Apr-2012 Eric Anholt <eric@anholt.net> i965/fs: Fix texelFetchOffset()

It appears that when using 'ld' with the offset bits, address bounds
checking happens before the offset is applied, so parts of the drawing
in piglit texelFetchOffset() with a negative texcoord go black.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
8890c759513597903997f519c69e9db30790b6f4 11-Apr-2012 Eric Anholt <eric@anholt.net> i965/fs: Suppress printing the whole loop in BRW_OPCODE_DO annotation.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
80ecb8f15b9ad7d6edcc85bd19f1867c368b09b6 13-Mar-2012 Eric Anholt <eric@anholt.net> i965/fs: Avoid generating extra AND instructions on bool logic ops.

By making a bool fs_reg only have a defined low bit (matching CMP
output), instead of being a full 0 or 1 value, we reduce the ANDs
generated in logic chains like:

if (v_texcoord.x < 0.0 || v_texcoord.x > texwidth ||
v_texcoord.y < 0.0 || v_texcoord.y > 1.0)
discard;

My concern originally when writing this code was that we would end up
generating unnecessary ANDs on bool uniforms, so I put the ANDs right
at the point of doing the CMPs that otherwise set only the low bit.
However, in order to use a bool, we're generating some instruction
anyway (e.g. moving it so as to produce a condition code update), and
those instructions can often be turned into an AND at that point. It
turns out in the shaders I have on hand, none of them regress in
instruction count:

Total instructions: 262649 -> 262545
39/2148 programs affected (1.8%)
14253 -> 14149 instructions in affected programs (0.7% reduction)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
32ae8d3b321185a85b73ff703d8fc26bd5f48fa7 10-Mar-2012 Eric Anholt <eric@anholt.net> i965/fs: Try to avoid generating extra MOVs to do saturates.

This change (before the previous two) produced a .23% +/- .11%
performance improvement in Unigine Tropics at 1024x768 on IVB.

Total instructions: 269270 -> 262649
614/2148 programs affected (28.6%)
179386 -> 172765 instructions in affected programs (3.7% reduction)

v2: Move some of the logic of finding the instruction that produced
the result of an expression tree to a helper.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
f75c2d53146ea14f8dfedcc5b7a4704278ba0792 21-Sep-2011 Kenneth Graunke <kenneth@whitecape.org> glsl: Demote 'type' from ir_instruction to ir_rvalue and ir_variable.

Variables have types, expression trees have types, but statements don't.
Rather than have a nonsensical field that stays NULL in the base class,
just move it to where it makes sense.

Fix up a few places that lazily used ir_instruction even though they
actually knew the particular subclass.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
01044fce6b3de11635ea5078b76ffee1a33b3802 30-Mar-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Avoid explicit accumulator operands in SIMD16 mode on Gen7.

According to the BSpec ISA volume's "Accumulator Register" section:

"[DevIVB] SIMD16 execution on dwords is not allowed when accumulator is
explicit source or destination operand."

Fixes piglit tests:
- fs-multiply-const-ivec4
- fs-multiply-const-uvec4
- fs-multiply-ivec4-const
- fs-multiply-uvec4-const

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
99cd475cc9d3bd54140f84c24b55b9e05d7310a1 09-Jan-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Enable SIMD16 mode for shaders with loops on Gen6+.

The hardware supports it; there's no reason not to.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
df5963c25641a7c3a4bbfcb81cc3dc771581590e 18-Feb-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Make the dummy fragment shader work in SIMD16 mode.

If you're resorting to the dummy shader, you've probably already turned
off SIMD16 mode. But if you didn't, it would die in a fire.

We could either fail to compile in SIMD16 mode...or just fix it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
393b42240f22dbbfb4f089036319031ad36173f3 18-Feb-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Fix GPU hangs in the dummy fragment shader.

The dummy FB write failed to specify EOT and a message length, causing
the GPU to hang. Now we can enjoy "everyone's favorite color" again.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
4b274068204c7f0bacaa4639f24feb433353b861 14-Feb-2012 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Take # of components into account in try_rewrite_rhs_to_dst.

Commit dc7f449d1ac53a66e6efb56ccf2a5953418a26ca introduced a new method
for avoiding MOVs: try to rewrite the destination of the instruction
that produced the RHS so it writes into the LHS.

Unfortunately, this is not safe for swizzled texturing operations, as
they return a set of four contiguous registers. Consider the following:

(assign (x)
(var_ref vec_ctor_x)
(swiz x (tex vec4 (var_ref m_sampY) (var_ref m_cordY) 0 1 ())))

In this case, the source and destination registers are equal, since
reg_offset is 0 for both. Yet, this is only a partial move: the texture
operation generates four registers, and the LHS only covers one.

Fixes color distortion in XBMC when using GLSL shaders.

NOTE: This is a candidate for the 8.0 branch (with the previous commit).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44333
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
7d55f37b0e87db9b3806088797075161a1c9a8bb 07-Feb-2012 Eric Anholt <eric@anholt.net> i965/fs: Add support for generating MADs.

Improves nexuiz performance 0.65% +/- .10% (n=5) on my gen6, and .39%
+/- .11% (n=10) on gen7. No statistically significant performance
difference on warsow (n=5, but only one shader has MADs).

v2: Add support for MADs in 16-wide by using compression control.
v3: Don't generate MADs when it will force an immediate to be moved to a temp.
(it's not clear whether this is a win or not, but it should result in less
questionable change to codegen compared to v2).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
7c857a6b159debf76d4661f494fd2c97d205b5b1 04-Feb-2012 Eric Anholt <eric@anholt.net> i965/fs: Implement GL_CLAMP behavior on texture rectangles on gen6+.

We were doing saturate-based clamping on the [0,width] or [0,height]
coordinate, which meant only the first pixel was addressable.

Fixes piglit ARB_texture_rectangle/texwrap-RECT-bordercolor

NOTE: This is a candidate for the 8.0 release branch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
07e621c52329cd17b97051a26493626228d043b9 03-Feb-2012 Eric Anholt <eric@anholt.net> i965/fs: Move GL_CLAMP handling to coordinate setup.

We should be able to merge self-move instruction into the MRF move
anyway, and this simplifies things for the next commit.

NOTE: This is a candidate for the 8.0 release branch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
83dc891b41c0224f5ba3624b3e3560129e644e28 09-Jan-2012 Eric Anholt <eric@anholt.net> i965/fs: Fix GPU hangs with 16-wide integer div/mod on gen7.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
1b05fc7cdd0e5d77b50bc8ee2f2c851da5884d72 07-Dec-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Factor out texturing related data from brw_wm_prog_key.

The idea is to reuse this for the VS and (in the future) GS as well.

v2: Include yuvtex data since we're not dropping GL_MESA_ycbycr.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
475d70d6ef5feb94efab3923e5607e625f2aee67 26-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Factor out texture offset bitfield computation.

We'll want to reuse this for the VS, and it's complex enough that I'd
rather not cut and paste it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
febad1779ae5cb5c85d66c2635baea62da52d2fa 26-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Rename texturing ops from FS_OPCODE to SHADER_OPCODE, except TXB.

We'll be reusing most of these for the VS shortly. The one exception is
TXB (texturing with LOD bias), which is explicitly forbidden in the VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
c592ebc581e5ca0122660b3d76ec924b96581216 06-Dec-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Don't swizzle the results of textureSize().

Fixes a regression since d2235b0f4681f75d562131d655a6d7b7033d2d8b,
in my new textureSize sampler(1DArrayShadow|2DShadow|2DArrayShadow)
piglit tests, though I'm not honestly sure how this ever worked.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
51e5a266c1e1c12c4f0d82bee3caff008a41c9fd 23-Nov-2011 Eric Anholt <eric@anholt.net> i965/fs: Fix regression in fbo-alphatest-nocolor.

In the refactor for handling user-defined out params, we failed to set
up the new color output tracking when there was no color drawbuffer in
place but alpha testing was on. Just always set up at least one when
handling gl_FragColor, since we won't make use of its value unless we
need to.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42806
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
6d874d0ee18b3694c49e0206fa519bd8b746ec24 09-Nov-2011 Eric Anholt <eric@anholt.net> i965/fs: Add support for user-defined out variables.

Before, I was tracking the ir_variable * found for gl_FragColor or
gl_FragData[]. Instead, when visiting those variables, set up an
array of per-render-target fs_regs to copy the output data from. This
cleans up the color emit path, while making handling of multiple
user-defined out variables easier.

v2: incorporate idr's feedback about ir->location (changes by Kenneth Graunke)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
e988d816e16f9c0844424472d689486a833931c3 09-Nov-2011 Eric Anholt <eric@anholt.net> i965/fs: Preserve the source register type when doing color writes.

When rendering to integer color buffers, we need to be careful to use
MRFs of the correct type when emitting color writes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
512431b3575eb5f2c27d8795c5e2191047ebb5ed 27-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Use the actual hardware g0 register for texel offset setup.

The idea here is to set up the message header with the Sampler State
pointer which the hardware provides as part of the PS Thread Payload in
register g0.

Unfortunately, the existing code

fs_reg(GRF, 0, BRW_REGISTER_TYPE_UD))

actually references "virtual GRF 0" rather than the hardware g0. This
is just some arbitrary GRF temporary which will get register allocated.

So, we ended up setting up the header with garbage.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
e04bdeae82797dbdcf6f544a997a4626fdfd4aee 22-Oct-2011 Paul Berry <stereotype441@gmail.com> i965/gen6+: Parameterize barycentric interpolation modes.

This patch modifies the fragment shader back-end so that instead of
using a single delta_x/delta_y register pair to store barycentric
coordinates, it uses an array of such register pairs, one for each
possible intepolation mode.

When setting up the WM, we intstruct it to only provide the
barycentric coordinates that are actually needed by the fragment
shader--that is computed by brw_compute_barycentric_interp_modes().
Currently this function returns just
BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC, because this is the only
interpolation mode we support. However, that will change in a later
patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
73b0a28ba8b3e2ab917d4c729f34ddbde52c9e88 04-Oct-2011 Eric Anholt <eric@anholt.net> i965/fs: Fix comparisions with uint negation.

The condmod instruction ends up generating garbage condition codes,
because apparently the comparison happens on the accumulator value (33
bits for UD), not the truncated value that would be written.

Fixes fs-op-neg-*

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
2e5a1a254ed81b1d3efa6064f48183eefac784d0 07-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> intel: Convert from GLboolean to 'bool' from stdbool.h.

I initially produced the patch using this bash command:
for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i
's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i
's/GL_FALSE/false/g' $file; done

Then I manually added #include <stdbool.h> to fix compilation errors,
and converted a few functions back to GLboolean that were used in core
Mesa's function pointer table to avoid "incompatible pointer" warnings.

Finally, I cleaned up some whitespace issues introduced by the change.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chad Versace <chad@chad-versace.us>
Acked-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
ff8f272b0d02b41a0ce34ab6af7119b9e06f4961 29-Sep-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Implement integer quotient and remainder math operations.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
7de6e749df90a214d1547956dd66cfec6edcb446 27-Sep-2011 Eric Anholt <eric@anholt.net> i965/fs: Add support for bit-shift operations.

Reviewed-by: Chad Versace <chad@chad-versace.us>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
83df7fbe62be2798d557142a47e01af86ec9e2e2 27-Sep-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Allow SIMD16 color writes on Ivybridge.

Again, the check was needlessly specific: this works fine on Gen7.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
79cba4c2b17456e2b25ac555c45e1c106b4e3f6b 27-Sep-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Allow SIMD16 with control flow on Ivybridge.

The check was designed to forbid it on old generations (Gen5/Ironlake),
not on new ones. It just works on Gen7/Ivybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
c310c35a754be835c9ceafe578c4974a667b9a77 20-Sep-2011 Eric Anholt <eric@anholt.net> i965: Fix compiler warnings.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
47b556fbcaea4660b21481e40d89167d883d47f5 07-Sep-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Implement texelFetch() on Gen4.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
0edf5d63d60100cc2b7467da78ce811c4824b760 26-Aug-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Implement texelFetch() on Ivybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
30be2cc6c7c3378ee17885b5bf41d7ae53bf6fe0 26-Aug-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Implement texelFetch() on Ironlake and Sandybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
4bc5bfb641bce931bf35f0e78ec2b44263d152ba 02-Sep-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Implement ir_u2f opcode.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
2f0edc60f4bd2ae5999a6afa656e3bb3f181bf0f 26-Aug-2011 Chad Versace <chad@chad-versace.us> i965: Fix Android build by removing relative includes

Replace each occurence of
#include "../glsl/*.h"
with
#include "glsl/*.h"

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad@chad-versace.us>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
dc7f449d1ac53a66e6efb56ccf2a5953418a26ca 26-Aug-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Avoid generating MOVs for most ir_assignment handling.

This is a port of vec4_visitor::try_rewrite_rhs_to_dst to fs_visitor.

Not only is this technique less invasive and more robust, it also
generates better code. Over and above the previous technique, this
reduced instruction count in shader-db by 0.28% on average and 1.4% in
the best case.

In no case did this technique result in more code than the prior method.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
d28a3bd4bf25157aff5379a003bbf4a66157ed06 26-Aug-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Revert "Avoid generating MOVs for assignments for expressions."

This reverts commit 53c89c67f33639afef951e178f93f4e29acc5d53, along with
the subsequent this->result = reg_undef additions it required.

Both Eric and I agree that the way he did this is really fragile; if you
forget to add this->result = reg_undef before calling accept(), it may
end up using the same register for two separate things, breaking things
in strange and mysterious ways.

The next commit will port over the new VS backend's method for solving
this problem, which is simpler, less intrusive, and still manages to
avoid MOVs in the common case.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
9d4b98eb9eadecc17cd1cda0074b420a39e74647 17-Aug-2011 Eric Anholt <eric@anholt.net> i965/gen6+: Use non-normalized coordinates for GL_TEXTURE_RECTANGLE.

Improves performance of a GL_TEXTURE_RECTANGLE microbenchmark by 1.84%
+/- .15% (n=3)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
4eeb4c150598605d1be3ce6674fa63076a720ae9 17-Aug-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Implement textureSize (TXS) on Gen4.

Also, remove the BRW_SAMPLER_MESSAGE_SIMD8_RESINFO #define because
there totally isn't a SIMD8 variant.

Unfortunately, resinfo returns FLOAT32 on Broadwater/Crestline, unlike
G45 which returns a proper UINT32. This turns out to be simple,
however: when we emit MOVs to select the desired half of the SIMD16
result, we can simply override the register type to be float so it's
converted to an integer.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
ecf8963754489abfb5097c130a9bcd4cdb76b6bd 19-Jun-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Implement textureSize (TXS) on Gen5+.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b6bdcf2a908889532ef6d5eb643791176dffcb9d 18-Aug-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Rudimentary support for non-floating point texture results.

Not all texturing operations return floating point data. For example,
the resinfo message (textureSize or TXS) returns integer data. In the
future, we'll also add integer texture support.

ir_texture's type field contains this information; use its base type to
appropriately type the destination register. We want to keep it as a
four component vector, however, since SIMD8 samplers always have a
response length of 4.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
1e3bcbdf31f09666ba358f35ff9486faee3642ca 25-Feb-2011 Kenneth Graunke <kenneth@whitecape.org> glsl: Add a new ir_txs (textureSize) opcode to ir_texture.

One unique aspect of TXS is that it doesn't have a coordinate.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
eb86bb55f5faef67c21604db19210c6788592679 18-Aug-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Change incorrect use of 'struct fs_reg' to simply 'fs_reg'.

It's actually a class.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
3f78f719732b87e6707f94c187ad6e263c6c2ef0 16-Aug-2011 Eric Anholt <eric@anholt.net> i965/fs: Fix 32-bit integer multiplication.

The MUL opcode does a 16bit * 32bit multiply, and we need to do the
MACH to get the top 16bit * 32bit added in.

Fixes fs-op-mult-int-*, fs-op-mult-ivec*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
65b5cbbcf783f6c668ab5b31a0734680dd396794 05-Aug-2011 Eric Anholt <eric@anholt.net> i965: Rename math FS_OPCODE_* to SHADER_OPCODE_*.

I want to just use the same enums in the VS.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b76378d46a211521582cfab56dc05031a57502a6 04-May-2011 Eric Anholt <eric@anholt.net> i965/fs: Eliminate the magic nature of virtual GRF 0.

This was a debugging aid at one point -- virtual grf 0 should never be
allocated, and it would be used if undefined register access occurred
in codegen. However, it made the confusing register allocation code
even more confusing by indexing things off of 1 all over.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
44ffb4ae207e48f78fae55925601b8708ed09c1d 29-Jul-2011 Eric Anholt <eric@anholt.net> i965/fs: Stop using the exec_list iterator.

The old style has gone out of favor in the project, but I kept copy
and pasting from existing iterator code.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
4fdd289805d14d4f7a234f88cd375be1b3b96764 26-Jul-2011 Eric Anholt <eric@anholt.net> i965/fs: Respect ARB_color_buffer_float clamping.

This was done in the old codegen path, but not the new one. Caught by
piglit fbo tests after the conversion to GLSL ff_fragment_shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
3e1fd13f605f16e8b48f3a9b71910a3c66eb84b5 25-Jul-2011 Kenneth Graunke <kenneth@whitecape.org> i965/gen4: Fix message parameter loading for 1D TXD sampling.

We were neglecting to load dvdx and dvdy. v is not optional.

Fixes glslparsertests tex-grad-0[12345].frag on Broadwater/Crestline.
(We still need an execution test using sampler1D.)

NOTE: This is a candidate for the 7.11 branch.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
156cef0fbacf242e8fc67e39ab964e5f8f3739cb 22-Jul-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Clear result before visiting shadow comparitor and LOD info.

Commit 53c89c67f33639afef951e178f93f4e29acc5d53 ("i965: Avoid generating
MOVs for assignments of expressions.") added the line "this->result =
reg_undef" all over the code. Unfortunately, since Eric developed his
patch before I landed Ivybridge support, he missed adding it to
fs_visitor::emit_texture_gen7() after rebasing.

Furthermore, since I developed TXD support before Eric's patch, I
neglected to add it to the gradient handling when I rebased.

Neglecting to set this causes the visitor to use this->result as storage
rather than generating a new temporary. These missing statements
resulted in the same register being used to store several different
values.

Fixes the following piglit tests on Ivybridge:
- glsl-fs-shadow2dproj.shader_test
- glsl-fs-shadow2dproj-bias.shader_test

NOTE: This is a candidate for the 7.11 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
eafc74d7d4982a835ac43c73963dda9982652464 30-Jun-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Fix message register allocation in FB writes.

Commit 6750226e6d915742ebf96bae2cfcdd287b85db35 bumped the base MRF to
m2 instead of m0, but failed to adjust inst->mlen, which was being set
to the highest MRF. Subtracting the base MRF solves the issue.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b633ddeb9fd951ddc49e8a3fd25a946e5a16361f 14-Jun-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Implement new ir_unop_u2i and ir_unop_i2u opcodes.

No MOV is necessary since signed/unsigned integers share the same
bit-representation; it's simply a question of interpretation. In
particular, the fs_reg::imm union shouldn't need updating.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
6750226e6d915742ebf96bae2cfcdd287b85db35 17-Jun-2011 Ben Widawsky <ben@bwidawsk.net> i965: step message register allocation

The system routine requires m0 be reserved for saving off architectural
state. Moved the allocation to start at 2 instead of 0.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
6430df37736d71dd2bd6f1fe447d39f0b68cb567 10-Jun-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Add support for TXD with shadow comparisons.

Our hardware doesn't have a sample_d_c message, so we have to do a
regular sample_d and emit instructions to manually perform the
comparison.

This requires a state dependent recompile whenever the sampler's compare
mode or function change. This adds the per-sampler comparison functions
to brw_wm_prog_key, but only sets them when the sampler's compare mode
is GL_COMPARE_R_TO_TEXTURE (i.e. only for shadow sampling).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
01fa9addf447120e994415ad8fc8246ac234ec27 10-Jun-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Refactor texture result swizzling into a helper function.

The next patch will add a few additional uses.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
f1622cfe9c0f37a9b452be1297f187cba8c46e6a 10-Jun-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Move sampler fetch to the top of the ir_texture visit function.

This makes it available earlier, which will soon be necessary.
(Separating code motion from actual changes.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
6c947cfd1973c3791d54f1406c973357b4a9621a 09-Jun-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Add support for non-shadow textureGrad (TXD) on gen4.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
2f4a4b943f1cad9bbbb8f66c34dca506503ba5bb 09-Jun-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Add support for non-shadow textureGrad (TXD) on gen5/6.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
3fa910fff9f72d1adf33f0f4dea3d790a9ce04ab 09-Jun-2011 Kenneth Graunke <kenneth@whitecape.org> i965/fs: Add support for non-shadow textureGrad (TXD) on Ivybridge.

This is somewhat ugly, but I couldn't think of a nicer way to handle the
interleaved coordinate/derivative parameter loading.

Ironlake and Sandybridge will still hit an assertion in visit().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
c331b3123ecda127919458e24848b7c1596525ac 12-May-2011 Eric Anholt <eric@anholt.net> i965/fs: Use the embedded compare in SEL on gen6+.

This avoids the extra CMP and the predication on SEL, so in addition
to one less instruction, it makes scheduling less constrained.

Improves glbenchmark Egypt performance 0.6% +/- 0.2% (n=3). Reduces
FS instruction count across affected shaders in shader-db by 1.3%
without regressing any.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
0653c450cc8da1212e1123a1cd6635c02f7d6919 27-May-2011 Eric Anholt <eric@anholt.net> i965/fs: Fix up for 8752764076e5b3f052a57e0134424a37bf2e9164.

I failed to commit and squash before pushing.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
d1f70a8a6c6ec7007bad22d3d6013415be2d243a 25-May-2011 Eric Anholt <eric@anholt.net> i965/fs: Split the GLSL IR -> FS LIR visitor to brw_fs_visitor.cpp.

We now have:
brw_fs.cpp handles calling out to everything and optimization.
brw_fs_visitor.cpp handles translating to our LIR.
brw_fs_emit.cpp handles emitting from our LIR to native code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp