History log of /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
97eed6da9704deb2c57fe47cd110c2b70191e2c2 25-Oct-2012 Kenneth Graunke <kenneth@whitecape.org> i965/vs: Don't lose the MRF writemask when doing compute-to-MRF.

Consider the following code sequence:

mul(8) g4<1>F g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF
mov.sat(8) m1<1>.xyF g4<4,4,1>F
mul(8) g4<1>F g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF
mov.sat(8) m1<1>.zwF g4<4,4,1>F

The compute-to-MRF pass will discover the first mov.sat and attempt to
replace it by rewriting earlier instructions. Everything works out,
so it replaces scan_inst's destination file, reg, and reg_offset,
resulting in:

mul(8) m1<1>F g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF
mul(8) g4<1>F g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF
mov.sat(8) m1<1>.zwF g4<4,4,1>F

Unfortunately, it loses the .xy writemask on the mov.sat's MRF
destination. While this doesn't pose an immediate problem, it then
proceeds to transform the second mov.sat, resulting in:

mul(8) m1<1>F g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF
mul(8) m1<1>F g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF

Instead of writing both halves of the vector (like the original code),
it overwrites the full vector both times, clobbering the desired .xy
values.

When encountering a MOV, the compute-to-MRF code scans for instructions
which generate channels of the MOV source. It ensures that all
necessary channels are available (possibly written by several
instructions). In this case, *more* channels are available than
necessary, so we want to take the subset that's actually used.
Taking the bitwise and of both writemasks should accomplish that.

This was discovered by analyzing an ARB_vertex_program test
(glean/vertProg1/MUL test (with swizzle and masking)) with my new
Mesa IR -> Vec4 IR translator code. However, it should be possible
with GLSL programs as well.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 10ff6772c8054aea12ac0f08e2e3898fd4a7f76b)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
25ca9cc8236845a4be32a6f39b4a6d1664d4b403 04-Jul-2012 Eric Anholt <eric@anholt.net> i965/vs: Move the other two src_reg/dst_reg constructors to brw_vec4.cpp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
b2f5d4c3ec9ec2fec8b39c87eb00121a24107276 04-Jul-2012 Eric Anholt <eric@anholt.net> i965/vs: Move class functions to brw_vec4.cpp.

This has less impact than for the FS (4k savings), because it was partially
done already, but makes things more consistent.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
7e7c40ff98cc2b930bc3113609ace5430f2bdc95 26-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> i965/vs: Add vec4_instruction::is_tex() query.

Copy and pasted from fs_inst::is_tex(), but without TXB.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
1d4f3ca8f0442821c914b758b323e6e5124149a3 29-Sep-2011 Kenneth Graunke <kenneth@whitecape.org> i965/vs: Implement integer quotient and remainder math operations.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
c662764f4f9d9d0303fb2685dfdc93824fa15dca 06-Sep-2011 Eric Anholt <eric@anholt.net> i965/vs: Add support for compute-to-MRF.

Removes 1.8% of the instructions from 97% of the vertex shaders in
shader-db.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
160848d8ef96cf3a760c02cc576df7dbffc1f669 06-Sep-2011 Eric Anholt <eric@anholt.net> i965/vs: Add a function for how many MRFs get written as part of a SEND.

This will be used for compute-to-mrf, which needs to know when MRFs
get overwritten.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
f0c04e6c22babf2aee2ad1ee85dbd6f996be3712 03-Sep-2011 Eric Anholt <eric@anholt.net> i965/vs: Add support for simple algebraic optimizations.

We generate silly code for array access, and it's easier to generally
support the cleanup than to specifically avoid the bad code in each
place we might generate it.

Removes 4.6% of instructions from 41.6% of shaders in shader-db,
particularly savage2/hon and unigine.

v2: Fixes by Ken: Make is_zero/one member functions, and fix a
progress flag.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
cc9eb936c220267b6130b705fc696d05906a31df 02-Sep-2011 Eric Anholt <eric@anholt.net> i965/vs: Add support for copy propagation of the UNIFORM and ATTR files.

Removes 2.0% of the instructions from 35.7% of vertex shaders in shader-db.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
42ce13195b94d0d51ca8e7fa5eed07fde8f37988 30-Aug-2011 Eric Anholt <eric@anholt.net> i965/vs: Add constant propagation to a few opcodes.

This differs from the FS in that we track constants in each
destination channel, and we we have to look at all the swizzled source
channels. Also, the instruction stream walk is done in an O(n) manner
instead of O(n^2).

Across shader-db, this reduces 8.0% of the instructions from 60.0% of
the vertex shaders, leaving us now behind the old backend by 11.1%
overall.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
df35d691807656d3627b6fa6f51a08674bdc043e 07-Sep-2011 Eric Anholt <eric@anholt.net> i965/vs: Add support for overflowing the number of available push constants.

Fixes glsl-vs-uniform-array-4.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33742

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
72cfc6f3778d8297e52c254a5861a88eb62e4d67 23-Aug-2011 Eric Anholt <eric@anholt.net> i965/vs: Pack live uniform vectors together in the push constant upload.

At some point we need to also move uniform accesses out to pull
constants when there are just too many in use, but we lack tests for
that at the moment.

Fixes glsl-vs-large-uniform-array.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
7c84b9d303345fa5075dba8c4ea7af449d93b0f8 23-Aug-2011 Eric Anholt <eric@anholt.net> i965/vs: Track uniforms as separate vectors once we've done array access.

This will make it easier to figure out which elements are totally
unused and not upload them.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
8174945d3346dc049ae56dcb4bf1eab39f5c88aa 17-Aug-2011 Eric Anholt <eric@anholt.net> i965/vs: Add simple dead code elimination.

This is copied right from the fragment shader. It is needed for real
register allocation to work correctly.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
3dadc1e3cceac80a1b63cad2e10f0e0f8904531b 17-Aug-2011 Eric Anholt <eric@anholt.net> i965/vs: Copy the live intervals calculation over from the FS.

This is a rather pessimistic calculation, since it doesn't distinguish
individual channels of a vec4, or elements of an array, but should be
a minimum start for register allocation.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp