97eed6da9704deb2c57fe47cd110c2b70191e2c2 |
|
25-Oct-2012 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Don't lose the MRF writemask when doing compute-to-MRF. Consider the following code sequence: mul(8) g4<1>F g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF mov.sat(8) m1<1>.xyF g4<4,4,1>F mul(8) g4<1>F g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF mov.sat(8) m1<1>.zwF g4<4,4,1>F The compute-to-MRF pass will discover the first mov.sat and attempt to replace it by rewriting earlier instructions. Everything works out, so it replaces scan_inst's destination file, reg, and reg_offset, resulting in: mul(8) m1<1>F g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF mul(8) g4<1>F g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF mov.sat(8) m1<1>.zwF g4<4,4,1>F Unfortunately, it loses the .xy writemask on the mov.sat's MRF destination. While this doesn't pose an immediate problem, it then proceeds to transform the second mov.sat, resulting in: mul(8) m1<1>F g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF mul(8) m1<1>F g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF Instead of writing both halves of the vector (like the original code), it overwrites the full vector both times, clobbering the desired .xy values. When encountering a MOV, the compute-to-MRF code scans for instructions which generate channels of the MOV source. It ensures that all necessary channels are available (possibly written by several instructions). In this case, *more* channels are available than necessary, so we want to take the subset that's actually used. Taking the bitwise and of both writemasks should accomplish that. This was discovered by analyzing an ARB_vertex_program test (glean/vertProg1/MUL test (with swizzle and masking)) with my new Mesa IR -> Vec4 IR translator code. However, it should be possible with GLSL programs as well. NOTE: This is a candidate for stable release branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit 10ff6772c8054aea12ac0f08e2e3898fd4a7f76b)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
25ca9cc8236845a4be32a6f39b4a6d1664d4b403 |
|
04-Jul-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Move the other two src_reg/dst_reg constructors to brw_vec4.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
b2f5d4c3ec9ec2fec8b39c87eb00121a24107276 |
|
04-Jul-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Move class functions to brw_vec4.cpp. This has less impact than for the FS (4k savings), because it was partially done already, but makes things more consistent. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7e7c40ff98cc2b930bc3113609ace5430f2bdc95 |
|
26-Oct-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Add vec4_instruction::is_tex() query. Copy and pasted from fs_inst::is_tex(), but without TXB. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
1d4f3ca8f0442821c914b758b323e6e5124149a3 |
|
29-Sep-2011 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vs: Implement integer quotient and remainder math operations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
c662764f4f9d9d0303fb2685dfdc93824fa15dca |
|
06-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for compute-to-MRF. Removes 1.8% of the instructions from 97% of the vertex shaders in shader-db.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
160848d8ef96cf3a760c02cc576df7dbffc1f669 |
|
06-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add a function for how many MRFs get written as part of a SEND. This will be used for compute-to-mrf, which needs to know when MRFs get overwritten.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
f0c04e6c22babf2aee2ad1ee85dbd6f996be3712 |
|
03-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for simple algebraic optimizations. We generate silly code for array access, and it's easier to generally support the cleanup than to specifically avoid the bad code in each place we might generate it. Removes 4.6% of instructions from 41.6% of shaders in shader-db, particularly savage2/hon and unigine. v2: Fixes by Ken: Make is_zero/one member functions, and fix a progress flag. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
cc9eb936c220267b6130b705fc696d05906a31df |
|
02-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for copy propagation of the UNIFORM and ATTR files. Removes 2.0% of the instructions from 35.7% of vertex shaders in shader-db.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
42ce13195b94d0d51ca8e7fa5eed07fde8f37988 |
|
30-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add constant propagation to a few opcodes. This differs from the FS in that we track constants in each destination channel, and we we have to look at all the swizzled source channels. Also, the instruction stream walk is done in an O(n) manner instead of O(n^2). Across shader-db, this reduces 8.0% of the instructions from 60.0% of the vertex shaders, leaving us now behind the old backend by 11.1% overall.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
df35d691807656d3627b6fa6f51a08674bdc043e |
|
07-Sep-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add support for overflowing the number of available push constants. Fixes glsl-vs-uniform-array-4. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33742 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
72cfc6f3778d8297e52c254a5861a88eb62e4d67 |
|
23-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Pack live uniform vectors together in the push constant upload. At some point we need to also move uniform accesses out to pull constants when there are just too many in use, but we lack tests for that at the moment. Fixes glsl-vs-large-uniform-array. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
7c84b9d303345fa5075dba8c4ea7af449d93b0f8 |
|
23-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Track uniforms as separate vectors once we've done array access. This will make it easier to figure out which elements are totally unused and not upload them. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
8174945d3346dc049ae56dcb4bf1eab39f5c88aa |
|
17-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Add simple dead code elimination. This is copied right from the fragment shader. It is needed for real register allocation to work correctly.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|
3dadc1e3cceac80a1b63cad2e10f0e0f8904531b |
|
17-Aug-2011 |
Eric Anholt <eric@anholt.net> |
i965/vs: Copy the live intervals calculation over from the FS. This is a rather pessimistic calculation, since it doesn't distinguish individual channels of a vec4, or elements of an array, but should be a minimum start for register allocation.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4.cpp
|