History log of /external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
9919542f1cfff70524bc6117d19bf88e59159caa 15-Jan-2017 Kenneth Graunke <kenneth@whitecape.org> i965: Make DCE set null destinations on messages with side effects.

(Co-authored by Matt Turner.)

Image atomics, for example, return a value - but the shader may not
want to use it. We assigned a useless VGRF destination. This seemed
harmless, but it can actually be quite harmful. The register allocator
has to assign that VGRF to a real register. It may assign the same
actual GRF to the destination of an instruction that follows soon after.

This results in a write-after-write (WAW) dependency, and stall.

A number of "Deus Ex: Mankind Divided" shaders use image atomics, but
don't use the return value. Several of these were hitting WAW stalls
for nearly 14,000 (poorly estimated) cycles a pop. Making dead code
elimination null out the destination avoids this issue.

This patch cuts one shader's estimated cycles by -98.39%! Removing the
message response should also help with data cluster bandwidth.

On Skylake:

(instruction counts remain identical)

total cycles in shared programs: 255413890 -> 248081010 (-2.87%)
cycles in affected programs: 12019948 -> 4687068 (-61.01%)
helped: 24
HURT: 10

v2: Make can_omit_write independent of can_eliminate (Curro).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
90bf39cd2b39874557a7c492d92b85945d45f3c6 15-Dec-2016 Kenneth Graunke <kenneth@whitecape.org> i965: Combine some dead code elimination NOP'ing code.

In theory we might have incorrectly NOP'd instructions that write the
flag, but where that flag value isn't used, and yet the instruction
either writes the accumulator or has side effects.

I don't believe any such instructions exist, so this is mostly a
code cleanup.

Curro pointed out that FS_OPCODE_FB_WRITE has a null destination and
actually writes the flag on Gen4-5 to dynamically decide whether to
write some payload data. The hunk removed in this patch might have
NOP'd it, except that we don't actually mark flags_written() in the
IR, so it doesn't think the flag is touched at all. That's sketchy,
but it means it wouldn't hit this today (though there are likely other
problems!).

v2: Properly replace the inst->regs_written() check in the second
hunk with the flag being live (mistake caught by Curro).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
be5f53e769deb936509efd6f0576b15b7a5432b9 18-Jan-2017 Kenneth Graunke <kenneth@whitecape.org> i965: Make DCE explicitly not eliminate any control flow instructions.

According to Matt, the dead code pass explicitly avoided IF and WHILE
because on Sandybridge, these could have conditional modifiers and
null destination registers. Normally, those instructions use BAD_FILE
for the destination register. Nowadays, we don't do that anymore, so
we could technically drop these checks.

However, it's clearer to explicitly leave control flow instructions
alone, so change it to the more generic !inst->is_control_flow().

This should have no actual change.

[This patch implements review feedback from Curro and Matt.]

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
4d4335c81a3f7d8434d9983881a63abcbc29dd5c 11-Oct-2016 Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> i965/fs: fill allocated memory with zeros where needed

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
0bc46cc9619b8ae43e7a7c96bfe91f19371d301d 02-Sep-2016 Francisco Jerez <currojerez@riseup.net> i965/fs: Simplify result_live calculation in dead_code_eliminate().

No need to unroll the first iteration of the loop manually.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
c458eeb94620fbce0a37474fc292545002d67f76 08-Sep-2016 Francisco Jerez <currojerez@riseup.net> i965/fs: Add wrapper functions for fs_inst::regs_read and ::regs_written.

This is in preparation for dropping fs_inst::regs_read and
::regs_written in favor of more accurate alternatives expressed in
byte units. The main reason these wrappers are useful is that a
number of optimization passes implement dataflow analysis with
register granularity, so these helpers will come in handy once we've
switched register offsets and sizes to the byte representation. The
wrapper functions will also make sure that GRF misalignment (currently
neglected by most of the back-end) is taken into account correctly in
the calculation of regs_read and regs_written.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
7d430fc05e8f0a6211fb587f1bc7b2a76ed7de10 19-May-2016 Francisco Jerez <currojerez@riseup.net> i965/fs: Clean up remaining uses of fs_inst::reads_flag and ::writes_flag.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
0fec265373f269d116f6d4de900b208fffabe2a1 19-May-2016 Francisco Jerez <currojerez@riseup.net> i965/fs: Track flag register liveness with byte granularity.

This is required for correctness in presence of multiple 8-wide flag
writes (e.g. 8-wide instructions with a conditional mod set) which
update a different portion of the same 16-bit flag subregister. Right
now we keep track of flag dataflow with 16-bit granularity and
consider flag writes to have killed any previous definition of the
same subregister even if the write was less than 16 channels wide,
which can cause live flag register updates to be dead code-eliminated
incorrectly.

Additionally this makes sure that we handle 32-wide flag writes and
reads which may span multiple flag subregisters so the current
approach of just setting/testing a single bit from the live set
wouldn't have worked.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
7e6a6f3e619e7dfed244043a95082f2168a5c953 30-Nov-2015 Matt Turner <mattst88@gmail.com> i965: Do dead-code elimination in a single pass.

The first pass marked dead instructions as opcode = NOP, and a second
pass deleted those instructions so that the live ranges used in the
first pass wouldn't change.

But since we're walking the instructions in reverse order, we can just
do everything in one pass. The only thing we have to do is walk the
blocks in reverse as well.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
48b4e88d3d2cfa2ccd912184cfdcbe559cd36ff0 26-Nov-2015 Matt Turner <mattst88@gmail.com> i965: Don't mark dead instructions' sources live.

Removes dead code from glsl-mat-from-int-ctor-03.shader_test.

Reported-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
b163aa01487ab5f9b22c48b7badc5d65999c4985 27-Oct-2015 Matt Turner <mattst88@gmail.com> i965: Rename GRF to VGRF.

The 2-bit hardware register file field is ARF, GRF, MRF, IMM.

Rename GRF to VGRF (virtual GRF) so that we can reuse the GRF name to
mean an assigned general purpose register.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
c3d7caa1e006f00c3544a79a0be7d78904ce4177 22-Oct-2015 Alejandro Piñeiro <apinheiro@igalia.com> i965: check inst->predicate when clearing flag_live at dead code eliminate

Detected by Matt Turner while reviewing commit
a59359ecd22154cc2b3f88bb8c599f21af8a3934

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
a3ee6c7d1991a90d22fae992c1cb94123e51ae54 06-Feb-2015 Francisco Jerez <currojerez@riseup.net> i965/fs: Remove dependency of fs_inst on the visitor class.

The fs_visitor argument of fs_inst::regs_read() wasn't used at all.

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
3759a89ad358139ef981bd5d46261ee115762b94 22-Aug-2014 Matt Turner <mattst88@gmail.com> i965/fs: Eliminate null-dst instructions without side-effects.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
b37273b92431a2d986235774f04a9fba2aa1bf74 29-Oct-2014 Matt Turner <mattst88@gmail.com> i965/fs: Use const fs_reg & rather than a copy or pointer.

Also while we're touching var_from_reg, just make it an inline function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
60d507c3c5c7caed57119df0ab4d824ad1ea85dc 29-Oct-2014 Matt Turner <mattst88@gmail.com> i965/fs: Dead code eliminate instructions writing the flag.

Most prominently helps Natural Selection 2, which has a surprising
number shaders that do very complicated things before drawing black.

instructions in affected programs: 21052 -> 16978 (-19.35%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
13f660158573846d6b1bc30ed4c61d97405bea58 29-Oct-2014 Matt Turner <mattst88@gmail.com> i965: Use local pointer to block_data in live intervals.

The next patch will be simplified because of this, and makes reading the
code a lot easier.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
ab7234c8520499fcfeed153e0aefeb6b43758d1f 09-Sep-2014 Jason Ekstrand <jason.ekstrand@intel.com> i965/fs: Use the var_from_vgrf helper function instead of doing it manually

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
c24dd54f973d1a42b0e2cc81aa219bb58f7523d9 24-Sep-2014 Jason Ekstrand <jason.ekstrand@intel.com> i965/fs: Fix a bug with dead_code_eliminate on large writes

Previously, if an instruction wrote to more than one register, we
implicitly assumed that it filled the entire register. We never hit this
before because the only time we did multi-register writes was things like
texturing which always wrote to all of the registers. However, with the
upcoming ability to do 16-wide instructions in SIMD8 and things of that
nature, we can have multi-register writes at offsets and we'll hit this.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
072ea414d04f1b9a7bf06a00b9011e8ad521c878 01-Sep-2014 Matt Turner <mattst88@gmail.com> i965: Remove cfg-invalidating parameter from invalidate_live_intervals.

Everything has been converted to preserve the CFG.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
20a849b4aa63c7fce96b04de674a4c70f054ed9c 13-Jul-2014 Matt Turner <mattst88@gmail.com> i965: Use basic-block aware insertion/removal functions.

To avoid invalidating and recreating the control flow graph. Also stop
invalidating the CFG in places we didn't add or remove an instruction.

cfg calculations: 202951 -> 80307 (-60.43%)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
596990d91e2a4c4a3a303c6c2da623bf1840771b 12-Jul-2014 Matt Turner <mattst88@gmail.com> i965: Add and use foreach_block macro.

Use this as an opportunity to rename 'block_num' to 'num'. block->num is
clear, and block->block_num has always been redundant.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
680fe0acb3e6569f7b9aab1913e9181d5a7eee2f 12-Jul-2014 Matt Turner <mattst88@gmail.com> i965: Add cfg to backend_visitor.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
2e90d1fb62a6ef53c15eff76e242c510145178a9 29-Jun-2014 Matt Turner <mattst88@gmail.com> i965/fs: Pass cfg to calculate_live_intervals().

We've often created the CFG immediately before, so use it when
available.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
266109736a9a69c3fdbe49fe1665a7a63c5cc122 25-Jun-2014 Matt Turner <mattst88@gmail.com> i965: Use typed foreach_in_list_safe instead of foreach_list_safe.

Acked-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
bc2fbbafd216676ccc7c3abd794ecb7dd1fa631f 24-Jun-2014 Matt Turner <mattst88@gmail.com> i965: Add and use foreach_inst_in_block macros.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
b1dcdcde2e323f960833f5c7da65d5c2c20113c9 17-Mar-2014 Matt Turner <mattst88@gmail.com> i965/fs: Loop from 0 to inst->sources, not 0 to 3.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
306ed81b9363721058c568244f9860c5c8c819f4 04-Apr-2014 Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> i965: Add writes_accumulator flag

Our hardware has an "accumulator" register, which can be used to store
intermediate results across multiple instructions. Many instructions
can implicitly write a value to the accumulator in addition to their
normal destination register. This is enabled by the "AccWrEn" flag.

This patch introduces a new flag, inst->writes_accumulator, which
allows us to express the AccWrEn notion in the IR. It also creates a
n ALU2_ACC macro to easily define emitters for instructions that
implicitly write the accumulator.

Previously, we only supported implicit accumulator writes from the
ADDC, SUBB, and MACH instructions. We always enabled them on those
instructions, and left them disabled for other instructions.

To take advantage of the MAC (multiply-accumulate) instruction, we
need to be able to set AccWrEn on other types of instructions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
18d12336b964cad54bbc0780380c3dcf625abb3d 14-Apr-2014 Matt Turner <mattst88@gmail.com> i965/fs: Clear variable from live-set if it's completely overwritten.

One program affected:

instructions in affected programs: 246 -> 244 (-0.81%)

Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
f34f39330bb41fb0a86930908de10353193a841d 13-Apr-2014 Matt Turner <mattst88@gmail.com> i965/fs: Reimplement dead_code_elimination().

total instructions in shared programs: 1653399 -> 1651790 (-0.10%)
instructions in affected programs: 92157 -> 90548 (-1.75%)
GAINED: 2
LOST: 2

Also significantly reduces the number of optimization loop iterations:

total loop iterations in shared programs: 39724 -> 31651 (-20.32%)
loop iterations in affected programs: 21617 -> 13544 (-37.35%)

Including some great pathological cases, like 29 -> 3 in Strike Suit
Zero and 24 -> 3 in Dota2.

Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp