History log of /external/mesa3d/src/gallium/drivers/r600/sb/sb_dce_cleanup.cpp
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
e933246013eef376804662f3fcf4646c143c6c88 20-Nov-2016 Heiko Przybyl <lil_tux@web.de> r600/sb: Fix loop optimization related hangs on eg

Make sure unused ops and their references are removed, prior to entering
the GCM (global code motion) pass, to stop GCM from breaking the loop
logic and thus hanging the GPU.

Turns out, that sb has problems with loops and node optimizations
regarding associative folding:

- the global code motion (gcm) pass moves ops up a loop level/basic block
until they've fulfilled their total usage count
- if there are ops folded into others, the usage count won't be
fulfilled and thus the op moved way up to the top
- within GCM the op would be visited and their deps would be moved
alongside it, to fulfill the src constaints
- in a loop, an unused op is moved out of the loop and GCM would move
the src value ops up as well
- now here arises the problem: if the loop counter is one of the src
values it would get moved up as well, the loop break condition would
never get hit and the shader turn into an endless loop, resulting in the
GPU hanging and being reset

A reduced (albeit nonsense) piglit example would be:

[require]
GLSL >= 1.20

[fragment shader]

uniform int SIZE;
uniform vec4 lights[512];

void main()
{
float x = 0;
for(int i = 0; i < SIZE; i++)
x += lights[2*i+1].x;
}

[test]
uniform int SIZE 1
draw rect -1 -1 2 2

Which gets optimized to:

===== SHADER #12 OPT ================================== PS/BARTS/EVERGREEN =====
===== 42 dw ===== 1 gprs ===== 2 stack =========================================
ALU 3 @24
1 y: MOV R0.y, 0
t: MULLO_UINT R0.w, [0x00000002 2.8026e-45].x, R0.z

LOOP_START_DX10 @22
PUSH @6
ALU 1 @30 KC0[CB0:0-15]
2 M x: PRED_SETGE_INT __.x, R0.z, KC0[0].x
JUMP @14 POP:1
LOOP_BREAK @20
POP @14 POP:1
ALU 2 @32
3 x: ADD_INT R0.x, R0.w, [0x00000002 2.8026e-45].x

TEX 1 @36
VFETCH R0.x___, R0.x, RID:0 MFC:16 UCF:0 FMT[..]
ALU 1 @40
4 y: ADD R0.y, R0.y, R0.x
LOOP_END @4
EXPORT_DONE PIXEL 0 R0.____ EOP
===== SHADER_END ===============================================================

Notice R0.z being the loop counter/break condition relevant register
and being never incremented at all. Also some of the loop content
has been moved out of it, to fulfill the requirements for the one unused
op.

With a debug build of mesa this would produce an error like
error at : PRED_SETGE_INT __, __, EM.2, R1.x.2||FP@R0.z, C0.x
: operand value R1.x.2||FP@R0.z was not previously written to its gpr
and the compilation would fail due to this. On a release build it gets
passed to the GPU.

When using this patch, the loop remains intact:

===== SHADER #12 OPT ================================== PS/BARTS/EVERGREEN =====
===== 48 dw ===== 1 gprs ===== 2 stack =========================================
ALU 2 @24
1 y: MOV R0.y, 0
z: MOV R0.z, 0
LOOP_START_DX10 @22
PUSH @6
ALU 1 @28 KC0[CB0:0-15]
2 M x: PRED_SETGE_INT __.x, R0.z, KC0[0].x
JUMP @14 POP:1
LOOP_BREAK @20
POP @14 POP:1
ALU 4 @30
3 t: MULLO_UINT T0.x, [0x00000002 2.8026e-45].x, R0.z

4 x: ADD_INT R0.x, T0.x, [0x00000002 2.8026e-45].x

TEX 1 @40
VFETCH R0.x___, R0.x, RID:0 MFC:16 UCF:0 FMT[..]
ALU 2 @44
5 y: ADD R0.y, R0.y, R0.x
z: ADD_INT R0.z, R0.z, 1
LOOP_END @4
EXPORT_DONE PIXEL 0 R0.____ EOP
===== SHADER_END ===============================================================

Piglit: ./piglit summary console -d results/*_gpu_noglx
name: unpatched_gpu_noglx patched_gpu_noglx
---- ------------------- -----------------
pass: 18016 18021
fail: 748 743
crash: 7 7
skip: 1124 1124
timeout: 0 0
warn: 13 13
incomplete: 0 0
dmesg-warn: 0 0
dmesg-fail: 0 0
changes: 0 5
fixes: 0 5
regressions: 0 0
total: 19908 19908

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94900
Tested-by: Heiko Przybyl <lil_tux@web.de>
Tested-on: Barts PRO HD6850
Signed-off-by: Heiko Przybyl <lil_tux@web.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
/external/mesa3d/src/gallium/drivers/r600/sb/sb_dce_cleanup.cpp
62c8149472903a2f3fc4d319c3799b9e729419d5 14-Oct-2013 Vadim Girlin <vadimgirlin@gmail.com> r600g/sb: fix issue with DCE between GVN and GCM (v2)

We can't perform DCE using the liveness pass between GVN and GCM
because it relies on the correct schedule, but GVN doesn't care about
preserving correctness - it's rescheduled later by GCM.

This patch makes dce_cleanup pass perform simple DCE
between GVN and GCM instead of relying on liveness pass.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70088

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
/external/mesa3d/src/gallium/drivers/r600/sb/sb_dce_cleanup.cpp
6ba7a162b6fb38e588784fa14426a61deca1143c 30-Apr-2013 Vadim Girlin <vadimgirlin@gmail.com> r600g/sb: don't propagate dead values in GVN pass

In some cases we use value::gvn_source field to link values that
are known to be equal before gvn pass (e.g. results of DOT4 in different
slots of the same alu group), but then source value may become dead later
and this confuses further passes.

This patch resets value::gvn_source to NULL in the dce_cleanup pass
if it points to dead value.

Fixes segfault during shader optimization with ETQW.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
/external/mesa3d/src/gallium/drivers/r600/sb/sb_dce_cleanup.cpp
2cd769179345799d383f92dd615991755ec24be1 30-Apr-2013 Vadim Girlin <vadimgirlin@gmail.com> r600g/sb: initial commit of the optimizing shader backend
/external/mesa3d/src/gallium/drivers/r600/sb/sb_dce_cleanup.cpp