3e135728268cf36a176dcd915108ad7dc0f4e457 |
|
04-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: deal with duplicate phi sources Otherwise we end up with funny things like: mov.f32f32 r0.x, r1.y mov.f32f32 r0.x, r1.y (It doesn't happen as much after fixing the problem w/ CP into phi src, but it can still happen since we aren't too clever about generating phi sources in the first place.) Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
19739e4fb9024f42a8fc332e6fa94c292bb6bc16 |
|
27-Mar-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: remove ir3_instruction::category Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
fad158a0e01f4c28851477e6d1eb5c8fd67e226b |
|
10-Jan-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: array rework Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
4b18d51756e9099710bfe421657b3b2034e1497f |
|
30-Nov-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: convert scheduler back to recursive algo I've played with a few different approaches to tweak instruction priority according to how much they increase/decrease register pressure, etc. But nothing seems to change the fact that compared to original (pre-multiple-block-support) scheduler, in some edge cases we are generating shaders w/ 5-6x higher register usage. The problem is that the priority queue approach completely looses the dependency between instructions, and ends up scheduling all paths at the same time. Original reason for switching was that recursive approach relied on starting from the shader outputs array. But we can achieve more or less the same thing by starting from the depth-sorted list. shader-db results: total instructions in shared programs: 113350 -> 105183 (-7.21%) total dwords in shared programs: 219328 -> 211168 (-3.72%) total full registers used in shared programs: 7911 -> 7383 (-6.67%) total half registers used in shader programs: 109 -> 109 (0.00%) total const registers used in shared programs: 21294 -> 21294 (0.00%) half full const instr dwords helped 0 322 0 711 215 hurt 0 163 0 38 4 The shaders hurt tend to gain a register or two. While there are also a lot of helped shaders that only loose a register or two, the more complex ones tend to loose significanly more registers used. In some more extreme cases, like glsl-fs-convolution-1.shader_test it is more like 7 vs 34 registers! Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
2181f2cd58f2af1e216618fc6889e23697cec325 |
|
26-Nov-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: use instr flag to mark unused instructions Rather than magic depth value, which won't be available in later stages. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
e44845472a4e04e7b6a82ab6c768f9648729d7e9 |
|
06-Jul-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3/sched: fixup new instr's block If we split addr/pred, the original instruction could have originated from a different block. If we don't fixup the block ptr we hit asserts later (in debug builds). NOTE: perhaps we don't want to try to preserve addr/pred reg's across block boundaries.. this at least needs some thought in case addr/pred writes end up inside a conditional block.. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
a84505c71920f2c70bc8d83cee3e223cd2d976ad |
|
02-Jul-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: don't be confused by eliminated indirects If an instruction using address register value gets eliminated, we need to remove it from the indirects list, otherwise it causes mayhem in sched for scheduling address register usage. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
2215ff2a5d5f1df5791399e1ff78b56bf06e9102 |
|
02-Jul-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: sched fixes for addr register usage A handful of fixes and cleanups: 1) If we split addr/pred, we need the newly created instruction to end up in the unscheduled_list 2) Avoid scheduling a write to the address register if there is no instruction using the address register that is otherwise ready to schedule. Note that I currently don't bother with the same logic for predicate register, since the only instructions using predicate (br/kill) don't take any other src registers, so this situation should not arise. 3) few other cosmetic cleanups Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
6b9f5cd5f7b25e9e03104fe279df74817f69fe87 |
|
02-Jul-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: fix indirects tracking cp would update instr->address but not update the indirects array resulting in sched getting confused when it had to 'spill' the address register. Add an ir3_instr_set_address() helper to set instr->address and also update ir->indirects, and update all places that were writing instr->address to use helper instead. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
457f7c2a2a93b45396ac66e0d4b3896d2db8fdf3 |
|
09-Jun-2015 |
Rob Clark <robdclark@gmail.com> |
freedreno/ir3: block reshuffling and loops! This shuffles things around to allow the shader to have multiple basic blocks. We drop the entire CFG structure from nir and just preserve the blocks. At scheduling we know whether to schedule conditional branches or unconditional jumps at the end of the block based on the # of block successors. (Dropping jumps to the following instruction, etc.) One slight complication is that variables (load_var/store_var, ie. arrays) are not in SSA form, so we have to figure out where to put the phi's ourself. For this, we use the predecessor set information from nir_block. (We could perhaps use NIR's dominance frontier information to help with this?) Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
c8fb5f8a011e1db78af3ceaf91c5cb3b1acaee14 |
|
25-May-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: move inputs/outputs to shader These belong in the shader, rather than the block. Mostly a lot of churn and nothing too interesting. But splitting this out from the rest of ir3_block reshuffling to cut down the noise in the later patch. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
7273cb4e933f8be65fc73b9d8c69c76d1078cb14 |
|
30-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3/sched: convert to priority queue Use a more standard priority-queue based scheduling algo. It is simpler and will make things easier once we have multiple basic blocks and flow control. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
adf1659ff5f07d907eca552be3b566e408c8601e |
|
30-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: use standard list implementation Use standard list_head double-linked list and related iterators, helpers, etc, rather than weird combo of instruction array and next pointers depending on stage. Now block has an instrs_list. In certain stages where we want to remove and re-add to the blocks list we just use list_replace() to copy the list to a new list_head. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
5c8c2e2f97394436effbdd3e0f61eec4590accb2 |
|
25-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: more builder helpers Use ir3_MOV() builder in a couple of spots, rather than open-coding the instruction construction. Also add ir3_NOP() builder and use that instead of open coding. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
49be76166b0b3c93bd2287fabc31d76d143d314c |
|
08-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3/sched: avoid getting stuck on addr conflicts When we get in a scenario where we cannot schedule any more instructions due to address register conflict, clone the instruction that writes the address register, and switch the remaining unscheduled users for the current address register over to the new clone. This is simpler and more robust than the previous attempt (which tried and sometimes failed to ensure all other dependencies of users of the address register were scheduled first).. hint it would try to schedule instructions that were not actually needed for any output value. We probably need to do the same with predicate register, although so far it isn't so heavily used so we aren't running into problems with it (yet). Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
c7811f56c205b113dd820034a99ff3aaa20af636 |
|
04-Apr-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
freedreno/ir3: insert nop between sfu/mem operations Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
aee26d292f165438577426f5e62a62ec2a1514c9 |
|
18-Mar-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: fix infinite recursion in sched One more case we need to handle. One of the src instructions for the indirect could also end up being ourself. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
feb858b788cf27b31d12ad8a00805f015d4063cc |
|
11-Mar-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: avoid scheduler deadlock Deadlock can occur if we schedule an address register write, yet some instructions which depend on that address register value also depend on other unscheduled instructions that depend on a different address register value. To solve this, before scheduling an address register write, ensure that all the other dependencies of the instructions which consume this address register are already scheduled. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
7208e96bb810a7a6c92fd11bb7f4df8c9b7f1a2d |
|
11-Mar-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: bit of cleanup Add an array_insert() macro to simplify inserting into dynamically sized arrays, add a comment, and remove unused prototype inherited from the original freedreno.git/fdre-a3xx test code, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
17754b70d78649f29e25dfe938de91d64dbf5ebf |
|
04-Feb-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: drop deref nodes The meta-deref instruction doesn't really do what we need for relative destination. Instead, since each instruction can reference at most a single address value, track the dependency on the address register via instr->address. This lets us express the dependency regardless of whether it is used for dst and/or src. The foreach_ssa_src{_n} iterator macros now also iterates the address register so, at least in SSA form, the address register behaves as an additional virtual src to the instruction. Which is pretty much what we want, as far as scheduling/etc. TODO: For now, the foreach_src{_n} iterators are unchanged. We could wrap the address in an ir3_register and make the foreach_src_{_n} iterators behave the same way. But that seems unnecessary at this point, since we mainly care about the address dependency when in SSA form. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
f8f7548f466509bf881db1826ef6dd23ffe2acdf |
|
02-Feb-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: helpful iterator macros I remembered that we are using c99.. which makes some sugary iterator macros easier. So introduce iterator macros to iterate all src registers and all SSA src instructions. The _n variants also return the src #, since there are a handful of places that need this. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
63e5b72da8b1df4bbb0fcf46524d106f51264605 |
|
07-Jan-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: make reg array dynamic To use fanin's to group registers in an array, we can potentially have a much larger array of registers. Rather than continuing to bump up the array size, just make it dynamically allocated when the instruction is created. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
8a0ffedd8de51eaf980855283c4525dba6dc5847 |
|
18-Oct-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: fix potential gpu lockup with kill It seems like the hardware is unhappy if we execute a kill instruction prior to last input (ei). Probably the shader thread stops executing and the end-input flag is never set. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
ab33a240890a7ef147d4b8cf35c27ae1932a1dbe |
|
18-Oct-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: comment + better fxn name Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
73ff4c5f70286ffe72ce6a60b68a8274d7425478 |
|
04-Sep-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: fix error in bail logic all_delayed will also be true if we didn't attempt to schedule anything due to no more instructions using current addr/pred. We rely on coming in to block_sched_undelayed() to detect and clean up when there are no more uses of the current addr/pred, which isn't necessarily an error. This fixes a regression introduced in b823abed. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
b823abedf8d1ddba9aeaa43c4a239cb90664dae4 |
|
29-Aug-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: detect scheduler fail There are some cases where the scheduler can get itself into impossible situations, by scheduling the wrong write to pred or addr register first. (Ie. it could end up being unable to schedule any instruction if some instruction which depends on the current addr/reg value also depends on another addr/reg value.) To solve this we'd need to be able to insert extra mov instructions (which would also help when register assignment gets into impossible situations). To do that, we'd need to move the nop padding from sched into legalize. But to start with, just detect when we get into an impossible situation and bail, rather than sitting forever in an infinite loop. This way it will at least fall back to the old compiler, which might even work if you are lucky. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|
db193e5ad06e7a2fbcffb3bb5df85d212eb12291 |
|
25-Jul-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: split out shader compiler from a3xx Move the bits we want to share between generations from fd3_program to ir3_shader. So overall structure is: fdN_shader_stateobj -> ir3_shader -> ir3_shader_variant -> ir3 |- ... \- ir3_shader_variant -> ir3 So the ir3_shader becomes the topmost generation neutral object, which manages the set of variants each of which generates, compiles, and assembles it's own ir. There is a bit of additional renaming to s/fd3_compiler/ir3_compiler/, etc. Keep the split between the gallium level stateobj and the shader helper object because it might be a good idea to pre-compute some generation specific register values (ie. anything that is independent of linking). Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_sched.c
|