fc10dc9fdea6ad7d04dfcdb8fd2e2d59ea67f68b |
|
22-Dec-2016 |
Rob Clark <robdclark@gmail.com> |
freedreno/ir3: rework location of driver constants Rework how we lay out driver constants (driver-params, UBO/TFBO buffer addresses, immediates) for more flexibility. For a5xx+ we need to deal with the fact that gpu ptrs are 64b instead of 32b, which makes the fixed offset scheme not work so well. While we are dealing with that we might also make the layout more dynamic to account for varying # of UBOs, etc. Signed-off-by: Rob Clark <robdclark@gmail.com>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
46ff17559b1369f0fe7dca000f51f077ffd1dae5 |
|
23-May-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: disable cp for indirect src's The variable-indexing tests always had a few random fails, which I usually couldn't reproduce when running tests manually. Somehow recently this got a lot worse. I ported a couple of the shaders to GLES to see what blob does, and it also seems to be avoiding to cp indirect srcs. So I guess indirect w/ instructions other than cat1 (mov) are not totally reliable. Let's just switch that off until this is better understood. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
f0a1f3de27831d1ccb8a51fc0c99d63f25fd6e2a |
|
01-May-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: cp small negative integers too Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
173871dfb988c3e9fb74a8016d2b024619a5d918 |
|
11-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: lower immeds to const Helps reduce register pressure and instruction counts for immediates that would otherwise require a mov into gpr. total instructions in shared programs: 4455332 -> 4369297 (-1.93%) total dwords in shared programs: 8807872 -> 8614432 (-2.20%) total full registers used in shared programs: 263062 -> 250846 (-4.64%) total half registers used in shader programs: 9845 -> 9845 (0.00%) total const registers used in shared programs: 1029735 -> 1466993 (42.46%) half full const instr dwords helped 0 10415 0 17861 5912 hurt 0 1157 21458 947 33 Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
b15c7fc268785cc8c960368d287ec799fe9dc502 |
|
11-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: add ir3_cp_ctx Needed in next commit.. just split out to reduce noise. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
dfd23abdcce7fb01da842d2fc69d54a04ecdfee2 |
|
25-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno: disallow cat4 immed src Normally this would never happen (constant-propagation in NIR would eliminate the instruction), except it does happen for 'undef' which we turn into immed 0.0 for bookkeeping purposes. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
b35ad6e7016a877def4b3d2d22b3a2265a68e2eb |
|
07-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: cleanup double cmps.s from frontend Since we cannot mov into a predicate register, the frontend uses a 'cmps.s p0.x, cond, 0' as a stand-in for mov to p0.x. It does this since it has no way to know that the source cond instruction (ie. for a kill, br, etc) will only be used to write the predicate reg. Detect this, and re-write the instruction writing p0.x to skip the original cmps.[sfu]. (It is done like this, rather than re-writing the dest of the first cmps.[sfu] in case the first cmps.[sfu] actually has other users.) Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
f9cdbf44054009122fcc16c887fb90ccc33b52c9 |
|
05-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: eliminate unnecessary absneg's The frontend inserts (abs) and (neg)'s to convert between NIR boolean (~0/0) and native boolean (1/0). So we'd end up with things like: cmps.s.ge r1.x, ... absneg.s r1.x, (neg)r1.x absneg.s r1.x, (abs)r1.x sel.b32 r2.x, r0.x, r1.x, r0.y The (neg) already gets collapsed due to the following (abs). Now by realizing that r1.x comes from a cmps.s instruction, we can drop the (abs) as well. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
8e451c2d06d18ee54dc3098b3987af6e0bc59f5e |
|
04-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: don't cp into phi's The block defining a phi source might not have been executed. If we allow copy propagation, we could end up pointing to a src instruction in the wrong block. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
383b6e87f90e0ac84a200e1677a44b370976c93b |
|
04-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: we can't store immediate values Fixes some transform-feedback piglits, like: bin/ext_transform_feedback-nonflat-integral Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
19739e4fb9024f42a8fc332e6fa94c292bb6bc16 |
|
27-Mar-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: remove ir3_instruction::category Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
ddede497b831fb98e3540247b570968532cdacc9 |
|
15-Jan-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: workaround bug/feature Seems like in certain cases, we cannot use c<a0.x+0> as the third src to cat3 instructions. This may be slightly conservative, we may only have this restriction when the first src is also const. This fixes, for example, +24/-0 of the variable-indexing piglit tests. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
fad158a0e01f4c28851477e6d1eb5c8fd67e226b |
|
10-Jan-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: array rework Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
cc7ed34df94a389e60221228896332d0ac8ea6c6 |
|
10-Jan-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: refactor/simplify cp If we handle separately the special case of eliminating output mov (which includes keeps and various other cases where we don't have a consuming instruction's src register to collapse things into), we can simplify the logic. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
96d4db683f90f02e72d34ece544de7eedfa873ee |
|
25-Jul-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: track "keeps" in ir Previously we had a fixed array to track kills, since they don't generate an SSA value, and then cheated by stuffing them in the outputs array before sending things through depth/sched/etc. But store instructions will need similar treatment. So convert this over to a more general array of instructions that must be kept and fix up the places that were previously relying on kills being in the output array. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
6b9f5cd5f7b25e9e03104fe279df74817f69fe87 |
|
02-Jul-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: fix indirects tracking cp would update instr->address but not update the indirects array resulting in sched getting confused when it had to 'spill' the address register. Add an ir3_instr_set_address() helper to set instr->address and also update ir->indirects, and update all places that were writing instr->address to use helper instead. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
457f7c2a2a93b45396ac66e0d4b3896d2db8fdf3 |
|
09-Jun-2015 |
Rob Clark <robdclark@gmail.com> |
freedreno/ir3: block reshuffling and loops! This shuffles things around to allow the shader to have multiple basic blocks. We drop the entire CFG structure from nir and just preserve the blocks. At scheduling we know whether to schedule conditional branches or unconditional jumps at the end of the block based on the # of block successors. (Dropping jumps to the following instruction, etc.) One slight complication is that variables (load_var/store_var, ie. arrays) are not in SSA form, so we have to figure out where to put the phi's ourself. For this, we use the predecessor set information from nir_block. (We could perhaps use NIR's dominance frontier information to help with this?) Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
c8fb5f8a011e1db78af3ceaf91c5cb3b1acaee14 |
|
25-May-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: move inputs/outputs to shader These belong in the shader, rather than the block. Mostly a lot of churn and nothing too interesting. But splitting this out from the rest of ir3_block reshuffling to cut down the noise in the later patch. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
d52fb2f5ad828f879286b9068023b82b9897bc17 |
|
01-May-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3/ra: use register_allocate Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
adf1659ff5f07d907eca552be3b566e408c8601e |
|
30-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: use standard list implementation Use standard list_head double-linked list and related iterators, helpers, etc, rather than weird combo of instruction array and next pointers depending on stage. Now block has an instrs_list. In certain stages where we want to remove and re-add to the blocks list we just use list_replace() to copy the list to a new list_head. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
87807e5cc50f404a8e3ec8864bf8b7427ab6d687 |
|
12-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: move out helper We'll also want it in NIR f/e for implementing UBO support. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
d5357c16cc0ccd84c3475778fcc08a025b8c24f7 |
|
08-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3/cp: handle indirect properly I noticed some cases where we where trying to copy-propagate indirect src's into places they cannot go, like 2nd src for cat3 (mad, etc). Expand out valid_flags() to be aware of relativ flag, and fix up a few related spots. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
f0e9a632a12798bd727799e396cde665bd960665 |
|
06-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3/cp: support to swap mad src's For a normal MAD (ie. not MADSH), if first source is gpr and second source is const, we can swap the first two sources to avoid needing a mov instruction. This gives back the biggest advantage TGSI f/e had over NIR f/e for common shaders, since TGSI f/e had this logic in the f/e. Note that doing this in copy-prop step has the advantage that it will also work for cases like: MOV TEMP[b], CONST[x] MAD TEMP[d], TEMP[a], TEMP[b], TEMP[c] Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
f370e95421f553ace931a02743c96be80fd62dc8 |
|
29-Mar-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: handle const/immed/abs/neg in cp Be smarter about propagating copies from const or immed, or with abs/neg modifiers. Also, realize that absneg.s and absneg.f are really "fancy" mov instructions. This opens up the possibility to remove more copies. It helps the TGSI frontend a bit, but will be really needed for the NIR f/e which builds everything up in SSA form (ie. will *always* insert a mov from const or immediate). Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
104713d9f2dced94a427004a25c54b2c7feee166 |
|
29-Mar-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: split float/int abs/neg Even though in the end, they map to the same bits, the backend will need to be able to differentiate float abs/neg vs integer abs/neg. Rather than making the backend figure it out based on instruction opcode (which when combined with mov/absneg instructions, can be awkward), just split out different flags for each so the frontend can signal it's intentions more clearly. Also, since (neg) for bitwise op's is actually a bitwise- not, split it out into bnot flag. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
060d3499202c339a27fbc366335f2122ed4ff7bc |
|
23-Jan-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: relative dst To simplify RA, assign arrays that are written to first. Since enough dependency information is in the graph to preserve order of reads and writes of array, so all SSA names for the array collapse into one, just assign the entire thing by array-id. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
17754b70d78649f29e25dfe938de91d64dbf5ebf |
|
04-Feb-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: drop deref nodes The meta-deref instruction doesn't really do what we need for relative destination. Instead, since each instruction can reference at most a single address value, track the dependency on the address register via instr->address. This lets us express the dependency regardless of whether it is used for dst and/or src. The foreach_ssa_src{_n} iterator macros now also iterates the address register so, at least in SSA form, the address register behaves as an additional virtual src to the instruction. Which is pretty much what we want, as far as scheduling/etc. TODO: For now, the foreach_src{_n} iterators are unchanged. We could wrap the address in an ir3_register and make the foreach_src_{_n} iterators behave the same way. But that seems unnecessary at this point, since we mainly care about the address dependency when in SSA form. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
f8f7548f466509bf881db1826ef6dd23ffe2acdf |
|
02-Feb-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: helpful iterator macros I remembered that we are using c99.. which makes some sugary iterator macros easier. So introduce iterator macros to iterate all src registers and all SSA src instructions. The _n variants also return the src #, since there are a handful of places that need this. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
9a9f2a893b5e29a77d66671191653f0b4261f546 |
|
25-Oct-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: simplify RA Group inputs/outputs, in addition to fanin/fanout, as they must also exist in sequential scalar registers. This lets us simplify RA by working in terms of neighbor groups. NOTE: has the slight problem that it can't optimize out mov's for things like: MOV OUT[n], IN[m] To avoid this, instead of trying to figure out what mov's we can eliminate, we first remove all mov's prior to grouping, and then re-insert mov's as needed while grouping inputs/outputs/fanins. Eventually we'd prefer the frontend to not insert extra mov's in the first place (so we don't have to bother removing them). This is the plan for an eventual NIR based frontend, so separate out the instr grouping (which will still be needed for NIR frontend) from the mov elimination (which won't). Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
13862812dc910a4ef57cb72cb9fe777ce3c14515 |
|
24-Oct-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: consider instruction neighbors in cp Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive scalar registers. Keep track of instruction neighbors in copy- propagation step and avoid eliminating mov's which would cause an instruction to need multiple distinct left and/or right neighbors. This lets us not fall on our face when we encounter things like: 1: MOV TEMP[2], IN[0].xyzw 2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D 3: MOV TEMP[2].xy, IN[0].yxzz 4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D 5: END Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
3dd9a0d6fdea1ff5b1fe903fce206bf1d1515400 |
|
01-Oct-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
freedreno/ir3: avoid fan-in sources referring to same instruction Since the RA has to be done s.t. each one gets its own (adjacent) register, it would complicate matters if instructions were allowed to be repeated. This enables copy-propagation use in situations where previously that might have happened. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|
db193e5ad06e7a2fbcffb3bb5df85d212eb12291 |
|
25-Jul-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: split out shader compiler from a3xx Move the bits we want to share between generations from fd3_program to ir3_shader. So overall structure is: fdN_shader_stateobj -> ir3_shader -> ir3_shader_variant -> ir3 |- ... \- ir3_shader_variant -> ir3 So the ir3_shader becomes the topmost generation neutral object, which manages the set of variants each of which generates, compiles, and assembles it's own ir. There is a bit of additional renaming to s/fd3_compiler/ir3_compiler/, etc. Keep the split between the gallium level stateobj and the shader helper object because it might be a good idea to pre-compute some generation specific register values (ie. anything that is independent of linking). Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3_cp.c
|