b15c7fc268785cc8c960368d287ec799fe9dc502 |
|
11-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: add ir3_cp_ctx Needed in next commit.. just split out to reduce noise. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
8fe20762433dafc8d6df3a14db7074c1ddf99120 |
|
24-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: convert over to ralloc The home-grown heap scheme (which is ultra-simple but probably not good to always allocate and memset such a chunk of memory up front) was a remnant of fdre (where the ir originally came from). But since we have ralloc in mesa, lets just use that instead. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
adf795432f788b33822d3a94b704be4ca536c8f1 |
|
19-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/a4xx: better workaround for astc+srgb This *seems* like a hw bug, and maybe only applies to certain a4xx variants/revisions. But setting the SRGB bit in sampler view state (texconst0) causes invalid alpha for ASTC textures. Work around this setting up a second texture state and using that to sample alpha separately. This way, srgb->linear conversion happens in hw *prior* to interpolation. This fixes 546 dEQP tests: dEQP-GLES3.functional.texture.*astc*srgb* Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
f68f6c02466f89decb02e8373c7c3b46a72a621f |
|
11-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: hack to avoid getting stuck in a loop There are still some edge cases which result in a neighbor-loop. Which needs to be fixed, but this hack at least makes deqp tests finish. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
f9cdbf44054009122fcc16c887fb90ccc33b52c9 |
|
05-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: eliminate unnecessary absneg's The frontend inserts (abs) and (neg)'s to convert between NIR boolean (~0/0) and native boolean (1/0). So we'd end up with things like: cmps.s.ge r1.x, ... absneg.s r1.x, (neg)r1.x absneg.s r1.x, (abs)r1.x sel.b32 r2.x, r0.x, r1.x, r0.y The (neg) already gets collapsed due to the following (abs). Now by realizing that r1.x comes from a cmps.s instruction, we can drop the (abs) as well. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
d47fb856af4da5f56f80e072365b9286f0731a54 |
|
04-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: add dumping for use/def/live-in/live-out Turned out to be useful to debug an issue in RA. Let's keep it. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
38ae05a340bdf526d5da62159223ad9938fea36a |
|
04-Apr-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: drop unused instr category arg No longer used, so drop the extra arg to ir3_instr_create() Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
19739e4fb9024f42a8fc332e6fa94c292bb6bc16 |
|
27-Mar-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: remove ir3_instruction::category Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
70735643f4cf660dc3022f40f853a138aea738c2 |
|
27-Mar-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: encode instruction category in opc_t Been on my TODO list for a while. If nothing else this will make gdb properly grok the opc_t enum. This first step preserves ir3_instruction::category (with an added assert that category matches what is encoded in opc_t). Next step is to drop the category field (and arg to ir3_instr_create()), but that is split into next commit for bisectability and so that we can run piglit in the intermediate state to flush out any problems. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
2a6ec1e0615127d036acfbece59576e9ef2527bc |
|
16-Jan-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: better array register allocation Detect arrays which don't conflict with each other and allow overlapping register allocation. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
6a33c5c0dffce136bdc95daa2db2d3e9d3c1741f |
|
16-Jan-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: array offset can be negative It at least happens with some piglit tests, like $piglit/bin/vp-address-01 VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..7] DCL ADDR[0] 0: ARL ADDR[0].x, IN[1].xxxx 1: MOV_SAT OUT[1], CONST[ADDR[0].x-1] 2: DP4 OUT[0].x, CONST[4], IN[0] 3: DP4 OUT[0].y, CONST[5], IN[0] 4: DP4 OUT[0].z, CONST[6], IN[0] 5: DP4 OUT[0].w, CONST[7], IN[0] 6: END Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
fad158a0e01f4c28851477e6d1eb5c8fd67e226b |
|
10-Jan-2016 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: array rework Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
4b18d51756e9099710bfe421657b3b2034e1497f |
|
30-Nov-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: convert scheduler back to recursive algo I've played with a few different approaches to tweak instruction priority according to how much they increase/decrease register pressure, etc. But nothing seems to change the fact that compared to original (pre-multiple-block-support) scheduler, in some edge cases we are generating shaders w/ 5-6x higher register usage. The problem is that the priority queue approach completely looses the dependency between instructions, and ends up scheduling all paths at the same time. Original reason for switching was that recursive approach relied on starting from the shader outputs array. But we can achieve more or less the same thing by starting from the depth-sorted list. shader-db results: total instructions in shared programs: 113350 -> 105183 (-7.21%) total dwords in shared programs: 219328 -> 211168 (-3.72%) total full registers used in shared programs: 7911 -> 7383 (-6.67%) total half registers used in shader programs: 109 -> 109 (0.00%) total const registers used in shared programs: 21294 -> 21294 (0.00%) half full const instr dwords helped 0 322 0 711 215 hurt 0 163 0 38 4 The shaders hurt tend to gain a register or two. While there are also a lot of helped shaders that only loose a register or two, the more complex ones tend to loose significanly more registers used. In some more extreme cases, like glsl-fs-convolution-1.shader_test it is more like 7 vs 34 registers! Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
8e52344dc1bd801a81ac773bb0010de5eca726f3 |
|
03-Dec-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: rename ir3_block::bd We'll need to add similar for ir3_instruction, but following the pattern to use 'id' seems confusing. Let's just go w/ generic 'data' as the name. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
2181f2cd58f2af1e216618fc6889e23697cec325 |
|
26-Nov-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: use instr flag to mark unused instructions Rather than magic depth value, which won't be available in later stages. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
96d4db683f90f02e72d34ece544de7eedfa873ee |
|
25-Jul-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: track "keeps" in ir Previously we had a fixed array to track kills, since they don't generate an SSA value, and then cheated by stuffing them in the outputs array before sending things through depth/sched/etc. But store instructions will need similar treatment. So convert this over to a more general array of instructions that must be kept and fix up the places that were previously relying on kills being in the output array. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
020301baccc77e5753ead1e890c0cf24a9675517 |
|
25-Jul-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: add support for store instructions For store instructions, the "dst" register is a read register, not a written register. (Ie. it is the address to store to.) Lets not confuse register allocation, scheduling, etc, with these details. Instead just leave a dummy instr->regs[0], and take "dst" from instr->regs[1] and srcs following. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
bc5e2bec303acd7fd962996bf369be5ce0e15cd2 |
|
23-Jul-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: updated cat6 encoding Sync updated cat6 encoding from freedreno.git, needed to properly encode store instructions. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
6b9f5cd5f7b25e9e03104fe279df74817f69fe87 |
|
02-Jul-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: fix indirects tracking cp would update instr->address but not update the indirects array resulting in sched getting confused when it had to 'spill' the address register. Add an ir3_instr_set_address() helper to set instr->address and also update ir->indirects, and update all places that were writing instr->address to use helper instead. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
00b6b41482985ba4a81fbb479a47c06ec83f3797 |
|
29-Jun-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: cache defining instruction It is silly to traverse back to find first instruction that writes part of a larger "virtual" register many times per instruction (plus per use as a src to later instructions). Cache this information so we only figure it out once. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
457f7c2a2a93b45396ac66e0d4b3896d2db8fdf3 |
|
09-Jun-2015 |
Rob Clark <robdclark@gmail.com> |
freedreno/ir3: block reshuffling and loops! This shuffles things around to allow the shader to have multiple basic blocks. We drop the entire CFG structure from nir and just preserve the blocks. At scheduling we know whether to schedule conditional branches or unconditional jumps at the end of the block based on the # of block successors. (Dropping jumps to the following instruction, etc.) One slight complication is that variables (load_var/store_var, ie. arrays) are not in SSA form, so we have to figure out where to put the phi's ourself. For this, we use the predecessor set information from nir_block. (We could perhaps use NIR's dominance frontier information to help with this?) Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
660d5c1646f5d63f9626b24beabc9cfc318849d4 |
|
01-Jun-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: a4xx encodes larger immed offset Without this, negative branch/jump offsets look like very large positive offsets. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
c8fb5f8a011e1db78af3ceaf91c5cb3b1acaee14 |
|
25-May-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: move inputs/outputs to shader These belong in the shader, rather than the block. Mostly a lot of churn and nothing too interesting. But splitting this out from the rest of ir3_block reshuffling to cut down the noise in the later patch. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
d52fb2f5ad828f879286b9068023b82b9897bc17 |
|
01-May-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3/ra: use register_allocate Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
694beb8b830c993e9bfb744655be3dbd558ab3a8 |
|
23-May-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: introduce ir3_compiler object Right now, just provides a cleaner way to get at the gpu-id, given the separation between compiler and context. But we will need this also to hold the reg-set for new register allocation. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
0f6faa8ff317634ffb75e6040f2de2019dd80d13 |
|
25-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: remove tgsi f/e Also remove ir3_flatten which was only used by tgsi f/e. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
7273cb4e933f8be65fc73b9d8c69c76d1078cb14 |
|
30-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3/sched: convert to priority queue Use a more standard priority-queue based scheduling algo. It is simpler and will make things easier once we have multiple basic blocks and flow control. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
adf1659ff5f07d907eca552be3b566e408c8601e |
|
30-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: use standard list implementation Use standard list_head double-linked list and related iterators, helpers, etc, rather than weird combo of instruction array and next pointers depending on stage. Now block has an instrs_list. In certain stages where we want to remove and re-add to the blocks list we just use list_replace() to copy the list to a new list_head. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
67d994c6761e09205dbc9a0515c510fc9dde02c7 |
|
30-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: drop dot graph dumping At least for now.. right now the instruction and instruction list printing should suffice, and the re-working of ir3_block would require a lot of changes in that code. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
5c8c2e2f97394436effbdd3e0f61eec4590accb2 |
|
25-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: more builder helpers Use ir3_MOV() builder in a couple of spots, rather than open-coding the instruction construction. Also add ir3_NOP() builder and use that instead of open coding. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
57f0d3b3c6ae3b9f79a03517410b8dbfab0382c6 |
|
12-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3/nir: UBO support Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
87807e5cc50f404a8e3ec8864bf8b7427ab6d687 |
|
12-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: move out helper We'll also want it in NIR f/e for implementing UBO support. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
6e8160d6e3ea7b000de112538dcbb0e29a6c3838 |
|
09-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3/nir: simplify emit_tex() Just build up arrays for src0/src1, and use create_collect().. Also add back missing .3d flag for 3d/cube textures. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
f0e9a632a12798bd727799e396cde665bd960665 |
|
06-Apr-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3/cp: support to swap mad src's For a normal MAD (ie. not MADSH), if first source is gpr and second source is const, we can swap the first two sources to avoid needing a mov instruction. This gives back the biggest advantage TGSI f/e had over NIR f/e for common shaders, since TGSI f/e had this logic in the f/e. Note that doing this in copy-prop step has the advantage that it will also work for cases like: MOV TEMP[b], CONST[x] MAD TEMP[d], TEMP[a], TEMP[b], TEMP[c] Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
f370e95421f553ace931a02743c96be80fd62dc8 |
|
29-Mar-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: handle const/immed/abs/neg in cp Be smarter about propagating copies from const or immed, or with abs/neg modifiers. Also, realize that absneg.s and absneg.f are really "fancy" mov instructions. This opens up the possibility to remove more copies. It helps the TGSI frontend a bit, but will be really needed for the NIR f/e which builds everything up in SSA form (ie. will *always* insert a mov from const or immediate). Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
104713d9f2dced94a427004a25c54b2c7feee166 |
|
29-Mar-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: split float/int abs/neg Even though in the end, they map to the same bits, the backend will need to be able to differentiate float abs/neg vs integer abs/neg. Rather than making the backend figure it out based on instruction opcode (which when combined with mov/absneg instructions, can be awkward), just split out different flags for each so the frontend can signal it's intentions more clearly. Also, since (neg) for bitwise op's is actually a bitwise- not, split it out into bnot flag. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
203f37540a698a812f0a66e2f3f1fff954af22ab |
|
19-Mar-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: add ir3 builder helpers Add helpers for constructing SSA forms of instructions. Only partial cat5/cat6 coverage.. but we can add stuff as needed. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
feb858b788cf27b31d12ad8a00805f015d4063cc |
|
11-Mar-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: avoid scheduler deadlock Deadlock can occur if we schedule an address register write, yet some instructions which depend on that address register value also depend on other unscheduled instructions that depend on a different address register value. To solve this, before scheduling an address register write, ensure that all the other dependencies of the instructions which consume this address register are already scheduled. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
7208e96bb810a7a6c92fd11bb7f4df8c9b7f1a2d |
|
11-Mar-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: bit of cleanup Add an array_insert() macro to simplify inserting into dynamically sized arrays, add a comment, and remove unused prototype inherited from the original freedreno.git/fdre-a3xx test code, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
060d3499202c339a27fbc366335f2122ed4ff7bc |
|
23-Jan-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: relative dst To simplify RA, assign arrays that are written to first. Since enough dependency information is in the graph to preserve order of reads and writes of array, so all SSA names for the array collapse into one, just assign the entire thing by array-id. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
17754b70d78649f29e25dfe938de91d64dbf5ebf |
|
04-Feb-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: drop deref nodes The meta-deref instruction doesn't really do what we need for relative destination. Instead, since each instruction can reference at most a single address value, track the dependency on the address register via instr->address. This lets us express the dependency regardless of whether it is used for dst and/or src. The foreach_ssa_src{_n} iterator macros now also iterates the address register so, at least in SSA form, the address register behaves as an additional virtual src to the instruction. Which is pretty much what we want, as far as scheduling/etc. TODO: For now, the foreach_src{_n} iterators are unchanged. We could wrap the address in an ir3_register and make the foreach_src_{_n} iterators behave the same way. But that seems unnecessary at this point, since we mainly care about the address dependency when in SSA form. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
f8f7548f466509bf881db1826ef6dd23ffe2acdf |
|
02-Feb-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: helpful iterator macros I remembered that we are using c99.. which makes some sugary iterator macros easier. So introduce iterator macros to iterate all src registers and all SSA src instructions. The _n variants also return the src #, since there are a handful of places that need this. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
e9f2abe349886ae5423c7c31d201e7d587a3695a |
|
25-Feb-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: handle flat bypass for a4xx We may not need this for later a4xx patchlevels, but we do at least need this for patchlevel 0. Bypass bary.f for fetching varyings when flat shading is needed (rather than configure via cmdstream). This requires a special dummy bary.f w/ (ei) flag to signal to scheduler when all varyings are consumed. And requires shader variants based on rasterizer flatshade state to handle TGSI_INTERPOLATE_COLOR. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
9d732d3125e1b39788a642a5723aeb54cb1983f3 |
|
26-Feb-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: add support for memory (cat6) instructions Scheduled basically the same as texture (cat5) instructions, using (sy) flag for synchronization. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
1e5c207dba4dbd07919bff2efe57ad361a44ac84 |
|
31-Dec-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: start on indirect gpr reads Handle TEMP[ADDR[]] src registers by generating a fanin to group array elements, similarly to how texture fetch instructions work. NOTE: For all the scalar instructions generated for a single tgsi vector operation which uses an array src (or possibly even uses the same array as multiple srcs), re-use the same fanin node. Since a vector operation operates on all components at the same time, it should never see more than one version of the same array. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
63e5b72da8b1df4bbb0fcf46524d106f51264605 |
|
07-Jan-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: make reg array dynamic To use fanin's to group registers in an array, we can potentially have a much larger array of registers. Rather than continuing to bump up the array size, just make it dynamically allocated when the instruction is created. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
9a9f2a893b5e29a77d66671191653f0b4261f546 |
|
25-Oct-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: simplify RA Group inputs/outputs, in addition to fanin/fanout, as they must also exist in sequential scalar registers. This lets us simplify RA by working in terms of neighbor groups. NOTE: has the slight problem that it can't optimize out mov's for things like: MOV OUT[n], IN[m] To avoid this, instead of trying to figure out what mov's we can eliminate, we first remove all mov's prior to grouping, and then re-insert mov's as needed while grouping inputs/outputs/fanins. Eventually we'd prefer the frontend to not insert extra mov's in the first place (so we don't have to bother removing them). This is the plan for an eventual NIR based frontend, so separate out the instr grouping (which will still be needed for NIR frontend) from the mov elimination (which won't). Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
dddfe6c21ee92f015b78060545f08239c331ceba |
|
02-Jan-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: regmask support for relative addr For temp arrays, a 32bit mask won't be sufficient.. but otoh we don't need to support an arbitrary mask. So for this case use a simple size field rather than a bitmask. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
d15db9e7c05fde924c3dbced83c0af9c97c3973b |
|
01-Jan-2015 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: drop instr_clone() stuff Unnecessary and overly complicated. And gets in the way for temp arrays (TEMP[ADDR[]]). Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
f332cf92b69e52de3cb7c3088ad1efd2e291bb88 |
|
25-Oct-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: split out legalize pass Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
4097ef6ee88e65bc2cf08fc9c2561665824309f4 |
|
25-Oct-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: ra debug Some compile time RA debug Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
61c68b69d704b5faa5ff9d2b73b24bebf7e19412 |
|
31-Jul-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno: add adreno 420 support Very initial support. Basic stuff working (es2gears, es2tri, and maybe about half of glmark2). Expect broken stuff. Still missing: mem->gmem (restore), queries, mipmaps (blob segfaults!), hw binning, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
13862812dc910a4ef57cb72cb9fe777ce3c14515 |
|
24-Oct-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: consider instruction neighbors in cp Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive scalar registers. Keep track of instruction neighbors in copy- propagation step and avoid eliminating mov's which would cause an instruction to need multiple distinct left and/or right neighbors. This lets us not fall on our face when we encounter things like: 1: MOV TEMP[2], IN[0].xyzw 2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D 3: MOV TEMP[2].xy, IN[0].yxzz 4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D 5: END Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
d6252d0f633292b01c3964d0e3da12f759bec9c5 |
|
24-Oct-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: standalone compiler updates for ir3test In order to test compiler changes more easily, spit out the assembled shader with some header information so that we can know about inputs/outputs more easily. See: git://people.freedesktop.org/~robclark/ir3test In ir3test we have a big collection of tgsi shaders and reference ir3_compiler outputs. When making compiler changes, regenerate the compiler outputs and feed to ir3test to compare the new vs reference shader. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
8a0ffedd8de51eaf980855283c4525dba6dc5847 |
|
18-Oct-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: fix potential gpu lockup with kill It seems like the hardware is unhappy if we execute a kill instruction prior to last input (ei). Probably the shader thread stops executing and the end-input flag is never set. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
652b8fbbbb0132c634c90e4d1fdbca9497b7cd94 |
|
15-Oct-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: large const support Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
af4d08839581c2372f17f75f1ad0fd1284ea7d8b |
|
03-Oct-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: fix lockups with lame FRAG shaders Shaders like: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL TEMP[0], LOCAL IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[0].xyyy, SAMP[0], 2D 1: MOV OUT[0], IMM[0].xyxx 2: END cause unhappyness. They have an IN[], but once this is compiled the useless TEX instruction goes away. Leaving a varying that is never fetched, which makes the hw unhappy. In the process fix a signed vs unsigned compare. If the vertex shader has max_reg=-1, MAX2() vs an unsigned would not give the desired result. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
e6acf3ac2445bbc15ab33001077343ac8b486b5b |
|
27-Sep-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
freedreno/ir3: add TXD support and expose ARB_shader_texture_lod Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
b823abedf8d1ddba9aeaa43c4a239cb90664dae4 |
|
29-Aug-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: detect scheduler fail There are some cases where the scheduler can get itself into impossible situations, by scheduling the wrong write to pred or addr register first. (Ie. it could end up being unable to schedule any instruction if some instruction which depends on the current addr/reg value also depends on another addr/reg value.) To solve this we'd need to be able to insert extra mov instructions (which would also help when register assignment gets into impossible situations). To do that, we'd need to move the nop padding from sched into legalize. But to start with, just detect when we get into an impossible situation and bail, rather than sitting forever in an infinite loop. This way it will at least fall back to the old compiler, which might even work if you are lucky. Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|
db193e5ad06e7a2fbcffb3bb5df85d212eb12291 |
|
25-Jul-2014 |
Rob Clark <robclark@freedesktop.org> |
freedreno/ir3: split out shader compiler from a3xx Move the bits we want to share between generations from fd3_program to ir3_shader. So overall structure is: fdN_shader_stateobj -> ir3_shader -> ir3_shader_variant -> ir3 |- ... \- ir3_shader_variant -> ir3 So the ir3_shader becomes the topmost generation neutral object, which manages the set of variants each of which generates, compiles, and assembles it's own ir. There is a bit of additional renaming to s/fd3_compiler/ir3_compiler/, etc. Keep the split between the gallium level stateobj and the shader helper object because it might be a good idea to pre-compute some generation specific register values (ie. anything that is independent of linking). Signed-off-by: Rob Clark <robclark@freedesktop.org>
/external/mesa3d/src/gallium/drivers/freedreno/ir3/ir3.h
|