4ea3bf8ebb56c8db6e885a77d81502a0b2adca4f |
|
10-Jun-2016 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
i965/vec4: handle 32 and 64 bit channels in liveness analysis Our current data flow analysis does not take into account that channels on 64-bit operands are 64-bit. This is a problem when the same register is accessed using both 64-bit and 32-bit channels. This is very common in operations where we need to access 64-bit data in 32-bit chunks, such as the double packing and packing operations. This patch changes the analysis by checking the bits that each source or destination datatype needs. Actually, rather than bits, we use blocks of 32bits, which is the minimum channel size. Because a vgrf can contain a dvec4 (256 bits), we reserve 8 32-bit blocks to map the channels. v2 (Curro): - Simplify code by making the var_from_reg helpers take an extra argument with the register component we want. - Fix a couple of cases where we had to update the code to the new way of representing live variables. v3: - Fix indent in multiline expressions (Matt) - Fix comment's closing tag (Matt) - Use DIV_ROUND_UP(inst->size_written, 16) instead of 2 * regs_written(inst) to avoid rounding issues. The same for regs_read(i). (Curro). - Add asserts in var_from_reg() to avoid exceeding the allocated registers (Curro). Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
fba020e5af49d9d9a2c6e4d4b79115ed1e74a127 |
|
01-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Replace dst/src_reg::reg_offset with dst/src_reg::offset expressed in bytes. The dst/src_reg::offset field in byte units introduced in the previous patch is a more straightforward alternative to an offset representation split between ::reg_offset and ::subreg_offset fields. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple FS back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. v2: Fix division by the wrong reg_unit in the UNIFORM case of convert_to_hw_regs(). (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
b163aa01487ab5f9b22c48b7badc5d65999c4985 |
|
27-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Rename GRF to VGRF. The 2-bit hardware register file field is ARF, GRF, MRF, IMM. Rename GRF to VGRF (virtual GRF) so that we can reuse the GRF name to mean an assigned general purpose register. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
7638e75cf99263c1ee8e31c6cc5a319feec2c943 |
|
26-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Use brw_reg's nr field to store register number. In addition to combining another field, we get replace silliness like "reg.reg" with something that actually makes sense, "reg.nr"; and no one will ever wonder again why dst.reg isn't a dst_reg. Moving the now 16-bit nr field to a 16-bit boundary decreases code size by about 3k. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
2babde35b9a38a0561a87dc2d7cb431e9aabbd5a |
|
23-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Calculate live intervals with subregister granularity. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
eddb87402ea7ce68357a3d93b0dbb41857be27f6 |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Define helper functions to convert a register to a variable index. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
bf6eb37e0b62fa61c01a32dc5ccb6a7ab00be5f4 |
|
18-Mar-2015 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Remove dependency of vec4_live_variables on the visitor. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
b53d035825ef3ad680470aa5c4f9dc51f8f5676b |
|
12-Feb-2015 |
Eric Anholt <eric@anholt.net> |
util: Move Mesa's bitset.h to util/. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
7a5cc789def94af7e5c364cce7b0884eee2bcc6b |
|
03-Nov-2014 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Track liveness of the flag register. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
13f660158573846d6b1bc30ed4c61d97405bea58 |
|
29-Oct-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Use local pointer to block_data in live intervals. The next patch will be simplified because of this, and makes reading the code a lot easier. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
ec1b2d6aa075c678f0eb0405be64253450f995a1 |
|
29-Jun-2014 |
Matt Turner <mattst88@gmail.com> |
i965: Mark fields in the live interval classes protected. cfg, for instance, is a pointer to a local variable in calculate_live_intervals, certainly not valid after that function has returned. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
415d6dc5bd6915b0c17a1df0f9bd0ef4ca534a81 |
|
21-Oct-2013 |
Eric Anholt <eric@anholt.net> |
i965/vec4: Reduce working set size of live variables computation. Orbital Explorer was generating a 4000 instruction geometry shader, which was taking 275 trips through dead code elimination and register coalescing, each of which updated live variables to get its work done, and invalidated those live variables afterwards. By using bitfields instead of bools (reducing the working set size by a factor of 8) in live variables analysis, it drops from 88% of the profile to 57%, and reduces overall runtime from I-got-bored-and-killed-it (Paul says 3+ minutes) to 10.5 seconds. Compare to f179f419d1d0a03fad36c2b0a58e8b853bae6118 on the FS side. Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
23e8673afb56f10f8301fa86e4c2cdfd864eaaf7 |
|
21-Sep-2013 |
Francisco Jerez <currojerez@riseup.net> |
i965: Switch vec4_live_variables to the non-zeroing allocator. All member variables of vec4_live_variables are already being initialized from its constructor, it's not necessary to use rzalloc to allocate its memory, and doing so makes it more likely that we will start relying on the allocator to zero out all memory if the class is ever extended with new member variables. That's bad because it ties objects to some specific allocation scheme, and gives unpredictable results when an object is created with a different allocator -- Stack allocation, array allocation, or aggregation inside a different object are some of the useful possibilities that come to my mind. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
4f1ebb8ddd0294698601a8c4fc38f1e39bfd51f6 |
|
18-Sep-2013 |
Kenneth Graunke <kenneth@whitecape.org> |
i965, mesa: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS macros. These classes declared a placement new operator, but didn't declare a delete operator. Switching to the macro gives them a delete operator, which probably is a good idea anyway. This also eliminates a lot of boilerplate. v2: Properly use RZALLOC in Mesa IR/TGSI translators. Caught by Eric and Chad. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|
20ebebac5153affcbd44350332678a2fb04d4c96 |
|
03-Oct-2012 |
Eric Anholt <eric@anholt.net> |
i965/vs: Improve live interval calculation. This is derived from the FS visitor code for the same, but tracks each channel separately (otherwise, some typical fill-a-channel-at-a-time patterns would produce excessive live intervals across loops and cause spilling). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48375 (crash -> failure, can turn into pass by forcing unrolling still)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
|