f51a5b51ab92ada4b9f3b1d603f9de60b66e46ce |
|
06-Jul-2016 |
Juan A. Suarez Romero <jasuarez@igalia.com> |
i965/vec4: emit correctly load_inputs for 64bit data For dvec3 and dvec4 types, a single GRF do not have enough space to allocate two inputs from two different vertices (SIMD4x2). So the GRF only contains first two components for the two vertices, and the next GRF has the remaining components. We want to put all the components for the same vertex in the same register. Thus, we do a shuffle to reorder the data. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
301fdfd8387856ea83c0ac0bff95915c0872c2f4 |
|
07-Dec-2016 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
vec4: use DIM instruction when loading DF immediates in HSW Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b76f2206f550c37835d4e19eea1588caa0211b85 |
|
01-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: fix store output for 64-bit types We need to shuffle the data before it is written to the URB. Also, dvec3/4 need two vec4 slots. v2: use byte_offset() instead of offset(). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
8eea41e75d86bfe9bef5f69b25ad797da236a008 |
|
12-Feb-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Fix SSBO stores for 64-bit data In this case we need to shuffle the 64-bit data before we write it to memory, source from reg_offset + 1 to write components Z and W and consider that each DF channel is twice as big. v2: use byte_offset() instead of offset(). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
9998d55afd179ad5019d3841e4c3255a02fd2d7b |
|
13-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Fix SSBO loads for 64-bit data Same requirements as for UBO loads. v2: - use byte_offset() instead of offset() (Iago) - keep the const. offset as an immediate like the original code did (Juan) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
4486c90aaeb08f424ce17f842f46d24d1ceaadcb |
|
13-Jul-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Fix UBO loads for 64-bit data We need to emit 2 32-bit load messages to load a full dvec4. If only 1 or 2 double components are needed dead-code-elimination will remove the second one. We also need to shuffle the result of the 32-bit messages to form valid 64-bit SIMD4x2 data. v2: - use byte_offset() instead of offset() (Iago) - keep the const. offset as an immediate like the original code did (Juan) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
d8e123cc5d66022069f3aee53318bfd1075bcc53 |
|
22-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Add a shuffle_64bit_data helper SIMD4x2 64bit data is stored in register space like this: r0.0:DF x0 y0 z0 w0 r1.0:DF x1 y1 z1 w1 When we need to write data such as this to memory using 32-bit write messages we need to shuffle it in this fashion: r0.0:DF x0 y0 x1 y1 r0.1:DF z0 w0 z1 w1 and emit two 32-bit write messages, one for r0.0 at base_offset and another one for r0.1 at base_offset+16. We also need to do the inverse operation when we read using 32-bit messages to produce valid SIMD4x2 64bit data from the data read. We can achieve this by aplying the exact same shuffling to the data read, although we need to apply different channel enables since the layout of the data is reversed. This helper implements the data shuffling logic and we will use it in various places where we read and write 64bit data from/to memory. v2 (Curro): - Use the writemask helper and don't assert on the original writemask being XYZW. - Use the Vec4 IR builder to simplify the implementation. v3 (Iago): - Use byte_offset() instead of offset(). v3: - Fix typo (Matt) - Clarify the example and fix indention (Matt). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
82e9dda8bf8875d232840585f48763c7a7092918 |
|
08-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4/nir: do not emit 64-bit MAD RepCtrl=1 does not work with 64-bit operands so we need to use RepCtrl=0. In that situation, the regioning generated for the sources seems to be equivalent to <4,4,1>:DF, so it will only work for components XY, which means that we have to move any other swizzle to a temporary so that we can source from channel X (or Y) in MAD and we also need to split the instruction (we are already scalarizing DF instructions but there is room for improvement and with MAD would be more restricted in that area) Also, it seems that MAD operations like this only write proper output for channels X and Y, so writes to Z and W also need to be done to a temporary using channels X/Y and then move that to channels Z or W of the actual dst. As a result the code we produce for native 64-bit MAD instructions is rather bad, and much worse than just emitting MUL+ADD. For reference, a simple case of a fully scalarized dvec4 MAD operation requires 15 instructions if we use native MAD and 8 instructions if we emit ADD+MUL instead. There are some improvements that we can do to the emission of MAD that might bring the instruction count down in some cases, but it comes at the expense of a more complex implementation so it does not seem worth it, at least initially. This patch makes translation of NIR's 64-bit FMMA instructions produce MUL+ADD instead of MAD. Currently, there is nothing else in the vec4 backend that emits MAD instructions, so this is sufficient and it helps optimization passes see MUL+ADD from the get go. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b58026b31e7258a4bd2bb630a1d41a433fb01799 |
|
07-Jul-2016 |
Samuel Iglesias Gonsálvez <siglesias@igalia.com> |
i965/vec4: use the new helper function to create double immediates Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
98da3623d5dfd991362c4fd3571325fe0277a2f9 |
|
09-Mar-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add a helper function to create double immediates Gen7 hardware does not support double immediates so these need to be moved in 32-bit chunks to a regular vgrf instead. Instead of doing this every time we need to create a DF immediate, create a helper function that does the right thing depending on the hardware generation. v2 (Curro): - Use swizzle() and writemask() helpers and make tmp const. v3 (Iago): - Adapt to changes in offset() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
8f9ce5fa22c04b5b34aa6dc67e4a9b2d151d293d |
|
18-Feb-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: fix optimize predicate for doubles Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
1816ae8f68e395da26dcfea2539bafd715c8dbc4 |
|
05-Feb-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: implement fsign() for doubles v2: use a MOV with a conditional_mod instead of a CMP, like we do in d2b, to skip loading a double immediate. v3: Fix comment (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
6e570619e0755a50b2c8d57c6d1189fb9aca899d |
|
17-Feb-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: implement d2b v2 (Curro): - Generate the flag register with a MOV with conditional_mod instead of a CMP instruction, which has the benefit that we can skip loading a DF 0.0 constant. - Avoid the PICK_LOW_32BIT + MOV by using the flag result and a SEL to set the boolean result. v3: - Fix comment (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
c1fb525016e41658d2dc5d581da4e83b8a075fd4 |
|
17-Feb-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: implement d2i, d2u, i2d and u2d Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
4b2257623494ea8e7a1c7b6fbb2f4f3e59522468 |
|
29-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: implement HW workaround for align16 double to float conversion From the BDW PRM, Workarounds chapter: "DF->f format conversion for Align16 has wrong emask calculation when source is immediate." Notice that Broadwell and later are strictly scalar at the moment though, so this is not really necessary. v2: Instead of moving the immediate to a vgrf and converting from there, just convert the double immediate to float in the compiler and move the result to the destination (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
bfc1f0f017db6bd11a558237c9a4ebeacf73f5ba |
|
29-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: add helpers for conversions to/from doubles Use these helpers to implement d2f and f2d. We will reuse these helpers when we implement things like d2i or i2d as well. v2: - Rename the helpers (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
c722a8e61ebc72d7d21c2bed0f623218d739fdb7 |
|
17-Feb-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Rename DF to/from F generator opcodes The opcodes are not specific for conversions to/from float since we need the same for conversions to/from other 32-bit types. Rename the opcodes accordingly and change the asserts to check the size of the types involved instead. v2: - Rename to VEC4_OPCODE_TO_DOUBLE and VEC4_OPCODE_FROM_DOUBLE (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
619271ec8785ab8b6021d0f49e98c51d457eab4d |
|
15-Feb-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: fix register allocation for 64-bit undef sources Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
a8318b120e53518ae4d933acd876b8dbd3871e0c |
|
12-Feb-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: fix get_nir_dest() to use DF type for 64-bit destinations v2: Make dst_reg_for_nir_reg() handle this for nir_register since we want to have the correct type set before we call offset(). Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
bb0e67d55dbd353e9c57b0709fa3e534f1aba05f |
|
05-Oct-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: fix indentation in get_nir_src() Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
8cdbbbd2cf9e0c42114c7090805fa2b4a93ca499 |
|
14-Aug-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4/nir: implement double comparisons v2: - Added newline before if() (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
8a3ba033397bc627e499fcd3a379984ba4d587d2 |
|
01-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: implement double packing Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
94cfdf586a6a95bd06b989bba27d85f9bf99b9df |
|
01-Jun-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: implement double unpacking Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
4c040332f56ca2e5a4bbd8c412fd32ab3ff821db |
|
10-Nov-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: We only support 32-bit integer ALU operations for now Add asserts so we remember to address this when we enable 64-bit integer support, as suggested by Connor and Jason. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
e09a6be3b6806c582347f6faf93cc2d824d98ed2 |
|
14-Aug-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: translate d2f/f2d Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
9ce4b20bde4f4ca8e8907fcac13e8bb9d7e5f4b4 |
|
14-Aug-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4/nir: fix emitting 64-bit immediates Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
3457252b74f5490cff8915cac1e5fe0bf1031f5b |
|
13-Aug-2015 |
Connor Abbott <connor.w.abbott@intel.com> |
i965/vec4/nir: set the right type for 64-bit registers Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
fef06f635610ddc730a213576e59afb638c6051d |
|
25-May-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4/nir: support doubles in ALU operations Basically, this involves considering the bit-size information to set the appropriate type on both operands and destination. v2 (Curro) - Don't use two temporaries (and write one of them twice ) to obtain the nir_alu_type. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
0f096b1e5a5e31a5efba7279326ec8bc8478bb56 |
|
02-Nov-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4/nir: Add bit-size information to types Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
2d81a292036445c440e56d07ce3d5294e0411d71 |
|
29-Feb-2016 |
Connor Abbott <connor.w.abbott@intel.com> |
i965/vec4/nir: allocate two registers for dvec3/dvec4 v2 (Curro): - Do not special-case for a bit-size of 64, divide the bit_size by 32 instead. - Use DIV_ROUND_UP so we can handle sub-32-bit types. v3 (Ian): - Make num_regs const. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
54913850aa379f57fcbf7a2dec5ea236cf997646 |
|
10-Aug-2015 |
Connor Abbott <connor.w.abbott@intel.com> |
i965/vec4/nir: simplify glsl_type_for_nir_alu_type() Less duplication, one one less case to handle for doubles and support for sized NIR types. v2: Fix call to get_instance by swapping rows and columns params (Iago) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
fd249c803e3ae2acb83f5e3b7152728e73228b7b |
|
12-Dec-2016 |
Ilia Mirkin <imirkin@alum.mit.edu> |
treewide: s/comparitor/comparator/ git grep -l comparitor | xargs sed -i 's/comparitor/comparator/g' Just happened to notice this in a patch that was sent and included one of the tokens in question. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
4f2d1d6ea713df8f8d816b48b9e99c7117cf36d7 |
|
28-Nov-2016 |
Ilia Mirkin <imirkin@alum.mit.edu> |
i965: support constant gather offsets larger than 4 bits Offsets that don't fit into 4 bits need to force gather_po to be selected. Adjust the logic so that this happens. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
f182e5eafc31ebc7c140e9a369d5f747948733ae |
|
17-Oct-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/vec4: Handle component qualifiers on non-generic varyings. ARB_enhanced_layouts only requires component qualifier support for generic varyings, so this is all the vec4 backend knew how to handle. This patch extends the backend to handle it for all varyings, so we can use store_output intrinsics with a component set for things like clip/cull distances. We may want to use that for other VUE header fields in the future as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
66fcfa6894ab61a8cb70955f4a4113729e4a8099 |
|
03-Oct-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: make offset() work in terms of a simd width and scalar components So that it has the same semantics as the scalar backend implementation. The helper will now take a simd width (which is always 8 in vec4 mode) and step as many scalar components as specified by that width, respecting the size of the scalar channels. v2 (Curro): - Remove the assertion in offset(), byte_offset() has the same checks. - Use byte_offset() directly instead of add_byte_offset(). - Make things more clear by explicitly including the vertical stride in the byte offset expression. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b |
|
13-Oct-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
nir/i965/anv/radv/gallium: make shader info a pointer When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
89e1436e2d4ff0c15202708979eb36761cae4167 |
|
11-Oct-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Silence unused parameter warnings brw_link.cpp:76:44: warning: unused parameter ‘shader_type’ [-Wunused-parameter] gl_shader_stage shader_type, ^ brw_nir.c: In function ‘brw_nir_lower_vs_inputs’: brw_nir.c:194:55: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo, ^ brw_vec4_visitor.cpp:914:37: warning: unused parameter ‘sampler’ [-Wunused-parameter] uint32_t sampler, ^ brw_vec4_visitor.cpp:1146:34: warning: unused parameter ‘stream_id’ [-Wunused-parameter] vec4_visitor::gs_emit_vertex(int stream_id) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
40dd45d0c6aa4a9d727c09225967e9c3b1f45854 |
|
30-Jun-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Enable ARB_shader_atomic_counter_ops Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
3d2011cb33317b0fe9b8fe989916efc1841c6ce0 |
|
30-Jun-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Refactor emission of atomic counter operations This will make it easier to add more operations. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
69fdf13c215c2970feaca76f178a5c2c11ba8fec |
|
03-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Replace vec4_instruction::regs_written with ::size_written field in bytes. The previous regs_written field can be recovered by rewriting each rvalue reference of regs_written like 'x = i.regs_written' to 'x = DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference like 'i.regs_written = x' to 'i.size_written = x * reg_unit'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
fba020e5af49d9d9a2c6e4d4b79115ed1e74a127 |
|
01-Sep-2016 |
Francisco Jerez <currojerez@riseup.net> |
i965/vec4: Replace dst/src_reg::reg_offset with dst/src_reg::offset expressed in bytes. The dst/src_reg::offset field in byte units introduced in the previous patch is a more straightforward alternative to an offset representation split between ::reg_offset and ::subreg_offset fields. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple FS back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. v2: Fix division by the wrong reg_unit in the UNIFORM case of convert_to_hw_regs(). (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
d1b1fca0b7cccff718923f2344ea144dc3ebb869 |
|
22-Jun-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965/vec4: add support for packing vs/gs/tes outputs Here we create a new output_generic_reg array with the ability to store the dst_reg for each component of user defined varyings. This is needed as the previous code only stored the dst_reg based on the varying location which meant packed varyings would overwrite each other. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b427abba0c04214ba6184092eee73fc6377fbff9 |
|
23-Jun-2016 |
Timothy Arceri <timothy.arceri@collabora.com> |
i965/vec4: add support for packing inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
96dfed49e47eac7afc100e5b8d3b316dd6652fb6 |
|
19-Jul-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Stop muging cube array lengths by 6 From the Sky Lake PRM: "For SURFTYPE_CUBE: For Sampling Engine Surfaces and Typed Data Port Surfaces, the range of this field is [0,340], indicating the number of cube array elements (equal to the number of underlying 2D array elements divided by 6). For other surfaces, this field must be zero." In other words, the depth field for cube maps is in number of cubes not number of 2-D slices so we need to divide by 6. ISL will do this correctly for us assuming that we provide it with the correct array bounds which it expects to be in 2-D slices. It appears as if we've been doing this wrong ever since we first added cube map arrays for Sandy Bridge and the change to ISL made things slightly worse. While we're at it, we now need to remoe the shader hacks we've always done since they were only needed because we were setting the depth field six times too large. v2: Fix the vec4 backend as well (not sure how I missed this). Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
3e7cebc8da5c9f16fa1b9a25ea72b8d31c86a440 |
|
22-Jun-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Use LZD to implement nir_op_find_lsb on Gen < 7 v2: Rebase on changes to previous two patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
c2019c6c261d5c46a4e5d3edc88836bcedf75f30 |
|
22-Jun-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Use LZD to implement nir_op_ifind_msb on Gen < 7 v2: Retype LZD source as UD to avoid potential problems with 0x80000000. Suggested by Matt. Also update comment about problem values with LZD(abs(x)). Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
de20086eed47e6bfe7c25835d72383114f99c7a9 |
|
22-Jun-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Use LZD to implement nir_op_ufind_msb This uses one less instruction. v2: Move emit_find_msb_using_lzd out of the visitor classes. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
fb5dcb81cc121e4355b7eef014474a5c42a2f6db |
|
19-May-2016 |
Matt Turner <mattst88@gmail.com> |
i965: Pass nir_src/nir_dest by reference. Cuts 6K of .text. text data bss dec hex filename 5772372 264648 29320 6066340 5c90a4 lib/i965_dri.so before 5766074 264648 29320 6060042 5c780a lib/i965_dri.so after Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
f687b8e1785df0825443f07778e5d0ddf6f9be09 |
|
13-May-2016 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Silence unused parameter warnings The only place that actually used the type parameter was the GS visitor, and it was always passed glsl_type::int. Just remove the parameter. brw_vec4_vs_visitor.cpp:38:61: warning: unused parameter ‘type’ [-Wunused-parameter] const glsl_type *type) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
9464d8c49813aba77285e7465b96e92a91ed327c |
|
27-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Switch the arguments to nir_foreach_function This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_function(\([^,]*\),\s*\([^,]*\))/nir_foreach_function(\2, \1)/ Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
707e72f13bb78869ee95d3286980bf1709cba6cf |
|
27-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Switch the arguments to nir_foreach_instr This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_instr(\([^,]*\),\s*\([^,]*\))/nir_foreach_instr(\2, \1)/ and similar expressions for nir_foreach_instr_safe etc. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
7efff10585122d484dc3adab14af9380b9b8f309 |
|
13-Apr-2016 |
Connor Abbott <cwabbott0@gmail.com> |
i965/nir: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b6dc940ec273252678d40707d300851fa1c85ea5 |
|
13-Apr-2016 |
Connor Abbott <cwabbott0@gmail.com> |
nir: rename nir_foreach_block*() to nir_foreach_block*_call() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b3f43822c72301c904fd2824ae3edcd20ea93a29 |
|
19-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use the correct offset for the swizzle shift in push constants This was actually caught by Ken in review the first time around but somehow didn't get fixed before the patches were pushed. :-( Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
9f16e170fed09821bb1b18a9dbe548f3d26b7977 |
|
19-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use nir_intrinsic_base in the load_uniform implementation We shouldn't be reading the const_index directly Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
0166ad6ced542cacfbbbe45e9d4b7f14af5040de |
|
06-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Support full std140 layout for push constants Up until now, we have been able to assume that all push constants are vec4-aligned because this is what the GL driver gives us. In Vulkan, we need to be able to support full std140 because we get the layout from the client. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
8e76f664beb845f8dca30ca5635f9369618563b0 |
|
09-Dec-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Get rid of the uniform_size array Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
056849772f66582fd7e8a181c3fb16955f84243b |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants This commit moves us to an instruction based model rather than a register-based model for indirects. This is more accurate anyway as we have to emit instructions to resolve the reladdr. It's also a lot simpler because it gets rid of the recursive reladdr problem by design. One side-effect of this is that we need a whole new algorithm in move_uniform_array_access_to_pull_constants. This new algorithm is much more straightforward than the old one and is fairly similar to what we're already doing in the FS backend. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
cb372b39ea15729caf8491f4fd9f12c37a2840df |
|
08-Apr-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use UD rather than D for uniform indirects Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
765dd6534937e125b95c7998862b1a4ec76a22d8 |
|
25-Mar-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Implement the new imod and irem opcodes Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
bfd17c76c1267756ea16051cbe174cb23ff49f44 |
|
08-Apr-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Port INTEL_PRECISE_TRIG=1 to NIR. This makes the extra multiply visible to NIR's algebraic optimizations (for constant reassociation) as well as constant folding. This means that when the result of sin/cos are multiplied by an constant, we can eliminate the extra multiply altogether, reducing the cost of the workaround. It also means we only have to implement it one place, rather than in both backends. This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion, which has a ton of sin() calls, but always multiplies them by an immediate constant. The extra multiply gets folded away. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
80c72a8ea7b1018661da0e6509a7f88ca1f5086f |
|
25-Mar-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Provide a default LOD for buffer textures Our hardware requires an LOD for all texelFetch commands even if they are on buffer textures. GLSL IR gives us an LOD of 0 in that case, but the LOD is really rather meaningless. This commit allows other NIR producers to be more lazy and not provide one at all. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
65fbc43d54403905e3eaea02372b5a364dc1d773 |
|
27-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/COS output range. The SIN and COS instructions on Intel hardware can produce values slightly outside of the [-1.0, 1.0] range for a small set of values. Obviously, this can break everyone's expectations about trig functions. According to an internal presentation, the COS instruction can produce a value up to 1.000027 for inputs in the range (0.08296, 0.09888). One suggested workaround is to multiply by 0.99997, scaling down the amplitude slightly. Apparently this also minimizes the error function, reducing the maximum error from 0.00006 to about 0.00003. When enabled, fixes 16 dEQP precision tests dEQP-GLES31.functional.shaders.builtin_functions.precision. {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. at the cost of making every sin and cos call more expensive (about twice the number of cycles on recent hardware). Enabling this option has been shown to reduce GPUTest Volplosion performance by about 10%. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
14c46954c910efb1db94a068a866c7259deaa9d9 |
|
25-Mar-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Add an implemnetation of nir_op_fquantize2f16 Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
084b24f5582567ebf5aa94b7f40ae3bdcb71316b |
|
16-Mar-2016 |
Iago Toral Quiroga <itoral@igalia.com> |
nir: rename nir_const_value fields to include bitsize information Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
0548844e866e4fe326432116f84fdf7e885fba9f |
|
04-Mar-2016 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4/nir: no need to use surface_access:: to call emit_untyped_atomic Now that brw_vec4_visitor::emit_untyped_atomic was removed, there is no need to explicitly set it. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
d3a89a7c494d577fdf8f45c0d8735004a571e86b |
|
04-Mar-2016 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4/nir: remove emit_untyped_surface_read and emit_untyped_atomic at brw_vec4_visitor surface_access emit_untyped_read and emit_untyped_atomic provides the same functionality. v2: surface parameter of emit_untyped_atomic is a const, no need to specify default predicate on emit_untyped_atomic, use retype (Francisco Jerez). Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
24994ae926629ac8521df3cab4a02eb81de15907 |
|
17-Feb-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Push most TES inputs in vec4 mode. (This is commit 4a1c8a3037cd29938b2a6e2c680c341e9903cfbe for vec4 mode.) Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 24 vec4 slots (12 registers) should suffice. (I chose this instead of the 32 vec4 slots used in the scalar backend to avoid regressing a few Piglit tests due to the vec4 register allocator being too stupid to figure out what to do. We probably ought to fix that, but it's a separate issue.) Improves performance in GPUTest's tessmark_x64 microbenchmark by 41.5394% +/- 0.288519% (n = 115) at 1024x768 on my Clevo W740SU (with Iris Pro 5200). Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 38.3576% +/- 0.759748% (n = 42). v2: Simplify abs/negate handling, as requested by Matt. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
8750299a420af76cebd3067f6f603eacde06ae06 |
|
09-Feb-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Remove the const_offset from nir_tex_instr When NIR was originally drafted, there was no easy way to determine if something was constant or not. The result was that we had lots of special-casing for constant values such as this. Now that load_const instructions are SSA-only, it's really easy to find constants and this isn't really needed anymore. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
d03e5d52557ce6523eb65dfec9172d6000f5ff8d |
|
03-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Plumb separate surfaces and samplers through from NIR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
5ec456375e4fdd0b6c7d797f99191044e19ead74 |
|
03-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Separate texture from sampler in nir_tex_instr This commit adds the capability to NIR to support separate textures and samplers. As it currently stands, glsl_to_nir only sets the texture deref and leaves the sampler deref alone as it did before and nir_lower_samplers assumes this. Backends can still assume that they are combined and only look at only at the texture index. Or, if they wish, they can assume that they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir all set both texture and sampler index whenever a sampler is required (the two indices are the same in this case). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
ee85014b90af1d94d637ec763a803479e9bac5dc |
|
06-Feb-2016 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir/tex_instr: Rename sampler to texture We're about to separate the two concepts. When we do, the sampler will become optional. Doing a rename first makes the separation a bit more safe because drivers that depend on GLSL or TGSI behaviour will be fine to just use the texture index all the time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
f4952421cddaa79498da2b7658f48dc008e489e1 |
|
25-Jan-2016 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Implement nir_op_pack_uvec2_to_uint. And mark nir_op_pack_uvec4_to_uint unreachable, since it's only produced by lowering pack[SU]norm4x8 which the vec4 backend does not need. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
866a6bf9f70625517d6d2c17be9523b9f035f1db |
|
19-Jan-2016 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Spaces around operators.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
0a6811207fbe18d49c7ab95f93ed01f75ffcdda0 |
|
14-Jan-2016 |
Jason Ekstrand <jason@jlekstrand.net> |
i965/vec4: Use UW type for multiply into accumulator on GEN8+ BDW adds the following restriction: "When multiplying DW x DW, the dst cannot be accumulator." Cc: "11.1,11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b82e26a6a4d6baf121f44c61c862bfa79ba0d172 |
|
13-Jan-2016 |
Matt Turner <mattst88@gmail.com> |
nir: Lower bitfield_extract. The OpenGL specifications for bitfieldExtract() says: The result will be undefined if <offset> or <bits> is negative, or if the sum of <offset> and <bits> is greater than the number of bits used to store the operand. Therefore passing bits=32, offset=0 is legal and defined in GLSL. But the earlier SM5 ubfe/ibfe opcodes are specified to accept a bitfield width ranging from 0-31. As such, Intel and AMD instructions read only the low 5 bits of the width operand, making them not able to implement the GLSL-specified behavior directly. This commit adds ubfe/ibfe operations from SM5 and a lowering pass for bitfield_extract to to handle the trivial case of <bits> = 32 as bitfieldExtract: bits > 31 ? value : bfe(value, offset, bits) Fixes: ES31-CTS.shader_bitfield_operation.bitfieldExtract.uvec3_0 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Marta Lofstedt <marta.lofstedt@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b85a229e1f542426b1c8000569d89cd4768b9339 |
|
08-Jan-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
glsl: Delete the ir_binop_bfm and ir_triop_bfi opcodes. TGSI doesn't use these - it just translates ir_quadop_bitfield_insert directly. NIR can handle ir_quadop_bitfield_insert as well. These opcodes were only used for i965, and with Jason's recent patches, we can do this lowering in NIR (which also gains us SPIR-V handling). So there's not much point to retaining this GLSL IR lowering code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
cddfc2cefa93b884c40329dcb193fe4fb22143ab |
|
10-Dec-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965: Add support for gl_DrawIDARB and enable extension We have to break open a new vec4 for gl_DrawIDARB. We've used up all space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its own separate vertex buffer anyway. This is because we point the vb for base vertex and base instance into the draw parameter BO for indirect draw calls, but the draw id is generated by mesa in a different buffer. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
17ebb55a14b5a9aa639845fbda9330ef9421834a |
|
10-Dec-2015 |
Kristian Høgsberg Kristensen <krh@bitplanet.net> |
i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB We already have gl_BaseVertexARB in the .x component of the SGVS vec4 and plug gl_BaseInstanceARB into the last free component (.y). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
237f2f2d8b45d9d956102eec6f9be63193e5269b |
|
26-Dec-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Get rid of function overloads When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> ir3 bits are Reviewed-by: Rob Clark <robclark@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
24be658d13b13fdb8a1977208038b4ba43bce4ac |
|
17-Nov-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Add tessellation control shaders. The TCS is the first tessellation shader stage, and the most complicated. It has access to each of the control points in the input patch, and computes a new output patch. There is one logical invocation per output control point; all invocations run in parallel, and can communicate by reading and writing output variables. One of the main responsibilities of the TCS is to write the special gl_TessLevelOuter[] and gl_TessLevelInner[] output variables which control how much new geometry the hardware tessellation engine will produce. Otherwise, it simply writes outputs that are passed along to the TES. We run in SIMD4x2 mode, handling two logical invocations per EU thread. The hardware doesn't properly manage the dispatch mask for us; it always initializes it to 0xFF. We wrap the whole program in an IF..ENDIF block to handle an odd number of invocations, essentially falling back to SIMD4x1 on the last thread. v2: Update comments (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
bb9eb599335ec4ac3a2a579359fb239f16de17e8 |
|
26-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Optimize predicate handling for any/all. For a select whose condition is any(v), instead of emitting cmp.nz.f0(8) null<1>D g1<0,4,1>D 0D mov(8) g7<1>.xUD 0x00000000UD (+f0.any4h) mov(8) g7<1>.xUD 0xffffffffUD cmp.nz.f0(8) null<1>D g7<4,4,1>.xD 0D (+f0) sel(8) g8<1>UD g4<4,4,1>UD g3<4,4,1>UD we now emit cmp.nz.f0(8) null<1>D g1<0,4,1>D 0D (+f0.any4h) sel(8) g9<1>UD g4<4,4,1>UD g3<4,4,1>UD Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
c8a74e3a4ea6ac5dfa35adac06af14a8fa4ff773 |
|
30-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
nir: Delete bany, ball, fany, fall. As in the previous patches, these can be implemented as any(v) -> any_nequal(v, false) all(v) -> all_equal(v, true) and their removal simplifies the code in the next patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
78b81be627734ea7fa50ea246c07b0d4a3a1638a |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: Get rid of *_indirect variants of input/output load/store intrinsics There is some special-casing needed in a competent back-end. However, they can do their special-casing easily enough based on whether or not the offset is a constant. In the mean time, having the *_indirect variants adds special cases a number of places where they don't need to be and, in general, only complicates things. To complicate matters, NIR had no way to convdert an indirect load/store to a direct one in the case that the indirect was a constant so we would still not really get what the back-ends wanted. The best solution seems to be to get rid of the *_indirect variants entirely. This commit is a bunch of different changes squashed together: - nir: Get rid of *_indirect variants of input/output load/store intrinsics - nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect - nir/lower_io: Get rid of load/store_foo_indirect - i965/fs: Get rid of load/store_foo_indirect - i965/vec4: Get rid of load/store_foo_indirect - tgsi_to_nir: Get rid of load/store_foo_indirect - ir3/nir: Use the new unified io intrinsics - vc4: Do all uniform loads with byte offsets - vc4/nir: Use the new unified io intrinsics - vc4: Fix load_user_clip_plane crash - vc4: add missing src for store outputs - vc4: Fix state uniforms - nir/lower_clip: Update to the new load/store intrinsics - nir/lower_two_sided_color: Update to the new load intrinsic NIR and i965 changes are Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NIR indirect declarations and vc4 changes are Reviewed-by: Eric Anholt <eric@anholt.net> ir3 changes are Reviewed-by: Rob Clark <robdclark@gmail.com> NIR changes are Acked-by: Rob Clark <robdclark@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
18069dce4a4c3d71e6afc6b10bfa7bee0560ba9c |
|
11-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Make uniform offsets be in terms of bytes This commit pushes makes uniform offsets be terms of bytes starting with nir_lower_io. They get converted to be in terms of vec4s or floats when we cram them in the UNIFORM register file but reladdr remains in terms of bytes all the way down to the point where we lower it to a pull constant load. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
05bdc21f84edc200a0b0a695b79d12f25cc00645 |
|
02-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use a stride of 1 and byte offsets for UBOs Cc: "11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92909 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
e3e70698c3cfa7e9acccd6eddfb37516c45d5ac2 |
|
24-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use byte offsets for UBO pulls on Sandy Bridge Previously, the VS_OPCODE_PULL_CONSTANT_LOAD opcode operated on vec4-aligned byte offsets on Iron Lake and below and worked in terms of vec4 offsets on Sandy Bridge. On Ivy Bridge, we add a new *LOAD_GEN7 variant which works in terms of vec4s. We're about to change the GEN7 version to work in terms of bytes, so this is a nice unification. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b715e6d52832a0761ccec5c1252e7458499bf752 |
|
26-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Stop pretending to support indirect output stores Since we're using nir_lower_outputs_to_temporaries to shadow all our outputs, it's impossible to actually get an indirect store. The code we had to "handle" this was pretty bogus as it created a register with a reladdr and then stuffed it in a fixed varying slot without so much as a MOV. Not only does this not do the MOV, it also puts the indirect on the wrong side of the transaction. Let's just delete the broken dead code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
aa35b0c2c71f054f72df5a85779d0862fa7d6e4a |
|
25-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Get rid of the nir_inputs array It's not really buying us anything at this point. It's just a way of remapping one offset namespace onto another. We can just use the location namespace the whole way through. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
f36993b46962eab4446bc1964eb47149751aee26 |
|
23-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Clean up #includes in the compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
ecac1aab538d65f0867fd93e23d0d020c1a5d0f1 |
|
23-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Push down inclusion of brw_program.h. We were including it in headers, which then caused it to be included in tons of places it wasn't needed. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
d9b8fde963a53d4e06570d8bece97f806714507a |
|
12-Nov-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use NIR for lowering texture swizzle Now that nir_lower_tex can do texture swizzle lowering, we can use that instead of repeating more-or-less the same code in both backends. This both allows us to share code and means that things like the tg4 work-arounds are somewhat simpler because they don't have to take the swizzle into account. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
f58813842bcece3498f55ec5d582466ccff92a5e |
|
15-May-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir: s/nir_type_unsigned/nir_type_uint v2: do the same in tgsi_to_nir (Samuel) v3: added missing cases after rebase (Iago) v4: Add a blank space after '#' in one of the comments (Matt) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
0684aed8abc51308945ead050d2452b522937c0a |
|
20-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Initialize nir_inputs with src_reg(). nir_locals, nir_ssa_values, and nir_system_values are all dst_reg (not that that makes a whole lot of sense to me), and only nir_inputs is a src_reg. Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
99840eb983f74cd447546f7205c8c9f505ef82c8 |
|
18-Nov-2015 |
Ian Romanick <ian.d.romanick@intel.com> |
i965: Enable EXT_shader_samples_identical On the vec4 backend, textureSamplesIdentical() will always return false. There are currently no test cases for the vec4 backend, so we don't have much confidence in any implementation. We also don't think anyone is likely to miss it. v2: Handle immediate value for MCS smarter. Rebase on changes to nir_texop_sampels_identical (missing second parameter). Suggested by Jason. v3: Add Neil's code to handle 16x MSAA in the FS. Also rebase on top of f9a9ba5e. Stub out the vec4 implementation. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v2] Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
84b6c64efc52948da8db89b8d92d5e744e6cfc95 |
|
18-Nov-2015 |
Ian Romanick <ian.d.romanick@intel.com> |
i965/vec4: Handle nir_tex_src_ms_index more like the scalar v2: Rebase on top of f9a9ba5e. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
457bb290efc162ea3c7c51a820ab7cf88a4efb8d |
|
18-Nov-2015 |
Ian Romanick <ian.d.romanick@intel.com> |
nir: Add nir_texop_samples_identical opcode This is the NIR analog to GLSL IR ir_samples_identical. v2: Don't add the second nir_tex_src_ms_index parameter. Suggested by Ken and Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
f9a9ba5eac2f1934bd7fecc92cd309f22411164b |
|
02-Nov-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Replace src_reg(imm) constructors with brw_imm_*(). Cuts 1.5k of .text. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b163aa01487ab5f9b22c48b7badc5d65999c4985 |
|
27-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Rename GRF to VGRF. The 2-bit hardware register file field is ARF, GRF, MRF, IMM. Rename GRF to VGRF (virtual GRF) so that we can reuse the GRF name to mean an assigned general purpose register. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
dba309fc14d1ca99251c8f8115d2a26ac86f14f6 |
|
30-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965: Initialize registers. The test (file == BAD_FILE) works on registers for which the constructor has not run because BAD_FILE is zero. The next commit will move BAD_FILE in the enum so that it's no longer zero. In the case of this->outputs, the constructor was being run implicitly, and we were unnecessarily memsetting is to zero. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
eca4c43a33c5c1bb63c8aa9d0506ed2ba3f9d8cb |
|
30-Oct-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Do not mark used surfaces in VS_OPCODE_GET_BUFFER_SIZE Do it in the visitor, like we do for other opcodes. v2: use const, get rid of useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
6105d1d0a02c7eea83b327965713be3bada306f7 |
|
30-Oct-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Do not mark used direct surfaces in VS_OPCODE_PULL_CONSTANT_LOAD Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. v2: Use const, do not add unnecessary temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
56774e63028b2997a7d8c0abb5009a4c79f9a453 |
|
20-Oct-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4: select predicate based on writemask for sel emissions Equivalent to commit 8ac3b525c but with sel operations. In this case we select the PredCtrl based on the writemask. This patch helps on cases like this: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D 3: (+f0.0) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD In this case, cmod propagation can't optimize instruction #2, because instructions #1 and #2 have different writemasks, and we can't update directly instruction #2 writemask because our code thinks that sel at instruction #3 reads all four channels of the flag, when it actually only reads .x. So, with this patch, the previous case becames this: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D 3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD Now only the x channel of the flag is used, allowing dead code eliminate to update the writemask at the second instruction: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null.x:D, vgrf40.xxxx:D, 0D 3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD So now cmod propagation can simplify out #2: 1: cmp.l.f0.0 vgrf40.0.x:F, attr18.wwww:F, vgrf7.xxxx:F 2: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD Shader-db numbers: total instructions in shared programs: 6235835 -> 6228008 (-0.13%) instructions in affected programs: 219850 -> 212023 (-3.56%) total loops in shared programs: 1979 -> 1979 (0.00%) helped: 1192 HURT: 0
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
c22d62f5991f1c26c58c9ae1891202ea437d2f7b |
|
26-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Clean up FBH code. It did a bunch of unnecessary stuff, emitting an extra MOV included. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
d9b09f8a306dfd471e45b5294c3adcb119114387 |
|
26-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Don't disable channels in any/all comparisons. We've made a mistake in calling the Channel Enable bits "writemask", because they do more than control which channels of the destination are written -- they actually control which channels are enabled (surprise! surprise!) So, if we emit cmp.z.f0(8) null.xy<1>D g10<4,4,1>.xyzzD g2<0,4,1>.xyzzD mov(8) g12<1>.xUD 0x00000000UD (+f0.all4h) mov(8) g12<1>.xUD 0xffffffffUD where the CMP instruction has only .xy channel enables, it won't write the .zw channels of the flag register, which are of course read by the +f0.all4 predicate. We need to always emit CMP instructions whose flag result might be read by such a predicate with all channels enabled. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
4379ca22f18f5731248ee794ab651db721ba38b2 |
|
07-Oct-2015 |
Emil Velikov <emil.velikov@collabora.com> |
i965: Implement nir_intrinsic_shader_clock v2: - Add a few const qualifiers for good measure. - Drop unneeded retype()s (Matt) - Convert timestamp to SIMD8/16, as fs_visitor::get_timestamp() returns SIMD4 (Connor) v3: - Remove unneeded temporary + MOV (Connor) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
8ac3b525c77cb5aae9e61bd984b78f6cbbffdc1c |
|
09-Oct-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/vec4: nir_emit_if doesn't need to predicate based on all the channels v2: changed comment, as suggested by Matt Turner Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
f534f331ca354bcb138e2b8f6d6d80147ee4a186 |
|
15-Oct-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/vec4: Use the right number of UBOs Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
d3f45888045c84b2bc382a34d169a0ede4774a24 |
|
09-Oct-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965: Adapt SSBOs to work with their own separate index space Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
baee16bf02eedc6a32381d79da6c7ac942f782ae |
|
28-Sep-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
nir: split SSBO min/max atomic instrinsics into signed/unsigned versions NIR is typeless so this is the only way to keep track of the type to select the proper atomic to use. v2: - Use imin,imax,umin,umax for the intrinsic names (Connor Abbott) - Change message for unreachable paths (Michael Schellenberger) Tested-by: Markus Wick <markus@selfnet.de> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
4642d53a03122e6d3214ed12cb327898917eb84e |
|
09-Oct-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4: Implement b2f and b2i using negation. Curro added this in commit 3ee2daf23d (before the vec4/NIR backend was added) but it was missed in the new NIR backend. Add it there as well. instructions in affected programs: 1857 -> 1810 (-2.53%) helped: 15 Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
1e3c1b107e075b210998998423901092b8fcd79b |
|
03-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Use nir_foreach_variable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
bf7b6fd3fd6d98305d64ee6224ca9f9e7ba48444 |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/shader: Get rid of the shader, prog, and shader_prog fields Unfortunately, we can't get rid of them entirely. The FS backend still needs gl_program for handling TEXTURE_RECTANGLE. The GS vec4 backend still needs gl_shader_program for handling transfom feedback. However, the VS needs neither and we can substantially reduce the amount they are used. One day we will be free from their tyranny. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
ca6a436f12cb55e9415049a217229c99b02ad3b8 |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use nir info instead of pulling things out of [shader_]prog Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
7926c3ea7d8f455cbee390d20c78dadf5432b9bc |
|
01-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/backend_shader: Add a field to store the NIR shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
30c63571133ed50907ec14172c2f3ef82ee8a34e |
|
01-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965: Move prog_data uniform setup to the codegen level As of now, uniform setup is more-or-less unified between vec4 and fs and no longer requires the fs_visitor. This makes uniform setup more of a language/API thing than a backend compiler thing. This commit moves setting up the stage_prog_data.params arrays to the same place as we set up the rest of stage_prog_data. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
cdf314cb21377ee7caca05bd1abab6a2b921d213 |
|
01-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Simplify uniform setup Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
7fee8b6f055831bc070bb36d02a8b1c4d601652a |
|
02-Oct-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Pull GLSL uniform handling into a common function The way we deal with GLSL uniforms and builtins is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
03c4171b577b06b1d8dde50b6eb9507d8ef4c1ce |
|
29-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/nir: Pull common ARB program uniform handling into a common function The way we deal with ARB program uniforms is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
390b48fc4a9836b563560581fbfb4833546de0c8 |
|
30-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use the uniform count from nir_assign_var_locations Previously, we were counting up uniforms as we set them up. However, this count should be exactly identical to shader->num_uniforms provided by nir_assign_var_locations. (If it's not, we're in trouble anyway because that means that locations don't match up.) This matches what the fs backend is already doing. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
5609e0d7b41e861a3359991e8d0f2053b255fc31 |
|
30-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Get rid of the uniform_vector_size array The uniform_vector_size array was only ever used by pack_uniform_registers which no longer needs it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
023165a734b3bae52a449ad01bc1ea5ba4384ec1 |
|
15-Sep-2015 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/vec4/nir: add nir_intrinsic_memory_barrier support Fix OpenGL ES 3.1 conformance tests: advanced-readWrite-case1-vsfs and advanced-matrix-vsfs. v2: - Fix SHADER_OPCODE_MEMORY_FENCE emission and the allocation of 'tmp' (Francisco). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
6668eb5a451c43ac78a784711cf239fdf7ca75ef |
|
11-Sep-2015 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
mesa: rename gl_shader_program's NumUniformBlocks to NumBufferInterfaceBlocks Because it counts shader storage blocks too. v2: - Use NumBufferInterfaceBlocks instead (Jordan). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
5ef169034c77ede86546d8dc42f7f22abcd6faa0 |
|
07-Aug-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir/vec4: Implement nir_intrinsic_ssbo_atomic_* Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
e3f9c7829c609e8a32da9f36c9829843f2204a37 |
|
10-Sep-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir/vec4: Implement nir_intrinsic_load_ssbo Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
922b3d1bb16b4b6b292cb159e5fe3d0615ca725c |
|
10-Sep-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir/vec4: Implement nir_intrinsic_store_ssbo Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
65d7f5fe9f4284f7de867b4c412f086c6dcca176 |
|
26-Aug-2015 |
Samuel Iglesias Gonsalvez <siglesias@igalia.com> |
i965/vec4/nir: implement nir_intrinsic_get_buffer_size Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
2e5423ad6345e027bb40c75ffc0e9e64843b9c05 |
|
23-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Add support for fdph_replicated Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
6ba291db4ba4f03ac94560eaae861bc162ac838e |
|
18-Sep-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/vec4/nir: Remove all "this->" snippets For consistency, either we have all class members dereferenced, or none. In this case, very few are so lets get rid of them all. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
a88ce0c1c4c1f77209b71d5a6858f952642f385a |
|
10-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4: Use the replicated fdot instruction in NIR Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
c951bb83056724df02ba7e6fe2dfa720c0f45c1f |
|
09-Sep-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4_nir: Use partial SSA form rather than full non-SSA We made this switch in the FS backend some time ago and it seems to make a number of things a bit easier. In particular, supporting SSA values takes very little work in the backend and allows us to take advantage of the majority of the SSA information even after we've gotten rid of Phi nodes. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b8d2263c83d29f4626ac0fe0316978aa6262aefb |
|
14-Sep-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/vec4_nir: Load constants as integers Loads constants using integer as their register type, like it is done in FS backend. No shader-db changes in HSW. Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91716 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
0b91bcea98c0fe201bba89abe1ca3aee4d04c56c |
|
12-Aug-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
i965: add support for textureSamples function Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> [v2: kayden-supplied code in fs_nir replacing need for logical opcode] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
0cc331dddd1a99c7af3619c92c48b5c32e17f6b3 |
|
04-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965/nir: Use nir_system_value_from_intrinsic to reduce duplication. This code is all pretty much identical. We just needed the translation from one enum value to the other. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
27e83b62bb52de7a681ed82679a707555023f43d |
|
28-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Store a key_tex pointer in vec4_visitor. I'm about to remove the base class for VS/GS/HS/DS program keys, at which point we won't be able to use key->tex anymore. Instead, we'll need to store a direct pointer (like we do in the FS backend). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
cfa056c6a5eadf87f92a71346c0dddd2a080e302 |
|
18-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4_nir: Get rid of the uniform_driver_location tracking Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
0db8e87b4a16b123f7c0b44d54f23b535a136ee6 |
|
18-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
nir/intrinsics: Add a second const index to load_uniform In the i965 backend, we want to be able to "pull apart" the uniforms and push some of them into the shader through a different path. In order to do this effectively, we need to know which variable is actually being referred to by a given uniform load. Previously, it was completely flattened by nir_lower_io which made things difficult. This adds more information to the intrinsic to make this easier for us. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
640c472fd075814972b1276c5b0ed3a769aacda5 |
|
12-Aug-2015 |
Kenneth Graunke <kenneth@whitecape.org> |
i965: Move type_size() methods out of visitor classes. I want to use C function pointers to these, and they don't use anything in the visitor classes anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
2450cbfcbc3671056afad9e858acadbb6edea068 |
|
12-Aug-2015 |
Matt Turner <mattst88@gmail.com> |
i965/vec4/nir: Emit single MOV to generate a scalar constant. If an immediate is written to multiple channels, we can load it in a single writemasked MOV. total instructions in shared programs: 6285144 -> 6261991 (-0.37%) instructions in affected programs: 718991 -> 695838 (-3.22%) helped: 5762 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
c1d9b3ae0bb0f1222719d7737dd9986e437bf5b9 |
|
04-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4_nir: Properly handle integer multiplies on BDW+ Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
1d658cf8795383dbef127e46f3740b516bfe21b9 |
|
03-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4_nir: Do boolean source modifier resolves on BDW+ On BDW+, the negation source modifier on NOT, AND, OR, and XOR, is actually a boolean negate and not an integer negate. However, NIR's soruce modifiers are the integer version. We have to resolve it with a MOV prior to emitting the actual instruction. This is basically the same thing we do in the FS backend. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
5e1c1c2fcbdfb96a973ae3fd196e341ab2d41833 |
|
03-Aug-2015 |
Jason Ekstrand <jason.ekstrand@intel.com> |
i965/vec4-nir: Handle boolean resolvese on ILK- The analysis code was already there and running, we just weren't doing anything with the result of it yet. Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
82f2e706bfd646b91bc0b8beecdff4e54b1f7b04 |
|
29-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Handle uniforms on vertex programs The implementation takes into account that on ARB_vertex_program only a single nir variable is generated to support all the uniform data. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
19cf934f7f18237e1a212b0a019026d5d36c6fac |
|
06-Jul-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/nir/vec4: Add implementation of nir_emit_texture() Uses the nir structure to get all the info needed (sources, dest reg, etc), and then it uses the common vec4_visitor::emit_texture to emit the final code. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
583c1c61703826002ba0f202e8ef7bc2c822ef1d |
|
17-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Implement nir_emit_jump This implementation is taken as-is from fs_nir. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
9b4a6fa4c09d36e0e5c00309e6ea37300ea38f78 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Mark as unreachable ops that should be already lowered NIR ALU operations: * nir_op_fabs * nir_op_iabs * nir_op_fneg * nir_op_ineg * nir_op_fsat should be lowered by lower_source mods * nir_op_fdiv should be lowered in the compiler by DIV_TO_MUL_RCP. * nir_op_fmod should be lowered in the compiler by MOD_TO_FLOOR. * nir_op_fsub * nir_op_isub should be handled by ir_sub_to_add_neg. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
16072834babc487f78472f7e7b59d35249a3aac8 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement vector "any" operation Adds NIR ALU operations: * nir_op_bany2 * nir_op_bany3 * nir_op_bany4 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
fa4e3c3c9f6f3a72a032499fccaa6e222d6a7fa4 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement the dot product operation Adds NIR ALU operations: * nir_op_fdot2 * nir_op_fdot3 * nir_op_fdot4 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
96106e2a9f214d98fc2e99c65398f95d41a3b879 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement conditional select Adds NIR ALU operations: * nir_op_bcsel Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b38fcd0aea8d17919ecd9cc7afc518cfb2c01c27 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement linear interpolation Adds NIR ALU operation: * nir_op_flrp Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b64bd1fdc37eed1bb62d2b32ad22f0f77501f7f2 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement floating-point fused multiply-add Adds NIR ALU operation: * nir_op_ffma Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
d12e165dbb403c3cf86ab7f1b8f28ab6188b479f |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement "shift" operations Adds NIR ALU operations: * nir_op_ishl * nir_op_ishr * nir_op_ushr Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
798cb33a256f703ecaf56d4443e12055484d4bcc |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement the "sign" operation Follows the vec4_visitor IR implementation but sets the saturate value in addition. Adds NIR ALU operations: * nir_op_fsign * nir_op_isign Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
8e1e6facbf828258a9a8ca09da846d1baa21d984 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement bit operations Same implementation than the IR case. Adds NIR ALU operations: * nir_op_bitfield_reverse * nir_op_bit_count * nir_op_ufind_msb * nir_op_ifind_msb * nir_op_find_lsb * nir_op_ubitfield_extract * nir_op_ibitfield_extract * nir_op_bfm * nir_op_bfi * nir_op_bitfield_insert Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
0e874985ce50d902535e1eb766bd252c921b5d8f |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement pack/unpack operations * Lowered floating-point pack and unpack operations are not valid in VS. * Pack and unpack 2x16 operations should be handled by lower_packing_builtins. * Adds NIR ALU operations: * nir_op_pack_half_2x16 * nir_op_unpack_half_2x16 * nir_op_unpack_unorm_4x8 * nir_op_unpack_snorm_4x8 * nir_op_pack_unorm_4x8 * nir_op_pack_snorm_4x8 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
3f10c2f3d73ae41ff83afcdbe225121b8336f499 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: "noise" ops should already be lowered Marked them as unreachable. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
fa4731f4a53aa21e53a62f42f3afdc19b0ce4c8e |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement "bool<->int,float" format conversion Used the same implementation than the vec4_visitor NIR. Adds NIR ALU operations: * nir_op_b2i * nir_op_b2f * nir_op_f2b * nir_op_i2b Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
f14199a8fb802f6672d559fa958a5ee84e3e13f1 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement logical operators Adds NIR ALU operations: * nir_op_inot * nir_op_ixor * nir_op_ior * nir_op_iand Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
51aeafaf96b3b349e007ad05738bc1e05663fedf |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement non-equality ops on vectors Adds NIR ALU operations: * nir_op_bany_fnequal2 * nir_op_bany_inequal2 * nir_op_bany_fnequal3 * nir_op_bany_inequal3 * nir_op_bany_fnequal4 * nir_op_bany_inequal4 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
8be4b876c90192c3a5e6fcc9b526f43a3f7bfc11 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement equality ops on vectors Adds NIR ALU operations: * nir_op_ball_fequal2 * nir_op_ball_iequal2 * nir_op_ball_fequal3 * nir_op_ball_iequal3 * nir_op_ball_fequal4 * nir_op_ball_iequal4 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
84d4a9dc2ca3d98f19cc9125a5ff1ac1225f360d |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement non-vector comparison ops Adds NIR ALU operations: * nir_op_flt * nir_op_ilt * nir_op_ult * nir_op_fge * nir_op_ige * nir_op_uge * nir_op_feq * nir_op_ieq * nir_op_fne * nir_op_ine Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b9c41affcf67f30d7f6c74c17ea34bc42756d56d |
|
17-Apr-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir: Add utility method for comparisons This method returns the brw_conditional_mod value used when emitting comparative ALU operations. It could be moved to brw_nir in the future to reuse it in fs_nir backend. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
dae6025e8efdfb759458a3243c8cd1588f485135 |
|
14-Apr-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Derivatives are not allowed in VS Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
5e6f1c38a591fa39cff1c32a2cfdda927145756a |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement min/max operations Adds NIR ALU operations: * nir_op_fmin * nir_op_imin * nir_op_umin * nir_op_fmax * nir_op_imax * nir_op_umax Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
7553a51a68c0b2030265fe741f9c511b65047914 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement various rounding functions Adds NIR ALU operations: * nir_op_ftrunc * nir_op_fceil * nir_op_ffloor * nir_op_ffrac * nir_op_fround_even Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
0ce159ec7fbcdf00c488b77f63e565e89ef6cab5 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement carry/borrow for addition/subtraction Adds NIR ALU operations: * nir_op_uadd_carry * nir_op_usub_borrow Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
62cef7b0723ad6ca49ed06a6899a5852e41359e8 |
|
17-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement more math operations Adds NIR ALU operations: * nir_op_frcp * nir_op_fexp2 * nir_op_flog2 * nir_op_fexp * nir_op_flog * nir_op_fsin * nir_op_fcos * nir_op_idiv * nir_op_udiv * nir_op_umod * nir_op_ldexp * nir_op_fsqrt * nir_op_frsq * nir_op_fpow Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
9acebf146184c35e6897b91fff414c5295d47996 |
|
16-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement multiplication Implementation based on the vec4_visitor IR implementation for the operations ir_binop_mul and ir_binop_imul_high. Adds NIR ALU operations: * nir_op_fmul * nir_op_imul * nir_op_imul_high * nir_op_umul_high Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
0675842b56a956befbac4a3b912823e73a95a500 |
|
16-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement the addition operation Adds NIR ALU operations: * nir_op_fadd * nir_op_iadd Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
4f39b547da4f9949d1b1f9f0df07d08951f0358d |
|
16-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement int<->float format conversion ops Adds NIR ALU operations: * nir_op_f2i * nir_op_f2u * nir_op_i2f * nir_op_u2f Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
e4f02f47e70d384531ac68e6d33a62fdcdbd1f28 |
|
16-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Lower "vecN" instructions and mark them unreachable This enables NIR pass "lower_vec_to_movs" on shaders that work on vec4. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
79154d99d6e760b1daf327b4594dded18f1d4191 |
|
16-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement single-element "mov" operations Adds NIR ALU operations: * nir_op_imov * nir_op_fmov Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
ef1b30ae637e613b384541324c199d2dbe6b44bd |
|
16-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Prepare source and destination registers for ALU operations This patch resolves and initializes the destination and the source registers that are common to most ALU operations. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
168bbfa6ff22a586ad6307c187cfa3b8fff5f227 |
|
16-Jun-2015 |
Antia Puentes <apuentes@igalia.com> |
i965/nir/vec4: Implement loading values from an UBO Based on the vec4_visitor IR implementation for the ir_binop_load_ubo operation. Notice that unlike the vec4_visitor IR, adding the !=0 comparison for UBO bools is not needed here because that comparison is already added by the nir_visitor when processing the ir_binop_load_ubo (in UBOs "true" is any value different from zero, but for us is ~0). Adds NIR instrinsics: * nir_intrinsic_load_ubo_indirect * nir_intrinsic_load_ubo Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
98d07022f5312967bdfd54069869c8d6c65117a7 |
|
16-Jun-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/nir/vec4: Implement atomic counter intrinsics (read, inc and dec) The implementation is based on its fs_nir counterpart. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
e6cafb5dfdef8d8d25ee1e3375304cf35897d1f7 |
|
16-Jun-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir/vec4: Implement load_uniform intrinsic For the indirect case we need to take the index delivered by NIR and compute the parent uniform that we are accessing (the one that we uploaded to a surface) and the constant offset into that surface. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
e76e8caecd30799500357a45468329f033a93932 |
|
16-Jun-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/nir/vec4: Implement intrinsics that load system values These include: nir_intrinsic_load_vertex_id_zero_base nir_intrinsic_load_base_vertex nir_intrinsic_load_instance_id The source register is fetched from the nir_system_values map initialized during nir_setup_system_values stage. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
662c4c99065381b8e265310d176cfdef6698ca57 |
|
16-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Implement store_output intrinsic This implementation is based on the current URB setup in vec4_visitor, which requires the output register to be stored in the output_reg array at variable's original shader location index. But since nir_lower_io() pass uses the value in var->data.driver_location, we need to put there var->data.location instead, prior to calling nir_lower_io(), so that we end up with the correct index in const_index[0]. The driver_location is not used at all, so this patch also disables the nir_assign_var_locations pass on non-scalar shaders. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
167cb9663adc8c7c61807e503f66e85f955e7d5f |
|
16-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Implement load_input intrinsic The source register is fetched from the nir_inputs map built during nir_setup_inputs stage. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
afe085a0ca01f659c69456018e5f5076c9dde47d |
|
16-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Implement loop statements (nir_cf_node_loop) This is taken as-is from fs_nir. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
5c0436dbf87fef76ba67456f215d9285c38f1816 |
|
16-Jun-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir/vec4: Implement conditional statements (nir_cf_node_if) The same we do in the FS NIR backend, only that here we need to consider the number of components in the condition and adjust the swizzle accordingly. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
f3187ea31ede6bc181ee561573d127aa2e485657 |
|
16-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Add get_nir_dst() and get_nir_src() methods These methods are essential for the implementation of the NIR->vec4 pass. They work similar to their fs_nir counter-parts. When processing instructions, these methods are invoked to resolve the brw registers (source or destination) corresponding to the NIR sources or destination. It uses the map of NIR register index to brw register for all registers locally allocated in a block. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
f7152525374015594e037fa11bb64e1c7174829b |
|
01-Jul-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Implement load_const intrinsic Similar to fs_nir backend, a nir_local_values map will be filled with newly allocated registers as the load_const instrinsic instructions are processed. Later, get_nir_src() will fetch the registers from this map for sources that are ssa. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
59006d3ad3ed5d29e84afa5931f425344e2ef658 |
|
22-Jul-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Add shader function implementation It basically allocates registers local to a function in a nir_locals map, then emits all its control-flow blocks. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
4023b55fdd7005a8a100637c229a1c40648cdd2b |
|
16-Jun-2015 |
Alejandro Piñeiro <apinheiro@igalia.com> |
i965/nir/vec4: Add setup for system values Similar to other variable setups, system values will initialize the corresponding register inside a 'nir_system_values' map, which will then be queried later when processing the different system value intrinsics for the appropriate register. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
195156e571e851273c135847f91ed73b3bfc1914 |
|
16-Jun-2015 |
Iago Toral Quiroga <itoral@igalia.com> |
i965/nir/vec4: Add setup of uniform variables Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
b929acb6a8659fdc06623b766bdf59904d8a3558 |
|
16-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Add setup of input variables in NIR->vec4 pass This implementation sets up a map of input variable offsets to source registers that are already initialized with the corresponding register offset. This map will then be queried when processing load_input intrinsic operations, to obtain the correct register source from which the input data will be loaded. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|
abf4fa3c03ebe5716c90c8a310945c3621cf598f |
|
16-Jun-2015 |
Eduardo Lima Mitev <elima@igalia.com> |
i965/nir/vec4: Add implementation placeholders for a new NIR->vec4 pass This patch will add a brw_vec4_nir.cpp file filled with entry point methods to the main functionality, following a structure similar to brw_fs_nir.cpp. Subsequent patches in this series will be adding the implementations for these methods, incrementally. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
|