Cross Reference: /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4

History log of /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
f51a5b51ab92ada4b9f3b1d603f9de60b66e46ce	06-Jul-2016	Juan A. Suarez Romero <jasuarez@igalia.com>	i965/vec4: emit correctly load_inputs for 64bit data For dvec3 and dvec4 types, a single GRF do not have enough space to allocate two inputs from two different vertices (SIMD4x2). So the GRF only contains first two components for the two vertices, and the next GRF has the remaining components. We want to put all the components for the same vertex in the same register. Thus, we do a shuffle to reorder the data. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
301fdfd8387856ea83c0ac0bff95915c0872c2f4	07-Dec-2016	Samuel Iglesias Gonsálvez <siglesias@igalia.com>	vec4: use DIM instruction when loading DF immediates in HSW Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b76f2206f550c37835d4e19eea1588caa0211b85	01-Jul-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: fix store output for 64-bit types We need to shuffle the data before it is written to the URB. Also, dvec3/4 need two vec4 slots. v2: use byte_offset() instead of offset(). Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8eea41e75d86bfe9bef5f69b25ad797da236a008	12-Feb-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: Fix SSBO stores for 64-bit data In this case we need to shuffle the 64-bit data before we write it to memory, source from reg_offset + 1 to write components Z and W and consider that each DF channel is twice as big. v2: use byte_offset() instead of offset(). Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
9998d55afd179ad5019d3841e4c3255a02fd2d7b	13-Jul-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: Fix SSBO loads for 64-bit data Same requirements as for UBO loads. v2: - use byte_offset() instead of offset() (Iago) - keep the const. offset as an immediate like the original code did (Juan) Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4486c90aaeb08f424ce17f842f46d24d1ceaadcb	13-Jul-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: Fix UBO loads for 64-bit data We need to emit 2 32-bit load messages to load a full dvec4. If only 1 or 2 double components are needed dead-code-elimination will remove the second one. We also need to shuffle the result of the 32-bit messages to form valid 64-bit SIMD4x2 data. v2: - use byte_offset() instead of offset() (Iago) - keep the const. offset as an immediate like the original code did (Juan) Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d8e123cc5d66022069f3aee53318bfd1075bcc53	22-Jun-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: Add a shuffle_64bit_data helper SIMD4x2 64bit data is stored in register space like this: r0.0:DF x0 y0 z0 w0 r1.0:DF x1 y1 z1 w1 When we need to write data such as this to memory using 32-bit write messages we need to shuffle it in this fashion: r0.0:DF x0 y0 x1 y1 r0.1:DF z0 w0 z1 w1 and emit two 32-bit write messages, one for r0.0 at base_offset and another one for r0.1 at base_offset+16. We also need to do the inverse operation when we read using 32-bit messages to produce valid SIMD4x2 64bit data from the data read. We can achieve this by aplying the exact same shuffling to the data read, although we need to apply different channel enables since the layout of the data is reversed. This helper implements the data shuffling logic and we will use it in various places where we read and write 64bit data from/to memory. v2 (Curro): - Use the writemask helper and don't assert on the original writemask being XYZW. - Use the Vec4 IR builder to simplify the implementation. v3 (Iago): - Use byte_offset() instead of offset(). v3: - Fix typo (Matt) - Clarify the example and fix indention (Matt). Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
82e9dda8bf8875d232840585f48763c7a7092918	08-Jun-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4/nir: do not emit 64-bit MAD RepCtrl=1 does not work with 64-bit operands so we need to use RepCtrl=0. In that situation, the regioning generated for the sources seems to be equivalent to <4,4,1>:DF, so it will only work for components XY, which means that we have to move any other swizzle to a temporary so that we can source from channel X (or Y) in MAD and we also need to split the instruction (we are already scalarizing DF instructions but there is room for improvement and with MAD would be more restricted in that area) Also, it seems that MAD operations like this only write proper output for channels X and Y, so writes to Z and W also need to be done to a temporary using channels X/Y and then move that to channels Z or W of the actual dst. As a result the code we produce for native 64-bit MAD instructions is rather bad, and much worse than just emitting MUL+ADD. For reference, a simple case of a fully scalarized dvec4 MAD operation requires 15 instructions if we use native MAD and 8 instructions if we emit ADD+MUL instead. There are some improvements that we can do to the emission of MAD that might bring the instruction count down in some cases, but it comes at the expense of a more complex implementation so it does not seem worth it, at least initially. This patch makes translation of NIR's 64-bit FMMA instructions produce MUL+ADD instead of MAD. Currently, there is nothing else in the vec4 backend that emits MAD instructions, so this is sufficient and it helps optimization passes see MUL+ADD from the get go. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b58026b31e7258a4bd2bb630a1d41a433fb01799	07-Jul-2016	Samuel Iglesias Gonsálvez <siglesias@igalia.com>	i965/vec4: use the new helper function to create double immediates Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
98da3623d5dfd991362c4fd3571325fe0277a2f9	09-Mar-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: add a helper function to create double immediates Gen7 hardware does not support double immediates so these need to be moved in 32-bit chunks to a regular vgrf instead. Instead of doing this every time we need to create a DF immediate, create a helper function that does the right thing depending on the hardware generation. v2 (Curro): - Use swizzle() and writemask() helpers and make tmp const. v3 (Iago): - Adapt to changes in offset() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8f9ce5fa22c04b5b34aa6dc67e4a9b2d151d293d	18-Feb-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: fix optimize predicate for doubles Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
1816ae8f68e395da26dcfea2539bafd715c8dbc4	05-Feb-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: implement fsign() for doubles v2: use a MOV with a conditional_mod instead of a CMP, like we do in d2b, to skip loading a double immediate. v3: Fix comment (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
6e570619e0755a50b2c8d57c6d1189fb9aca899d	17-Feb-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: implement d2b v2 (Curro): - Generate the flag register with a MOV with conditional_mod instead of a CMP instruction, which has the benefit that we can skip loading a DF 0.0 constant. - Avoid the PICK_LOW_32BIT + MOV by using the flag result and a SEL to set the boolean result. v3: - Fix comment (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c1fb525016e41658d2dc5d581da4e83b8a075fd4	17-Feb-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: implement d2i, d2u, i2d and u2d Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4b2257623494ea8e7a1c7b6fbb2f4f3e59522468	29-Jun-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: implement HW workaround for align16 double to float conversion From the BDW PRM, Workarounds chapter: "DF->f format conversion for Align16 has wrong emask calculation when source is immediate." Notice that Broadwell and later are strictly scalar at the moment though, so this is not really necessary. v2: Instead of moving the immediate to a vgrf and converting from there, just convert the double immediate to float in the compiler and move the result to the destination (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
bfc1f0f017db6bd11a558237c9a4ebeacf73f5ba	29-Jun-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: add helpers for conversions to/from doubles Use these helpers to implement d2f and f2d. We will reuse these helpers when we implement things like d2i or i2d as well. v2: - Rename the helpers (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c722a8e61ebc72d7d21c2bed0f623218d739fdb7	17-Feb-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: Rename DF to/from F generator opcodes The opcodes are not specific for conversions to/from float since we need the same for conversions to/from other 32-bit types. Rename the opcodes accordingly and change the asserts to check the size of the types involved instead. v2: - Rename to VEC4_OPCODE_TO_DOUBLE and VEC4_OPCODE_FROM_DOUBLE (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
619271ec8785ab8b6021d0f49e98c51d457eab4d	15-Feb-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: fix register allocation for 64-bit undef sources Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
a8318b120e53518ae4d933acd876b8dbd3871e0c	12-Feb-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: fix get_nir_dest() to use DF type for 64-bit destinations v2: Make dst_reg_for_nir_reg() handle this for nir_register since we want to have the correct type set before we call offset(). Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
bb0e67d55dbd353e9c57b0709fa3e534f1aba05f	05-Oct-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: fix indentation in get_nir_src() Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8cdbbbd2cf9e0c42114c7090805fa2b4a93ca499	14-Aug-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4/nir: implement double comparisons v2: - Added newline before if() (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8a3ba033397bc627e499fcd3a379984ba4d587d2	01-Jun-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: implement double packing Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
94cfdf586a6a95bd06b989bba27d85f9bf99b9df	01-Jun-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: implement double unpacking Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4c040332f56ca2e5a4bbd8c412fd32ab3ff821db	10-Nov-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: We only support 32-bit integer ALU operations for now Add asserts so we remember to address this when we enable 64-bit integer support, as suggested by Connor and Jason. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e09a6be3b6806c582347f6faf93cc2d824d98ed2	14-Aug-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: translate d2f/f2d Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
9ce4b20bde4f4ca8e8907fcac13e8bb9d7e5f4b4	14-Aug-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4/nir: fix emitting 64-bit immediates Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
3457252b74f5490cff8915cac1e5fe0bf1031f5b	13-Aug-2015	Connor Abbott <connor.w.abbott@intel.com>	i965/vec4/nir: set the right type for 64-bit registers Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
fef06f635610ddc730a213576e59afb638c6051d	25-May-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4/nir: support doubles in ALU operations Basically, this involves considering the bit-size information to set the appropriate type on both operands and destination. v2 (Curro) - Don't use two temporaries (and write one of them twice ) to obtain the nir_alu_type. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0f096b1e5a5e31a5efba7279326ec8bc8478bb56	02-Nov-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4/nir: Add bit-size information to types Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
2d81a292036445c440e56d07ce3d5294e0411d71	29-Feb-2016	Connor Abbott <connor.w.abbott@intel.com>	i965/vec4/nir: allocate two registers for dvec3/dvec4 v2 (Curro): - Do not special-case for a bit-size of 64, divide the bit_size by 32 instead. - Use DIV_ROUND_UP so we can handle sub-32-bit types. v3 (Ian): - Make num_regs const. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
54913850aa379f57fcbf7a2dec5ea236cf997646	10-Aug-2015	Connor Abbott <connor.w.abbott@intel.com>	i965/vec4/nir: simplify glsl_type_for_nir_alu_type() Less duplication, one one less case to handle for doubles and support for sized NIR types. v2: Fix call to get_instance by swapping rows and columns params (Iago) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
fd249c803e3ae2acb83f5e3b7152728e73228b7b	12-Dec-2016	Ilia Mirkin <imirkin@alum.mit.edu>	treewide: s/comparitor/comparator/ git grep -l comparitor \| xargs sed -i 's/comparitor/comparator/g' Just happened to notice this in a patch that was sent and included one of the tokens in question. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4f2d1d6ea713df8f8d816b48b9e99c7117cf36d7	28-Nov-2016	Ilia Mirkin <imirkin@alum.mit.edu>	i965: support constant gather offsets larger than 4 bits Offsets that don't fit into 4 bits need to force gather_po to be selected. Adjust the logic so that this happens. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f182e5eafc31ebc7c140e9a369d5f747948733ae	17-Oct-2016	Kenneth Graunke <kenneth@whitecape.org>	i965/vec4: Handle component qualifiers on non-generic varyings. ARB_enhanced_layouts only requires component qualifier support for generic varyings, so this is all the vec4 backend knew how to handle. This patch extends the backend to handle it for all varyings, so we can use store_output intrinsics with a component set for things like clip/cull distances. We may want to use that for other VUE header fields in the future as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
66fcfa6894ab61a8cb70955f4a4113729e4a8099	03-Oct-2016	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: make offset() work in terms of a simd width and scalar components So that it has the same semantics as the scalar backend implementation. The helper will now take a simd width (which is always 8 in vec4 mode) and step as many scalar components as specified by that width, respecting the size of the scalar channels. v2 (Curro): - Remove the assertion in offset(), byte_offset() has the same checks. - Use byte_offset() directly instead of add_byte_offset(). - Make things more clear by explicitly including the vertical stride in the byte offset expression. Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b	13-Oct-2016	Timothy Arceri <timothy.arceri@collabora.com>	nir/i965/anv/radv/gallium: make shader info a pointer When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
89e1436e2d4ff0c15202708979eb36761cae4167	11-Oct-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Silence unused parameter warnings brw_link.cpp:76:44: warning: unused parameter ‘shader_type’ [-Wunused-parameter] gl_shader_stage shader_type, ^ brw_nir.c: In function ‘brw_nir_lower_vs_inputs’: brw_nir.c:194:55: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo, ^ brw_vec4_visitor.cpp:914:37: warning: unused parameter ‘sampler’ [-Wunused-parameter] uint32_t sampler, ^ brw_vec4_visitor.cpp:1146:34: warning: unused parameter ‘stream_id’ [-Wunused-parameter] vec4_visitor::gs_emit_vertex(int stream_id) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
40dd45d0c6aa4a9d727c09225967e9c3b1f45854	30-Jun-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Enable ARB_shader_atomic_counter_ops Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
3d2011cb33317b0fe9b8fe989916efc1841c6ce0	30-Jun-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Refactor emission of atomic counter operations This will make it easier to add more operations. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
69fdf13c215c2970feaca76f178a5c2c11ba8fec	03-Sep-2016	Francisco Jerez <currojerez@riseup.net>	i965/vec4: Replace vec4_instruction::regs_written with ::size_written field in bytes. The previous regs_written field can be recovered by rewriting each rvalue reference of regs_written like 'x = i.regs_written' to 'x = DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference like 'i.regs_written = x' to 'i.size_written = x * reg_unit'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
fba020e5af49d9d9a2c6e4d4b79115ed1e74a127	01-Sep-2016	Francisco Jerez <currojerez@riseup.net>	i965/vec4: Replace dst/src_reg::reg_offset with dst/src_reg::offset expressed in bytes. The dst/src_reg::offset field in byte units introduced in the previous patch is a more straightforward alternative to an offset representation split between ::reg_offset and ::subreg_offset fields. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple FS back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. v2: Fix division by the wrong reg_unit in the UNIFORM case of convert_to_hw_regs(). (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d1b1fca0b7cccff718923f2344ea144dc3ebb869	22-Jun-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965/vec4: add support for packing vs/gs/tes outputs Here we create a new output_generic_reg array with the ability to store the dst_reg for each component of user defined varyings. This is needed as the previous code only stored the dst_reg based on the varying location which meant packed varyings would overwrite each other. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b427abba0c04214ba6184092eee73fc6377fbff9	23-Jun-2016	Timothy Arceri <timothy.arceri@collabora.com>	i965/vec4: add support for packing inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
96dfed49e47eac7afc100e5b8d3b316dd6652fb6	19-Jul-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Stop muging cube array lengths by 6 From the Sky Lake PRM: "For SURFTYPE_CUBE: For Sampling Engine Surfaces and Typed Data Port Surfaces, the range of this field is [0,340], indicating the number of cube array elements (equal to the number of underlying 2D array elements divided by 6). For other surfaces, this field must be zero." In other words, the depth field for cube maps is in number of cubes not number of 2-D slices so we need to divide by 6. ISL will do this correctly for us assuming that we provide it with the correct array bounds which it expects to be in 2-D slices. It appears as if we've been doing this wrong ever since we first added cube map arrays for Sandy Bridge and the change to ISL made things slightly worse. While we're at it, we now need to remoe the shader hacks we've always done since they were only needed because we were setting the depth field six times too large. v2: Fix the vec4 backend as well (not sure how I missed this). Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
3e7cebc8da5c9f16fa1b9a25ea72b8d31c86a440	22-Jun-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Use LZD to implement nir_op_find_lsb on Gen < 7 v2: Rebase on changes to previous two patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c2019c6c261d5c46a4e5d3edc88836bcedf75f30	22-Jun-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Use LZD to implement nir_op_ifind_msb on Gen < 7 v2: Retype LZD source as UD to avoid potential problems with 0x80000000. Suggested by Matt. Also update comment about problem values with LZD(abs(x)). Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
de20086eed47e6bfe7c25835d72383114f99c7a9	22-Jun-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Use LZD to implement nir_op_ufind_msb This uses one less instruction. v2: Move emit_find_msb_using_lzd out of the visitor classes. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
fb5dcb81cc121e4355b7eef014474a5c42a2f6db	19-May-2016	Matt Turner <mattst88@gmail.com>	i965: Pass nir_src/nir_dest by reference. Cuts 6K of .text. text data bss dec hex filename 5772372 264648 29320 6066340 5c90a4 lib/i965_dri.so before 5766074 264648 29320 6060042 5c780a lib/i965_dri.so after Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f687b8e1785df0825443f07778e5d0ddf6f9be09	13-May-2016	Ian Romanick <ian.d.romanick@intel.com>	i965: Silence unused parameter warnings The only place that actually used the type parameter was the GS visitor, and it was always passed glsl_type::int. Just remove the parameter. brw_vec4_vs_visitor.cpp:38:61: warning: unused parameter ‘type’ [-Wunused-parameter] const glsl_type *type) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
9464d8c49813aba77285e7465b96e92a91ed327c	27-Apr-2016	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Switch the arguments to nir_foreach_function This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_function(\([^,]\),\s\([^,]*\))/nir_foreach_function(\2, \1)/ Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
707e72f13bb78869ee95d3286980bf1709cba6cf	27-Apr-2016	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Switch the arguments to nir_foreach_instr This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_instr(\([^,]\),\s\([^,]*\))/nir_foreach_instr(\2, \1)/ and similar expressions for nir_foreach_instr_safe etc. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
7efff10585122d484dc3adab14af9380b9b8f309	13-Apr-2016	Connor Abbott <cwabbott0@gmail.com>	i965/nir: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b6dc940ec273252678d40707d300851fa1c85ea5	13-Apr-2016	Connor Abbott <cwabbott0@gmail.com>	nir: rename nir_foreach_block() to nir_foreach_block_call() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b3f43822c72301c904fd2824ae3edcd20ea93a29	19-Apr-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Use the correct offset for the swizzle shift in push constants This was actually caught by Ken in review the first time around but somehow didn't get fixed before the patches were pushed. :-( Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001 /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
9f16e170fed09821bb1b18a9dbe548f3d26b7977	19-Apr-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Use nir_intrinsic_base in the load_uniform implementation We shouldn't be reading the const_index directly Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001 /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0166ad6ced542cacfbbbe45e9d4b7f14af5040de	06-Apr-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Support full std140 layout for push constants Up until now, we have been able to assume that all push constants are vec4-aligned because this is what the GL driver gives us. In Vulkan, we need to be able to support full std140 because we get the layout from the client. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8e76f664beb845f8dca30ca5635f9369618563b0	09-Dec-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Get rid of the uniform_size array Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
056849772f66582fd7e8a181c3fb16955f84243b	25-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants This commit moves us to an instruction based model rather than a register-based model for indirects. This is more accurate anyway as we have to emit instructions to resolve the reladdr. It's also a lot simpler because it gets rid of the recursive reladdr problem by design. One side-effect of this is that we need a whole new algorithm in move_uniform_array_access_to_pull_constants. This new algorithm is much more straightforward than the old one and is fairly similar to what we're already doing in the FS backend. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
cb372b39ea15729caf8491f4fd9f12c37a2840df	08-Apr-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Use UD rather than D for uniform indirects Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
765dd6534937e125b95c7998862b1a4ec76a22d8	25-Mar-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Implement the new imod and irem opcodes Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
bfd17c76c1267756ea16051cbe174cb23ff49f44	08-Apr-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Port INTEL_PRECISE_TRIG=1 to NIR. This makes the extra multiply visible to NIR's algebraic optimizations (for constant reassociation) as well as constant folding. This means that when the result of sin/cos are multiplied by an constant, we can eliminate the extra multiply altogether, reducing the cost of the workaround. It also means we only have to implement it one place, rather than in both backends. This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion, which has a ton of sin() calls, but always multiplies them by an immediate constant. The extra multiply gets folded away. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
80c72a8ea7b1018661da0e6509a7f88ca1f5086f	25-Mar-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Provide a default LOD for buffer textures Our hardware requires an LOD for all texelFetch commands even if they are on buffer textures. GLSL IR gives us an LOD of 0 in that case, but the LOD is really rather meaningless. This commit allows other NIR producers to be more lazy and not provide one at all. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
65fbc43d54403905e3eaea02372b5a364dc1d773	27-Jan-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/COS output range. The SIN and COS instructions on Intel hardware can produce values slightly outside of the [-1.0, 1.0] range for a small set of values. Obviously, this can break everyone's expectations about trig functions. According to an internal presentation, the COS instruction can produce a value up to 1.000027 for inputs in the range (0.08296, 0.09888). One suggested workaround is to multiply by 0.99997, scaling down the amplitude slightly. Apparently this also minimizes the error function, reducing the maximum error from 0.00006 to about 0.00003. When enabled, fixes 16 dEQP precision tests dEQP-GLES31.functional.shaders.builtin_functions.precision. {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. at the cost of making every sin and cos call more expensive (about twice the number of cycles on recent hardware). Enabling this option has been shown to reduce GPUTest Volplosion performance by about 10%. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
14c46954c910efb1db94a068a866c7259deaa9d9	25-Mar-2016	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Add an implemnetation of nir_op_fquantize2f16 Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
084b24f5582567ebf5aa94b7f40ae3bdcb71316b	16-Mar-2016	Iago Toral Quiroga <itoral@igalia.com>	nir: rename nir_const_value fields to include bitsize information Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0548844e866e4fe326432116f84fdf7e885fba9f	04-Mar-2016	Alejandro Piñeiro <apinheiro@igalia.com>	i965/vec4/nir: no need to use surface_access:: to call emit_untyped_atomic Now that brw_vec4_visitor::emit_untyped_atomic was removed, there is no need to explicitly set it. Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d3a89a7c494d577fdf8f45c0d8735004a571e86b	04-Mar-2016	Alejandro Piñeiro <apinheiro@igalia.com>	i965/vec4/nir: remove emit_untyped_surface_read and emit_untyped_atomic at brw_vec4_visitor surface_access emit_untyped_read and emit_untyped_atomic provides the same functionality. v2: surface parameter of emit_untyped_atomic is a const, no need to specify default predicate on emit_untyped_atomic, use retype (Francisco Jerez). Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
24994ae926629ac8521df3cab4a02eb81de15907	17-Feb-2016	Kenneth Graunke <kenneth@whitecape.org>	i965: Push most TES inputs in vec4 mode. (This is commit 4a1c8a3037cd29938b2a6e2c680c341e9903cfbe for vec4 mode.) Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 24 vec4 slots (12 registers) should suffice. (I chose this instead of the 32 vec4 slots used in the scalar backend to avoid regressing a few Piglit tests due to the vec4 register allocator being too stupid to figure out what to do. We probably ought to fix that, but it's a separate issue.) Improves performance in GPUTest's tessmark_x64 microbenchmark by 41.5394% +/- 0.288519% (n = 115) at 1024x768 on my Clevo W740SU (with Iris Pro 5200). Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 38.3576% +/- 0.759748% (n = 42). v2: Simplify abs/negate handling, as requested by Matt. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8750299a420af76cebd3067f6f603eacde06ae06	09-Feb-2016	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Remove the const_offset from nir_tex_instr When NIR was originally drafted, there was no easy way to determine if something was constant or not. The result was that we had lots of special-casing for constant values such as this. Now that load_const instructions are SSA-only, it's really easy to find constants and this isn't really needed anymore. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d03e5d52557ce6523eb65dfec9172d6000f5ff8d	03-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Plumb separate surfaces and samplers through from NIR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
5ec456375e4fdd0b6c7d797f99191044e19ead74	03-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Separate texture from sampler in nir_tex_instr This commit adds the capability to NIR to support separate textures and samplers. As it currently stands, glsl_to_nir only sets the texture deref and leaves the sampler deref alone as it did before and nir_lower_samplers assumes this. Backends can still assume that they are combined and only look at only at the texture index. Or, if they wish, they can assume that they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir all set both texture and sampler index whenever a sampler is required (the two indices are the same in this case). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
ee85014b90af1d94d637ec763a803479e9bac5dc	06-Feb-2016	Jason Ekstrand <jason.ekstrand@intel.com>	nir/tex_instr: Rename sampler to texture We're about to separate the two concepts. When we do, the sampler will become optional. Doing a rename first makes the separation a bit more safe because drivers that depend on GLSL or TGSI behaviour will be fine to just use the texture index all the time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f4952421cddaa79498da2b7658f48dc008e489e1	25-Jan-2016	Matt Turner <mattst88@gmail.com>	i965/vec4: Implement nir_op_pack_uvec2_to_uint. And mark nir_op_pack_uvec4_to_uint unreachable, since it's only produced by lowering pack[SU]norm4x8 which the vec4 backend does not need. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
866a6bf9f70625517d6d2c17be9523b9f035f1db	19-Jan-2016	Matt Turner <mattst88@gmail.com>	i965/vec4: Spaces around operators. /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0a6811207fbe18d49c7ab95f93ed01f75ffcdda0	14-Jan-2016	Jason Ekstrand <jason@jlekstrand.net>	i965/vec4: Use UW type for multiply into accumulator on GEN8+ BDW adds the following restriction: "When multiplying DW x DW, the dst cannot be accumulator." Cc: "11.1,11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b82e26a6a4d6baf121f44c61c862bfa79ba0d172	13-Jan-2016	Matt Turner <mattst88@gmail.com>	nir: Lower bitfield_extract. The OpenGL specifications for bitfieldExtract() says: The result will be undefined if <offset> or <bits> is negative, or if the sum of <offset> and <bits> is greater than the number of bits used to store the operand. Therefore passing bits=32, offset=0 is legal and defined in GLSL. But the earlier SM5 ubfe/ibfe opcodes are specified to accept a bitfield width ranging from 0-31. As such, Intel and AMD instructions read only the low 5 bits of the width operand, making them not able to implement the GLSL-specified behavior directly. This commit adds ubfe/ibfe operations from SM5 and a lowering pass for bitfield_extract to to handle the trivial case of <bits> = 32 as bitfieldExtract: bits > 31 ? value : bfe(value, offset, bits) Fixes: ES31-CTS.shader_bitfield_operation.bitfieldExtract.uvec3_0 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Marta Lofstedt <marta.lofstedt@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b85a229e1f542426b1c8000569d89cd4768b9339	08-Jan-2016	Kenneth Graunke <kenneth@whitecape.org>	glsl: Delete the ir_binop_bfm and ir_triop_bfi opcodes. TGSI doesn't use these - it just translates ir_quadop_bitfield_insert directly. NIR can handle ir_quadop_bitfield_insert as well. These opcodes were only used for i965, and with Jason's recent patches, we can do this lowering in NIR (which also gains us SPIR-V handling). So there's not much point to retaining this GLSL IR lowering code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
cddfc2cefa93b884c40329dcb193fe4fb22143ab	10-Dec-2015	Kristian Høgsberg Kristensen <krh@bitplanet.net>	i965: Add support for gl_DrawIDARB and enable extension We have to break open a new vec4 for gl_DrawIDARB. We've used up all space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its own separate vertex buffer anyway. This is because we point the vb for base vertex and base instance into the draw parameter BO for indirect draw calls, but the draw id is generated by mesa in a different buffer. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
17ebb55a14b5a9aa639845fbda9330ef9421834a	10-Dec-2015	Kristian Høgsberg Kristensen <krh@bitplanet.net>	i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB We already have gl_BaseVertexARB in the .x component of the SGVS vec4 and plug gl_BaseInstanceARB into the last free component (.y). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
237f2f2d8b45d9d956102eec6f9be63193e5269b	26-Dec-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Get rid of function overloads When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> ir3 bits are Reviewed-by: Rob Clark <robclark@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
24be658d13b13fdb8a1977208038b4ba43bce4ac	17-Nov-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Add tessellation control shaders. The TCS is the first tessellation shader stage, and the most complicated. It has access to each of the control points in the input patch, and computes a new output patch. There is one logical invocation per output control point; all invocations run in parallel, and can communicate by reading and writing output variables. One of the main responsibilities of the TCS is to write the special gl_TessLevelOuter[] and gl_TessLevelInner[] output variables which control how much new geometry the hardware tessellation engine will produce. Otherwise, it simply writes outputs that are passed along to the TES. We run in SIMD4x2 mode, handling two logical invocations per EU thread. The hardware doesn't properly manage the dispatch mask for us; it always initializes it to 0xFF. We wrap the whole program in an IF..ENDIF block to handle an odd number of invocations, essentially falling back to SIMD4x1 on the last thread. v2: Update comments (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
bb9eb599335ec4ac3a2a579359fb239f16de17e8	26-Nov-2015	Matt Turner <mattst88@gmail.com>	i965/vec4: Optimize predicate handling for any/all. For a select whose condition is any(v), instead of emitting cmp.nz.f0(8) null<1>D g1<0,4,1>D 0D mov(8) g7<1>.xUD 0x00000000UD (+f0.any4h) mov(8) g7<1>.xUD 0xffffffffUD cmp.nz.f0(8) null<1>D g7<4,4,1>.xD 0D (+f0) sel(8) g8<1>UD g4<4,4,1>UD g3<4,4,1>UD we now emit cmp.nz.f0(8) null<1>D g1<0,4,1>D 0D (+f0.any4h) sel(8) g9<1>UD g4<4,4,1>UD g3<4,4,1>UD Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c8a74e3a4ea6ac5dfa35adac06af14a8fa4ff773	30-Nov-2015	Matt Turner <mattst88@gmail.com>	nir: Delete bany, ball, fany, fall. As in the previous patches, these can be implemented as any(v) -> any_nequal(v, false) all(v) -> all_equal(v, true) and their removal simplifies the code in the next patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
78b81be627734ea7fa50ea246c07b0d4a3a1638a	25-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: Get rid of _indirect variants of input/output load/store intrinsics There is some special-casing needed in a competent back-end. However, they can do their special-casing easily enough based on whether or not the offset is a constant. In the mean time, having the _indirect variants adds special cases a number of places where they don't need to be and, in general, only complicates things. To complicate matters, NIR had no way to convdert an indirect load/store to a direct one in the case that the indirect was a constant so we would still not really get what the back-ends wanted. The best solution seems to be to get rid of the _indirect variants entirely. This commit is a bunch of different changes squashed together: - nir: Get rid of _indirect variants of input/output load/store intrinsics - nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect - nir/lower_io: Get rid of load/store_foo_indirect - i965/fs: Get rid of load/store_foo_indirect - i965/vec4: Get rid of load/store_foo_indirect - tgsi_to_nir: Get rid of load/store_foo_indirect - ir3/nir: Use the new unified io intrinsics - vc4: Do all uniform loads with byte offsets - vc4/nir: Use the new unified io intrinsics - vc4: Fix load_user_clip_plane crash - vc4: add missing src for store outputs - vc4: Fix state uniforms - nir/lower_clip: Update to the new load/store intrinsics - nir/lower_two_sided_color: Update to the new load intrinsic NIR and i965 changes are Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NIR indirect declarations and vc4 changes are Reviewed-by: Eric Anholt <eric@anholt.net> ir3 changes are Reviewed-by: Rob Clark <robdclark@gmail.com> NIR changes are Acked-by: Rob Clark <robdclark@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
18069dce4a4c3d71e6afc6b10bfa7bee0560ba9c	11-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Make uniform offsets be in terms of bytes This commit pushes makes uniform offsets be terms of bytes starting with nir_lower_io. They get converted to be in terms of vec4s or floats when we cram them in the UNIFORM register file but reladdr remains in terms of bytes all the way down to the point where we lower it to a pull constant load. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
05bdc21f84edc200a0b0a695b79d12f25cc00645	02-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Use a stride of 1 and byte offsets for UBOs Cc: "11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92909 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e3e70698c3cfa7e9acccd6eddfb37516c45d5ac2	24-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Use byte offsets for UBO pulls on Sandy Bridge Previously, the VS_OPCODE_PULL_CONSTANT_LOAD opcode operated on vec4-aligned byte offsets on Iron Lake and below and worked in terms of vec4 offsets on Sandy Bridge. On Ivy Bridge, we add a new *LOAD_GEN7 variant which works in terms of vec4s. We're about to change the GEN7 version to work in terms of bytes, so this is a nice unification. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b715e6d52832a0761ccec5c1252e7458499bf752	26-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Stop pretending to support indirect output stores Since we're using nir_lower_outputs_to_temporaries to shadow all our outputs, it's impossible to actually get an indirect store. The code we had to "handle" this was pretty bogus as it created a register with a reladdr and then stuffed it in a fixed varying slot without so much as a MOV. Not only does this not do the MOV, it also puts the indirect on the wrong side of the transaction. Let's just delete the broken dead code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
aa35b0c2c71f054f72df5a85779d0862fa7d6e4a	25-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Get rid of the nir_inputs array It's not really buying us anything at this point. It's just a way of remapping one offset namespace onto another. We can just use the location namespace the whole way through. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f36993b46962eab4446bc1964eb47149751aee26	23-Nov-2015	Matt Turner <mattst88@gmail.com>	i965: Clean up #includes in the compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
ecac1aab538d65f0867fd93e23d0d020c1a5d0f1	23-Nov-2015	Matt Turner <mattst88@gmail.com>	i965: Push down inclusion of brw_program.h. We were including it in headers, which then caused it to be included in tons of places it wasn't needed. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d9b8fde963a53d4e06570d8bece97f806714507a	12-Nov-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Use NIR for lowering texture swizzle Now that nir_lower_tex can do texture swizzle lowering, we can use that instead of repeating more-or-less the same code in both backends. This both allows us to share code and means that things like the tg4 work-arounds are somewhat simpler because they don't have to take the swizzle into account. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f58813842bcece3498f55ec5d582466ccff92a5e	15-May-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir: s/nir_type_unsigned/nir_type_uint v2: do the same in tgsi_to_nir (Samuel) v3: added missing cases after rebase (Iago) v4: Add a blank space after '#' in one of the comments (Matt) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0684aed8abc51308945ead050d2452b522937c0a	20-Nov-2015	Matt Turner <mattst88@gmail.com>	i965/vec4: Initialize nir_inputs with src_reg(). nir_locals, nir_ssa_values, and nir_system_values are all dst_reg (not that that makes a whole lot of sense to me), and only nir_inputs is a src_reg. Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
99840eb983f74cd447546f7205c8c9f505ef82c8	18-Nov-2015	Ian Romanick <ian.d.romanick@intel.com>	i965: Enable EXT_shader_samples_identical On the vec4 backend, textureSamplesIdentical() will always return false. There are currently no test cases for the vec4 backend, so we don't have much confidence in any implementation. We also don't think anyone is likely to miss it. v2: Handle immediate value for MCS smarter. Rebase on changes to nir_texop_sampels_identical (missing second parameter). Suggested by Jason. v3: Add Neil's code to handle 16x MSAA in the FS. Also rebase on top of f9a9ba5e. Stub out the vec4 implementation. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v2] Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2] /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
84b6c64efc52948da8db89b8d92d5e744e6cfc95	18-Nov-2015	Ian Romanick <ian.d.romanick@intel.com>	i965/vec4: Handle nir_tex_src_ms_index more like the scalar v2: Rebase on top of f9a9ba5e. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
457bb290efc162ea3c7c51a820ab7cf88a4efb8d	18-Nov-2015	Ian Romanick <ian.d.romanick@intel.com>	nir: Add nir_texop_samples_identical opcode This is the NIR analog to GLSL IR ir_samples_identical. v2: Don't add the second nir_tex_src_ms_index parameter. Suggested by Ken and Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f9a9ba5eac2f1934bd7fecc92cd309f22411164b	02-Nov-2015	Matt Turner <mattst88@gmail.com>	i965/vec4: Replace src_reg(imm) constructors with brw_imm_*(). Cuts 1.5k of .text. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b163aa01487ab5f9b22c48b7badc5d65999c4985	27-Oct-2015	Matt Turner <mattst88@gmail.com>	i965: Rename GRF to VGRF. The 2-bit hardware register file field is ARF, GRF, MRF, IMM. Rename GRF to VGRF (virtual GRF) so that we can reuse the GRF name to mean an assigned general purpose register. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
dba309fc14d1ca99251c8f8115d2a26ac86f14f6	30-Oct-2015	Matt Turner <mattst88@gmail.com>	i965: Initialize registers. The test (file == BAD_FILE) works on registers for which the constructor has not run because BAD_FILE is zero. The next commit will move BAD_FILE in the enum so that it's no longer zero. In the case of this->outputs, the constructor was being run implicitly, and we were unnecessarily memsetting is to zero. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
eca4c43a33c5c1bb63c8aa9d0506ed2ba3f9d8cb	30-Oct-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: Do not mark used surfaces in VS_OPCODE_GET_BUFFER_SIZE Do it in the visitor, like we do for other opcodes. v2: use const, get rid of useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
6105d1d0a02c7eea83b327965713be3bada306f7	30-Oct-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: Do not mark used direct surfaces in VS_OPCODE_PULL_CONSTANT_LOAD Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. v2: Use const, do not add unnecessary temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
56774e63028b2997a7d8c0abb5009a4c79f9a453	20-Oct-2015	Alejandro Piñeiro <apinheiro@igalia.com>	i965/vec4: select predicate based on writemask for sel emissions Equivalent to commit 8ac3b525c but with sel operations. In this case we select the PredCtrl based on the writemask. This patch helps on cases like this: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D 3: (+f0.0) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD In this case, cmod propagation can't optimize instruction #2, because instructions #1 and #2 have different writemasks, and we can't update directly instruction #2 writemask because our code thinks that sel at instruction #3 reads all four channels of the flag, when it actually only reads .x. So, with this patch, the previous case becames this: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D 3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD Now only the x channel of the flag is used, allowing dead code eliminate to update the writemask at the second instruction: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null.x:D, vgrf40.xxxx:D, 0D 3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD So now cmod propagation can simplify out #2: 1: cmp.l.f0.0 vgrf40.0.x:F, attr18.wwww:F, vgrf7.xxxx:F 2: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD Shader-db numbers: total instructions in shared programs: 6235835 -> 6228008 (-0.13%) instructions in affected programs: 219850 -> 212023 (-3.56%) total loops in shared programs: 1979 -> 1979 (0.00%) helped: 1192 HURT: 0 /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c22d62f5991f1c26c58c9ae1891202ea437d2f7b	26-Oct-2015	Matt Turner <mattst88@gmail.com>	i965/vec4: Clean up FBH code. It did a bunch of unnecessary stuff, emitting an extra MOV included. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d9b09f8a306dfd471e45b5294c3adcb119114387	26-Oct-2015	Matt Turner <mattst88@gmail.com>	i965/vec4: Don't disable channels in any/all comparisons. We've made a mistake in calling the Channel Enable bits "writemask", because they do more than control which channels of the destination are written -- they actually control which channels are enabled (surprise! surprise!) So, if we emit cmp.z.f0(8) null.xy<1>D g10<4,4,1>.xyzzD g2<0,4,1>.xyzzD mov(8) g12<1>.xUD 0x00000000UD (+f0.all4h) mov(8) g12<1>.xUD 0xffffffffUD where the CMP instruction has only .xy channel enables, it won't write the .zw channels of the flag register, which are of course read by the +f0.all4 predicate. We need to always emit CMP instructions whose flag result might be read by such a predicate with all channels enabled. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4379ca22f18f5731248ee794ab651db721ba38b2	07-Oct-2015	Emil Velikov <emil.velikov@collabora.com>	i965: Implement nir_intrinsic_shader_clock v2: - Add a few const qualifiers for good measure. - Drop unneeded retype()s (Matt) - Convert timestamp to SIMD8/16, as fs_visitor::get_timestamp() returns SIMD4 (Connor) v3: - Remove unneeded temporary + MOV (Connor) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8ac3b525c77cb5aae9e61bd984b78f6cbbffdc1c	09-Oct-2015	Alejandro Piñeiro <apinheiro@igalia.com>	i965/vec4: nir_emit_if doesn't need to predicate based on all the channels v2: changed comment, as suggested by Matt Turner Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f534f331ca354bcb138e2b8f6d6d80147ee4a186	15-Oct-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/vec4: Use the right number of UBOs Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d3f45888045c84b2bc382a34d169a0ede4774a24	09-Oct-2015	Iago Toral Quiroga <itoral@igalia.com>	i965: Adapt SSBOs to work with their own separate index space Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
baee16bf02eedc6a32381d79da6c7ac942f782ae	28-Sep-2015	Iago Toral Quiroga <itoral@igalia.com>	nir: split SSBO min/max atomic instrinsics into signed/unsigned versions NIR is typeless so this is the only way to keep track of the type to select the proper atomic to use. v2: - Use imin,imax,umin,umax for the intrinsic names (Connor Abbott) - Change message for unreachable paths (Michael Schellenberger) Tested-by: Markus Wick <markus@selfnet.de> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4642d53a03122e6d3214ed12cb327898917eb84e	09-Oct-2015	Matt Turner <mattst88@gmail.com>	i965/vec4: Implement b2f and b2i using negation. Curro added this in commit 3ee2daf23d (before the vec4/NIR backend was added) but it was missed in the new NIR backend. Add it there as well. instructions in affected programs: 1857 -> 1810 (-2.53%) helped: 15 Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
1e3c1b107e075b210998998423901092b8fcd79b	03-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Use nir_foreach_variable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
bf7b6fd3fd6d98305d64ee6224ca9f9e7ba48444	02-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/shader: Get rid of the shader, prog, and shader_prog fields Unfortunately, we can't get rid of them entirely. The FS backend still needs gl_program for handling TEXTURE_RECTANGLE. The GS vec4 backend still needs gl_shader_program for handling transfom feedback. However, the VS needs neither and we can substantially reduce the amount they are used. One day we will be free from their tyranny. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
ca6a436f12cb55e9415049a217229c99b02ad3b8	02-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Use nir info instead of pulling things out of [shader_]prog Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
7926c3ea7d8f455cbee390d20c78dadf5432b9bc	01-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/backend_shader: Add a field to store the NIR shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
30c63571133ed50907ec14172c2f3ef82ee8a34e	01-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965: Move prog_data uniform setup to the codegen level As of now, uniform setup is more-or-less unified between vec4 and fs and no longer requires the fs_visitor. This makes uniform setup more of a language/API thing than a backend compiler thing. This commit moves setting up the stage_prog_data.params arrays to the same place as we set up the rest of stage_prog_data. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
cdf314cb21377ee7caca05bd1abab6a2b921d213	01-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Simplify uniform setup Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
7fee8b6f055831bc070bb36d02a8b1c4d601652a	02-Oct-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Pull GLSL uniform handling into a common function The way we deal with GLSL uniforms and builtins is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
03c4171b577b06b1d8dde50b6eb9507d8ef4c1ce	29-Sep-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/nir: Pull common ARB program uniform handling into a common function The way we deal with ARB program uniforms is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
390b48fc4a9836b563560581fbfb4833546de0c8	30-Sep-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Use the uniform count from nir_assign_var_locations Previously, we were counting up uniforms as we set them up. However, this count should be exactly identical to shader->num_uniforms provided by nir_assign_var_locations. (If it's not, we're in trouble anyway because that means that locations don't match up.) This matches what the fs backend is already doing. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
5609e0d7b41e861a3359991e8d0f2053b255fc31	30-Sep-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Get rid of the uniform_vector_size array The uniform_vector_size array was only ever used by pack_uniform_registers which no longer needs it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
023165a734b3bae52a449ad01bc1ea5ba4384ec1	15-Sep-2015	Samuel Iglesias Gonsalvez <siglesias@igalia.com>	i965/vec4/nir: add nir_intrinsic_memory_barrier support Fix OpenGL ES 3.1 conformance tests: advanced-readWrite-case1-vsfs and advanced-matrix-vsfs. v2: - Fix SHADER_OPCODE_MEMORY_FENCE emission and the allocation of 'tmp' (Francisco). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
6668eb5a451c43ac78a784711cf239fdf7ca75ef	11-Sep-2015	Samuel Iglesias Gonsalvez <siglesias@igalia.com>	mesa: rename gl_shader_program's NumUniformBlocks to NumBufferInterfaceBlocks Because it counts shader storage blocks too. v2: - Use NumBufferInterfaceBlocks instead (Jordan). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
5ef169034c77ede86546d8dc42f7f22abcd6faa0	07-Aug-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/nir/vec4: Implement nir_intrinsic_ssbo_atomic_* Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e3f9c7829c609e8a32da9f36c9829843f2204a37	10-Sep-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/nir/vec4: Implement nir_intrinsic_load_ssbo Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
922b3d1bb16b4b6b292cb159e5fe3d0615ca725c	10-Sep-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/nir/vec4: Implement nir_intrinsic_store_ssbo Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
65d7f5fe9f4284f7de867b4c412f086c6dcca176	26-Aug-2015	Samuel Iglesias Gonsalvez <siglesias@igalia.com>	i965/vec4/nir: implement nir_intrinsic_get_buffer_size Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
2e5423ad6345e027bb40c75ffc0e9e64843b9c05	23-Sep-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Add support for fdph_replicated Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
6ba291db4ba4f03ac94560eaae861bc162ac838e	18-Sep-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/vec4/nir: Remove all "this->" snippets For consistency, either we have all class members dereferenced, or none. In this case, very few are so lets get rid of them all. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
a88ce0c1c4c1f77209b71d5a6858f952642f385a	10-Sep-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4: Use the replicated fdot instruction in NIR Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c951bb83056724df02ba7e6fe2dfa720c0f45c1f	09-Sep-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4_nir: Use partial SSA form rather than full non-SSA We made this switch in the FS backend some time ago and it seems to make a number of things a bit easier. In particular, supporting SSA values takes very little work in the backend and allows us to take advantage of the majority of the SSA information even after we've gotten rid of Phi nodes. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b8d2263c83d29f4626ac0fe0316978aa6262aefb	14-Sep-2015	Antia Puentes <apuentes@igalia.com>	i965/vec4_nir: Load constants as integers Loads constants using integer as their register type, like it is done in FS backend. No shader-db changes in HSW. Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91716 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0b91bcea98c0fe201bba89abe1ca3aee4d04c56c	12-Aug-2015	Ilia Mirkin <imirkin@alum.mit.edu>	i965: add support for textureSamples function Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> [v2: kayden-supplied code in fs_nir replacing need for logical opcode] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0cc331dddd1a99c7af3619c92c48b5c32e17f6b3	04-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	i965/nir: Use nir_system_value_from_intrinsic to reduce duplication. This code is all pretty much identical. We just needed the translation from one enum value to the other. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
27e83b62bb52de7a681ed82679a707555023f43d	28-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Store a key_tex pointer in vec4_visitor. I'm about to remove the base class for VS/GS/HS/DS program keys, at which point we won't be able to use key->tex anymore. Instead, we'll need to store a direct pointer (like we do in the FS backend). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
cfa056c6a5eadf87f92a71346c0dddd2a080e302	18-Aug-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4_nir: Get rid of the uniform_driver_location tracking Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0db8e87b4a16b123f7c0b44d54f23b535a136ee6	18-Aug-2015	Jason Ekstrand <jason.ekstrand@intel.com>	nir/intrinsics: Add a second const index to load_uniform In the i965 backend, we want to be able to "pull apart" the uniforms and push some of them into the shader through a different path. In order to do this effectively, we need to know which variable is actually being referred to by a given uniform load. Previously, it was completely flattened by nir_lower_io which made things difficult. This adds more information to the intrinsic to make this easier for us. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
640c472fd075814972b1276c5b0ed3a769aacda5	12-Aug-2015	Kenneth Graunke <kenneth@whitecape.org>	i965: Move type_size() methods out of visitor classes. I want to use C function pointers to these, and they don't use anything in the visitor classes anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
2450cbfcbc3671056afad9e858acadbb6edea068	12-Aug-2015	Matt Turner <mattst88@gmail.com>	i965/vec4/nir: Emit single MOV to generate a scalar constant. If an immediate is written to multiple channels, we can load it in a single writemasked MOV. total instructions in shared programs: 6285144 -> 6261991 (-0.37%) instructions in affected programs: 718991 -> 695838 (-3.22%) helped: 5762 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c1d9b3ae0bb0f1222719d7737dd9986e437bf5b9	04-Aug-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4_nir: Properly handle integer multiplies on BDW+ Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
1d658cf8795383dbef127e46f3740b516bfe21b9	03-Aug-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4_nir: Do boolean source modifier resolves on BDW+ On BDW+, the negation source modifier on NOT, AND, OR, and XOR, is actually a boolean negate and not an integer negate. However, NIR's soruce modifiers are the integer version. We have to resolve it with a MOV prior to emitting the actual instruction. This is basically the same thing we do in the FS backend. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
5e1c1c2fcbdfb96a973ae3fd196e341ab2d41833	03-Aug-2015	Jason Ekstrand <jason.ekstrand@intel.com>	i965/vec4-nir: Handle boolean resolvese on ILK- The analysis code was already there and running, we just weren't doing anything with the result of it yet. Reviewed-by: Matt Turner <mattst88@gmail.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
82f2e706bfd646b91bc0b8beecdff4e54b1f7b04	29-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Handle uniforms on vertex programs The implementation takes into account that on ARB_vertex_program only a single nir variable is generated to support all the uniform data. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
19cf934f7f18237e1a212b0a019026d5d36c6fac	06-Jul-2015	Alejandro Piñeiro <apinheiro@igalia.com>	i965/nir/vec4: Add implementation of nir_emit_texture() Uses the nir structure to get all the info needed (sources, dest reg, etc), and then it uses the common vec4_visitor::emit_texture to emit the final code. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
583c1c61703826002ba0f202e8ef7bc2c822ef1d	17-Jun-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir/vec4: Implement nir_emit_jump This implementation is taken as-is from fs_nir. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
9b4a6fa4c09d36e0e5c00309e6ea37300ea38f78	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Mark as unreachable ops that should be already lowered NIR ALU operations: * nir_op_fabs * nir_op_iabs * nir_op_fneg * nir_op_ineg * nir_op_fsat should be lowered by lower_source mods * nir_op_fdiv should be lowered in the compiler by DIV_TO_MUL_RCP. * nir_op_fmod should be lowered in the compiler by MOD_TO_FLOOR. * nir_op_fsub * nir_op_isub should be handled by ir_sub_to_add_neg. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
16072834babc487f78472f7e7b59d35249a3aac8	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement vector "any" operation Adds NIR ALU operations: * nir_op_bany2 * nir_op_bany3 * nir_op_bany4 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
fa4e3c3c9f6f3a72a032499fccaa6e222d6a7fa4	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement the dot product operation Adds NIR ALU operations: * nir_op_fdot2 * nir_op_fdot3 * nir_op_fdot4 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
96106e2a9f214d98fc2e99c65398f95d41a3b879	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement conditional select Adds NIR ALU operations: * nir_op_bcsel Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b38fcd0aea8d17919ecd9cc7afc518cfb2c01c27	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement linear interpolation Adds NIR ALU operation: * nir_op_flrp Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b64bd1fdc37eed1bb62d2b32ad22f0f77501f7f2	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement floating-point fused multiply-add Adds NIR ALU operation: * nir_op_ffma Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d12e165dbb403c3cf86ab7f1b8f28ab6188b479f	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement "shift" operations Adds NIR ALU operations: * nir_op_ishl * nir_op_ishr * nir_op_ushr Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
798cb33a256f703ecaf56d4443e12055484d4bcc	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement the "sign" operation Follows the vec4_visitor IR implementation but sets the saturate value in addition. Adds NIR ALU operations: * nir_op_fsign * nir_op_isign Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8e1e6facbf828258a9a8ca09da846d1baa21d984	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement bit operations Same implementation than the IR case. Adds NIR ALU operations: * nir_op_bitfield_reverse * nir_op_bit_count * nir_op_ufind_msb * nir_op_ifind_msb * nir_op_find_lsb * nir_op_ubitfield_extract * nir_op_ibitfield_extract * nir_op_bfm * nir_op_bfi * nir_op_bitfield_insert Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0e874985ce50d902535e1eb766bd252c921b5d8f	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement pack/unpack operations * Lowered floating-point pack and unpack operations are not valid in VS. * Pack and unpack 2x16 operations should be handled by lower_packing_builtins. * Adds NIR ALU operations: * nir_op_pack_half_2x16 * nir_op_unpack_half_2x16 * nir_op_unpack_unorm_4x8 * nir_op_unpack_snorm_4x8 * nir_op_pack_unorm_4x8 * nir_op_pack_snorm_4x8 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
3f10c2f3d73ae41ff83afcdbe225121b8336f499	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: "noise" ops should already be lowered Marked them as unreachable. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
fa4731f4a53aa21e53a62f42f3afdc19b0ce4c8e	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement "bool<->int,float" format conversion Used the same implementation than the vec4_visitor NIR. Adds NIR ALU operations: * nir_op_b2i * nir_op_b2f * nir_op_f2b * nir_op_i2b Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f14199a8fb802f6672d559fa958a5ee84e3e13f1	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement logical operators Adds NIR ALU operations: * nir_op_inot * nir_op_ixor * nir_op_ior * nir_op_iand Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
51aeafaf96b3b349e007ad05738bc1e05663fedf	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement non-equality ops on vectors Adds NIR ALU operations: * nir_op_bany_fnequal2 * nir_op_bany_inequal2 * nir_op_bany_fnequal3 * nir_op_bany_inequal3 * nir_op_bany_fnequal4 * nir_op_bany_inequal4 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8be4b876c90192c3a5e6fcc9b526f43a3f7bfc11	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement equality ops on vectors Adds NIR ALU operations: * nir_op_ball_fequal2 * nir_op_ball_iequal2 * nir_op_ball_fequal3 * nir_op_ball_iequal3 * nir_op_ball_fequal4 * nir_op_ball_iequal4 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
84d4a9dc2ca3d98f19cc9125a5ff1ac1225f360d	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement non-vector comparison ops Adds NIR ALU operations: * nir_op_flt * nir_op_ilt * nir_op_ult * nir_op_fge * nir_op_ige * nir_op_uge * nir_op_feq * nir_op_ieq * nir_op_fne * nir_op_ine Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b9c41affcf67f30d7f6c74c17ea34bc42756d56d	17-Apr-2015	Antia Puentes <apuentes@igalia.com>	i965/nir: Add utility method for comparisons This method returns the brw_conditional_mod value used when emitting comparative ALU operations. It could be moved to brw_nir in the future to reuse it in fs_nir backend. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
dae6025e8efdfb759458a3243c8cd1588f485135	14-Apr-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Derivatives are not allowed in VS Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
5e6f1c38a591fa39cff1c32a2cfdda927145756a	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement min/max operations Adds NIR ALU operations: * nir_op_fmin * nir_op_imin * nir_op_umin * nir_op_fmax * nir_op_imax * nir_op_umax Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
7553a51a68c0b2030265fe741f9c511b65047914	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement various rounding functions Adds NIR ALU operations: * nir_op_ftrunc * nir_op_fceil * nir_op_ffloor * nir_op_ffrac * nir_op_fround_even Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0ce159ec7fbcdf00c488b77f63e565e89ef6cab5	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement carry/borrow for addition/subtraction Adds NIR ALU operations: * nir_op_uadd_carry * nir_op_usub_borrow Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
62cef7b0723ad6ca49ed06a6899a5852e41359e8	17-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement more math operations Adds NIR ALU operations: * nir_op_frcp * nir_op_fexp2 * nir_op_flog2 * nir_op_fexp * nir_op_flog * nir_op_fsin * nir_op_fcos * nir_op_idiv * nir_op_udiv * nir_op_umod * nir_op_ldexp * nir_op_fsqrt * nir_op_frsq * nir_op_fpow Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
9acebf146184c35e6897b91fff414c5295d47996	16-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement multiplication Implementation based on the vec4_visitor IR implementation for the operations ir_binop_mul and ir_binop_imul_high. Adds NIR ALU operations: * nir_op_fmul * nir_op_imul * nir_op_imul_high * nir_op_umul_high Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0675842b56a956befbac4a3b912823e73a95a500	16-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement the addition operation Adds NIR ALU operations: * nir_op_fadd * nir_op_iadd Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4f39b547da4f9949d1b1f9f0df07d08951f0358d	16-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement int<->float format conversion ops Adds NIR ALU operations: * nir_op_f2i * nir_op_f2u * nir_op_i2f * nir_op_u2f Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e4f02f47e70d384531ac68e6d33a62fdcdbd1f28	16-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Lower "vecN" instructions and mark them unreachable This enables NIR pass "lower_vec_to_movs" on shaders that work on vec4. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
79154d99d6e760b1daf327b4594dded18f1d4191	16-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement single-element "mov" operations Adds NIR ALU operations: * nir_op_imov * nir_op_fmov Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
ef1b30ae637e613b384541324c199d2dbe6b44bd	16-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Prepare source and destination registers for ALU operations This patch resolves and initializes the destination and the source registers that are common to most ALU operations. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
168bbfa6ff22a586ad6307c187cfa3b8fff5f227	16-Jun-2015	Antia Puentes <apuentes@igalia.com>	i965/nir/vec4: Implement loading values from an UBO Based on the vec4_visitor IR implementation for the ir_binop_load_ubo operation. Notice that unlike the vec4_visitor IR, adding the !=0 comparison for UBO bools is not needed here because that comparison is already added by the nir_visitor when processing the ir_binop_load_ubo (in UBOs "true" is any value different from zero, but for us is ~0). Adds NIR instrinsics: * nir_intrinsic_load_ubo_indirect * nir_intrinsic_load_ubo Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
98d07022f5312967bdfd54069869c8d6c65117a7	16-Jun-2015	Alejandro Piñeiro <apinheiro@igalia.com>	i965/nir/vec4: Implement atomic counter intrinsics (read, inc and dec) The implementation is based on its fs_nir counterpart. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e6cafb5dfdef8d8d25ee1e3375304cf35897d1f7	16-Jun-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/nir/vec4: Implement load_uniform intrinsic For the indirect case we need to take the index delivered by NIR and compute the parent uniform that we are accessing (the one that we uploaded to a surface) and the constant offset into that surface. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e76e8caecd30799500357a45468329f033a93932	16-Jun-2015	Alejandro Piñeiro <apinheiro@igalia.com>	i965/nir/vec4: Implement intrinsics that load system values These include: nir_intrinsic_load_vertex_id_zero_base nir_intrinsic_load_base_vertex nir_intrinsic_load_instance_id The source register is fetched from the nir_system_values map initialized during nir_setup_system_values stage. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
662c4c99065381b8e265310d176cfdef6698ca57	16-Jun-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir/vec4: Implement store_output intrinsic This implementation is based on the current URB setup in vec4_visitor, which requires the output register to be stored in the output_reg array at variable's original shader location index. But since nir_lower_io() pass uses the value in var->data.driver_location, we need to put there var->data.location instead, prior to calling nir_lower_io(), so that we end up with the correct index in const_index[0]. The driver_location is not used at all, so this patch also disables the nir_assign_var_locations pass on non-scalar shaders. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
167cb9663adc8c7c61807e503f66e85f955e7d5f	16-Jun-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir/vec4: Implement load_input intrinsic The source register is fetched from the nir_inputs map built during nir_setup_inputs stage. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
afe085a0ca01f659c69456018e5f5076c9dde47d	16-Jun-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir/vec4: Implement loop statements (nir_cf_node_loop) This is taken as-is from fs_nir. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
5c0436dbf87fef76ba67456f215d9285c38f1816	16-Jun-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/nir/vec4: Implement conditional statements (nir_cf_node_if) The same we do in the FS NIR backend, only that here we need to consider the number of components in the condition and adjust the swizzle accordingly. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f3187ea31ede6bc181ee561573d127aa2e485657	16-Jun-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir/vec4: Add get_nir_dst() and get_nir_src() methods These methods are essential for the implementation of the NIR->vec4 pass. They work similar to their fs_nir counter-parts. When processing instructions, these methods are invoked to resolve the brw registers (source or destination) corresponding to the NIR sources or destination. It uses the map of NIR register index to brw register for all registers locally allocated in a block. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f7152525374015594e037fa11bb64e1c7174829b	01-Jul-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir/vec4: Implement load_const intrinsic Similar to fs_nir backend, a nir_local_values map will be filled with newly allocated registers as the load_const instrinsic instructions are processed. Later, get_nir_src() will fetch the registers from this map for sources that are ssa. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
59006d3ad3ed5d29e84afa5931f425344e2ef658	22-Jul-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir/vec4: Add shader function implementation It basically allocates registers local to a function in a nir_locals map, then emits all its control-flow blocks. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4023b55fdd7005a8a100637c229a1c40648cdd2b	16-Jun-2015	Alejandro Piñeiro <apinheiro@igalia.com>	i965/nir/vec4: Add setup for system values Similar to other variable setups, system values will initialize the corresponding register inside a 'nir_system_values' map, which will then be queried later when processing the different system value intrinsics for the appropriate register. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
195156e571e851273c135847f91ed73b3bfc1914	16-Jun-2015	Iago Toral Quiroga <itoral@igalia.com>	i965/nir/vec4: Add setup of uniform variables Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b929acb6a8659fdc06623b766bdf59904d8a3558	16-Jun-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir/vec4: Add setup of input variables in NIR->vec4 pass This implementation sets up a map of input variable offsets to source registers that are already initialized with the corresponding register offset. This map will then be queried when processing load_input intrinsic operations, to obtain the correct register source from which the input data will be loaded. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
abf4fa3c03ebe5716c90c8a310945c3621cf598f	16-Jun-2015	Eduardo Lima Mitev <elima@igalia.com>	i965/nir/vec4: Add implementation placeholders for a new NIR->vec4 pass This patch will add a brw_vec4_nir.cpp file filled with entry point methods to the main functionality, following a structure similar to brw_fs_nir.cpp. Subsequent patches in this series will be adding the implementations for these methods, incrementally. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp