History log of /external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
f51a5b51ab92ada4b9f3b1d603f9de60b66e46ce 06-Jul-2016 Juan A. Suarez Romero <jasuarez@igalia.com> i965/vec4: emit correctly load_inputs for 64bit data

For dvec3 and dvec4 types, a single GRF do not have enough space to
allocate two inputs from two different vertices (SIMD4x2).

So the GRF only contains first two components for the two vertices, and
the next GRF has the remaining components.

We want to put all the components for the same vertex in the same
register. Thus, we do a shuffle to reorder the data.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
301fdfd8387856ea83c0ac0bff95915c0872c2f4 07-Dec-2016 Samuel Iglesias Gonsálvez <siglesias@igalia.com> vec4: use DIM instruction when loading DF immediates in HSW

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b76f2206f550c37835d4e19eea1588caa0211b85 01-Jul-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: fix store output for 64-bit types

We need to shuffle the data before it is written to the URB. Also,
dvec3/4 need two vec4 slots.

v2: use byte_offset() instead of offset().

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8eea41e75d86bfe9bef5f69b25ad797da236a008 12-Feb-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: Fix SSBO stores for 64-bit data

In this case we need to shuffle the 64-bit data before we write it
to memory, source from reg_offset + 1 to write components Z and W
and consider that each DF channel is twice as big.

v2: use byte_offset() instead of offset().

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
9998d55afd179ad5019d3841e4c3255a02fd2d7b 13-Jul-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: Fix SSBO loads for 64-bit data

Same requirements as for UBO loads.

v2:
- use byte_offset() instead of offset() (Iago)
- keep the const. offset as an immediate like the original code did (Juan)

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4486c90aaeb08f424ce17f842f46d24d1ceaadcb 13-Jul-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: Fix UBO loads for 64-bit data

We need to emit 2 32-bit load messages to load a full dvec4. If only
1 or 2 double components are needed dead-code-elimination will remove
the second one.

We also need to shuffle the result of the 32-bit messages to form
valid 64-bit SIMD4x2 data.

v2:
- use byte_offset() instead of offset() (Iago)
- keep the const. offset as an immediate like the original code did (Juan)

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d8e123cc5d66022069f3aee53318bfd1075bcc53 22-Jun-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: Add a shuffle_64bit_data helper

SIMD4x2 64bit data is stored in register space like this:

r0.0:DF x0 y0 z0 w0
r1.0:DF x1 y1 z1 w1

When we need to write data such as this to memory using 32-bit write
messages we need to shuffle it in this fashion:

r0.0:DF x0 y0 x1 y1
r0.1:DF z0 w0 z1 w1

and emit two 32-bit write messages, one for r0.0 at base_offset
and another one for r0.1 at base_offset+16.

We also need to do the inverse operation when we read using 32-bit messages
to produce valid SIMD4x2 64bit data from the data read. We can achieve this
by aplying the exact same shuffling to the data read, although we need to
apply different channel enables since the layout of the data is reversed.

This helper implements the data shuffling logic and we will use it in
various places where we read and write 64bit data from/to memory.

v2 (Curro):
- Use the writemask helper and don't assert on the original writemask
being XYZW.
- Use the Vec4 IR builder to simplify the implementation.

v3 (Iago):
- Use byte_offset() instead of offset().

v3:
- Fix typo (Matt)
- Clarify the example and fix indention (Matt).

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
82e9dda8bf8875d232840585f48763c7a7092918 08-Jun-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4/nir: do not emit 64-bit MAD

RepCtrl=1 does not work with 64-bit operands so we need to use RepCtrl=0.

In that situation, the regioning generated for the sources seems to be
equivalent to <4,4,1>:DF, so it will only work for components XY, which
means that we have to move any other swizzle to a temporary so that we can
source from channel X (or Y) in MAD and we also need to split the instruction
(we are already scalarizing DF instructions but there is room for
improvement and with MAD would be more restricted in that area)

Also, it seems that MAD operations like this only write proper output for
channels X and Y, so writes to Z and W also need to be done to a temporary
using channels X/Y and then move that to channels Z or W of the actual dst.

As a result the code we produce for native 64-bit MAD instructions is rather
bad, and much worse than just emitting MUL+ADD. For reference, a simple case
of a fully scalarized dvec4 MAD operation requires 15 instructions if we use
native MAD and 8 instructions if we emit ADD+MUL instead. There are some
improvements that we can do to the emission of MAD that might bring the
instruction count down in some cases, but it comes at the expense of a more
complex implementation so it does not seem worth it, at least initially.

This patch makes translation of NIR's 64-bit FMMA instructions produce MUL+ADD
instead of MAD. Currently, there is nothing else in the vec4 backend that emits
MAD instructions, so this is sufficient and it helps optimization passes see
MUL+ADD from the get go.

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b58026b31e7258a4bd2bb630a1d41a433fb01799 07-Jul-2016 Samuel Iglesias Gonsálvez <siglesias@igalia.com> i965/vec4: use the new helper function to create double immediates

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
98da3623d5dfd991362c4fd3571325fe0277a2f9 09-Mar-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: add a helper function to create double immediates

Gen7 hardware does not support double immediates so these need
to be moved in 32-bit chunks to a regular vgrf instead. Instead
of doing this every time we need to create a DF immediate,
create a helper function that does the right thing depending
on the hardware generation.

v2 (Curro):
- Use swizzle() and writemask() helpers and make tmp const.

v3 (Iago):
- Adapt to changes in offset()

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8f9ce5fa22c04b5b34aa6dc67e4a9b2d151d293d 18-Feb-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: fix optimize predicate for doubles

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
1816ae8f68e395da26dcfea2539bafd715c8dbc4 05-Feb-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: implement fsign() for doubles

v2: use a MOV with a conditional_mod instead of a CMP, like we do in d2b, to skip
loading a double immediate.

v3: Fix comment (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
6e570619e0755a50b2c8d57c6d1189fb9aca899d 17-Feb-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: implement d2b

v2 (Curro):
- Generate the flag register with a MOV with conditional_mod instead of a CMP
instruction, which has the benefit that we can skip loading a DF
0.0 constant.
- Avoid the PICK_LOW_32BIT + MOV by using the flag result and a
SEL to set the boolean result.

v3:
- Fix comment (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c1fb525016e41658d2dc5d581da4e83b8a075fd4 17-Feb-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: implement d2i, d2u, i2d and u2d

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4b2257623494ea8e7a1c7b6fbb2f4f3e59522468 29-Jun-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: implement HW workaround for align16 double to float conversion

From the BDW PRM, Workarounds chapter:

"DF->f format conversion for Align16 has wrong emask calculation when
source is immediate."

Notice that Broadwell and later are strictly scalar at the moment though, so
this is not really necessary.

v2: Instead of moving the immediate to a vgrf and converting from there, just
convert the double immediate to float in the compiler and move the result
to the destination (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
bfc1f0f017db6bd11a558237c9a4ebeacf73f5ba 29-Jun-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: add helpers for conversions to/from doubles

Use these helpers to implement d2f and f2d. We will reuse these helpers when
we implement things like d2i or i2d as well.

v2:
- Rename the helpers (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c722a8e61ebc72d7d21c2bed0f623218d739fdb7 17-Feb-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: Rename DF to/from F generator opcodes

The opcodes are not specific for conversions to/from float since we need
the same for conversions to/from other 32-bit types. Rename the opcodes
accordingly and change the asserts to check the size of the types involved
instead.

v2:
- Rename to VEC4_OPCODE_TO_DOUBLE and VEC4_OPCODE_FROM_DOUBLE (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
619271ec8785ab8b6021d0f49e98c51d457eab4d 15-Feb-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: fix register allocation for 64-bit undef sources

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
a8318b120e53518ae4d933acd876b8dbd3871e0c 12-Feb-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: fix get_nir_dest() to use DF type for 64-bit destinations

v2: Make dst_reg_for_nir_reg() handle this for nir_register since we
want to have the correct type set before we call offset().

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
bb0e67d55dbd353e9c57b0709fa3e534f1aba05f 05-Oct-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: fix indentation in get_nir_src()

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8cdbbbd2cf9e0c42114c7090805fa2b4a93ca499 14-Aug-2015 Iago Toral Quiroga <itoral@igalia.com> i965/vec4/nir: implement double comparisons

v2:
- Added newline before if() (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8a3ba033397bc627e499fcd3a379984ba4d587d2 01-Jun-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: implement double packing

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
94cfdf586a6a95bd06b989bba27d85f9bf99b9df 01-Jun-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: implement double unpacking

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4c040332f56ca2e5a4bbd8c412fd32ab3ff821db 10-Nov-2015 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: We only support 32-bit integer ALU operations for now

Add asserts so we remember to address this when we enable 64-bit
integer support, as suggested by Connor and Jason.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e09a6be3b6806c582347f6faf93cc2d824d98ed2 14-Aug-2015 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: translate d2f/f2d

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
9ce4b20bde4f4ca8e8907fcac13e8bb9d7e5f4b4 14-Aug-2015 Iago Toral Quiroga <itoral@igalia.com> i965/vec4/nir: fix emitting 64-bit immediates

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
3457252b74f5490cff8915cac1e5fe0bf1031f5b 13-Aug-2015 Connor Abbott <connor.w.abbott@intel.com> i965/vec4/nir: set the right type for 64-bit registers

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
fef06f635610ddc730a213576e59afb638c6051d 25-May-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4/nir: support doubles in ALU operations

Basically, this involves considering the bit-size information to set
the appropriate type on both operands and destination.

v2 (Curro)
- Don't use two temporaries (and write one of them twice ) to obtain
the nir_alu_type.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0f096b1e5a5e31a5efba7279326ec8bc8478bb56 02-Nov-2015 Iago Toral Quiroga <itoral@igalia.com> i965/vec4/nir: Add bit-size information to types

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
2d81a292036445c440e56d07ce3d5294e0411d71 29-Feb-2016 Connor Abbott <connor.w.abbott@intel.com> i965/vec4/nir: allocate two registers for dvec3/dvec4

v2 (Curro):
- Do not special-case for a bit-size of 64, divide the bit_size by 32
instead.
- Use DIV_ROUND_UP so we can handle sub-32-bit types.

v3 (Ian):
- Make num_regs const.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
54913850aa379f57fcbf7a2dec5ea236cf997646 10-Aug-2015 Connor Abbott <connor.w.abbott@intel.com> i965/vec4/nir: simplify glsl_type_for_nir_alu_type()

Less duplication, one one less case to handle for doubles and support
for sized NIR types.

v2: Fix call to get_instance by swapping rows and columns params (Iago)

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
fd249c803e3ae2acb83f5e3b7152728e73228b7b 12-Dec-2016 Ilia Mirkin <imirkin@alum.mit.edu> treewide: s/comparitor/comparator/

git grep -l comparitor | xargs sed -i 's/comparitor/comparator/g'

Just happened to notice this in a patch that was sent and included one
of the tokens in question.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4f2d1d6ea713df8f8d816b48b9e99c7117cf36d7 28-Nov-2016 Ilia Mirkin <imirkin@alum.mit.edu> i965: support constant gather offsets larger than 4 bits

Offsets that don't fit into 4 bits need to force gather_po to be
selected. Adjust the logic so that this happens.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f182e5eafc31ebc7c140e9a369d5f747948733ae 17-Oct-2016 Kenneth Graunke <kenneth@whitecape.org> i965/vec4: Handle component qualifiers on non-generic varyings.

ARB_enhanced_layouts only requires component qualifier support for
generic varyings, so this is all the vec4 backend knew how to handle.

This patch extends the backend to handle it for all varyings, so we
can use store_output intrinsics with a component set for things like
clip/cull distances. We may want to use that for other VUE header
fields in the future as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
66fcfa6894ab61a8cb70955f4a4113729e4a8099 03-Oct-2016 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: make offset() work in terms of a simd width and scalar components

So that it has the same semantics as the scalar backend implementation. The
helper will now take a simd width (which is always 8 in vec4 mode) and step
as many scalar components as specified by that width, respecting the size of
the scalar channels.

v2 (Curro):
- Remove the assertion in offset(), byte_offset() has the same checks.
- Use byte_offset() directly instead of add_byte_offset().
- Make things more clear by explicitly including the vertical stride
in the byte offset expression.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b 13-Oct-2016 Timothy Arceri <timothy.arceri@collabora.com> nir/i965/anv/radv/gallium: make shader info a pointer

When restoring something from shader cache we won't have and don't
want to create a nir_shader this change detaches the two.

There are other advantages such as being able to reuse the
shader info populated by GLSL IR.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
89e1436e2d4ff0c15202708979eb36761cae4167 11-Oct-2016 Ian Romanick <ian.d.romanick@intel.com> i965: Silence unused parameter warnings

brw_link.cpp:76:44: warning: unused parameter ‘shader_type’ [-Wunused-parameter]
gl_shader_stage shader_type,
^
brw_nir.c: In function ‘brw_nir_lower_vs_inputs’:
brw_nir.c:194:55: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
const struct gen_device_info *devinfo,
^
brw_vec4_visitor.cpp:914:37: warning: unused parameter ‘sampler’ [-Wunused-parameter]
uint32_t sampler,
^
brw_vec4_visitor.cpp:1146:34: warning: unused parameter ‘stream_id’ [-Wunused-parameter]
vec4_visitor::gs_emit_vertex(int stream_id)
^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
40dd45d0c6aa4a9d727c09225967e9c3b1f45854 30-Jun-2016 Ian Romanick <ian.d.romanick@intel.com> i965: Enable ARB_shader_atomic_counter_ops

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
3d2011cb33317b0fe9b8fe989916efc1841c6ce0 30-Jun-2016 Ian Romanick <ian.d.romanick@intel.com> i965: Refactor emission of atomic counter operations

This will make it easier to add more operations.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
69fdf13c215c2970feaca76f178a5c2c11ba8fec 03-Sep-2016 Francisco Jerez <currojerez@riseup.net> i965/vec4: Replace vec4_instruction::regs_written with ::size_written field in bytes.

The previous regs_written field can be recovered by rewriting each
rvalue reference of regs_written like 'x = i.regs_written' to 'x =
DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference
like 'i.regs_written = x' to 'i.size_written = x * reg_unit'.

For the same reason as in the previous patches, this doesn't attempt
to be particularly clever about simplifying the result in the interest
of keeping the rather lengthy patch as obvious as possible. I'll come
back later to clean up any ugliness introduced here.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
fba020e5af49d9d9a2c6e4d4b79115ed1e74a127 01-Sep-2016 Francisco Jerez <currojerez@riseup.net> i965/vec4: Replace dst/src_reg::reg_offset with dst/src_reg::offset expressed in bytes.

The dst/src_reg::offset field in byte units introduced in the previous
patch is a more straightforward alternative to an offset
representation split between ::reg_offset and ::subreg_offset fields.
The split representation makes it too easy to forget about one of the
offsets while dealing with the other, which has led to multiple FS
back-end bugs in the past. To make the matter worse the unit
reg_offset was expressed in was rather inconsistent, for uniforms it
would be expressed in either 4B or 16B units depending on the
back-end, and for most other things it would be expressed in 32B
units.

This encodes reg_offset as a new offset field expressed consistently
in byte units. Each rvalue reference of reg_offset in existing code
like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and
each lvalue reference like 'r.reg_offset = x' is rewritten to
'r.offset = r.offset % reg_unit + x * reg_unit'.

Because the change affects a lot of places and is rather non-trivial
to verify due to the inconsistent value of reg_unit, I've tried to
avoid making any additional changes other than applying the rewrite
rule above in order to keep the patch as simple as possible, sometimes
at the cost of introducing obvious stupidity (e.g. algebraic
expressions that could be simplified given some knowledge of the
context) -- I'll clean those up later on in a second pass.

v2: Fix division by the wrong reg_unit in the UNIFORM case of
convert_to_hw_regs(). (Iago)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d1b1fca0b7cccff718923f2344ea144dc3ebb869 22-Jun-2016 Timothy Arceri <timothy.arceri@collabora.com> i965/vec4: add support for packing vs/gs/tes outputs

Here we create a new output_generic_reg array with the ability to
store the dst_reg for each component of user defined varyings.
This is needed as the previous code only stored the dst_reg based
on the varying location which meant packed varyings would overwrite
each other.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b427abba0c04214ba6184092eee73fc6377fbff9 23-Jun-2016 Timothy Arceri <timothy.arceri@collabora.com> i965/vec4: add support for packing inputs

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
96dfed49e47eac7afc100e5b8d3b316dd6652fb6 19-Jul-2016 Jason Ekstrand <jason.ekstrand@intel.com> i965: Stop muging cube array lengths by 6

From the Sky Lake PRM:

"For SURFTYPE_CUBE: For Sampling Engine Surfaces and Typed Data Port
Surfaces, the range of this field is [0,340], indicating the number of
cube array elements (equal to the number of underlying 2D array elements
divided by 6). For other surfaces, this field must be zero."

In other words, the depth field for cube maps is in number of cubes not
number of 2-D slices so we need to divide by 6. ISL will do this correctly
for us assuming that we provide it with the correct array bounds which it
expects to be in 2-D slices. It appears as if we've been doing this wrong
ever since we first added cube map arrays for Sandy Bridge and the change
to ISL made things slightly worse. While we're at it, we now need to remoe
the shader hacks we've always done since they were only needed because we
were setting the depth field six times too large.

v2: Fix the vec4 backend as well (not sure how I missed this).

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
3e7cebc8da5c9f16fa1b9a25ea72b8d31c86a440 22-Jun-2016 Ian Romanick <ian.d.romanick@intel.com> i965: Use LZD to implement nir_op_find_lsb on Gen < 7

v2: Rebase on changes to previous two patches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c2019c6c261d5c46a4e5d3edc88836bcedf75f30 22-Jun-2016 Ian Romanick <ian.d.romanick@intel.com> i965: Use LZD to implement nir_op_ifind_msb on Gen < 7

v2: Retype LZD source as UD to avoid potential problems with 0x80000000.
Suggested by Matt. Also update comment about problem values with
LZD(abs(x)). Suggested by Curro.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
de20086eed47e6bfe7c25835d72383114f99c7a9 22-Jun-2016 Ian Romanick <ian.d.romanick@intel.com> i965: Use LZD to implement nir_op_ufind_msb

This uses one less instruction.

v2: Move emit_find_msb_using_lzd out of the visitor classes. Suggested
by Curro.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
fb5dcb81cc121e4355b7eef014474a5c42a2f6db 19-May-2016 Matt Turner <mattst88@gmail.com> i965: Pass nir_src/nir_dest by reference.

Cuts 6K of .text.

text data bss dec hex filename
5772372 264648 29320 6066340 5c90a4 lib/i965_dri.so before
5766074 264648 29320 6060042 5c780a lib/i965_dri.so after

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f687b8e1785df0825443f07778e5d0ddf6f9be09 13-May-2016 Ian Romanick <ian.d.romanick@intel.com> i965: Silence unused parameter warnings

The only place that actually used the type parameter was the GS visitor,
and it was always passed glsl_type::int. Just remove the parameter.

brw_vec4_vs_visitor.cpp:38:61: warning: unused parameter ‘type’ [-Wunused-parameter]
const glsl_type *type)
^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
9464d8c49813aba77285e7465b96e92a91ed327c 27-Apr-2016 Jason Ekstrand <jason.ekstrand@intel.com> nir: Switch the arguments to nir_foreach_function

This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:

s/nir_foreach_function(\([^,]*\),\s*\([^,]*\))/nir_foreach_function(\2, \1)/

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
707e72f13bb78869ee95d3286980bf1709cba6cf 27-Apr-2016 Jason Ekstrand <jason.ekstrand@intel.com> nir: Switch the arguments to nir_foreach_instr

This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:

s/nir_foreach_instr(\([^,]*\),\s*\([^,]*\))/nir_foreach_instr(\2, \1)/

and similar expressions for nir_foreach_instr_safe etc.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
7efff10585122d484dc3adab14af9380b9b8f309 13-Apr-2016 Connor Abbott <cwabbott0@gmail.com> i965/nir: fixup for new foreach_block()

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b6dc940ec273252678d40707d300851fa1c85ea5 13-Apr-2016 Connor Abbott <cwabbott0@gmail.com> nir: rename nir_foreach_block*() to nir_foreach_block*_call()

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b3f43822c72301c904fd2824ae3edcd20ea93a29 19-Apr-2016 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Use the correct offset for the swizzle shift in push constants

This was actually caught by Ken in review the first time around but somehow
didn't get fixed before the patches were pushed. :-(

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
9f16e170fed09821bb1b18a9dbe548f3d26b7977 19-Apr-2016 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Use nir_intrinsic_base in the load_uniform implementation

We shouldn't be reading the const_index directly

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0166ad6ced542cacfbbbe45e9d4b7f14af5040de 06-Apr-2016 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Support full std140 layout for push constants

Up until now, we have been able to assume that all push constants are
vec4-aligned because this is what the GL driver gives us. In Vulkan, we
need to be able to support full std140 because we get the layout from the
client.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8e76f664beb845f8dca30ca5635f9369618563b0 09-Dec-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Get rid of the uniform_size array

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
056849772f66582fd7e8a181c3fb16955f84243b 25-Nov-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants

This commit moves us to an instruction based model rather than a
register-based model for indirects. This is more accurate anyway as we
have to emit instructions to resolve the reladdr. It's also a lot simpler
because it gets rid of the recursive reladdr problem by design.

One side-effect of this is that we need a whole new algorithm in
move_uniform_array_access_to_pull_constants. This new algorithm is much
more straightforward than the old one and is fairly similar to what we're
already doing in the FS backend.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
cb372b39ea15729caf8491f4fd9f12c37a2840df 08-Apr-2016 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Use UD rather than D for uniform indirects

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
765dd6534937e125b95c7998862b1a4ec76a22d8 25-Mar-2016 Jason Ekstrand <jason.ekstrand@intel.com> i965: Implement the new imod and irem opcodes

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
bfd17c76c1267756ea16051cbe174cb23ff49f44 08-Apr-2016 Kenneth Graunke <kenneth@whitecape.org> i965: Port INTEL_PRECISE_TRIG=1 to NIR.

This makes the extra multiply visible to NIR's algebraic optimizations
(for constant reassociation) as well as constant folding. This means
that when the result of sin/cos are multiplied by an constant, we can
eliminate the extra multiply altogether, reducing the cost of the
workaround.

It also means we only have to implement it one place, rather than in
both backends.

This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion,
which has a ton of sin() calls, but always multiplies them by an
immediate constant. The extra multiply gets folded away.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
80c72a8ea7b1018661da0e6509a7f88ca1f5086f 25-Mar-2016 Jason Ekstrand <jason.ekstrand@intel.com> i965/nir: Provide a default LOD for buffer textures

Our hardware requires an LOD for all texelFetch commands even if they are
on buffer textures. GLSL IR gives us an LOD of 0 in that case, but the LOD
is really rather meaningless. This commit allows other NIR producers to be
more lazy and not provide one at all.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
65fbc43d54403905e3eaea02372b5a364dc1d773 27-Jan-2016 Kenneth Graunke <kenneth@whitecape.org> i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/COS output range.

The SIN and COS instructions on Intel hardware can produce values
slightly outside of the [-1.0, 1.0] range for a small set of values.
Obviously, this can break everyone's expectations about trig functions.

According to an internal presentation, the COS instruction can produce
a value up to 1.000027 for inputs in the range (0.08296, 0.09888). One
suggested workaround is to multiply by 0.99997, scaling down the
amplitude slightly. Apparently this also minimizes the error function,
reducing the maximum error from 0.00006 to about 0.00003.

When enabled, fixes 16 dEQP precision tests

dEQP-GLES31.functional.shaders.builtin_functions.precision.
{cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}.

at the cost of making every sin and cos call more expensive (about
twice the number of cycles on recent hardware). Enabling this
option has been shown to reduce GPUTest Volplosion performance by
about 10%.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
14c46954c910efb1db94a068a866c7259deaa9d9 25-Mar-2016 Jason Ekstrand <jason.ekstrand@intel.com> i965: Add an implemnetation of nir_op_fquantize2f16

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
084b24f5582567ebf5aa94b7f40ae3bdcb71316b 16-Mar-2016 Iago Toral Quiroga <itoral@igalia.com> nir: rename nir_const_value fields to include bitsize information

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0548844e866e4fe326432116f84fdf7e885fba9f 04-Mar-2016 Alejandro Piñeiro <apinheiro@igalia.com> i965/vec4/nir: no need to use surface_access:: to call emit_untyped_atomic

Now that brw_vec4_visitor::emit_untyped_atomic was removed, there is no need
to explicitly set it.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d3a89a7c494d577fdf8f45c0d8735004a571e86b 04-Mar-2016 Alejandro Piñeiro <apinheiro@igalia.com> i965/vec4/nir: remove emit_untyped_surface_read and emit_untyped_atomic at brw_vec4_visitor

surface_access emit_untyped_read and emit_untyped_atomic provides the same
functionality.

v2: surface parameter of emit_untyped_atomic is a const, no need to
specify default predicate on emit_untyped_atomic, use retype
(Francisco Jerez).

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
24994ae926629ac8521df3cab4a02eb81de15907 17-Feb-2016 Kenneth Graunke <kenneth@whitecape.org> i965: Push most TES inputs in vec4 mode.

(This is commit 4a1c8a3037cd29938b2a6e2c680c341e9903cfbe for vec4 mode.)

Using the push model for inputs is much more efficient than pulling
inputs - the hardware can simply copy a large chunk into URB registers
at thread creation time, rather than having the thread send messages to
request data from the L3 cache. Unfortunately, it's possible to have
more TES inputs than fit in registers, so we have to fall back to the
pull model in some cases.

However, it turns out that most tessellation evaluation shaders are
fairly simple, and don't use many inputs. An arbitrary cut-off of
24 vec4 slots (12 registers) should suffice. (I chose this instead of
the 32 vec4 slots used in the scalar backend to avoid regressing a few
Piglit tests due to the vec4 register allocator being too stupid to
figure out what to do. We probably ought to fix that, but it's a
separate issue.)

Improves performance in GPUTest's tessmark_x64 microbenchmark by
41.5394% +/- 0.288519% (n = 115) at 1024x768 on my Clevo W740SU
(with Iris Pro 5200).

Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by
38.3576% +/- 0.759748% (n = 42).

v2: Simplify abs/negate handling, as requested by Matt.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8750299a420af76cebd3067f6f603eacde06ae06 09-Feb-2016 Jason Ekstrand <jason.ekstrand@intel.com> nir: Remove the const_offset from nir_tex_instr

When NIR was originally drafted, there was no easy way to determine if
something was constant or not. The result was that we had lots of
special-casing for constant values such as this. Now that load_const
instructions are SSA-only, it's really easy to find constants and this
isn't really needed anymore.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d03e5d52557ce6523eb65dfec9172d6000f5ff8d 03-Nov-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Plumb separate surfaces and samplers through from NIR

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
5ec456375e4fdd0b6c7d797f99191044e19ead74 03-Nov-2015 Jason Ekstrand <jason.ekstrand@intel.com> nir: Separate texture from sampler in nir_tex_instr

This commit adds the capability to NIR to support separate textures and
samplers. As it currently stands, glsl_to_nir only sets the texture deref
and leaves the sampler deref alone as it did before and nir_lower_samplers
assumes this. Backends can still assume that they are combined and only
look at only at the texture index. Or, if they wish, they can assume that
they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir
all set both texture and sampler index whenever a sampler is required (the
two indices are the same in this case).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
ee85014b90af1d94d637ec763a803479e9bac5dc 06-Feb-2016 Jason Ekstrand <jason.ekstrand@intel.com> nir/tex_instr: Rename sampler to texture

We're about to separate the two concepts. When we do, the sampler will
become optional. Doing a rename first makes the separation a bit more
safe because drivers that depend on GLSL or TGSI behaviour will be fine to
just use the texture index all the time.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f4952421cddaa79498da2b7658f48dc008e489e1 25-Jan-2016 Matt Turner <mattst88@gmail.com> i965/vec4: Implement nir_op_pack_uvec2_to_uint.

And mark nir_op_pack_uvec4_to_uint unreachable, since it's only produced
by lowering pack[SU]norm4x8 which the vec4 backend does not need.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
866a6bf9f70625517d6d2c17be9523b9f035f1db 19-Jan-2016 Matt Turner <mattst88@gmail.com> i965/vec4: Spaces around operators.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0a6811207fbe18d49c7ab95f93ed01f75ffcdda0 14-Jan-2016 Jason Ekstrand <jason@jlekstrand.net> i965/vec4: Use UW type for multiply into accumulator on GEN8+

BDW adds the following restriction: "When multiplying DW x DW, the dst
cannot be accumulator."

Cc: "11.1,11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b82e26a6a4d6baf121f44c61c862bfa79ba0d172 13-Jan-2016 Matt Turner <mattst88@gmail.com> nir: Lower bitfield_extract.

The OpenGL specifications for bitfieldExtract() says:

The result will be undefined if <offset> or <bits> is negative, or if
the sum of <offset> and <bits> is greater than the number of bits
used to store the operand.

Therefore passing bits=32, offset=0 is legal and defined in GLSL.

But the earlier SM5 ubfe/ibfe opcodes are specified to accept a bitfield width
ranging from 0-31. As such, Intel and AMD instructions read only the low 5 bits
of the width operand, making them not able to implement the GLSL-specified
behavior directly.

This commit adds ubfe/ibfe operations from SM5 and a lowering pass for
bitfield_extract to to handle the trivial case of <bits> = 32 as

bitfieldExtract:
bits > 31 ? value : bfe(value, offset, bits)

Fixes:
ES31-CTS.shader_bitfield_operation.bitfieldExtract.uvec3_0
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Marta Lofstedt <marta.lofstedt@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b85a229e1f542426b1c8000569d89cd4768b9339 08-Jan-2016 Kenneth Graunke <kenneth@whitecape.org> glsl: Delete the ir_binop_bfm and ir_triop_bfi opcodes.

TGSI doesn't use these - it just translates ir_quadop_bitfield_insert
directly. NIR can handle ir_quadop_bitfield_insert as well.

These opcodes were only used for i965, and with Jason's recent patches,
we can do this lowering in NIR (which also gains us SPIR-V handling).
So there's not much point to retaining this GLSL IR lowering code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
cddfc2cefa93b884c40329dcb193fe4fb22143ab 10-Dec-2015 Kristian Høgsberg Kristensen <krh@bitplanet.net> i965: Add support for gl_DrawIDARB and enable extension

We have to break open a new vec4 for gl_DrawIDARB. We've used up all
space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its
own separate vertex buffer anyway. This is because we point the vb for
base vertex and base instance into the draw parameter BO for indirect
draw calls, but the draw id is generated by mesa in a different buffer.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
17ebb55a14b5a9aa639845fbda9330ef9421834a 10-Dec-2015 Kristian Høgsberg Kristensen <krh@bitplanet.net> i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB

We already have gl_BaseVertexARB in the .x component of the SGVS vec4
and plug gl_BaseInstanceARB into the last free component (.y).

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
237f2f2d8b45d9d956102eec6f9be63193e5269b 26-Dec-2015 Jason Ekstrand <jason.ekstrand@intel.com> nir: Get rid of function overloads

When Connor originally drafted NIR, he copied the same function+overload
system that GLSL IR had with a few names changed. However, this
double-indirection is not really needed and has only served to confuse
people. Instead, let's just have functions which may not have unique names
and may or may not have an implementation. If someone wants to do overload
resolving, they can hav a hash table based function+overload system in the
overload resolving pass. There's no good reason to keep it in core NIR.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>

ir3 bits are

Reviewed-by: Rob Clark <robclark@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
24be658d13b13fdb8a1977208038b4ba43bce4ac 17-Nov-2015 Kenneth Graunke <kenneth@whitecape.org> i965: Add tessellation control shaders.

The TCS is the first tessellation shader stage, and the most
complicated. It has access to each of the control points in the input
patch, and computes a new output patch. There is one logical invocation
per output control point; all invocations run in parallel, and can
communicate by reading and writing output variables.

One of the main responsibilities of the TCS is to write the special
gl_TessLevelOuter[] and gl_TessLevelInner[] output variables which
control how much new geometry the hardware tessellation engine will
produce. Otherwise, it simply writes outputs that are passed along
to the TES.

We run in SIMD4x2 mode, handling two logical invocations per EU thread.
The hardware doesn't properly manage the dispatch mask for us; it always
initializes it to 0xFF. We wrap the whole program in an IF..ENDIF block
to handle an odd number of invocations, essentially falling back to
SIMD4x1 on the last thread.

v2: Update comments (requested by Jordan Justen).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
bb9eb599335ec4ac3a2a579359fb239f16de17e8 26-Nov-2015 Matt Turner <mattst88@gmail.com> i965/vec4: Optimize predicate handling for any/all.

For a select whose condition is any(v), instead of emitting

cmp.nz.f0(8) null<1>D g1<0,4,1>D 0D
mov(8) g7<1>.xUD 0x00000000UD
(+f0.any4h) mov(8) g7<1>.xUD 0xffffffffUD
cmp.nz.f0(8) null<1>D g7<4,4,1>.xD 0D
(+f0) sel(8) g8<1>UD g4<4,4,1>UD g3<4,4,1>UD

we now emit

cmp.nz.f0(8) null<1>D g1<0,4,1>D 0D
(+f0.any4h) sel(8) g9<1>UD g4<4,4,1>UD g3<4,4,1>UD

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c8a74e3a4ea6ac5dfa35adac06af14a8fa4ff773 30-Nov-2015 Matt Turner <mattst88@gmail.com> nir: Delete bany, ball, fany, fall.

As in the previous patches, these can be implemented as

any(v) -> any_nequal(v, false)
all(v) -> all_equal(v, true)

and their removal simplifies the code in the next patch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
78b81be627734ea7fa50ea246c07b0d4a3a1638a 25-Nov-2015 Jason Ekstrand <jason.ekstrand@intel.com> nir: Get rid of *_indirect variants of input/output load/store intrinsics

There is some special-casing needed in a competent back-end. However, they
can do their special-casing easily enough based on whether or not the
offset is a constant. In the mean time, having the *_indirect variants
adds special cases a number of places where they don't need to be and, in
general, only complicates things. To complicate matters, NIR had no way to
convdert an indirect load/store to a direct one in the case that the
indirect was a constant so we would still not really get what the back-ends
wanted. The best solution seems to be to get rid of the *_indirect
variants entirely.

This commit is a bunch of different changes squashed together:

- nir: Get rid of *_indirect variants of input/output load/store intrinsics
- nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect
- nir/lower_io: Get rid of load/store_foo_indirect
- i965/fs: Get rid of load/store_foo_indirect
- i965/vec4: Get rid of load/store_foo_indirect
- tgsi_to_nir: Get rid of load/store_foo_indirect
- ir3/nir: Use the new unified io intrinsics
- vc4: Do all uniform loads with byte offsets
- vc4/nir: Use the new unified io intrinsics
- vc4: Fix load_user_clip_plane crash
- vc4: add missing src for store outputs
- vc4: Fix state uniforms
- nir/lower_clip: Update to the new load/store intrinsics
- nir/lower_two_sided_color: Update to the new load intrinsic

NIR and i965 changes are

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

NIR indirect declarations and vc4 changes are

Reviewed-by: Eric Anholt <eric@anholt.net>

ir3 changes are

Reviewed-by: Rob Clark <robdclark@gmail.com>

NIR changes are

Acked-by: Rob Clark <robdclark@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
18069dce4a4c3d71e6afc6b10bfa7bee0560ba9c 11-Nov-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965: Make uniform offsets be in terms of bytes

This commit pushes makes uniform offsets be terms of bytes starting with
nir_lower_io. They get converted to be in terms of vec4s or floats when we
cram them in the UNIFORM register file but reladdr remains in terms of
bytes all the way down to the point where we lower it to a pull constant
load.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
05bdc21f84edc200a0b0a695b79d12f25cc00645 02-Nov-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Use a stride of 1 and byte offsets for UBOs

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92909
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e3e70698c3cfa7e9acccd6eddfb37516c45d5ac2 24-Nov-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Use byte offsets for UBO pulls on Sandy Bridge

Previously, the VS_OPCODE_PULL_CONSTANT_LOAD opcode operated on
vec4-aligned byte offsets on Iron Lake and below and worked in terms of
vec4 offsets on Sandy Bridge. On Ivy Bridge, we add a new *LOAD_GEN7
variant which works in terms of vec4s. We're about to change the GEN7
version to work in terms of bytes, so this is a nice unification.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b715e6d52832a0761ccec5c1252e7458499bf752 26-Nov-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Stop pretending to support indirect output stores

Since we're using nir_lower_outputs_to_temporaries to shadow all our
outputs, it's impossible to actually get an indirect store. The code we
had to "handle" this was pretty bogus as it created a register with a
reladdr and then stuffed it in a fixed varying slot without so much as a
MOV. Not only does this not do the MOV, it also puts the indirect on the
wrong side of the transaction. Let's just delete the broken dead code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
aa35b0c2c71f054f72df5a85779d0862fa7d6e4a 25-Nov-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Get rid of the nir_inputs array

It's not really buying us anything at this point. It's just a way of
remapping one offset namespace onto another. We can just use the location
namespace the whole way through.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f36993b46962eab4446bc1964eb47149751aee26 23-Nov-2015 Matt Turner <mattst88@gmail.com> i965: Clean up #includes in the compiler.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
ecac1aab538d65f0867fd93e23d0d020c1a5d0f1 23-Nov-2015 Matt Turner <mattst88@gmail.com> i965: Push down inclusion of brw_program.h.

We were including it in headers, which then caused it to be included in
tons of places it wasn't needed.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d9b8fde963a53d4e06570d8bece97f806714507a 12-Nov-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965: Use NIR for lowering texture swizzle

Now that nir_lower_tex can do texture swizzle lowering, we can use that
instead of repeating more-or-less the same code in both backends. This
both allows us to share code and means that things like the tg4
work-arounds are somewhat simpler because they don't have to take the
swizzle into account.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f58813842bcece3498f55ec5d582466ccff92a5e 15-May-2015 Jason Ekstrand <jason.ekstrand@intel.com> nir: s/nir_type_unsigned/nir_type_uint

v2: do the same in tgsi_to_nir (Samuel)

v3: added missing cases after rebase (Iago)

v4: Add a blank space after '#' in one of the comments (Matt)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0684aed8abc51308945ead050d2452b522937c0a 20-Nov-2015 Matt Turner <mattst88@gmail.com> i965/vec4: Initialize nir_inputs with src_reg().

nir_locals, nir_ssa_values, and nir_system_values are all dst_reg (not
that that makes a whole lot of sense to me), and only nir_inputs is a
src_reg.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
99840eb983f74cd447546f7205c8c9f505ef82c8 18-Nov-2015 Ian Romanick <ian.d.romanick@intel.com> i965: Enable EXT_shader_samples_identical

On the vec4 backend, textureSamplesIdentical() will always return
false. There are currently no test cases for the vec4 backend, so we
don't have much confidence in any implementation. We also don't think
anyone is likely to miss it.

v2: Handle immediate value for MCS smarter. Rebase on changes to
nir_texop_sampels_identical (missing second parameter). Suggested by
Jason.

v3: Add Neil's code to handle 16x MSAA in the FS. Also rebase on top of
f9a9ba5e. Stub out the vec4 implementation.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v2]
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
84b6c64efc52948da8db89b8d92d5e744e6cfc95 18-Nov-2015 Ian Romanick <ian.d.romanick@intel.com> i965/vec4: Handle nir_tex_src_ms_index more like the scalar

v2: Rebase on top of f9a9ba5e.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
457bb290efc162ea3c7c51a820ab7cf88a4efb8d 18-Nov-2015 Ian Romanick <ian.d.romanick@intel.com> nir: Add nir_texop_samples_identical opcode

This is the NIR analog to GLSL IR ir_samples_identical.

v2: Don't add the second nir_tex_src_ms_index parameter. Suggested by
Ken and Jason.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f9a9ba5eac2f1934bd7fecc92cd309f22411164b 02-Nov-2015 Matt Turner <mattst88@gmail.com> i965/vec4: Replace src_reg(imm) constructors with brw_imm_*().

Cuts 1.5k of .text.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b163aa01487ab5f9b22c48b7badc5d65999c4985 27-Oct-2015 Matt Turner <mattst88@gmail.com> i965: Rename GRF to VGRF.

The 2-bit hardware register file field is ARF, GRF, MRF, IMM.

Rename GRF to VGRF (virtual GRF) so that we can reuse the GRF name to
mean an assigned general purpose register.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
dba309fc14d1ca99251c8f8115d2a26ac86f14f6 30-Oct-2015 Matt Turner <mattst88@gmail.com> i965: Initialize registers.

The test (file == BAD_FILE) works on registers for which the constructor
has not run because BAD_FILE is zero. The next commit will move
BAD_FILE in the enum so that it's no longer zero.

In the case of this->outputs, the constructor was being run implicitly,
and we were unnecessarily memsetting is to zero.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
eca4c43a33c5c1bb63c8aa9d0506ed2ba3f9d8cb 30-Oct-2015 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: Do not mark used surfaces in VS_OPCODE_GET_BUFFER_SIZE

Do it in the visitor, like we do for other opcodes.

v2: use const, get rid of useless surf_index temporary (Curro)

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
6105d1d0a02c7eea83b327965713be3bada306f7 30-Oct-2015 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: Do not mark used direct surfaces in VS_OPCODE_PULL_CONSTANT_LOAD

Right now the generator marks direct surfaces as used but leaves marking of
indirect surfaces to the caller. Just make the callers handle marking in both
cases for consistency.

v2: Use const, do not add unnecessary temporary (Curro)

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
56774e63028b2997a7d8c0abb5009a4c79f9a453 20-Oct-2015 Alejandro Piñeiro <apinheiro@igalia.com> i965/vec4: select predicate based on writemask for sel emissions

Equivalent to commit 8ac3b525c but with sel operations. In this case
we select the PredCtrl based on the writemask.

This patch helps on cases like this:
1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F
2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D
3: (+f0.0) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD

In this case, cmod propagation can't optimize instruction #2, because
instructions #1 and #2 have different writemasks, and we can't update
directly instruction #2 writemask because our code thinks that sel at
instruction #3 reads all four channels of the flag, when it actually
only reads .x.

So, with this patch, the previous case becames this:
1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F
2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D
3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD

Now only the x channel of the flag is used, allowing dead code
eliminate to update the writemask at the second instruction:
1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F
2: cmp.nz.f0.0 null.x:D, vgrf40.xxxx:D, 0D
3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD

So now cmod propagation can simplify out #2:
1: cmp.l.f0.0 vgrf40.0.x:F, attr18.wwww:F, vgrf7.xxxx:F
2: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD

Shader-db numbers:
total instructions in shared programs: 6235835 -> 6228008 (-0.13%)
instructions in affected programs: 219850 -> 212023 (-3.56%)
total loops in shared programs: 1979 -> 1979 (0.00%)
helped: 1192
HURT: 0
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c22d62f5991f1c26c58c9ae1891202ea437d2f7b 26-Oct-2015 Matt Turner <mattst88@gmail.com> i965/vec4: Clean up FBH code.

It did a bunch of unnecessary stuff, emitting an extra MOV included.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d9b09f8a306dfd471e45b5294c3adcb119114387 26-Oct-2015 Matt Turner <mattst88@gmail.com> i965/vec4: Don't disable channels in any/all comparisons.

We've made a mistake in calling the Channel Enable bits "writemask",
because they do more than control which channels of the destination are
written -- they actually control which channels are enabled (surprise!
surprise!)

So, if we emit

cmp.z.f0(8) null.xy<1>D g10<4,4,1>.xyzzD g2<0,4,1>.xyzzD
mov(8) g12<1>.xUD 0x00000000UD
(+f0.all4h) mov(8) g12<1>.xUD 0xffffffffUD

where the CMP instruction has only .xy channel enables, it won't write
the .zw channels of the flag register, which are of course read by the
+f0.all4 predicate.

We need to always emit CMP instructions whose flag result might be read
by such a predicate with all channels enabled.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4379ca22f18f5731248ee794ab651db721ba38b2 07-Oct-2015 Emil Velikov <emil.velikov@collabora.com> i965: Implement nir_intrinsic_shader_clock

v2:
- Add a few const qualifiers for good measure.
- Drop unneeded retype()s (Matt)
- Convert timestamp to SIMD8/16, as fs_visitor::get_timestamp() returns
SIMD4 (Connor)

v3:
- Remove unneeded temporary + MOV (Connor)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8ac3b525c77cb5aae9e61bd984b78f6cbbffdc1c 09-Oct-2015 Alejandro Piñeiro <apinheiro@igalia.com> i965/vec4: nir_emit_if doesn't need to predicate based on all the channels

v2: changed comment, as suggested by Matt Turner

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f534f331ca354bcb138e2b8f6d6d80147ee4a186 15-Oct-2015 Iago Toral Quiroga <itoral@igalia.com> i965/vec4: Use the right number of UBOs

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d3f45888045c84b2bc382a34d169a0ede4774a24 09-Oct-2015 Iago Toral Quiroga <itoral@igalia.com> i965: Adapt SSBOs to work with their own separate index space

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
baee16bf02eedc6a32381d79da6c7ac942f782ae 28-Sep-2015 Iago Toral Quiroga <itoral@igalia.com> nir: split SSBO min/max atomic instrinsics into signed/unsigned versions

NIR is typeless so this is the only way to keep track of the
type to select the proper atomic to use.

v2:
- Use imin,imax,umin,umax for the intrinsic names (Connor Abbott)
- Change message for unreachable paths (Michael Schellenberger)

Tested-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4642d53a03122e6d3214ed12cb327898917eb84e 09-Oct-2015 Matt Turner <mattst88@gmail.com> i965/vec4: Implement b2f and b2i using negation.

Curro added this in commit 3ee2daf23d (before the vec4/NIR backend was
added) but it was missed in the new NIR backend. Add it there as well.

instructions in affected programs: 1857 -> 1810 (-2.53%)
helped: 15

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
1e3c1b107e075b210998998423901092b8fcd79b 03-Oct-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965: Use nir_foreach_variable

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
bf7b6fd3fd6d98305d64ee6224ca9f9e7ba48444 02-Oct-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/shader: Get rid of the shader, prog, and shader_prog fields

Unfortunately, we can't get rid of them entirely. The FS backend still
needs gl_program for handling TEXTURE_RECTANGLE. The GS vec4 backend still
needs gl_shader_program for handling transfom feedback. However, the VS
needs neither and we can substantially reduce the amount they are used.
One day we will be free from their tyranny.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
ca6a436f12cb55e9415049a217229c99b02ad3b8 02-Oct-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Use nir info instead of pulling things out of [shader_]prog

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
7926c3ea7d8f455cbee390d20c78dadf5432b9bc 01-Oct-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/backend_shader: Add a field to store the NIR shader

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
30c63571133ed50907ec14172c2f3ef82ee8a34e 01-Oct-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965: Move prog_data uniform setup to the codegen level

As of now, uniform setup is more-or-less unified between vec4 and fs and no
longer requires the fs_visitor. This makes uniform setup more of a
language/API thing than a backend compiler thing. This commit moves
setting up the stage_prog_data.params arrays to the same place as we set up
the rest of stage_prog_data.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
cdf314cb21377ee7caca05bd1abab6a2b921d213 01-Oct-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/nir: Simplify uniform setup

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
7fee8b6f055831bc070bb36d02a8b1c4d601652a 02-Oct-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/nir: Pull GLSL uniform handling into a common function

The way we deal with GLSL uniforms and builtins is basically the same in
both the vec4 and the fs backend. This commit takes the best parts of both
implementations and pulls the common code into a shared helper function.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
03c4171b577b06b1d8dde50b6eb9507d8ef4c1ce 29-Sep-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/nir: Pull common ARB program uniform handling into a common function

The way we deal with ARB program uniforms is basically the same in both the
vec4 and the fs backend. This commit takes the best parts of both
implementations and pulls the common code into a shared helper function.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
390b48fc4a9836b563560581fbfb4833546de0c8 30-Sep-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Use the uniform count from nir_assign_var_locations

Previously, we were counting up uniforms as we set them up. However, this
count should be exactly identical to shader->num_uniforms provided by
nir_assign_var_locations. (If it's not, we're in trouble anyway because
that means that locations don't match up.) This matches what the fs
backend is already doing.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
5609e0d7b41e861a3359991e8d0f2053b255fc31 30-Sep-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Get rid of the uniform_vector_size array

The uniform_vector_size array was only ever used by pack_uniform_registers
which no longer needs it.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
023165a734b3bae52a449ad01bc1ea5ba4384ec1 15-Sep-2015 Samuel Iglesias Gonsalvez <siglesias@igalia.com> i965/vec4/nir: add nir_intrinsic_memory_barrier support

Fix OpenGL ES 3.1 conformance tests: advanced-readWrite-case1-vsfs
and advanced-matrix-vsfs.

v2:
- Fix SHADER_OPCODE_MEMORY_FENCE emission and the allocation of 'tmp'
(Francisco).

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
6668eb5a451c43ac78a784711cf239fdf7ca75ef 11-Sep-2015 Samuel Iglesias Gonsalvez <siglesias@igalia.com> mesa: rename gl_shader_program's NumUniformBlocks to NumBufferInterfaceBlocks

Because it counts shader storage blocks too.

v2:
- Use NumBufferInterfaceBlocks instead (Jordan).

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
5ef169034c77ede86546d8dc42f7f22abcd6faa0 07-Aug-2015 Iago Toral Quiroga <itoral@igalia.com> i965/nir/vec4: Implement nir_intrinsic_ssbo_atomic_*

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e3f9c7829c609e8a32da9f36c9829843f2204a37 10-Sep-2015 Iago Toral Quiroga <itoral@igalia.com> i965/nir/vec4: Implement nir_intrinsic_load_ssbo

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
922b3d1bb16b4b6b292cb159e5fe3d0615ca725c 10-Sep-2015 Iago Toral Quiroga <itoral@igalia.com> i965/nir/vec4: Implement nir_intrinsic_store_ssbo

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
65d7f5fe9f4284f7de867b4c412f086c6dcca176 26-Aug-2015 Samuel Iglesias Gonsalvez <siglesias@igalia.com> i965/vec4/nir: implement nir_intrinsic_get_buffer_size

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
2e5423ad6345e027bb40c75ffc0e9e64843b9c05 23-Sep-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Add support for fdph_replicated

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
6ba291db4ba4f03ac94560eaae861bc162ac838e 18-Sep-2015 Eduardo Lima Mitev <elima@igalia.com> i965/vec4/nir: Remove all "this->" snippets

For consistency, either we have all class members dereferenced, or none.
In this case, very few are so lets get rid of them all.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
a88ce0c1c4c1f77209b71d5a6858f952642f385a 10-Sep-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4: Use the replicated fdot instruction in NIR

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c951bb83056724df02ba7e6fe2dfa720c0f45c1f 09-Sep-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4_nir: Use partial SSA form rather than full non-SSA

We made this switch in the FS backend some time ago and it seems to make a
number of things a bit easier. In particular, supporting SSA values takes
very little work in the backend and allows us to take advantage of the
majority of the SSA information even after we've gotten rid of Phi nodes.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b8d2263c83d29f4626ac0fe0316978aa6262aefb 14-Sep-2015 Antia Puentes <apuentes@igalia.com> i965/vec4_nir: Load constants as integers

Loads constants using integer as their register type, like it is
done in FS backend.

No shader-db changes in HSW.

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91716
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0b91bcea98c0fe201bba89abe1ca3aee4d04c56c 12-Aug-2015 Ilia Mirkin <imirkin@alum.mit.edu> i965: add support for textureSamples function

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
[v2: kayden-supplied code in fs_nir replacing need for logical opcode]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0cc331dddd1a99c7af3619c92c48b5c32e17f6b3 04-Aug-2015 Kenneth Graunke <kenneth@whitecape.org> i965/nir: Use nir_system_value_from_intrinsic to reduce duplication.

This code is all pretty much identical. We just needed the translation
from one enum value to the other.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
27e83b62bb52de7a681ed82679a707555023f43d 28-Aug-2015 Kenneth Graunke <kenneth@whitecape.org> i965: Store a key_tex pointer in vec4_visitor.

I'm about to remove the base class for VS/GS/HS/DS program keys, at
which point we won't be able to use key->tex anymore. Instead, we'll
need to store a direct pointer (like we do in the FS backend).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
cfa056c6a5eadf87f92a71346c0dddd2a080e302 18-Aug-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4_nir: Get rid of the uniform_driver_location tracking

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0db8e87b4a16b123f7c0b44d54f23b535a136ee6 18-Aug-2015 Jason Ekstrand <jason.ekstrand@intel.com> nir/intrinsics: Add a second const index to load_uniform

In the i965 backend, we want to be able to "pull apart" the uniforms and
push some of them into the shader through a different path. In order to do
this effectively, we need to know which variable is actually being referred
to by a given uniform load. Previously, it was completely flattened by
nir_lower_io which made things difficult. This adds more information to
the intrinsic to make this easier for us.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
640c472fd075814972b1276c5b0ed3a769aacda5 12-Aug-2015 Kenneth Graunke <kenneth@whitecape.org> i965: Move type_size() methods out of visitor classes.

I want to use C function pointers to these, and they don't use anything
in the visitor classes anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
2450cbfcbc3671056afad9e858acadbb6edea068 12-Aug-2015 Matt Turner <mattst88@gmail.com> i965/vec4/nir: Emit single MOV to generate a scalar constant.

If an immediate is written to multiple channels, we can load it in a
single writemasked MOV.

total instructions in shared programs: 6285144 -> 6261991 (-0.37%)
instructions in affected programs: 718991 -> 695838 (-3.22%)
helped: 5762

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
c1d9b3ae0bb0f1222719d7737dd9986e437bf5b9 04-Aug-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4_nir: Properly handle integer multiplies on BDW+

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
1d658cf8795383dbef127e46f3740b516bfe21b9 03-Aug-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4_nir: Do boolean source modifier resolves on BDW+

On BDW+, the negation source modifier on NOT, AND, OR, and XOR, is actually
a boolean negate and not an integer negate. However, NIR's soruce
modifiers are the integer version. We have to resolve it with a MOV prior
to emitting the actual instruction. This is basically the same thing we do
in the FS backend.

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
5e1c1c2fcbdfb96a973ae3fd196e341ab2d41833 03-Aug-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/vec4-nir: Handle boolean resolvese on ILK-

The analysis code was already there and running, we just weren't doing
anything with the result of it yet.

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
82f2e706bfd646b91bc0b8beecdff4e54b1f7b04 29-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Handle uniforms on vertex programs

The implementation takes into account that on ARB_vertex_program
only a single nir variable is generated to support all the uniform data.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
19cf934f7f18237e1a212b0a019026d5d36c6fac 06-Jul-2015 Alejandro Piñeiro <apinheiro@igalia.com> i965/nir/vec4: Add implementation of nir_emit_texture()

Uses the nir structure to get all the info needed (sources,
dest reg, etc), and then it uses the common
vec4_visitor::emit_texture to emit the final code.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
583c1c61703826002ba0f202e8ef7bc2c822ef1d 17-Jun-2015 Eduardo Lima Mitev <elima@igalia.com> i965/nir/vec4: Implement nir_emit_jump

This implementation is taken as-is from fs_nir.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
9b4a6fa4c09d36e0e5c00309e6ea37300ea38f78 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Mark as unreachable ops that should be already lowered

NIR ALU operations:
* nir_op_fabs
* nir_op_iabs
* nir_op_fneg
* nir_op_ineg
* nir_op_fsat
should be lowered by lower_source mods

* nir_op_fdiv
should be lowered in the compiler by DIV_TO_MUL_RCP.

* nir_op_fmod
should be lowered in the compiler by MOD_TO_FLOOR.

* nir_op_fsub
* nir_op_isub
should be handled by ir_sub_to_add_neg.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
16072834babc487f78472f7e7b59d35249a3aac8 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement vector "any" operation

Adds NIR ALU operations:
* nir_op_bany2
* nir_op_bany3
* nir_op_bany4

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
fa4e3c3c9f6f3a72a032499fccaa6e222d6a7fa4 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement the dot product operation

Adds NIR ALU operations:
* nir_op_fdot2
* nir_op_fdot3
* nir_op_fdot4

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
96106e2a9f214d98fc2e99c65398f95d41a3b879 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement conditional select

Adds NIR ALU operations:
* nir_op_bcsel

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b38fcd0aea8d17919ecd9cc7afc518cfb2c01c27 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement linear interpolation

Adds NIR ALU operation:
* nir_op_flrp

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b64bd1fdc37eed1bb62d2b32ad22f0f77501f7f2 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement floating-point fused multiply-add

Adds NIR ALU operation:
* nir_op_ffma

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
d12e165dbb403c3cf86ab7f1b8f28ab6188b479f 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement "shift" operations

Adds NIR ALU operations:
* nir_op_ishl
* nir_op_ishr
* nir_op_ushr

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
798cb33a256f703ecaf56d4443e12055484d4bcc 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement the "sign" operation

Follows the vec4_visitor IR implementation but
sets the saturate value in addition.

Adds NIR ALU operations:
* nir_op_fsign
* nir_op_isign

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8e1e6facbf828258a9a8ca09da846d1baa21d984 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement bit operations

Same implementation than the IR case.

Adds NIR ALU operations:
* nir_op_bitfield_reverse
* nir_op_bit_count
* nir_op_ufind_msb
* nir_op_ifind_msb
* nir_op_find_lsb
* nir_op_ubitfield_extract
* nir_op_ibitfield_extract
* nir_op_bfm
* nir_op_bfi
* nir_op_bitfield_insert

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0e874985ce50d902535e1eb766bd252c921b5d8f 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement pack/unpack operations

* Lowered floating-point pack and unpack operations are not valid in VS.

* Pack and unpack 2x16 operations should be handled by lower_packing_builtins.

* Adds NIR ALU operations:
* nir_op_pack_half_2x16
* nir_op_unpack_half_2x16
* nir_op_unpack_unorm_4x8
* nir_op_unpack_snorm_4x8
* nir_op_pack_unorm_4x8
* nir_op_pack_snorm_4x8

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
3f10c2f3d73ae41ff83afcdbe225121b8336f499 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: "noise" ops should already be lowered

Marked them as unreachable.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
fa4731f4a53aa21e53a62f42f3afdc19b0ce4c8e 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement "bool<->int,float" format conversion

Used the same implementation than the vec4_visitor NIR.

Adds NIR ALU operations:
* nir_op_b2i
* nir_op_b2f
* nir_op_f2b
* nir_op_i2b

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f14199a8fb802f6672d559fa958a5ee84e3e13f1 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement logical operators

Adds NIR ALU operations:
* nir_op_inot
* nir_op_ixor
* nir_op_ior
* nir_op_iand

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
51aeafaf96b3b349e007ad05738bc1e05663fedf 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement non-equality ops on vectors

Adds NIR ALU operations:
* nir_op_bany_fnequal2
* nir_op_bany_inequal2
* nir_op_bany_fnequal3
* nir_op_bany_inequal3
* nir_op_bany_fnequal4
* nir_op_bany_inequal4

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
8be4b876c90192c3a5e6fcc9b526f43a3f7bfc11 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement equality ops on vectors

Adds NIR ALU operations:
* nir_op_ball_fequal2
* nir_op_ball_iequal2
* nir_op_ball_fequal3
* nir_op_ball_iequal3
* nir_op_ball_fequal4
* nir_op_ball_iequal4

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
84d4a9dc2ca3d98f19cc9125a5ff1ac1225f360d 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement non-vector comparison ops

Adds NIR ALU operations:
* nir_op_flt
* nir_op_ilt
* nir_op_ult
* nir_op_fge
* nir_op_ige
* nir_op_uge
* nir_op_feq
* nir_op_ieq
* nir_op_fne
* nir_op_ine

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b9c41affcf67f30d7f6c74c17ea34bc42756d56d 17-Apr-2015 Antia Puentes <apuentes@igalia.com> i965/nir: Add utility method for comparisons

This method returns the brw_conditional_mod value used when emitting
comparative ALU operations.

It could be moved to brw_nir in the future to reuse it in fs_nir backend.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
dae6025e8efdfb759458a3243c8cd1588f485135 14-Apr-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Derivatives are not allowed in VS

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
5e6f1c38a591fa39cff1c32a2cfdda927145756a 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement min/max operations

Adds NIR ALU operations:
* nir_op_fmin
* nir_op_imin
* nir_op_umin
* nir_op_fmax
* nir_op_imax
* nir_op_umax

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
7553a51a68c0b2030265fe741f9c511b65047914 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement various rounding functions

Adds NIR ALU operations:
* nir_op_ftrunc
* nir_op_fceil
* nir_op_ffloor
* nir_op_ffrac
* nir_op_fround_even

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0ce159ec7fbcdf00c488b77f63e565e89ef6cab5 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement carry/borrow for addition/subtraction

Adds NIR ALU operations:
* nir_op_uadd_carry
* nir_op_usub_borrow

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
62cef7b0723ad6ca49ed06a6899a5852e41359e8 17-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement more math operations

Adds NIR ALU operations:
* nir_op_frcp
* nir_op_fexp2
* nir_op_flog2
* nir_op_fexp
* nir_op_flog
* nir_op_fsin
* nir_op_fcos
* nir_op_idiv
* nir_op_udiv
* nir_op_umod
* nir_op_ldexp
* nir_op_fsqrt
* nir_op_frsq
* nir_op_fpow

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
9acebf146184c35e6897b91fff414c5295d47996 16-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement multiplication

Implementation based on the vec4_visitor IR implementation
for the operations ir_binop_mul and ir_binop_imul_high.

Adds NIR ALU operations:
* nir_op_fmul
* nir_op_imul
* nir_op_imul_high
* nir_op_umul_high

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
0675842b56a956befbac4a3b912823e73a95a500 16-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement the addition operation

Adds NIR ALU operations:
* nir_op_fadd
* nir_op_iadd

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4f39b547da4f9949d1b1f9f0df07d08951f0358d 16-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement int<->float format conversion ops

Adds NIR ALU operations:
* nir_op_f2i
* nir_op_f2u
* nir_op_i2f
* nir_op_u2f

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e4f02f47e70d384531ac68e6d33a62fdcdbd1f28 16-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Lower "vecN" instructions and mark them unreachable

This enables NIR pass "lower_vec_to_movs" on shaders that work on vec4.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
79154d99d6e760b1daf327b4594dded18f1d4191 16-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement single-element "mov" operations

Adds NIR ALU operations:
* nir_op_imov
* nir_op_fmov

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
ef1b30ae637e613b384541324c199d2dbe6b44bd 16-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Prepare source and destination registers for ALU operations

This patch resolves and initializes the destination and the source
registers that are common to most ALU operations.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
168bbfa6ff22a586ad6307c187cfa3b8fff5f227 16-Jun-2015 Antia Puentes <apuentes@igalia.com> i965/nir/vec4: Implement loading values from an UBO

Based on the vec4_visitor IR implementation for the ir_binop_load_ubo
operation. Notice that unlike the vec4_visitor IR, adding the !=0
comparison for UBO bools is not needed here because that comparison is
already added by the nir_visitor when processing the ir_binop_load_ubo
(in UBOs "true" is any value different from zero, but for us is ~0).

Adds NIR instrinsics:

* nir_intrinsic_load_ubo_indirect
* nir_intrinsic_load_ubo

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
98d07022f5312967bdfd54069869c8d6c65117a7 16-Jun-2015 Alejandro Piñeiro <apinheiro@igalia.com> i965/nir/vec4: Implement atomic counter intrinsics (read, inc and dec)

The implementation is based on its fs_nir counterpart.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e6cafb5dfdef8d8d25ee1e3375304cf35897d1f7 16-Jun-2015 Iago Toral Quiroga <itoral@igalia.com> i965/nir/vec4: Implement load_uniform intrinsic

For the indirect case we need to take the index delivered by
NIR and compute the parent uniform that we are accessing (the one
that we uploaded to a surface) and the constant offset into that
surface.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
e76e8caecd30799500357a45468329f033a93932 16-Jun-2015 Alejandro Piñeiro <apinheiro@igalia.com> i965/nir/vec4: Implement intrinsics that load system values

These include:

nir_intrinsic_load_vertex_id_zero_base
nir_intrinsic_load_base_vertex
nir_intrinsic_load_instance_id

The source register is fetched from the nir_system_values map initialized
during nir_setup_system_values stage.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
662c4c99065381b8e265310d176cfdef6698ca57 16-Jun-2015 Eduardo Lima Mitev <elima@igalia.com> i965/nir/vec4: Implement store_output intrinsic

This implementation is based on the current URB setup in vec4_visitor, which
requires the output register to be stored in the output_reg array at variable's
original shader location index. But since nir_lower_io() pass uses the value
in var->data.driver_location, we need to put there var->data.location instead,
prior to calling nir_lower_io(), so that we end up with the correct index
in const_index[0].

The driver_location is not used at all, so this patch also disables the
nir_assign_var_locations pass on non-scalar shaders.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
167cb9663adc8c7c61807e503f66e85f955e7d5f 16-Jun-2015 Eduardo Lima Mitev <elima@igalia.com> i965/nir/vec4: Implement load_input intrinsic

The source register is fetched from the nir_inputs map built during
nir_setup_inputs stage.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
afe085a0ca01f659c69456018e5f5076c9dde47d 16-Jun-2015 Eduardo Lima Mitev <elima@igalia.com> i965/nir/vec4: Implement loop statements (nir_cf_node_loop)

This is taken as-is from fs_nir.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
5c0436dbf87fef76ba67456f215d9285c38f1816 16-Jun-2015 Iago Toral Quiroga <itoral@igalia.com> i965/nir/vec4: Implement conditional statements (nir_cf_node_if)

The same we do in the FS NIR backend, only that here we need to consider
the number of components in the condition and adjust the swizzle
accordingly.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f3187ea31ede6bc181ee561573d127aa2e485657 16-Jun-2015 Eduardo Lima Mitev <elima@igalia.com> i965/nir/vec4: Add get_nir_dst() and get_nir_src() methods

These methods are essential for the implementation of the NIR->vec4 pass. They
work similar to their fs_nir counter-parts.

When processing instructions, these methods are invoked to resolve the
brw registers (source or destination) corresponding to the NIR sources
or destination. It uses the map of NIR register index to brw register for
all registers locally allocated in a block.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
f7152525374015594e037fa11bb64e1c7174829b 01-Jul-2015 Eduardo Lima Mitev <elima@igalia.com> i965/nir/vec4: Implement load_const intrinsic

Similar to fs_nir backend, a nir_local_values map will be filled with
newly allocated registers as the load_const instrinsic instructions are
processed. Later, get_nir_src() will fetch the registers from this map
for sources that are ssa.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
59006d3ad3ed5d29e84afa5931f425344e2ef658 22-Jul-2015 Eduardo Lima Mitev <elima@igalia.com> i965/nir/vec4: Add shader function implementation

It basically allocates registers local to a function in a nir_locals map,
then emits all its control-flow blocks.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
4023b55fdd7005a8a100637c229a1c40648cdd2b 16-Jun-2015 Alejandro Piñeiro <apinheiro@igalia.com> i965/nir/vec4: Add setup for system values

Similar to other variable setups, system values will initialize the
corresponding register inside a 'nir_system_values' map, which will then
be queried later when processing the different system value intrinsics
for the appropriate register.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
195156e571e851273c135847f91ed73b3bfc1914 16-Jun-2015 Iago Toral Quiroga <itoral@igalia.com> i965/nir/vec4: Add setup of uniform variables

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b929acb6a8659fdc06623b766bdf59904d8a3558 16-Jun-2015 Eduardo Lima Mitev <elima@igalia.com> i965/nir/vec4: Add setup of input variables in NIR->vec4 pass

This implementation sets up a map of input variable offsets to source registers
that are already initialized with the corresponding register offset.

This map will then be queried when processing load_input intrinsic operations,
to obtain the correct register source from which the input data will be loaded.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
abf4fa3c03ebe5716c90c8a310945c3621cf598f 16-Jun-2015 Eduardo Lima Mitev <elima@igalia.com> i965/nir/vec4: Add implementation placeholders for a new NIR->vec4 pass

This patch will add a brw_vec4_nir.cpp file filled with entry point methods to
the main functionality, following a structure similar to brw_fs_nir.cpp.

Subsequent patches in this series will be adding the implementations for these
methods, incrementally.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp