History log of /external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
5597b2b243d96e50b4c151db8200487eae0c4997 15-Jan-2017 Kenneth Graunke <kenneth@whitecape.org> i965: Use align1 mode for barrier messages.

In commit 7428e6f86ab5 we switched the barrier SEND message's
destination type to UW to avoid problems in SIMD16 compute shaders.

Tessellation control shaders also use barriers, and in vec4 mode, we
were emitting them in align16 mode. The simulator warns that only UD,
D, F, and DF are valid destination types - UW is technically illegal.

So, switch to align1 mode. Either mode should work fine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9b22a0d295316b7547667ebbfe1e1b6182439186 09-Dec-2016 Francisco Jerez <currojerez@riseup.net> i965/fs: Expose arbitrary pull constant load sizes to the IR.

Change the FS generator to ask the dataport for enough owords worth of
constants to fill the execution size of the instruction -- Which means
that the visitor now needs to set the execution size correctly for
uniform pull constant load instructions, which we were kind of
neglecting until now.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7a6aadb76ff3f6ef73216b53b0dc5edda5bae978 09-Dec-2016 Francisco Jerez <currojerez@riseup.net> i965: Factor out oword block read and write message control calculation.

We'll need roughly the same logic in other places and it would be
annoying to duplicate it. Instead factor it out into a function-like
macro that takes the number of dwords per block (which will prove more
convenient than taking the same value in owords or some other unit).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ad38ba113491869ab0dffed937f7b3dd50e8a735 26-Oct-2016 Francisco Jerez <currojerez@riseup.net> i965/fs: Switch to the constant cache for uniform pull constants.

This reverts to using the oword block read messages for uniform pull
constant loads, as used to be the case until
4c1fdae0a01b3f92ec03b61aac1d3df5. There are two important differences
though: Now the L3 cacheability bits are set up correctly for UBOs
(since 11f5d8a5d4fbb861ec161f68593e429cbd65d1cd), and we target the
constant cache instead of the data cache. The latter used to get no
L3 way allocation on boot on all platforms that existed at the time,
so oword read messages wouldn't get cached on L3 regardless of the
MOCS bits, what probably explains the apparent slowness of oword
fetches.

Constant cache loads seem to perform better than SIMD4x2 sampler loads
in a number of cases, they alleviate some of the cache thrashing
caused by the competition with textures for the L1/L2 sampler caches,
and they allow fetching up to 128B worth of constants with a single
oword fetch message.

Note that IVB devices suffer from a hardware bug that leads to
serialization of L3 read requests overlapping the same cacheline as
result of a (on IVB buggy) mechanism of the L3 to preserve coherency.
Since read requests for matching cachelines from any L3 client are not
pipelined, throughput may decrease in cases where there are no
non-overlapping requests left in the queue that can be processed
between them.

This situation should be relatively uncommon as long as we make sure
that we don't use the 1/2 oword messages in cases where the shader
intends to read from any other location of the same cacheline at some
other point. This is generally a good idea anyway on all generations
because using the 1 and 2 oword messages is expected to waste
bandwidth since the minimum L3 request size for the DC is exactly 4
owords (i.e. one cacheline). A future commit will have this effect.
I haven't been able to find any real-world example where this would
still result in a regression on IVB, but if someone happens to find
one it shouldn't be too difficult to add an IVB-specific check to have
it fall back to the sampler cache for pull constant loads.

Note that on SKL+ this change has the additional benefit of reducing
the register footprint of pull constant loads. The following table
summarizes the effect of the whole series on several shader-db stats:

Total instructions Total cycles
BWR: 4571248 -> 4568342 (-0.06%) 123375740 -> 123373296 (-0.00%)
ELK: 3989020 -> 3985402 (-0.09%) 98757068 -> 98754058 (-0.00%)
ILK: 6383591 -> 6376787 (-0.11%) 143649910 -> 143648914 (-0.00%)
SNB: 7528395 -> 7501446 (-0.36%) 103503796 -> 102460370 (-1.01%)
IVB: 6949221 -> 6943317 (-0.08%) 60592262 -> 60584422 (-0.01%)
HSW: 6409753 -> 6403702 (-0.09%) 60609070 -> 60604414 (-0.01%)
BDW: 8043467 -> 7976364 (-0.83%) 68427730 -> 68483042 (0.08%)
CHV: 8045019 -> 7977916 (-0.83%) 68297426 -> 68352756 (0.08%)
SKL: 8204037 -> 7939086 (-3.23%) 66583900 -> 65624378 (-1.44%)

Lost->Gained Total spills Total fills
BWR: 5 -> 5 1488 -> 1488 (0.00%) 1957 -> 1957 (0.00%)
ELK: 5 -> 5 1489 -> 1489 (0.00%) 1958 -> 1958 (0.00%)
ILK: 1 -> 4 1449 -> 1449 (0.00%) 1921 -> 1921 (0.00%)
SNB: 0 -> 0 549 -> 549 (0.00%) 52 -> 52 (0.00%)
IVB: 13 -> 3 1271 -> 1271 (0.00%) 1162 -> 1162 (0.00%)
HSW: 11 -> 0 1271 -> 1271 (0.00%) 1162 -> 1162 (0.00%)
BDW: 12 -> 0 1340 -> 1340 (0.00%) 1452 -> 1452 (0.00%)
CHV: 12 -> 0 1340 -> 1340 (0.00%) 1452 -> 1452 (0.00%)
SKL: 0 -> 120 1269 -> 375 (-70.45%) 1563 -> 690 (-55.85%)

v3: Non-trivial rebase.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
3c78d31374422b028b19afa5799689c404a5b73e 23-Apr-2015 Francisco Jerez <currojerez@riseup.net> i965: Let the caller of brw_set_dp_write/read_message control the target cache.

brw_set_dp_read_message already had a target_cache argument, but its
interpretation was rather convoluted (on Gen6 the render cache was
used if the caller asked for it, otherwise it was ignored using the
sampler cache instead), and the constant cache wasn't representable at
all. brw_set_dp_write_message used the data cache on Gen7+ except for
RENDER_TARGET_WRITE messages, in which case it would use the render
cache. On Gen6 the render cache was always used.

Instead of the above, provide the shared unit SFID that the caller
expects will be used. Makes no functional changes.

v3: Non-trivial rebase.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
43cdbb3e6ab7224278a2c68586b8d1a9cb7429a9 04-Dec-2016 Matt Turner <mattst88@gmail.com> i965: Emit proper NOPs.

The PRMs for HSW and newer say that other than the opcode and DebugCtrl
bits of the instruction word, the rest must be zero.

By zeroing the instruction word manually, we avoid using any of the
state inherited through brw_codegen.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96959
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
c9f176790346558fa48cfbcf6e2d5e140eb78fd7 04-Oct-2016 Timothy Arceri <timothy.arceri@collabora.com> i965: fix unused variable warning in gen7_block_read_scratch()

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8a468d186e6fc27c26dd12ba989192e7596f667a 15-Sep-2016 Jason Ekstrand <jason@jlekstrand.net> i965/fs: Take Dispatch/Vector mask into account in FIND_LIVE_CHANNEL

On at least Sky Lake, ce0 does not contain the full story as far as enabled
channels goes. It is possible to have completely disabled channels where
the corresponding bits in ce0 are 1. In order to get the correct execution
mask, you have to mask off those channels which were disabled from the
beginning by taking the AND of ce0 with either sr0.2 or sr0.3 depending on
the shader stage. Failure to do so can result in FIND_LIVE_CHANNEL
returning a completely dead channel.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Francisco Jerez <currojerez@riseup.net>
[ Francisco Jerez: Fix a couple of typos, add mask register type
assertion, clarify reason why ce0 can have bits set for disabled
channels, clarify that this may only be a problem when thread
dispatch doesn't pack channels tightly in the SIMD thread. Apply
same treatment to Align16 path. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
979d0aca6277975986f5f278cad0f37616c9d91f 26-Aug-2016 Jason Ekstrand <jason.ekstrand@intel.com> intel: Rename brw_get_device_name/info to gen_get_device_name/info

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
527f37199929932300acc1688d8160e1f3b1d753 23-Aug-2016 Jason Ekstrand <jason.ekstrand@intel.com> intel: s/brw_device_info/gen_device_info/

Generated by:

sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.h
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.cpp
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.h

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
90eaf01616a8cf7a39dae63a3d5636874fa68fa5 30-Aug-2016 Matt Turner <mattst88@gmail.com> i965: Pass start_offset to brw_set_uip_jip().

Without this, we would pass over the instructions in the SIMD8 program
(which is located earlier in the buffer) when brw_set_uip_jip() is
called to handle the SIMD16 program.

The assertion about compacted control flow was bogus: halt, cont, break
cannot be compacted because they have both JIP and UIP. Instead, we
should never see a compacted instruction in this code at all.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
26ac16fe2f73507041062f63646286dea60053da 22-Jul-2016 Francisco Jerez <currojerez@riseup.net> i965/eu: Add codegen support for the Gen9+ render target read message.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
29eb8059fd7906d2595ea99bc65a27691b9fbe53 22-Jul-2016 Francisco Jerez <currojerez@riseup.net> i965/eu: Take into account the target cache argument in brw_set_dp_read_message.

brw_set_dp_read_message() was setting the data cache as send message
SFID on Gen7+ hardware, ignoring the target cache specified by the
caller. Some of the callers were passing a bogus target cache value
as argument relying on brw_set_dp_read_message not to take it into
account. Fix them too.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0534863c477240e47f1d85616b59c31fad453ea2 07-Jul-2016 Samuel Iglesias Gonsálvez <siglesias@igalia.com> i965/eu: set DF imm value to the source of DIM

According to HSW's PRM, vol02b, the DIM instruction has the following
restriction:

"Restriction : src0 must be immediate. src0 must specify the :f (F, Float)
type encoding but is an immediate 64-bit DF (Double Float) value. dst
must have type DF."

This commit allows to upload the immediate 64-bit DF value to the source
of a DIM instruction even when it is of float type encoding.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
6e28976d35cf0a15c62bed1fd2ceeb734a3fc81e 07-Jul-2016 Samuel Iglesias Gonsálvez <siglesias@igalia.com> i965: enable the emission of the DIM instruction

v2 (Matt):
- Take a DF source argument for the DIM instruction emission
in the visitors.
- Indentation.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7244dc1e0651958b62222cafb15e34487851a6cd 03-Jun-2016 Francisco Jerez <currojerez@riseup.net> Revert "i965/fs: Allow scalar source regions on SNB math instructions."

This reverts commit c1107cec44ab030c7fcc97c67baa12df1cc9d7b5.
Apparently the hardware spec text I quoted in the commit message was
outright lying about scalar source math being supported on SNB, the
hardware seems to load 32 contiguous bits of data for each channel
regardless of the regioning mode. Fixes regressions in the following
CTS tests (which we didn't catch early due to CTS being temporarily
disabled in our CI system):

es2-cts.gtf.gl.atan.atan_vec3_frag_xvary
es2-cts.gtf.gl.cos.cos_vec2_frag_xvary
es2-cts.gtf.gl.atan.atan_vec2_frag_xvary
es2-cts.gtf.gl.pow.pow_vec2_frag_xvary_yconsthalf
es2-cts.gtf.gl.cos.cos_float_frag_xvary
es2-cts.gtf.gl.pow.pow_float_frag_xvary_yconsthalf
es2-cts.gtf.gl.atan.atan_vec3_frag_xvaryyvary
es2-cts.gtf.gl.pow.pow_vec3_frag_xvary_yconsthalf
es2-cts.gtf.gl.cos.cos_vec3_frag_xvary
es2-cts.gtf.gl.atan.atan_vec2_frag_xvaryyvary

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96346
Reported-by: Mark Janes <mark.a.janes@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
58617bcebe1d0d5e8d360fec2c9dabf8771b0f7a 01-Jun-2016 Alejandro Piñeiro <apinheiro@igalia.com> i965/eu: use simd8 when exec_size != EXECUTE_16

Among other thigs, fix a gpu hang when using INTEL_DEBUG=shader_time
for any shader.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
c1107cec44ab030c7fcc97c67baa12df1cc9d7b5 28-May-2016 Francisco Jerez <currojerez@riseup.net> i965/fs: Allow scalar source regions on SNB math instructions.

I haven't found any evidence that this isn't supported by the
hardware, in fact according to the SNB hardware spec:

"The supported regioning modes for math instructions are align16,
align1 with the following restrictions:
- Scalar source is supported.
[...]
- Source and destination offset must be the same, except the case of
scalar source."

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
81bc6de8c0f7faafd0f3b0aee944a14ba3ef0b64 19-May-2016 Francisco Jerez <currojerez@riseup.net> i965/ir: Make BROADCAST emit an unmasked single-channel move.

Alternatively we could have extended the current semantics to 32-wide
mode by changing brw_broadcast() to emit multiple indexed MOV
instructions in the generator copying the selected value to all
destination registers, but it seemed rather silly to waste EU cycles
unnecessarily copying the exact same value 32 times in the GRF.

The vstride change in the Align16 path is required to avoid assertions
in validate_reg() since the change causes the execution size of the
MOV and SEL instructions to be equal to the source region width.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
41562eb8f33558f02ff8f53b3094a0e6d54e4c49 21-May-2016 Francisco Jerez <currojerez@riseup.net> i965/fs: Allow specifying arbitrary quarter control to FIND_LIVE_CHANNEL.

This makes FIND_LIVE_CHANNEL behave like a normal instruction for
non-zero quarter control. On Gen8+ we just leave the quarter control
field of the emitted FBL instruction set to the default value so the
hardware applies the expected shift to the execution mask signals. On
Gen7 we apply the offset manually by specifying a non-zero subregister
offset in the source region of the FBL instruction.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a5a08109608406438109bfa5def5a2af788d2840 19-May-2016 Francisco Jerez <currojerez@riseup.net> i965/fs: Allow specifying arbitrary execution sizes up to 32 to FIND_LIVE_CHANNEL.

Due to a Gen7-specific hardware bug native 32-wide instructions get
the lower 16 bits of the execution mask applied incorrectly to both
halves of the instruction, so the MOV trick we currently use wouldn't
work. Instead emit multiple 16-wide MOV instructions in 32-wide mode
in order to cover the whole execution mask.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a7d319c00be425be219a101b5b4d48f1cbe4ec01 17-May-2016 Francisco Jerez <currojerez@riseup.net> i965/fs: Implement scratch reads and writes of 4 GRFs at a time.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
fe5cdde2f9f84022b512de1fa42a036a371d31ba 17-May-2016 Francisco Jerez <currojerez@riseup.net> i965/eu: Fix Gen7+ DP scratch message size calculation on Gen7.

Gen7 hardware expects the block size field in the message descriptor
to be the number of registers minus one instead of the log2 of the
number of registers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
fc7107de1d7cac6be817e8951e53f997c248c277 26-Apr-2016 Francisco Jerez <currojerez@riseup.net> i965/eu: Set execution size explicitly for memory fence send message.

We don't want to emit a 32-wide send message in 32-wide programs. The
memory fence message should have the same effect regardless of the
execution size (as long as it's valid) so just set it to one.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
5c887326c516e2de710ff2d90ed608d834920688 26-Apr-2016 Francisco Jerez <currojerez@riseup.net> i965/eu: Consider QtrCtrl 3Q-4Q in typed surface message descriptor setup.

In SIMD32 programs the compiler is responsible for providing the
appropriate half of the sample mask in the message header, so the
first and third quarters both map to the first slot group of the
provided 16-bit half, while the second and fourth quarters map to the
second slot group -- IOW they should be equivalent to 1Q and 2Q modulo
two.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
448340d31f4d4d60fbd1935d5a50fe9ee22efd41 20-May-2016 Francisco Jerez <currojerez@riseup.net> i965/fs: Clean up remaining uses of dispatch_width in the generator.

Most of these are bugs because the intended execution size of an
instruction and the dispatch width of the shader aren't necessarily
the same (especially in SIMD32 programs).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
646213168ed1d2427f30cb92e783910a319cdbb4 28-May-2016 Francisco Jerez <currojerez@riseup.net> i965/eu: Use current exec size instead of p->compressed in surface message generation.

This was kind of an abuse of p->compressed, dataport send message
instructions are always uncompressed. Use the current execution size
instead since p->compressed is on its way out.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
fdae8b9f91089aea3d4b88ddb62a39ac687bb9be 19-May-2016 Francisco Jerez <currojerez@riseup.net> i965/eu: Stop using p->compressed to specify the exec size of control flow instructions.

p->compressed won't work for SIMD32, we should just be using the
execution size value specified via p->current instead.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
c19c3d3a5285af2936025568a91020f566ae768c 19-May-2016 Francisco Jerez <currojerez@riseup.net> i965/eu: Fix a bunch of compression control bugs in the generator.

Most of these were resetting quarter control to zero incorrectly even
though everything they needed to do was disable instruction
compression -- The brw_SAMPLE() case was doing the right thing but it
can be simplified slightly by using the new compression control
interface.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
61847d77084215ca23bf89517c282a78bc9726b9 24-May-2016 Matt Turner <mattst88@gmail.com> i965: Mark fallthrough in switch statement.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2fd79ebe8fe4f0f0397bba1624deed9fa4e7fc3b 15-May-2016 Kenneth Graunke <kenneth@whitecape.org> i965: Fix JIP to skip over sibling do...while loops.

We've apparently always been botching JIP for sequences such as:

do
cmp.f0.0 ...
(+f0.0) break
...
do
...
while
...
while

Because the "do" instruction doesn't actually exist, the inner "while"
is at the same depth as the "break". brw_find_next_block_end() thus
mistook the inner "while" as the end of the loop containing the "break",
and set the "break" to point to the wrong place.

Only "while" instructions that jump before our instruction are relevant.
We need to ignore the rest, as they're sibling control flow nodes (or
children, but this was already handled by the depth == 0 check).

See also commit 1ac1581f3889d5f7e6e231c05651f44fbd80f0b6.

This prevents channel masks from being screwed up, and fixes GPU
hangs(*) in dEQP-GLES31.functional.shaders.multisample_interpolation.
interpolate_at_sample.centroid_qualified.multisample_texture_16.

The test ended up executing code with no channels enabled, and that
code contained FIND_LIVE_CHANNEL, which returned 8 (out of range for
a SIMD8 program), which then was used in indirect GRF addressing,
which randomly got a boolean value (0xFFFFFFFF), interpreted it as
a sample ID, OR'd it into an indirect send message descriptor,
which corrupted the message length, sending a pixel interpolator
message with mlen 15, which is illegal. Whew :)

(*) Technically, the test doesn't GPU hang currently, but only
because another bug prevents it from issuing pixel interpolator
messages entirely...with that fixed, it hangs.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2f02fad6b3a0429798c3bd4feb4501dafa5e2fc0 15-May-2016 Kenneth Graunke <kenneth@whitecape.org> i965: Make a "does this while jump before our instruction?" helper.

I need to use this in an additional place.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
3210870b34ac64ccdc399778ba306ce452ac7e88 07-Jan-2016 Iago Toral Quiroga <itoral@igalia.com> i965: two-argument instructions can only use 32-bit immediates

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
cbf7c7f09993e79633f43edc01ef95e70c1bffab 03-Aug-2015 Connor Abbott <connor.w.abbott@intel.com> i965/eu: add support for DF immediates

v2 (Sam):
- Remove 'however' from the comment (Topi)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9add73f641401775867824a5d799813474d34112 20-Oct-2014 Topi Pohjolainen <topi.pohjolainen@intel.com> i965/eu: Allow 3-src float ops with doubles

v2:
- set 3src_src_type for BRW_REGISTER_TYPE_DF (Connor)

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
530593da65c0205539fe4bd7bcf7c01e3eba723d 18-Mar-2016 Marc-André Lureau <marcandre.lureau@redhat.com> i965: fix invalid memory write

I noticed some heap corruption running virgl tests, and valgrind
helped me to track it down to the following error:

==29272== Invalid write of size 4
==29272== at 0x90283D4: push_loop_stack (brw_eu_emit.c:1307)
==29272== by 0x9029A7D: brw_DO (brw_eu_emit.c:1750)
==29272== by 0x90554B0: fs_generator::generate_code(cfg_t const*, int) (brw_fs_generator.cpp:1999)
==29272== by 0x904491F: brw_compile_fs (brw_fs.cpp:5685)
==29272== by 0x8FC5DC5: brw_codegen_wm_prog (brw_wm.c:137)
==29272== by 0x8FC7663: brw_fs_precompile (brw_wm.c:638)
==29272== by 0x8FA4040: brw_shader_precompile(gl_context*, gl_shader_program*) (brw_link.cpp:51)
==29272== by 0x8FA4A9A: brw_link_shader (brw_link.cpp:260)
==29272== by 0x8DEF751: _mesa_glsl_link_shader (ir_to_mesa.cpp:3006)
==29272== by 0x8C84325: _mesa_link_program (shaderapi.c:1042)
==29272== by 0x8C851D7: _mesa_LinkProgram (shaderapi.c:1515)
==29272== by 0x4E4B8E8: add_shader_program (vrend_renderer.c:880)
==29272== Address 0xf2f3cb0 is 0 bytes after a block of size 112 alloc'd
==29272== at 0x4C2AA98: calloc (vg_replace_malloc.c:711)
==29272== by 0x8ED11F7: ralloc_size (ralloc.c:113)
==29272== by 0x8ED1282: rzalloc_size (ralloc.c:134)
==29272== by 0x8ED14C0: rzalloc_array_size (ralloc.c:196)
==29272== by 0x9019C7B: brw_init_codegen (brw_eu.c:291)
==29272== by 0x904F565: fs_generator::fs_generator(brw_compiler const*, void*, void*, void const*, brw_stage_prog_data*, unsigned int, bool, gl_shader_stage) (brw_fs_generator.cpp:124)
==29272== by 0x9044883: brw_compile_fs (brw_fs.cpp:5675)
==29272== by 0x8FC5DC5: brw_codegen_wm_prog (brw_wm.c:137)
==29272== by 0x8FC7663: brw_fs_precompile (brw_wm.c:638)
==29272== by 0x8FA4040: brw_shader_precompile(gl_context*, gl_shader_program*) (brw_link.cpp:51)
==29272== by 0x8FA4A9A: brw_link_shader (brw_link.cpp:260)
==29272== by 0x8DEF751: _mesa_glsl_link_shader (ir_to_mesa.cpp:3006)

if_depth_in_loop is an array of size p->loop_stack_array_size, and
push_loop_stack() will access if_depth_in_loop[p->loop_stack_depth+1],
thus the condition to grow the array should be
p->loop_stack_array_size <= (p->loop_stack_depth + 1) (it's currently
off by 2...)

This can be reproduced by running the following test with virgl test
server:
LIBGL_ALWAYS_SOFTWARE=y GALLIUM_DRIVER=virpipe bin/shader_runner
./tests/shaders/glsl-fs-unroll-explosion.shader_test -auto

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
5be11d22368c4fd520983ab78a9ac8fc10d79929 03-Dec-2015 Iago Toral Quiroga <itoral@igalia.com> i965: Skip execution size adjustment for instructions of width 4

This code in brw_set_dest adjusts the execution size of any instruction
with a dst.width < 8. However, we don't want to do this with instructions
operating on doubles, since these will have a width of 4, but still
need an execution size of 8 (for SIMD8). Unfortunately, we can't just check
the size of the operands involved to detect if we are doing an operation on
doubles, because we can have instructions that do operations on double
operands interpreted as UD, operating on any of its 2 32-bit components.

Previous commits have made it so we never emit instructions with a horizontal
width of 4 that don't have the correct execution size set for gen6+, so
we can skip it in this case, avoiding the conflicts with fp64 requirements.

Expanding the same fix to other hardware generations requires many more
changes but since we are not targetting fp64 support on them
wer don't really care for now.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f6342b56456582340f622ec6e23627ee07ba711d 03-Dec-2015 Iago Toral Quiroga <itoral@igalia.com> i965: set correct execsize for MOVS with a width of 4 in brw_find_live_channel

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
31a86042522f4f836b503679be8a120e302fb68a 03-Dec-2015 Iago Toral Quiroga <itoral@igalia.com> i965/eu: set execution size for SEND message in brw_send_indirect_message

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ea45b6e96d16b04b6a6cbebb5a8f77ba6a46bcf9 03-Dec-2015 Iago Toral Quiroga <itoral@igalia.com> i965/eu: set correct execution size in brw_NOP

v2: NOP should have an execsize of 1 (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
537d3df97466835ad6438fe2c9121283e0da1bcd 27-Feb-2016 Francisco Jerez <currojerez@riseup.net> i965: Pass symbolic swizzle to brw_swizzle() as a single argument.

And replace brw_swizzle1() with brw_swizzle(). Seems slightly cleaner
and will allow reusing brw_swizzle() in the vec4 back-end more easily.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7428e6f86ab5022ba07f562e124642245c63a72f 01-Feb-2016 Jordan Justen <jordan.l.justen@intel.com> i965: Set dest type to UW for several send messages

Without this, on SIMD 16 the send instruction destination will appear
to write more than one destination register, causing the simulator to
report an error.

Of course, the send instruction can actually write more than one
destination register regardless of the type set for the destination,
so this is a bit strange.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
53a9b6223f4ebf66e8892e04ffe47eb5586eda5c 31-Dec-2015 Kenneth Graunke <kenneth@whitecape.org> i965: Move 3-src subnr swizzle handling into the vec4 backend.

While most align16 instructions only support a SubRegNum of 0 or 4
(using swizzling to control the other channels), 3-src instructions
actually support arbitrary SubRegNums. When the RepCtrl bit is set,
we believe it ignores the swizzle and uses the equivalent of a <0,1,0>
region from the subnr.

In the past, we adopted a vec4-centric approach of specifying subnr of
0 or 4 and a swizzle, then having brw_eu_emit.c convert that to a proper
SubRegNum. This isn't a great fit for the scalar backend, where we
don't set swizzles at all, and happily set subnrs in the range [0, 7].

This patch changes brw_eu_emit.c to use subnr and swizzle directly,
relying on the higher levels to set them sensibly.

This should fix problems where scalar sources get copy propagated into
3-src instructions in the FS backend. I've only observed this with
TES push model inputs, but I suppose it could happen in other cases.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
77b338d63b61d72dafa7ecd420e36ee2bb0436ab 02-Dec-2015 Kenneth Graunke <kenneth@whitecape.org> i965: Make brw_set_message_descriptor() non-static.

I want to use this directly from brw_vec4_generator.cpp.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1ac1581f3889d5f7e6e231c05651f44fbd80f0b6 18-Nov-2015 Kenneth Graunke <kenneth@whitecape.org> i965: Fix JIP to properly skip over unrelated control flow.

We've apparently always been botching JIP for sequences such as:

do
cmp.f0.0 ...
(+f0.0) break
...
if
...
else
...
endif
...
while

Normally, UIP is supposed to point to the final destination of the jump,
while in nested control flow, JIP is supposed to point to the end of the
current nesting level. It essentially bounces out of the current nested
control flow, to an instruction that has a JIP which bounces out another
level, and so on.

In the above example, when setting JIP for the BREAK, we call
brw_find_next_block_end(), which begins a search after the BREAK for the
next ENDIF, ELSE, WHILE, or HALT. It ignores the IF and finds the ELSE,
setting JIP there.

This makes no sense at all. The break is supposed to skip over the
whole if/else/endif block entirely. They have a sibling relationship,
not a nesting relationship.

This patch fixes brw_find_next_block_end() to track depth as it does
its search, and ignore anything not at depth 0. So when it sees the
IF, it ignores everything until after the ENDIF. That way, it finds
the end of the right block.

I noticed this while reading some assembly code. We believe jumping
earlier is harmless, but makes the EU walk through a bunch of disabled
instructions for no reason. I noticed that GLBenchmark Manhattan had
a shader that contained a BREAK with a bogus JIP, but didn't measure
any performance improvement (it's likely miniscule, if there is any).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
55ffa64daf765b1229364518106a4124bd84b9a7 23-Nov-2015 Francisco Jerez <currojerez@riseup.net> i965/gen9+: Switch thread scratch space to non-coherent stateless access.

The thread scratch space is thread-local so using the full IA-coherent
stateless surface index (255 since Gen8) is unnecessary and
potentially expensive. On Gen8 and early steppings of Gen9 this is
not a functional change because the kernel already sets bit 4 of
HDC_CHICKEN0 which overrides all HDC memory access to be non-coherent
in order to workaround a hardware bug.

This happens to fix a full system hang when running any spilling code
on a pre-production SKL GT4e machine I have on my desk (forcing all
HDC access to non-coherent from the kernel up to stepping F0 might be
a good idea though regardless of this patch), and improves performance
of the OglPSBump2 SynMark benchmark run with INTEL_DEBUG=spill_fs by
33% (11 runs, 5% significance) on a production SKL GT2 (on which HDC
IA-coherency is apparently functional so it wouldn't make sense to
disable globally).

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0eb3db117b56b081ee2674cc8940c193ffc3c41b 02-Nov-2015 Matt Turner <mattst88@gmail.com> i965: Use BRW_MRF_COMPR4 macro in more places.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d74dd703f80ff40047ad8360e66ffd70b80f7230 23-Oct-2015 Matt Turner <mattst88@gmail.com> i965: Add and use enum brw_reg_file.

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e42fb0c2a687cdcd6af2a590f6f5e24f64cfff3b 23-Oct-2015 Matt Turner <mattst88@gmail.com> i965: Make 'dw1' and 'bits' unnamed structures in brw_reg.

Generated by

sed -i -e 's/\.bits\././g' *.c *.h *.cpp
sed -i -e 's/dw1\.//g' *.c *.h *.cpp

and then reverting changes to comments in gen7_blorp.cpp and
brw_fs_generator.cpp.

There wasn't any utility offered by forcing the programmer to list these
to access their fields. Removing them will reduce churn in future
commits.

This is C11 (and gcc has apparently supported it for sometime
"compatibility with other compilers")

See https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f7f1bc6cca251193105d59811d7313e69e867d78 28-Oct-2015 Kristian Høgsberg <krh@bitplanet.net> i965: Fix invalid memory accesses after resizing brw_codegen's store table

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e09e5f992eb233a4e2afb505e150befd7a67deac 26-Oct-2015 Matt Turner <mattst88@gmail.com> i965: Set correct field for indirect align16 addrimm.

This has been wrong since the initial import of the i965 driver.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
de5a450bd360d24db65cbba5b6633f800fda0d2e 17-Oct-2015 Kristian Høgsberg Kristensen <krh@bitplanet.net> i965: Don't use message headers for untyped reads

We always set the mask to 0xffff, which is what it defaults to when no
header is present. Let's drop the header instead.

v2: Only remove header for untyped reads. Typed reads always need the
header.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
da361acd1c899d533caec6cae5a336f6ab35e076 17-Jul-2015 Neil Roberts <neil@linux.intel.com> i965/fs: Handle non-const sample number in interpolateAtSample

If a non-const sample number is given to interpolateAtSample it will
now generate an indirect send message with the sample ID similar to
how non-const sampler array indexing works. Previously non-const
values were ignored and instead it ended up using a constant 0 value.

The generator will try to determine if the sample ID is dynamically
uniform via nir_src_is_dynamically_uniform. If not it will query the
pixel interpolator in a loop, once for each different live sample
number. The next live sample number is found using emit_uniformize. If
multiple live channels have the same sample number then they will be
handled in a single iteration of the loop. The loop is necessary
because the indirect send message doesn't seem to have a way to
specify a different value for each fragment.

This fixes the following two Piglit tests:

arb_gpu_shader5-interpolateAtSample-nonconst
arb_gpu_shader5-interpolateAtSample-dynamically-nonuniform

v2: Handle dynamically non-uniform sample ids.
v3: Remove the BREAK instruction and predicate the WHILE directly.
Make the tokens arrays const. (Matt Turner)
v4: Iterate over the live channels instead of each possible sample
number.
v5: Don't special case immediate values in
brw_pixel_interpolator_query. Make a better wrapper for the
function to set up the PI send instruction. Ensure that the SHL
instructions are scalar. (Francisco Jerez).

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
c1070550c289d48ef389aeb8c564d1abd1123ad1 21-Sep-2015 Kenneth Graunke <kenneth@whitecape.org> i965: Fix MRF register number assertions for compr4.

compr4 is represented by setting the high bit on the MRF number.
We need to mask it out before sanity checking the register number.

Fixes ~8000 assert fails on Ironlake and G45.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92066
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f50645d05c6dffa6463856ded0b8461ac9d24535 15-Sep-2015 Iago Toral Quiroga <itoral@igalia.com> i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation

There are some bug reports about shaders failing to compile in gen6
because MRF 14 is used when we need to spill. For example:
https://bugs.freedesktop.org/show_bug.cgi?id=86469
https://bugs.freedesktop.org/show_bug.cgi?id=90631

Discussion in bugzilla pointed to the fact that gen6 might actually have
24 MRF registers available instead of 16, so we could use other MRF
registers and avoid these conflicts (we still need to investigate why
some shaders need up to MRF 14 anyway, since this is not expected).

Notice that the hardware docs are not clear about this fact:

SNB PRM Vol4 Part2's "Table 5-4. MRF Registers Available in Device
Hardware" says "Number per Thread" - "24 registers"

However, SNB PRM Vol4 Part1, 1.6.1 Message Register File (MRF) says:

"Normal threads should construct their messages in m1..m15. (...)
Regardless of actual hardware implementation, the thread should
not assume th at MRF addresses above m15 wrap to legal MRF registers."

Therefore experimentation was necessary to evaluate if we had these extra
MRF registers available or not. This was tested in gen6 using MRF
registers 21..23 for spilling and doing a full piglit run (all.py) forcing
spilling of everything on the FS backend. It was also tested by doing
spilling of everything on both the FS and the VS backends with a piglit run
of shader.py. In both cases no regressions were observed. In fact, many of
these tests where helped in the cases where we forced spilling, since that
triggered the same underlying problem described in the bug reports. Here are
some results using INTEL_DEBUG=spill_fs,spill_vec4 for a shader.py run on
gen6 hardware:

Using MRFs 13..15 for spilling:
crash: 2, fail: 113, pass: 6621, skip: 5461

Using MRFs 21..23 for spilling:
crash: 2, fail: 12, pass: 6722, skip: 5461

This patch sets the ground for later patches to implement spilling
using MRF registers 21..23 in gen6.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
085861083638ec782c17d3aa72ab46f1a0099935 16-Sep-2015 Iago Toral Quiroga <itoral@igalia.com> i965: Move MRF register asserts out of brw_reg.h

In a later patch we will make BRW_MAX_MRF return a different value depending
on the hardware generation, but it is inconvenient to add a gen parameter
to the brw_reg functions only for the assertions, so move these to places where
we have the hardware generation available.

Ken suggested to add the asserts to brw_set_src0 and brw_set_dest since that
would make sure that we catch all uses of MRF registers, even those coming
from modules that generate native code directly, like blorp. Unfortunately,
this is very late in the process which can make things harder to debug, so add
asserts to the generator as well.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
dd7290cf59206c49f1a322d53baa9957b13d2949 11-Sep-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/emit: Add assertions for accumulator restrictions

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8fafb0a67faa548cb16e122e214912a17835e369 19-Aug-2015 Ian Romanick <ian.d.romanick@intel.com> mesa: Fix warning about static being in the wrong place

Because the compiler already has enough things to complain about.

grep -rl 'const static' src/ | while read f
do
sed --in-place -e 's/const static/static const/g' $f
done

brw_eu_emit.c: In function 'brw_reg_type_to_hw_type':
brw_eu_emit.c:98:7: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
const static int imm_hw_types[] = {
^
brw_eu_emit.c:120:7: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
const static int hw_types[] = {
^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
40e2102e528498dd4c03c4567d3522241f4d1f22 06-Jul-2015 Francisco Jerez <currojerez@riseup.net> i965/gen4-5: Set ENDIF dst and src0 fields to the null register.

The hardware docs don't mention explicitly what these fields should
be, but I've verified experimentally on ILK that using a GRF as
destination causes the register to be corrupted when the execution
size of an ENDIF instruction is higher than 8 -- and because the
destination we were using was g0, eventually a hang.

Fixes some 150 piglit tests on Gen4-5 when forced to run shaders with
if conditionals 16-wide, e.g. shaders/glsl-fs-sampler-numbering-3.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7953c000731ec1310fdbb5d8a13720fe0cdbf6f4 05-Nov-2014 Jordan Justen <jordan.l.justen@intel.com> i965: Add brw_barrier to emit a Gateway Barrier SEND

This will be used to implement the Gateway Barrier SEND needed to implement
the barrier function.

v2:
* notify => gateway_notify (Ken)
* combine short lines of brw_barrier proto/decl (mattst88)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0d250cc210f971f566bbe5b1e54cf3cd114537e9 05-Nov-2014 Jordan Justen <jordan.l.justen@intel.com> i965: Add brw_WAIT to emit wait instruction

This will be used to implement the barrier function.

v2:
* Rename to brw_WAIT (mattst88)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e354cc9b791cf025d26de7e19c58d499b83a3570 27-May-2015 Matt Turner <mattst88@gmail.com> i965: Silence warning in 3-src type-setting.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
715bc6d8b16a0bbdc17fe1e1e46b88a679bf312b 23-Apr-2015 Francisco Jerez <currojerez@riseup.net> i965: Introduce the FIND_LIVE_CHANNEL pseudo-opcode.

This instruction calculates the index of an arbitrary channel enabled
in the current execution mask. It's expected to be used as input for
the BROADCAST opcode, but it's implemented as a separate instruction
rather than being baked into BROADCAST because FIND_LIVE_CHANNEL has
no dependencies so it can always be CSE'ed with other instances of the
same instruction within a basic block.

v2: Whitespace fixes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
c74511f5dc239eefb8604294c6c1e57b3a394111 20-Feb-2015 Francisco Jerez <currojerez@riseup.net> i965: Introduce the BROADCAST pseudo-opcode.

The BROADCAST instruction picks the channel from its first source
given by an index passed in as second source. This will be used in
situations where all channels from the same SIMD thread have to agree
on the value of something, e.g. a surface binding table index.

This is in particular the case for UBO, sampler and image arrays,
which can be indexed dynamically with the restriction that all active
SIMD channels access the same index, provided to the shared unit as
part of a single scalar field of the message descriptor. Simply
taking the index value from the first channel as we were doing until
now is incorrect, because it might contain an uninitialized value if
the channel had previously been disabled by non-uniform control flow.

v2: Minor style fixes. Improve commit message.

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f1d1d17db6bdeac0519652aa7432048507154a28 23-Apr-2015 Francisco Jerez <currojerez@riseup.net> i965: Add memory fence opcode.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f118e5d15fd9b35cf27a975a702c5fb81d3157aa 23-Apr-2015 Francisco Jerez <currojerez@riseup.net> i965: Add typed surface access opcodes.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0775d8835ac8d1f2ab75d04f0cddbad36b6787fe 23-Apr-2015 Francisco Jerez <currojerez@riseup.net> i965: Add untyped surface write opcode.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2f1c16df3e997771bcedb60ae7f16a21c4c60144 23-Apr-2015 Francisco Jerez <currojerez@riseup.net> i965: Fix the untyped surface opcodes to deal with indirect surface access.

Change brw_untyped_atomic() and brw_untyped_surface_read() to take the
surface index as a register instead of a constant and to use
brw_send_indirect_message() to emit the indirect variant of send with
a dynamically calculated message descriptor. This will be required to
support variable indexing of image arrays for
ARB_shader_image_load_store.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
05e7f7f4388bde882b7ce74124000a4d435affff 22-Apr-2015 Zoë Blade <zoe@bytenoise.co.uk> Fix a few typos

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a85c4c9b3f75cac9ab133caa91a40eec2e4816ae 16-Apr-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965: Rename brw_compile to brw_codegen

This name better matches what it's actually used for. The patch was
generated with the following command:

for file in *; do
sed -i -e s/brw_compile/brw_codegen/g $file
done

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
61c4702489fa1694892c5ce90ccf65a5094df3e7 15-Apr-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965: Remove the context field from brw_compiler

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
4e9c79c847c81701300b5b0d97d85dcfad32239a 15-Apr-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965: Make the brw_inst helpers take a device_info instead of a context

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
b14313e45295d91b5737775ec788c76d8f0c2f93 07-Apr-2015 Matt Turner <mattst88@gmail.com> i965/fs: Manually set source regioning on PLN instructions.

Like LINE (commit 92346db0), src0 must have a scalar region. Setting
src1's region to <8,8,1> lets us pass a properly sized combined delta_xy
argument in a few commits without getting a bogus <16,16,1> region.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
b5a5b63548e9b27a3d0b8ad1b399006c71dcc3c4 04-Apr-2015 Matt Turner <mattst88@gmail.com> i965/fs: Allow an execution size of 32.

In a few commits, we'll start emitting an add(32) instruction on some
platforms.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
dd5c8250537640f92dbc1ee63d516c6e3e2aaf77 14-Apr-2015 Matt Turner <mattst88@gmail.com> i965: Replace guess_execution_size with something simpler.

guess_execution_size() does two things:

1. Cope with small destination registers.
2. Cope with SIMD8 vs SIMD16 mode.

This patch replaces the first with a simple if block in brw_set_dest: if
the destination register width is less than 8, you probably want the
execution size to match. (I didn't put this in the 3src block because
it doesn't seem to matter.)

Since only the FS compiler cares about SIMD16 mode, it's easy to just
set the default execution size there.

This pattern was already been proven in the Gen8+ generator, but we
didn't port it back to the existing generator when we combined the two.

This is based on a patch from Ken from about a year ago. I've rebased it
and and fixed a few bugs.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1cc00f1875e7b830db27945090ad78be41157dc9 26-Feb-2015 Francisco Jerez <currojerez@riseup.net> i965: Mask out unused Align16 components in brw_untyped_atomic.

This is currently not a problem because the vec4 visitor happens to
mask out unused components from the destination, but it might become
an issue when we start using atomics without writeback message. In
any case it seems sensible to set it again here because the
consequences of setting the wrong writemask (random graphics memory
corruption) are difficult to debug and can easily go unnoticed.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
959d16e38e007b29349d7371fb390a5449c88341 25-Feb-2015 Francisco Jerez <currojerez@riseup.net> i965: Pass number of components explicitly to brw_untyped_atomic and _surface_read.

And calculate the message response size based on the number of
components rather than the other way around. This simplifies their
interface somewhat and allows the caller to request a writeback
message with more than one vector component in SIMD4x2 mode.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a902a5d6ba921ab006496aeecab0f68bca7ffb09 19-Mar-2015 Francisco Jerez <currojerez@riseup.net> i965: Factor out logic to build a send message instruction with indirect descriptor.

This is going to be useful because the Gen7+ uniform and varying pull
constant, texturing, typed and untyped surface read, write, and atomic
generation code on the vec4 and fs back-end all require the same logic
to handle conditionally indirect surface indices. In pseudocode:

| if (surface.file == BRW_IMMEDIATE_VALUE) {
| inst = brw_SEND(p, dst, payload);
| set_descriptor_control_bits(inst, surface, ...);
| } else {
| inst = brw_OR(p, addr, surface, 0);
| set_descriptor_control_bits(inst, ...);
| inst = brw_SEND(p, dst, payload);
| set_indirect_send_descriptor(inst, addr);
| }

This patch abstracts out this frequently recurring pattern so we can
now write:

| inst = brw_send_indirect_message(p, sfid, dst, payload, surface)
| set_descriptor_control_bits(inst, ...);

without worrying about handling the immediate and indirect surface
index cases explicitly.

v2: Rebase. Improve documentatation and commit message. (Topi)
Preserve UW destination type cargo-cult. (Topi, Ken, Matt)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9735a62a2c6007e7ee7baa5a769575a0adb5fda3 12-Mar-2015 Antia Puentes <apuentes@igalia.com> i965: Emit IF/ELSE/ENDIF/WHILE JIP with type W on Gen7

IvyBridge and Haswell PRM say that the JIP should be emitted
with type W but we were using UD. The previous implementation
did not show adverse effects, but IMHO it is safer to follow
the specification thoroughly.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Antia Puentes <apuentes@igalia.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a72fb69604711d4f0e0fe49241d2da0311503f6a 05-Mar-2015 Iago Toral Quiroga <itoral@igalia.com> i965/fs: Implement SIMD16 dual source blending.

From the SNB PRM, volume 4, part 1, page 193:

"The dual source render target messages only have SIMD8 forms due to
maximum message length limitations. SIMD16 pixel shaders must send two of
these messages to cover all of the pixels. Each message contains two colors
(4 channels each) for each pixel in the message payload."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82831
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
49a7f8c919d23fec977116f218780a35896cc1dd 28-Feb-2015 Brian Paul <brianp@vmware.com> i965: replace Elements() with ARRAY_SIZE()

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
6c07279e5ad67d925b99ff9e0345dcaeffc37283 04-Feb-2015 Francisco Jerez <currojerez@riseup.net> i965: Handle F16TO32/F32TO16 with dword src/dst consistently on both back-ends.

Due to the way it's implemented in hardware, the F16TO32/F32TO16
instructions require the source/destination register to be of some
16-bit type in Align1 mode, while they require it to be some 32-bit
type in Align16 mode (and as an undocumented feature the high 16 bits
of the destination register are zeroed out in the case of the F32TO16
instruction on Gen7). Make their behaviour consistent so you can
specify a 32 bit register type as source or destination and get
predictable results in the most significant bits no matter what access
mode is being used.

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
437d401e6398eebc2ecd061650d16d1ad2d947f1 04-Feb-2015 Francisco Jerez <currojerez@riseup.net> i965/gen8: Fix F32TO16 in vec4 mode if the source and destination registers alias.

We cannot zero out the destination register if it overlaps with the
source. Use an Align1 instruction instead to zero out the high 16
bits after the conversion to half float.

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2335153ff2fae01d6294876a86d3eab59c6c4236 07-Jan-2015 Matt Turner <mattst88@gmail.com> i965: Remove now unnecessary Gen8 CMP destination type override.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
6be2434031775f4b44f2ff3db99047c0baefa797 23-Jan-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/emit: Assert that src1 is not an MRF after doing the MRF->GRF conversion

When emitting texturing from indirect texture units, we need to be able to
scratch around in the header message. Since we only do this for >= HSW,
this is ok since there are no MRFs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7de8a3e13efe1c3eede531737f6780d388152355 22-Jan-2015 Jason Ekstrand <jason.ekstrand@intel.com> i965/emit: Do the sampler index adjustment directly in header.0.3

Prior to this commit, the adjust_sampler_state_pointer function took an
extra register that it could use as scratch space. The usual candidate was
the destination of the sampler instruction. However, if that register ever
aliased anything important such as the sampler index, this would scratch
all over important data. Fortunately, the calculation is such that we can
just do it in place and we don't need the scratch space at all.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
169d7e5cb1cce73d39e40717d5f49ac30b626d1b 23-Dec-2014 Ben Widawsky <benjamin.widawsky@intel.com> i965: Extract scalar region checking logic

There are currently 2 users of this functionality. I have 2 more users coming
up, and having a simple function makes the results much cleaner. The existing
interface semantics was proposed by Matt.

v2 (Ken): Rename to region_matches()/has_scalar_region().

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
430fbd8ad8b6d62bbb80757c5c7fa4fb365a3794 15-Dec-2014 Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> i965: Make validate_reg tables constant

Declare local tables constant.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
92346db0578ef4796ced402ff33117713da7b9ee 19-Aug-2014 Matt Turner <mattst88@gmail.com> i965: Set the region of LINE's src0 to <0,1,0>.

The PRMs say that

<src0> region must be a replicated scalar
(with HorzStride = VertStride = 0).

but apparently that doesn't actually apply to all generations. I did
notice when implementing the optimization later in this series that G45
and ILK needed this regioning.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9557cf7d0d2e5a76a5277c2a4825e265609b2fca 28-Oct-2014 Matt Turner <mattst88@gmail.com> i965: Remove non-existent vertical strides from array.

These never existed, as far as I can tell.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1ca88aa58217239da84a426c29f05b5b53855951 04-Nov-2014 Chris Forbes <chrisf@ijw.co.nz> i965: Fix sampler state pointer adjustment for nonconst samplers

This started hitting an assertion recently. Only affects Haswell
(Ivybridge doesn't support this meddling with the sampler state pointer,
and ARB_gpu_shader5 is not enabled yet on Broadwell)

14 Piglits crash->pass.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
4242eb14c1dab83b29e63a4833400ff600cc9f96 24-Oct-2014 Jason Ekstrand <jason.ekstrand@intel.com> i965: Use the spill destination for the message header on GEN >= 7

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0d5c9bf1e46b2d4d5274b6ad9d7c34da34d77cd7 03-Oct-2014 Matt Turner <mattst88@gmail.com> Revert "i965: Emit ELSE/ENDIF JIP with type D on Gen 7."

This reverts commit 54e30dbf4db437748509d1319c3f6e4185f76c69.

Will investigate after XDC.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84557
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8b0e4b387a2aeb28e32df5b680013338a841859b 17-Sep-2014 Jason Ekstrand <jason.ekstrand@intel.com> i965/fs: Add a an optional source to the FS_OPCODE_FB_WRITE instruction

Previously, we were use the base_mrf parameter of fs_inst to store the MRF
location. In preparation for doing FB writes from the GRF, we now also
allow you to set inst->base_mrf to -1 and provide a source register.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d25aaf1cb1688b38b2a4025dbbff26d74291723c 12-Sep-2014 Jason Ekstrand <jason.ekstrand@intel.com> i965/fs: Use the GRF for UNTYPED_ATOMIC instructions

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
66ab9c22fecc053f099376c7b20958e0ffdf05ca 30-Aug-2014 Matt Turner <mattst88@gmail.com> i965: Use BRW_MATH_DATA_SCALAR when source regioning is scalar.

Notice the mistaken (but harmless) argument swapping in brw_math_invert().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
54e30dbf4db437748509d1319c3f6e4185f76c69 28-Aug-2014 Matt Turner <mattst88@gmail.com> i965: Emit ELSE/ENDIF JIP with type D on Gen 7.

The spec says the type must be W (JIP is 16-bits after all), but we've
been emitting it with a UD type all along and have experienced no
adverse effects. Changing the type to D allows ELSE and ENDIF
instructions to be compacted.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
b5466707d6632e7dd019b36ced8da2b4ec7d5297 28-Aug-2014 Matt Turner <mattst88@gmail.com> i965: Set JumpCount, not JIP, on ENDIF on Gen 6.

Despite what the Sandybridge PRM says, ENDIF has Jump Count in <dst>,
not JIP in <src1>. (The same mistake appears about WHILE as well).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
86dc34a0b0d967e9c8611bc29178fdb1de22c724 10-Aug-2014 Chris Forbes <chrisf@ijw.co.nz> i965: Generalize sampler state pointer mangling for non-const

For now, assume that the addressed sampler can be in any of the
16-sampler banks. If we preserved range information this far, we
could avoid emitting these instructions if the sampler were known
to be contained within one bank.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8ce3fa8e91e96adac9ba909876d3b3066bdcd723 10-Aug-2014 Chris Forbes <chrisf@ijw.co.nz> i965: Extract helper function for surface state pointer adjustment

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
17e0fa9a066967ee7765d857e3a91f3a6bd4e566 02-Aug-2014 Chris Forbes <chrisf@ijw.co.nz> i965: Adjust set_message_descriptor to handle non-sends

We're about to be using this infrastructure to build descriptors in
src1 of non-send instructions, when preparing to do an indirect send.

Don't accidentally clobber the conditionalmod field of those
instructions with SFID bits, which aren't part of the descriptor.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
3512c79789e3b924c4f639a157cac7b80fea16f2 03-Aug-2014 Chris Forbes <chrisf@ijw.co.nz> i965: Add low-level support for indirect sends

This provides a reasonable place to enforce the hardware restriction
that indirect descriptors must be in a0.0

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e84e074248efca9f5445d353fae970c8f1240de5 14-Aug-2014 Kenneth Graunke <kenneth@whitecape.org> Revert "i965/vec4: Use MOV, not OR, to set URB write channel mask bits."

This reverts commit af13cf609f4257768ad8b80be8cec7f2e6ca8c81, which
appears to cause huge performance problems on Ivybridge. I'd missed
that the FFTID bits are in the low byte. The documentation doesn't
indicate that the URB write message header actually wants FFTID - it
just labels those bits as "Reserved." But it appears necessary.

This does slightly more than revert the original change: originally,
Broadwell had separate code generation, which used MOV, and this patch
only changed it for Gen4-7. Now that both are unified, reverting this
also makes Broadwell use OR. Which should be fine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ce90fd9676c3dfce6d692671909ee28d86a534ae 10-Aug-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Set src0 file to IMM on Gen8+ flow control instructions.

According to the documentation, we need to set the source 0 register
type to IMM for flow control instructinos that have both JIP and UIP.
Out of paranoia, just make all flow control instructions use IMM;
there's no benefit to using ARF anyway, and it could trouble that's
difficult to diagnose.

See commit 9584959123b0453cf5313722357e3abb9f736aa7, which did the
analogous change in the gen8_generator code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d8ef0eab5a133ad9d8945a6b7f077fea000a87a6 10-Aug-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Refactor brw_WHILE to share a bit more code on Gen6+.

We're going to add a Gen8+ case shortly, which would need to duplicate
this code again. Instead, share it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
aafdf9eef481a77810258b828e2a0b4e3c0aa696 29-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Emulate F32TO16 and F16TO32 on Broadwell.

When we combine the Gen4-7 and Gen8+ generators, we'll need to handle
half float packing/unpacking functions somehow. The Gen8+ generator
code today just emulates the behavior of the Gen7 F32TO16/F16TO32
instructions, including the align16 mode bugs.

Rather than messing with fs_generator/vec4_generator, I decided to just
emulate the instructions at the brw_eu_emit.c layer.

v2: Change gen >= 7 asserts to gen == 7 (suggested by Chris Forbes).
Fix regressions on Haswell in VS tests due to type assertions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
af13cf609f4257768ad8b80be8cec7f2e6ca8c81 11-Aug-2014 Kenneth Graunke <kenneth@whitecape.org> i965/vec4: Use MOV, not OR, to set URB write channel mask bits.

g0.5 has nothing of value to contribute to m0.5. In both the VS and GS
payload, g0.5 contains the scratch space pointer - which is definitely
not of any use. The GS payload also contains FFTID, but the URB write
message header doesn't want FFTID.

The only reason I used OR was because Eric originally requested it.
On Broadwell, I used MOV, and that's worked out fine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9276ef6f41626307c3da2ed94a77c0d51b6d8efd 12-Jul-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Allow math on immediates on Broadwell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
82ddd517afad7b133624e8dd32e90addfff27d1e 30-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Refactor jump distance scaling to use a helper function.

Different generations of hardware measure jump distances in different
units. Previously, every function that needed to set a jump target open
coded this scaling, or made a hardcoded assumption (i.e. just used 2).

Most functions start with the number of instructions to jump, and scale
up to the hardware-specific value. So, I made the function match that.

Others start with a byte offset, and divide by a constant (8) to obtain
the jump distance. This is actually 16 / 2 (the jump scale for Gen5-7).

v2: Make the helper a static inline defined in brw_eu.h, instead of
an actual function in brw_eu_emit.c (as suggested by Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a1c899c718758d68c112590d826e16c772ace195 30-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Set UIP on ELSE instructions on Broadwell.

Broadwell adds UIP on ELSE instructions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7d41170b62570eafa0d3041a87cff9ad57ff418e 30-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Make it clear that brw_patch_break_count only runs on Gen4-5.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0457464c3343b3809048249fa5c1c0867ef499dc 30-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Make it clear that brw_find_loop_end only runs on Gen6+.

It has Gen6+ knowledge baked in, and indeed is only called for Gen6+,
but it wasn't immediately obvious that this was the case.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0d6adce469b36190224cd13173441e98870c695a 30-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Port Broadwell CMP destination type hack to brw_eu_emit.c.

See gen8_generator::CMP().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8609df97a087d025ad40efad8d71e2a56450ef8f 12-Jul-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Use Haswell atomic messages on Broadwell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e1bd2ca28a125676f45c97e28339feec6d766795 30-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Change gen == 7 to gen >= 7 in a couple brw_eu_emit.c cases.

Broadwell is going to use the brw_eu_emit.c code soon. We want to get
the fake MRF handling and URB HWord channel mask handling.

We don't need the CMP thread switch workaround, though.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e9a9d441f0b1a0293e0901d0f6b99a946e51f6f4 04-Aug-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Set ExecSize to 16 for loop instructions in SIMD16 shaders.

Previously, we explicitly set the execution size to BRW_EXECUTE_8 and
disabled compression for loop instructions. I can't imagine how this
could be correct in SIMD16 mode.

Looking at the history, it appears that this code has used BRW_EXECUTE_8
since 2007, when we had a SIMD8 backend that supported control flow and
a separate SIMD16 backend that didn't. Presumably, when we added SIMD16
support for shaders with control flow, we simply neglected to update it.

Note that Gen4-5 don't support SIMD16 on shaders with control flow.

This might be a candidate for stable, but would need to be rewritten
completely due to the brw_inst API changes in master.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e64dbd050d6d5b4ea502ee2fc727e12135833771 04-Aug-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Merge brw_CONT and gen6_CONT.

The only difference is setting PopCount on Gen4-5.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e7a7b3317c5992d230cf55752ef0b6bc25928ff9 04-Aug-2014 Kenneth Graunke <kenneth@whitecape.org> i965/eu: Drop redundant brw_set_src0/brw_set_dest from gen6_CONT.

We shouldn't need to set them, then set them differently.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1e0da6233be6e5fb0143615d5e3d3642ddb7964f 25-Feb-2014 Kenneth Graunke <kenneth@whitecape.org> util: Move ralloc to a new src/util directory.

For a long time, we've wanted a place to put utility code which isn't
directly tied to Mesa or Gallium internals. This patch creates a new
src/util directory for exactly that purpose, and builds the contents as
libmesautil.la.

ralloc seemed like a good first candidate. These days, it's directly
used by mesa/main, i965, i915, and r300g, so keeping it in src/glsl
didn't make much sense.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>

v2 (Jason Ekstrand): More realloc uses and some scons fixes

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d732598b63eb0cd103f06bccd99d13d732028d79 17-Nov-2013 Chris Forbes <chrisf@ijw.co.nz> i965: add low-level support for send to pixel interpolator

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1499619fe6815510d4448f0e297d097f20a0acf3 06-Jul-2014 Chris Forbes <chrisf@ijw.co.nz> i965: Fix two broken asserts in brw_eu_emit

These were looking in the wrong field.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
46e5b2a497216133be656b38ebfcf96da64b7744 30-Jun-2014 Matt Turner <mattst88@gmail.com> i965: Make a brw_conditional_mod enum.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
3d826729dabab53896cdbb1f453c76fab1c7e696 29-Jun-2014 Matt Turner <mattst88@gmail.com> i965: Use unreachable() instead of unconditional assert().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7c79608b5b8a7eb4bed9fa9d594c9bda696dd49a 13-Jun-2014 Matt Turner <mattst88@gmail.com> i965: Replace 'struct brw_instruction' with 'brw_inst'.

Use this an an opportunity to clean up the formatting of some old code
(brw_ADD, for instance).

Signed-off-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
829aac4b6783a6e7667293a60d97947d277cfa39 05-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Convert brw_eu_emit.c to the new brw_inst API.

v2:
- Fix IF -> ELSE patching on Sandybridge.
- Don't set base_mrf on Gen6+ in OWord Block Read functions. (Although
- the old code did this universally, it shouldn't have - the field
- doesn't exist on Gen6+ and just got overwritten by the SFID anyway.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
05040d6f8fcfdc4fb070c7ff24d3990ffede77f1 08-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Pass brw into next_offset().

The new brw_inst API is going to require a brw pointer in order
to access fields (so it can do generation checks). Plumb it in now.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f3aecefa9930ce7dbdabdeefee0bd183172b586f 15-Jun-2014 Matt Turner <mattst88@gmail.com> i965: Silence warning about unused brw in release builds.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ebc75245032f58bbd8d24900c1471e74eb768077 14-Jun-2014 Matt Turner <mattst88@gmail.com> Revert "i965: Add 'wait' instruction support"

This reverts commit 20be3ff57670529a410b30a1008a71e768d08428.

No evidence of ever being used.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2c8520c03de135228c37d67c9ff9756e3febb660 11-Jun-2014 Matt Turner <mattst88@gmail.com> i965: Use brw->gen in some generation checks.

Will simplify the automated conversion if we want to allow compiling the
driver for a single generation.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
b003fc265fc672b35d15ce7c2d05e8b9c81c4ee9 07-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Rename brw_math to gen4_math.

Usually, I try to use "brw" for functions that apply to all generations,
and "gen4" for dead end/legacy code that is only used on Gen4-5.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
de65ec2fdeb3a22d408db24535d738b39cc3402c 07-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Split Gen4-5 and Gen6+ MATH instruction emitters.

Our existing functions, brw_math and brw_math2, had unclear roles:

Gen4-5 used brw_math for both unary and binary math functions; it never
used brw_math2. Since operands are already in message registers, this
is reasonable.

Gen6+ used brw_math for unary math functions, and brw_math2 for binary
math functions, duplicating a lot of code. The only real difference was
that brw_math used brw_null_reg() for src1.

This patch improves brw_math2's assertions to allow both unary and
binary operations, renames it to gen6_math(), and drops the Gen6+ code
out of brw_math().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
000f4a33c0359ed6b3c11aafa5f0cba1d6d91fea 13-Dec-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Don't set the "switch" flag on control flow instructions on Gen6+.

Thread switching on control flow instructions is a documented workaround
for Gen4-5 errata. As far as I can tell, it hasn't been needed since
Sandybridge. Thread switching is not free, so in theory this may help
performance slightly.

Flow control instructions with the "switch" flag cannot be compacted, so
removing it will make these instructions compactable. (Of course, we
still have to implement compaction for flow control instructions...)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8873120f9fb0c82cfd46cd15c39e66c38076cb0d 05-Jun-2014 Iago Toral Quiroga <itoral@igalia.com> Revert "i965: Move brw_land_fwd_jump() to compilation unit of its use."

This reverts commit f3cb2e6ed7059b22752a6b7d7a98c07ba6b5552e.

brw_land_fwd_jump() is convenient wherever we produce JMPI instructions
and we will use JMPI to implement framebuffer writes that involve line
antialiasing in gen < 6.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
220e208329e923faf50524c0adf72e4dcc931e49 05-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Fix else and brace placement in brw_eu_emit.c.

I'm making a lot of changes to this area, and I figured I may as well
not conflate these trivial changes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1f3735bff08bdcd23a7f1f6565f072f3103d780b 06-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Drop the remaining default predication whacking.

With my earlier cleaning in place (see git log brw_eu_emit.c), nothing
relies on the instruction emitters for IF/WHILE/JMPI disabling
predication. Drop it in favor of making callers do the right thing
explicitly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e374809819d82f2e3e946fe809c4d46061ddc5b5 01-Jun-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Put '_default_' in the name of functions that set default state.

Eventually we're going to use functions to set bits on an instruction.
Putting 'default' in the name of functions that alter default state will
help distinguins them.

This patch was generated entirely mechanically, by the following:

for file in brw*.{cpp,c,h}; do
sed -i \
-e 's/brw_set_mask_control/brw_set_default_mask_control/g' \
-e 's/brw_set_saturate/brw_set_default_saturate/g' \
-e 's/brw_set_access_mode/brw_set_default_access_mode/g' \
-e 's/brw_set_compression_control/brw_set_default_compression_control/g' \
-e 's/brw_set_predicate_control/brw_set_default_predicate_control/g' \
-e 's/brw_set_predicate_inverse/brw_set_default_predicate_inverse/g' \
-e 's/brw_set_flag_reg/brw_set_default_flag_reg/g' \
-e 's/brw_set_acc_write_control/brw_set_default_acc_write_control/g' \
$file;
done

No manual changes were done after running that command.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
76d7160c6c76cd5cbf61ccfa178ffba4ea9aa93b 31-May-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Delete brw_set_conditionalmod.

This removes the ability to set the default conditional modifier on all
future instructions. Nothing uses it, and it's not really a sensible
thing to do anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ff340ce3c3326959027d7cb9a611c6fab1d89941 31-May-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Stop setting predication from brw_set_conditionalmod.

brw_set_conditionalmod has traditionally been complex: it causes
conditionalmod to be set for the next instruction, and then predication
to be set on all future instructions after that.

We may want to generate a flag condition and not use it immediately,
due to instruction scheduling or the like. Even if not, it's easy
to set things explicitly, and that's clearer.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8deb91b2e75a65b979bd9d70c8700d2c38443336 28-May-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Make brw_JMPI set predicate_control based on a parameter.

We use both predicated and unconditional JMPI instructions. But in each
case, it's clear which we want. It's simpler to just specify it as a
parameter, rather than relying on default state.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
3769a2d51f593b94638743e4a174ee5b8a3d5406 28-May-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Remove the dst and src0 parameters from brw_JMPI.

In all cases, we set both dst and src0 to brw_ip_reg(). This is no
accident: according to the ISA reference, both are required to be the IP
register. So, we may as well drop the parameters.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
09655bb81b2a3767e678280631c49851ba9c022d 25-May-2014 Kenneth Graunke <kenneth@whitecape.org> i965: Don't implicitly set predicate default state in brw_CMP.

Previously, brw_CMP with a null destination implicitly set the default
state to make future instructions predicated. This is messy and
confusing - emitting a CMP that populates the flag register and later
using it to predicate instructions are logically separate. With the
main compiler, we may even schedule instructions between the CMP and the
user of the flag value.

This patch simplifies brw_CMP to just emit a CMP instruction, and not
mess with predication. It also updates all necessary callers. These
mostly fell into two patterns:

1. brw_CMP followed by brw_IF.

We don't need to do anything special here; brw_IF already sets up
predication appropriately.

2. brw_CMP followed by a single predicated instruction.

The old model was to call brw_CMP, emit the next (predicated)
instruction, then disable predication for any instructions beyond
that. Instead, just explicitly set predicate_control on the single
instruction we want to predicate. It's no more code, and requires
less cross-module knowledge.

This drops setting flag_value to 0xff as well, which is a field only
used by the SF compile. There is only one brw_CMP call in the SF code,
which is in do_twoside_caller, and called at the start of
brw_emit_tri_setup, where flag_value is already 0xff.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
4c7bf8a704c7d9f05fde6c8653734532b24bddd7 18-May-2014 Matt Turner <mattst88@gmail.com> i965: Switch types D->UD when possible to allow compaction.

Number of compacted instructions: 827404 -> 833045 (0.68%)

Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
492af22fb469190a6466ecbba4d4e3ce6a5c1715 19-May-2014 Matt Turner <mattst88@gmail.com> i965: Remove useless typo'd debugging messages.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f3cb2e6ed7059b22752a6b7d7a98c07ba6b5552e 19-May-2014 Matt Turner <mattst88@gmail.com> i965: Move brw_land_fwd_jump() to compilation unit of its use.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
99af02fb1778a425f1f3d211018d294b8d3b2245 02-May-2014 Matt Turner <mattst88@gmail.com> i965: Emit 0.0:F sources with type VF instead.

Number of compacted instructions: 817752 -> 827404 (1.18%)

Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
fb977c90d1ef29f47b686c27500005025543cf11 02-May-2014 Matt Turner <mattst88@gmail.com> i965: Emit ARF:UD for non-present src1 on Gen6+.

Enables the next commits to compact more instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
392cbc2f930b2505520e85b97b407cb6d4e17548 17-May-2014 Matt Turner <mattst88@gmail.com> i965: Move next_offset() to brw_eu.h for use elsewhere.

Also perform arithmetic on char* rather than void* since the latter is a
GNU C extension not available in C++.

Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e32e69cc27471f91093d5ceba0a18b22b4e4f8c0 17-May-2014 Matt Turner <mattst88@gmail.com> i965: Rename next_ip() -> next_offset().

That we were comparing its return value with offsets should have been a
clue. :)

Make it take a void *store in preparation for making the function useful
elsewhere.

Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2012599abb6710e0dae632edfe51de191438e76b 02-May-2014 Matt Turner <mattst88@gmail.com> i965: Reformat brw_set_src1 so it can be easily found with grep.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
83daa88035af978c5158cfe5a196df45ce1555c1 23-Dec-2013 Eric Anholt <eric@anholt.net> i965: Move the remaining driver debug over to stderr.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d92f593d8776ec157ad0e7fa2ee8c9a17fd744ce 14-Feb-2014 Eric Anholt <eric@anholt.net> i965/fs: Use conditional sends to do FB writes on HSW+.

This drops the MOVs for header setup, which are totally mis-scheduled.

total instructions in shared programs: 1590047 -> 1589331 (-0.05%)
instructions in affected programs: 43729 -> 43013 (-1.64%)
GAINED: 0
LOST: 0

glb27-trex:
x before
+ after
+-----------------------------------------------------------------------------+
| + x xx + + + |
| ++ + xxx ++x xx + ** *x+ + + + x * |
|+x xx x* x+++xx*x*xx+++*+*xx++** *x* x+***x*+xx+* + * + + *|
| |__|__________MA___A___________|___| |
+-----------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 49 62.33 65.41 63.49 63.53449 0.62757822
+ 50 62.28 65.4 63.7 63.6982 0.656564
No difference proven at 95.0% confidence

Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
877128505431adaf817dc8069172ebe4a1cdf5d8 17-Jan-2014 José Fonseca <jfonseca@vmware.com> s/Tungsten Graphics/VMware/

Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the
old copyright name is creating unnecessary confusion, hence this change.

This was the sed script I used:

$ cat tg2vmw.sed
# Run as:
#
# git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed
#

# Rename copyrights
s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g
/Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./
s/TUNGSTEN GRAPHICS/VMWARE/g

# Rename emails
s/alanh@tungstengraphics.com/alanh@vmware.com/
s/jens@tungstengraphics.com/jowen@vmware.com/g
s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/
s/jrfonseca\?@tungstengraphics.com/jfonseca@vmware.com/g
s/keithw\?@tungstengraphics.com/keithw@vmware.com/g
s/michel@tungstengraphics.com/daenzer@vmware.com/g
s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/
s/zack@tungstengraphics.com/zackr@vmware.com/

# Remove dead links
s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g

# C string src/gallium/state_trackers/vega/api_misc.c
s/"Tungsten Graphics, Inc"/"VMware, Inc"/

Reviewed-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7db56ddee07fb365832881e05205d98f281cea80 24-Dec-2013 Eric Anholt <eric@anholt.net> i965: Warning fix

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1e8e17ccd7a64fdde9b78d239d8a3c256006c984 10-Dec-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Add support for Broadwell's new register types.

Broadwell introduces support for Q, UQ, and HF types. It also extends
DF support to allow immediate values.

Irritatingly, although HF and DF both support immediates, they're
represented by a different value depending on the register file.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
15b9aa22d7d40456d59a9686be302ef0078e083f 10-Dec-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Add BRW_REGISTER_TYPE_DF.

Ivybridge, Baytrail, and Haswell support double float register types,
but do not support them as immediate values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
54e91e742010728cbf6c5b8c00b6ca5019a63eb9 10-Dec-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Abstract BRW_REGISTER_TYPE_* into an enum with unique values.

On released hardware, values 4-6 are overloaded. For normal registers,
they mean UB/B/DF. But for immediates, they mean UV/VF/V.

Previously, we just created #defines for each name, reusing the same
value. This meant we could directly splat the brw_reg::type field into
the assembly encoding, which was fairly nice, and worked well.

Unfortunately, Broadwell makes this infeasible: the HF and DF types are
represented as different numeric values depending on whether the
source register is an immediate or not.

To preserve sanity, I decided to simply convert BRW_REGISTER_TYPE_* to
an abstract enum that has a unique value for each register type, and
write translation functions. One nice benefit is that we can add
assertions about register files and generations.

I've chosen not to convert brw_reg::type to the enum, since converting
it caused a lot of trouble due to C++ enum rules (even though it's
defined in an extern "C" block...).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
11d9af7c0ab76c551e676c5ce0f0f369d7fc9f97 26-Nov-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Don't use GL types in files shared with intel-gpu-tools.

sed -i -e 's/GLuint/unsigned/g' -e 's/GLint/int/g' \
-e 's/GLfloat/float/g' -e 's/GLubyte/uint8_t/g' \
-e 's/GLshort/int16_t/g' \
brw_eu* brw_disasm.c brw_structs.h

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d542c45c75671e93a1cfae1f0eaf9c12f082f4f1 26-Nov-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Drop trailing whitespace from files shared with intel-gpu-tools.

Performed via s/ *$//g.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8dfc9f038ee3f6a57f0a3f3cc641b0866a6111b7 16-Oct-2013 Eric Anholt <eric@anholt.net> i965/fs: Use the gen7 scratch read opcode when possible.

This avoids a lot of message setup we had to do otherwise. Improves
GLB2.7 performance with register spilling force enabled by 1.6442% +/-
0.553218% (n=4).

v2: Use BRW_PREDICATE_NONE, improve a comment (by Paul).

Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e515dcbf96887faae743acb4771cb7375be0d6b8 20-Oct-2013 Francisco Jerez <currojerez@riseup.net> i965: Simplify the shader time code by using atomic counter helpers.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
5e621cb9fef7eada5a3c131d27f5b0b142658758 11-Sep-2013 Francisco Jerez <currojerez@riseup.net> i965/gen7: Implement code generation for untyped surface read instructions.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
cfaaa9bbb7a6ab5819f4fa9e38352b72d6293cff 11-Sep-2013 Francisco Jerez <currojerez@riseup.net> i965/gen7: Implement code generation for untyped atomic instructions.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
36fbe66d3a71df76fcb6f915846da4471b3a8442 10-Oct-2013 Eric Anholt <eric@anholt.net> i965/fs: Convert gen7 to using GRFs for texture messages.

Looking at Lightsmark's shaders, the way we used MRFs (or in gen7's
case, GRFs) was bad in a couple of ways. One was that it prevented
compute-to-MRF for the common case of a texcoord that gets used
exactly once, but where the texcoord setup all gets emitted before the
texture calls (such as when it's a bare fragment shader input, which
gets interpolated before processing main()). Another was that it
introduced a bunch of dependencies that constrained scheduling, and
forced waits for texture operations to be done before they are
required. For example, we can now move the compute-to-MRF
interpolation for the second texture send down after the first send.

The downside is that this generally prevents
remove_duplicate_mrf_writes() from doing anything, whereas previously
it avoided work for the case of sampling from the same texcoord twice.
However, I suspect that most of the win that originally justified that
code was in avoiding the WAR stall on the first send, which this patch
also avoids, rather than the small cost of the extra instruction. We
see instruction count regressions in shaders in unigine, yofrankie,
savage2, hon, and gstreamer.

Improves GLB2.7 performance by 0.633628% +/- 0.491809% (n=121/125, avg of
~66fps, outliers below 61 dropped).

Improves openarena performance by 1.01092% +/- 0.66897% (n=425).

No significant difference on Lightsmark (n=44).

v2: Squash in the fix for register unspilling for send-from-GRF, fixing a
segfault in lightsmark.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
44f0777f17df13f504602e76d7f5ba0edc642081 27-Sep-2013 Chia-I Wu <olv@lunarg.com> i965: make BRW_COMPRESSION_2NDHALF valid for brw_SAMPLE

SIMD8 sampler messages are allowed in SIMD16 mode, and they could not work
without BRW_COMPRESSION_2NDHALF. Later PRMs (gen5 and later) do not
explicitly state whether BRW_COMPRESSION_2NDHALF is allowed, but they do have
examples using send with SecHalf. It should be safe to assume SecHalf is
valid.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
014cce3dc49f5b0bfd7fbb1940ed661c9fc7bbd7 19-Sep-2013 Matt Turner <mattst88@gmail.com> i965: Generate code for ir_binop_carry and ir_binop_borrow.

Using the ADDC and SUBB instructions on Gen7.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
34b11334d417fae65ebe9cf96980aea581e24893 17-Sep-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Fix writemask != 0 assertions on Sandybridge.

This fixes myriads of regressions since commit 169f9c030c16d1247a3a7629
("i965: Add an assertion that writemask != NULL for non-ARFs.").

On Sandybridge, our control flow handling (such as brw_IF) does:

brw_set_dest(p, insn, brw_imm_w(0));
insn->bits1.branch_gen6.jump_count = 0;

This results in a IMM destination with zero for the writemask. IMM
destinations are rather bizarre, but the code has been working for ages,
so I'm loathe to change it.

Fixes glxgears on Sandybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a74af8148ded7417a46be5a9e300f2c6dfed4bed 12-Aug-2013 Paul Berry <stereotype441@gmail.com> i965/gen7: Add the ability to send URB_WRITE_OWORD messages.

Previously, brw_urb_WRITE() would always generate a URB_WRITE_HWORD
message, we always wanted to write data to the URB in pairs of varying
slots or larger (an HWORD is 32 bytes, which is 2 varying slots).

In order to support geometry shader EndPrimitive functionality, we'll
need the ability to write to just a single OWORD (16 byte) slot, since
we'll only be outputting 32 of the control data bits at a time. So
this patch adds a flag that will cause brw_urb_WRITE to generate a
URB_WRITE_OWORD message.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
bf5419e389a4a8339699e25ddb6cbe902cc22357 11-Aug-2013 Paul Berry <stereotype441@gmail.com> i965/gen7: Allow URB_WRITE channel masks to be used.

Previously, brw_urb_WRITE() would unconditionally override the channel
masks in the URB_WRITE message to 0xff (indicating that all channels
should be written to the URB).

In order to support geometry shader EndPrimitive functionality, we'll
need the ability to set the channel masks programatically, so that we
can output just 32 of the control data bits at a time. So this patch
adds a flag that will prevent brw_urb_WRITE() from overriding them.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
169f9c030c16d1247a3a762972d8687d89a16750 10-Sep-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Add an assertion that writemask != NULL for non-ARFs.

We've observed GPU hangs on Ivybridge from the following instruction:

mov(8) g115<1>.F 0D { align16 WE_normal NoDDChk 1Q };

There should be no reason to ever set the writemask on a destination
register to zero, except for perhaps the ARF NULL register.

This patch adds an assertion to enforce this for non-ARF registers.
Excluding ARFs is conservative yet should still catch the majority
of mistakes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
4a6100054c1702e08fea898d6a30050aadf36bcb 29-Aug-2013 Matt Turner <mattst88@gmail.com> i965: Remove never used RSR and RSL opcodes.

RSR and RSL are listed in the "Defeatured Instructions" section of the
965 PRM, Volume 4:

"The following instructions are removed from Gen4 implementation mainly
due to implementation cost/schedule reasons. They are candidates for
future generations."

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
cfe39ea14edc8db13c549b853b214e676f8276f1 23-Aug-2013 Paul Berry <stereotype441@gmail.com> i965: Allow C++ type safety in the use of enum brw_urb_write_flags.

(From a suggestion by Francisco Jerez)

If an enum represents a bitfield of flags, e.g.:

enum E {
A = 1,
B = 2,
C = 4,
D = 8,
};

then C++ normally prohibits statements like this:

enum E x = A | B;

because A and B are implicitly converted to ints before OR-ing them,
and an int can't be stored in an enum without a type cast. C, on the
other hand, allows an int to be implicitly converted to an enum
without casting.

In the past we've dealt with this situation by storing flag bitfields
as ints. This avoids ugly casting at the expense of some type safety
that C++ would normally have offered (e.g. we get no warning if we
accidentally use the wrong enum type).

However, we can get the best of both worlds if we override the |
operator. The ugly casting is confined to the operator overload, and
we still get the benefit of C++ making sure we don't use the wrong
enum type.

v2: Remove unnecessary comment and unnecessary use of "enum" keyword.
Use static_cast.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
eaa63cbbc2f5ae415fc89ef6fd74c5b26ad622fd 21-Mar-2013 Paul Berry <stereotype441@gmail.com> i965/gs: Add a flag allowing URB write messages to use a per-slot offset.

This will be used by geometry shaders to implement the EmitVertex()
function, since it requires writing data to a dynamically-determined
offset within the geometry shader's URB entry.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a9e8c10bd76f9a94b878b76bb5ae696beeaae2e0 11-Aug-2013 Paul Berry <stereotype441@gmail.com> i965: Combine 4 boolean args of brw_urb_WRITE into a flags bitfield.

The arguments to brw_urb_WRITE() were getting pretty unwieldy, and we
have to add more flags to support geometry shaders anyhow.

Also plumb these flags through brw_clip_emit_vue(),
brw_set_urb_message(), and the vec4_instruction class.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a0c8e762026acf2300951ac8a6b6bc293de4a4b1 10-Jul-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Cite the Ivybridge PRM for why the fake MRF range is what it is.

The exact text is in the public docs, so we should cite those.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
53631be4ebaa4fb13a7f129727c1cdd32fcc6f3d 06-Jul-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Move intel_context::gen and gt fields to brw_context.

Most functions no longer use intel_context, so this patch additionally
removes the local "intel" variables to avoid compiler warnings.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
794de2f3873bcedc78300b3ba69656adc755894c 06-Jul-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Move intel_context::is_<platform> flags to brw_context.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
55272883acc8a5a6cf4d725bfd4713e7d347ce3b 13-Jun-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Remove broken source type assertions from brw_alu3().

Commit 526ffdfc033ab01cf133cb7e8290c65d12ccc9be attempted to generalize
the source register type assertions to allow D and UD. However, the
src1 and src2 assertions actually checked src0.type against D and UD due
to a copy and paste bug.

It also began setting the source and destination register types based on
dest.type, ignoring src0/src1/src2.type completely. BFE and BFI2 may
actually pass mixed D/UD types and expect them to be ignored, which is
arguably a bit sloppy, but not too crazy either.

This patch simply removes the source register assertions as those values
aren't used anyway. It also clarifies the comment above the block that
sets the register types.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9321f3257f0199c5988fd2e220874acd8b7f0a53 13-Jun-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Add back strict type assertions for MAD and LRP.

Commit 526ffdfc033ab01cf133cb7e8290c65d12ccc9be relaxed the type
assertions in brw_alu3 to allow D/UD types (required by BFE and BFI2).
This lost us the strict type checking for MAD and LRP, which require
all four types to be float.

This patch adds a new ALU3F wrapper which checks these once again.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
fa958182b7e7a9a177ec45ffd39d42f15ca756b3 10-Apr-2013 Matt Turner <mattst88@gmail.com> i965: Add support for emitting and disassembling bit instructions.

Specifically
bfe - for bitfieldExtract()
bfi1 and bfi2 - for bitfieldInsert()
bfrev - for bitfieldReverse()
cbit - for bitCount()
fbh - for findMSB()
fbl - for findLSB()

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
526ffdfc033ab01cf133cb7e8290c65d12ccc9be 17-Apr-2013 Matt Turner <mattst88@gmail.com> i965/gen7: Set src/dst types for 3-src instructions.

Also update asserts to allow BFE and BFI2, which take (unsigned)
doubleword arguments.

v2: Allow BRW_REGISTER_TYPE_UD for src1 and src2 as well.
Assert that src2.type (instead of src0.type) matches dest.type since
it's the primary argument and src0 and src1 might correctly have
different types.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v1]
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ec646e465493ffc12caeccad01a9333f82e85517 21-Apr-2013 Matt Turner <mattst88@gmail.com> i965: Apply CMP NULL {Switch} work-around to other Gen7s.

Listed in the restrictions section of CMP, but not on the work-arounds
page.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e87015f5089a2c4a99e0288e1adeeabb5a7ca7e3 19-Apr-2013 Matt Turner <mattst88@gmail.com> Revert "i965: Check reg.nr for BRW_ARF_NULL instead of reg.file."

This reverts commit ecdda414d361ab4430fd5747c9217687c1f3d63f.

Commit was supposed to be a simple typo fix. Clearly needs more
investigating.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63688
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ecdda414d361ab4430fd5747c9217687c1f3d63f 16-Apr-2013 Matt Turner <mattst88@gmail.com> i965: Check reg.nr for BRW_ARF_NULL instead of reg.file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
60e4c994884ac10954f341db13a4c9c410c4dd0e 15-Apr-2013 Matt Turner <mattst88@gmail.com> i965: Implement work-around for CMP with null dest on Haswell.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
5c5218ea6163f694a256562df1d73a108396e40d 19-Mar-2013 Eric Anholt <eric@anholt.net> i965/fs: Switch shader_time writes to using GRFs.

This avoids conflicts between shader_time and FB writes, so we can include
more of the program under our profiling. This does mean hiding more of
the message setup from the optimizer, which doesn't have a way to handle
multi-reg sends from GRFs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
76ba30800d08149386c0bc6a6c5efc50590d3048 23-Mar-2013 Paul Berry <stereotype441@gmail.com> i965/gen7: Use WE_all mode when enabling channel masks for URB write.

Gen7 adds mask bits to the message header for a URB write which allow
the write to apply only to certain channels. We don't use this
functionality, so to ensure that the entire write always occurs, we
emit an OR instruction to set the mask bits.

With the advent of geometry shaders, URB writes won't just happen at
the end of a thread; they will happen in mid-thread too. Thus, we can
no longer rely on channel 0 being enabled, so we need to emit the OR
instruction in WE_all mode to ensure that it is executed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f27a220cadd1326e6293a2c3fb945b7765a85da4 07-Feb-2013 Kenneth Graunke <kenneth@whitecape.org> i965: Fix INTEL_DEBUG=shader_time for Haswell.

Haswell's "Data Cache" data port is a single unit, but split into two
SFIDs to allow for more message types without adding more bits in the
message descriptor.

Untyped Atomic Operations are now message 0010 in the second data cache
data port, rather than 6 in the first.

v2: Use the #defines from the previous commit. (by anholt)

NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
015a48743dfcf138cce5752098e01a6cfd6efefe 02-Dec-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Add support for emitting the LRP instruction.

Like MAD, this is another three-source instruction.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d5c3aa89dc4a8de048920470300bb8a921e72899 11-Feb-2013 Matt Turner <mattst88@gmail.com> i965/gen7: Relax restrictions on fake MRFs

Gen6 has write-only MRF registers, and for ease of implementation we
paritition off 16 general purposes registers to act as MRFs on Gen7.

Knowing that our Gen7 MRFs are actually GRFs, we can do things we can't
do with real MRFs:
- read from them;
- return values directly to them from a send instruction; and
- compute directly to them with math instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
516d8be502885f5aadcc43aafe764e617f2593f4 06-Feb-2013 Eric Anholt <eric@anholt.net> i965: Remove writemask support from brw_SAMPLE().

The code was rather broken for non-XYZW on 8-wide, but all of our
callers were using XYZW anyway. For my experiments with using writemask
on texturing, I've been using manual header setup in the compiler
backends, since we want to actually know what registers are written for
optimization and register allocation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7093558b311b223004845d0a422eb88bed15b418 09-Jan-2013 Chad Versace <chad.versace@linux.intel.com> i965: Quote the PRM on a HorzStride subtlety

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7e21910f233a8ff6e2c4adaee6b4edd2f70b6c68 09-Jan-2013 Chad Versace <chad.versace@linux.intel.com> i965: Add opcodes for F32TO16 and F16TO32

The GLSL ES 3.00 operations packHalf2x16 and unpackHalf2x16 will emit
these opcodes.

- Define the opcodes BRW_OPCODE_{F32TO16,F16TO32}.
- Add the opcodes to the brw_disasm table.
- Define convenience functions brw_{F32TO16,F16TO32}.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
622d96aae499445f12861214354a5b9f63e3a738 02-Jan-2013 Vinson Lee <vlee@freedesktop.org> i965: Add break statement at end of BRW_OPCODE_CONTINUE case.

Fixes missing break in switch defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7baf9198b201666fc0f20fe407d7b46ee0ca7ef5 12-Dec-2012 Eric Anholt <eric@anholt.net> i965: Also consider HALTs a potential block end.

The final halt of the fragment shader turns off the remaining channels,
then jumps such that everything is turned back on. So, we can have our
last ENDIF of the shader point at that directly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2702202290b55a9c8b61f02f7ae0af8f4a53f0e2 12-Dec-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Jump to the end of the next outer conditional block on ENDIFs.

From the Ivybridge PRM, Volume 4, Part 3, section 6.24 (page 172):

"The endif instruction is also used to hop out of nested conditionals by
jumping to the end of the next outer conditional block when all
channels are disabled."

Also:
"Pseudocode:
Evaluate(WrEn);
if ( WrEn == 0 ) { // all channels false
Jump(IP + JIP);
}"

First, ENDIF re-enables any channels that were disabled because they
didn't match the conditional. If any channels are active, it proceeds
to the next instruction (IP + 16). However, if they're all disabled,
there's no point in walking through all of the instructions that have no
effect---it can jump to the next instruction that might re-enable some
channels (an ELSE, ENDIF, or WHILE).

Previously, we always set JIP on ENDIF instructions to 2 (which is
measured in 8-byte units). This made it do Jump(IP + 16), which just
meant it would go to the next instruction even if all channels were off.

It turns out that walking over instructions while all the channels are
disabled like this is worse than just instruction dispatch overhead: if
there are texturing messages, it still costs a couple hundred cycles to
not-actually-read from the texture results.

This patch finds the next instruction that could re-enable channels and
sets JIP accordingly.

Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
beafced21c3c11315a8b94f20508562729453175 06-Dec-2012 Eric Anholt <eric@anholt.net> i965/fs: Improve performance of shaders that start out with a discard.

I had tried this in the past, but ran into trouble with applications
that sample from undiscarded pixels in the same subspan. To fix that
issue, only jump to the end for an entire subspan at a time.

Improves GLbenchmark 2.7 (1024x768) performance by 7.9 +/- 1.5% (n=8).

v2: Drop the br variable in the jump instruction -- if I ever do jumps
pre-gen6, it'll be a different code block anyway since we don't have
HALT until gen6.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
71f06344a0d72a6bd27750ceca571fc016b8de85 27-Nov-2012 Eric Anholt <eric@anholt.net> i965: Add a debug flag for counting cycles spent in each compiled shader.

This can be used for two purposes: Using hand-coded shaders to determine
per-instruction timings, or figuring out which shader to optimize in a
whole application.

Note that this doesn't cover the instructions that set up the message to
the URB/FB write -- we'd need to convert the MRF usage in these
instructions to GRFs so that our offsets/times don't overwrite our
shader outputs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)

v2: Check the timestamp reset flag in the VS, which is apparently
getting set fairly regularly in the range we watch, resulting in
negative numbers getting added to our 32-bit counter, and thus large
values added to our uint64_t.
v3: Rebase on reladdr changes, removing a new safety check that proved
impossible to satisfy. Add a comment to the AOP defs from Ken's
review, and put them in a slightly more sensible spot.
v4: Check timestamp reset in the FS as well.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
78e9c57a3ec6930e818b83af960dcb40d09daa5a 09-Nov-2012 Eric Anholt <eric@anholt.net> i965: Add a header_present flag for setting up dp read messages.

As of gen7, we can skip the header on some messages, and this can make
optimization on those messages much nicer when you've got GRFs instead of MRFs
as the source.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8f05b2f2b022cb80c9e49d2ceb212d3b4f23905b 09-Nov-2012 Eric Anholt <eric@anholt.net> i965/gen7: Add some safety checks for send messages from GRFs.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
4cb8b946d9b404da4f72cdb70d07c2283df3922e 07-Nov-2012 Vinson Lee <vlee@freedesktop.org> i965: Fix assertion in brw_alu3.

Fixes side effect in assertion defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
109a97dbd2748eaf4813d9f0c7d9ea396daeddbc 26-Oct-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Remove VS constant buffer read support from brw_eu_emit.c.

brw_vec4_emit.cpp implements this directly; only the old backend used
the brw_eu_emit.c code.

Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
b6346749a839325e938fbb225af06006bc711ac5 08-Oct-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Delete some dead code from brw_eu_emit.c.

Presumably some of this was used by the old fragment shader backend.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f5e2706395904eb515a04c71966d7b96972f221a 03-Feb-2012 Eric Anholt <eric@anholt.net> i965: Prepare the break/cont uip/jip setting for compacted instructions.

The first cut at instruction compaction won't compact things that
would change control flow jump distances, but we do need to still be
able to walk the instruction stream, which involves jumping by 8 or 16
bytes between instructions.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f2bd3e70b5e643f9c03a5003965861281f206fd3 03-Feb-2012 Eric Anholt <eric@anholt.net> i965: Move program dump to a helper function in brw_eu.c.

It's going to get more complicated when we do instruction compaction. This
also introduces putting the program offset in the output.

v2: Use next_insn_offset in brw_get_program(), too.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9b4053cabd8bda180b352d2d2047209f6ca5f6e8 06-Aug-2012 Eric Anholt <eric@anholt.net> i965: Drop the confusing saturate argument to math instruction setup.

This was ridiculous. We were ignoring the inst->header.saturate flag in the
case of math and only math. On gen4, we would leave inst->header.saturate in
place if it happened to be set, which would end up being applied to the
implicit mov and thus trash the first argument. On gen6, we would overwrite
inst->header.saturate with the saturate flag from the argument, which was not
set appropriately in brw_vec4_emit.cpp, and was only not a bug due to our
incompetence at coalescing saturate moves.

By ripping the argument out and making saturate work just like all the other
brw_eu_emit.c code generation, we can avoid both these classes of bugs.

Fixes piglit fog-modes, and the new specific fs-saturate-exp2 case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48628
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
148c8e639da7ee10fc9e002e3c6d60e17d218b21 19-Jul-2012 Paul Berry <stereotype441@gmail.com> i965: Use sendc for all render target writes on Gen6+.

The sendc instruction causes the fragment shader thread to wait for
any dependent threads (i.e. threads rendering to overlapping pixels)
to complete before sending the message. We need to use sendc on the
first render target write in order to guarantee that fragment shader
outputs are written to the render target in the correct order.

Previously, we only used the "sendc" instruction when writing to
binding table index 0. This did the right thing for fragment shaders,
because our fragment shader back-ends always issue their first render
target write to binding table index 0. However, it did the wrong
thing for blorp, which performs its render target writes to binding
table index 1.

A more robust solution is to use sendc for all render target writes.
This should not produce any performance penalty, since after the first
sendc, all of the dependent threads will have completed.

For more information about sendc, see the Ivy Bridge PRM, Vol4 Part3
p218 (sendc - Conditional Send Message), and p54 (TDR Registers).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
6a27506181b29c8b7eda7bd6cf80689f849e181d 07-Jul-2012 Paul Berry <stereotype441@gmail.com> i965: Add support for AVG instruction.

From the Ivy Bridge PRM, Vol4 Part3 p152:

"The avg instruction performs component-wise integer average of
src0 and src1 and stores the results in dst. An integer average
uses integer upward rounding. It is equivalent to increment one to
the addition of src0 and src1 and then apply an arithmetic right
shift to this intermediate value."

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
29362875f2613ad87abe7725ce3c56c36d16cf9b 25-Apr-2012 Eric Anholt <eric@anholt.net> i965/gen6+: Add support for GL_ARB_blend_func_extended.

v2: Add support for gen6, and don't turn it on if blending is
disabled. (fixes GPU hang), and note it in docs/GL3.txt

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9e9ae280e215988287b0f875c81bc2e146b9f5dd 04-May-2012 Eric Anholt <eric@anholt.net> Revert "i965/fs: Jump from discard statements to the end of the program when done."

This reverts commit 31866308fcf989df992ace28b5b986c3d3770e90.

Fixes piglit glsl-fs-discard-exit-3 and unigine tropics rendering.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
31866308fcf989df992ace28b5b986c3d3770e90 19-Dec-2011 Eric Anholt <eric@anholt.net> i965/fs: Jump from discard statements to the end of the program when done.

From the GLSL 1.30 spec:

The discard keyword is only allowed within fragment shaders. It
can be used within a fragment shader to abandon the operation on
the current fragment. This keyword causes the fragment to be
discarded and no updates to any buffers will occur. Control flow
exits the shader, and subsequent implicit or explicit derivatives
are undefined when this control flow is non-uniform (meaning
different fragments within the primitive take different control
paths).

v2: Don't emit the final HALT if no other HALTs were emitted.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
172bb92db1a3c317867d9cfec6f15c09c37a0f6c 19-Feb-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Only set Last Render Target Select on the last FB write.

Fixes GPU hangs in OilRush, Trine, and Amnesia: The Dark Descent,
which all use MRT (multiple render targets).

NOTE: This is a candidate for release branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38720
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40059
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45216
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2b28fd6ca603df40a5d02aac4035eced3a1d079a 22-Mar-2010 Eric Anholt <eric@anholt.net> i965: Add support for the MAD opcode on gen6+.

v2: Fix MRF handling on gen7.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e910241e9754b6e673ed0fc3133c8b1de56e76c7 27-Jan-2012 Eric Anholt <eric@anholt.net> i965/fs: Fix rendering corruption in unigine tropics.

We were allocating registers into the MRF hack region, resulting in
sparkly renering in a few of the scenes. We could do better
allocation by making an MRF class, having MRFs conflict with the
corresponding GRFs, and tracking the live intervals of the "MRF"s and
setting up the conflicts. But this is way easier for the moment.

NOTE: This is a candidate for the 8.0 branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
5f4575d42fdaaf671d4b3cdcf2c733ad9d35d339 26-Jan-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Expose brw_set_sampler_message for use outside brw_eu_emit.c.

brw_SAMPLE is full of complex workarounds for original Broadwater
hardware, and I'd rather avoid all that for my next Ivybridge patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
5acc7f38d42859db459567d4442c18764a4072e7 17-Jan-2012 Kenneth Graunke <kenneth@whitecape.org> i965: Bump Ivybridge's fake MRF range to g112-127 instead of g111-126.

When I originally implemented the hack to use GRFs 111+ as fake MRFs, I
did so purely to avoid rewriting all the code that dealt with MRFs.
However, it turns out that a similar hack is actually required.

Newly discovered language in the BSpec indicates that SEND instructions
with EOT set "should" use g112-g127 as their source registers. Based on
assertions in the simulator, this is actually a requirement on certain
platforms.

Since we're faking MRFs already, we may as well use the officially
sanctioned range. My guess is that we avoided this issue because we
seldom use m0: URB writes in the new VS backend start at m1, and RT
writes in the new FS backend start at m2.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
bf2c7469fba256e8d5fb3b5c6c130204550ec253 30-Dec-2011 Eric Anholt <eric@anholt.net> i965: Silence gcc warning from resizing EU store changes.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
3aa3c3f75894ca0eb08087c0ec3dd114eeae4bb7 21-Dec-2011 Yuanhan Liu <yuanhan.liu@linux.intel.com> i965: increase the brw eu instruction store size dynamically

Here is the final patch to enable dynamic eu instruction store size:
increase the brw eu instruction store size dynamically instead of just
allocating it statically with a constant limit. This would fix something
that 'GL_MAX_PROGRAM_INSTRUCTIONS_ARB was 16384 while the driver would
limit it to 10000'.

v2: comments from ken, do not hardcode the eu limit to (1024 * 1024)

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8d1b378939768c4054b35b5da592af102345ebed 21-Dec-2011 Yuanhan Liu <yuanhan.liu@linux.intel.com> i965: call next_insn() before referencing a instruction by index

A single next_insn may change the base address of instruction store
memory(p->store), so call it first before referencing the instruction
store pointer from an index.

This the final prepare work to enable the dynamic store size.

v2: comments from Ken, define emit_endif as bool type

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
328e6a5497e54b0e8aed803cf6d2ae9a2a00b2fe 21-Dec-2011 Yuanhan Liu <yuanhan.liu@linux.intel.com> i965: get the jmp distance by instruction index

If dynamic instruction store size is enabled, while after the brw_JMPI()
and before the brw_land_fwd_jump() function, the eu instruction store
base address(p->store) may change. Thus, the safe way to reference the
jmp instruction is by index instead of by the instruction address.

v2: comments from Eric, don't change the prototype of brw_JMPI

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0a17093eaf84696b05d04a45d6d51281f7b2786b 21-Dec-2011 Yuanhan Liu <yuanhan.liu@linux.intel.com> i965: let the if_stack just store the instruction index

If dynamic instruction store size is enabled, while after
the brw_IF/ELSE() and before the brw_ENDIF() function, the
eu instruction store base address(p->store) may change.

Thus let if_stack just store the instruction index. This is
somehow more flexible and safe than store the instruction
memory address.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f1d89638c02afafbf82ef657cd6ba9965dad6738 06-Dec-2011 Eric Anholt <eric@anholt.net> i965: Don't make consumers of brw_CONT/brw_WHILE track if depth in loop.

The codegen backends all had this same tracking, so just do it at the
EU level.

Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ce6be334bbf7f44c71ad5d190f9fb075d2f9a38c 06-Dec-2011 Eric Anholt <eric@anholt.net> i965: Don't make consumers of brw_WHILE do pre-gen6 BREAK/CONT patching.

The EU code itself can just do this work, since all the consumers were
duplicating it.

Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
32118cfe37495738ed5931c6b1a71b8ee2ad189c 06-Dec-2011 Eric Anholt <eric@anholt.net> i965: Don't make consumers of brw_DO()/brw_WHILE() track loop start.

This is a similar cleanup to what we did for brw_IF(), brw_ELSE(),
brw_ENDIF() handling.

Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9f8814752f306cb9a26d283f0b7cf876639e10f7 06-Dec-2011 Eric Anholt <eric@anholt.net> i965: Drop unused do_insn argument from gen6_CONT().

The branch distances get patched up later at the WHILE instruction.

Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9308f298300beaa757194a0db8ed50924754c011 28-Nov-2011 Paul Berry <stereotype441@gmail.com> i965 gen6: Initial implementation of transform feedback.

This patch adds basic transform feedback capability for Gen6 hardware.
This consists of several related pieces of functionality:

(1) In gen6_sol.c, we set up binding table entries for use by
transform feedback. We use one binding table entry per transform
feedback varying (this allows us to avoid doing pointer arithmetic in
the shader, since we can set up the binding table entries with the
appropriate offsets and surface pitches to place each varying at the
correct address).

(2) In brw_context.c, we advertise the hardware capabilities, which
are as follows:

MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS 64
MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS 4
MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS 16

OpenGL 3.0 requires these values to be at least 64, 4, and 4,
respectively. The reason we advertise a larger value than required
for MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS is that we have already
set aside 64 binding table entries, so we might as well make them all
available in both separate attribs and interleaved modes.

(3) We set aside a single SVBI ("streamed vertex buffer index") for
use by transform feedback. The hardware supports four independent
SVBI's, but we only need one, since vertices are added to all
transform feedback buffers at the same rate. Note: at the moment this
index is reset to 0 only when the driver is initialized. It needs to
be reset to 0 whenever BeginTransformFeedback() is called, and
otherwise preserved.

(4) In brw_gs_emit.c and brw_gs.c, we modify the geometry shader
program to output transform feedback data as a side effect.

(5) In gen6_gs_state.c, we configure the geometry shader stage to
handle the SVBI pointer correctly.

Note: ordering of vertices is not yet correct for triangle strips
(alternate triangles are improperly oriented). This will be addressed
in a future patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
fd5d0c8b12a0e144aa8e95540c0da2161d8e089a 05-Dec-2011 Paul Berry <stereotype441@gmail.com> i965 gen6+: Use 1-wide null operands for IF instructions

The Sandy Bridge PRM, volume 4, part 2, section 5.3.10 ("5.3.10
Register Region Restrictions") contains the following restriction on
the execution size and operand width of instructions:

"3. ExecSize must be equal to or greater than Width."

When emitting an IF instruction in single program flow mode on Gen6+,
we use an ExecSize of 1, therefore the Width of each operand must also
be 1.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
dabe15da4f81546b5c9fca8c208d31bfe98ada9f 25-Nov-2011 Paul Berry <stereotype441@gmail.com> i965: Only convert if/else to conditional adds prior to Gen6.

Normally when outputting instructions in SPF (single program flow)
mode, we convert IF and ELSE instructions to conditional ADD
instructions applied to the IP register. On platforms prior to Gen6,
flow control instructions cause an implied thread switch, so this is a
significant savings.

However, according to the SandyBridge PRM (Volume 4 part 2, p79):

[Errata DevSNB{WA}] - When SPF is ON, IP may not be updated by
non-flow control instructions.

So we have to disable this optimization on Gen6.

On later platforms, there is no significant benefit to converting flow
control instructions to ADDs, so for the sake of consistency, this
patch disables the optimization on later platforms too.

The reason we never noticed this problem before is that so far we
haven't needed to use SPF mode on Gen6. However, later patches in
this series will introduce a Gen6 GS program which uses SPF mode.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e304aa3600f865db533d273e2c1a554cb6a54f05 15-Nov-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Make gen6_resolve_implied_move a no-op for MRF sources.

Attempting to move an MRF to a MRF is not only pointless, it will fail
because MRFs are read-only, resulting in garbage in your register.

If we already set up a MRF source, there's nothing to resolve anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7e84a64dd02794a59586ba58ef0864118534d3c6 10-Nov-2011 Eric Anholt <eric@anholt.net> i965/gen4: Fix sampling from integer textures.

On original gen4, the surface format didn't determine the return data
type from sampling like it does on g45 and later.

Fixes GL_EXT_texture_integer/texture_integer_glsl130

Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a73c65c5342bf41fa0dfefe7daa9197ce6a11db4 18-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Enable faster workaround-free math on Ivybridge.

According to the documentation, Ivybridge's math instruction works in
SIMD16 mode for the fragment shader, and no longer forbids align16 mode
for the vertex shader.

The documentation claims that SIMD16 mode isn't supported for INT DIV,
but empirical evidence shows that it works fine. Presumably the note
is trying to warn us that the variant that returns both quotient and
remainder in (dst, dst + 1) doesn't work in SIMD16 mode since dst + 1
would be sechalf(dst), trashing half your results. Since we don't use
that variant, we don't care and can just enable SIMD16 everywhere.

The documentation also still claims that source modifiers and
conditional modifiers aren't supported, but empirical evidence and
study of the simulator both show that they work just fine.

Goodbye workarounds. Math just works now.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
973b4ddd0e2f25cfd72cb945fbd38aed629a6fed 19-Oct-2011 Brian Paul <brianp@vmware.com> i965: remove unused vars in brw_set_ff_sync_message()
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
53798f90e818e9bf213c3ae4298751362a5ecd50 08-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Rename pixel_scoreboard_clear to last_render_target for clarity.

Finding this bit in the documentation proved challenging. It wasn't in
the SEND instruction's message descriptor section, nor the data port
message descriptor section. It turns out to be part of the Render
Target Write message's control bits, and in the documentation is named
"Last Render Target Select".

Shaders that use Multiple Render Targets should set this bit on the last
RT write, but not on any prior ones.

The GPU does update the Pixel Scoreboard appropriately, but doesn't
document this bit as directly causing a scoreboard clear.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
fa0aa3796d3483cf8924fa127085d075d34019e8 08-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Factor out code for setting Message Descriptors.

Every brw_set_???_message function had duplicated code, per-generation,
to set the Message Descriptor and Extended Message Descriptor bits
(SFID, message length, response length, header present, end of thread).

However, these fields are actually specified as part of the SEND
instruction itself; individual types of messages don't even specify
them (except for header present, but that's in the same bit location).

Since these are exactly the same regardless of the message type, just
create a function to set them, using the generic message structs. This
not only shortens the code, but hides a lot of the per-generation
complexity (like the SFID being in destreg__conditionalmod) in one spot.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
43ccd3200c394dd4d89ed96f039ca7d6cfff972f 08-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Remove EOT parameter from brw_SAMPLE and brw_set_sampler_message.

The existing code asserted that eot == 0, as it doesn't make sense for
a thread to sample a texture as the last thing it does.

It doesn't make much sense to pass around a dead parameter either.
Especially for a function which already has a long parameter list.

So, remove the parameter and just set EOT to 0.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2e124388a4642d1e7f5154e7b83d38578c6b2789 08-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Rename BRW_MESSAGE_TARGET_* to BRW_SFID_* and document them.

When reading the data port code, it was not clear to me what these
values meant, nor where I could find them in the documentation.
Especially since the latest BSpec and older PRMs document them in
radically different places...neither of which are near the descriptions
of individual messages.

Cite the documentation, and rename them to SFID to signify that these
are Shared Function IDs that one can read about in the GPU overview,
rather than arbitrary bitfields. While we're add it, make them an enum.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
bbea5c5a5a7fb327d4ef03f80fe19cfa8d8edccd 08-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Clarify check for which cache to use on Gen6 data port reads.

Currently, we use the Render Cache for scratch access (read/write data)
and the Sampler Cache for all read only data (pull constants).

Reversing the condition here is clearer: if the caller requested the
Render Cache, use that. Otherwise, they requested the Data Cache
(which does not exist on Gen6) or Sampler Cache, so use the Sampler
Cache.

This should not change behavior in any way.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0d4a9ba9b247664bc5662b3db774064778f9aa17 08-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Use Ivybridge's "Legacy Data Port" for reads/writes.

Using the constant cache for reads isn't going to work for scratch
reads (variably-indexed arrays or register spills), as these aren't
constant at all.

Also, in the new VS backend, use the proper message number for OWord
Dual Block Write messages. It's now 10, instead of 9.

+205 piglits.

NOTE: This is a candidate for the 7.11 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2e5a1a254ed81b1d3efa6064f48183eefac784d0 07-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> intel: Convert from GLboolean to 'bool' from stdbool.h.

I initially produced the patch using this bash command:
for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i
's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i
's/GL_FALSE/false/g' $file; done

Then I manually added #include <stdbool.h> to fix compilation errors,
and converted a few functions back to GLboolean that were used in core
Mesa's function pointer table to avoid "incompatible pointer" warnings.

Finally, I cleaned up some whitespace issues introduced by the change.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chad Versace <chad@chad-versace.us>
Acked-by: Paul Berry <stereotype441@gmail.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
b861479f83ea140bfe24357d09f18a6d026d97b5 07-Oct-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Fix inconsistent indentation in brw_eu_emit.c.

Most of these functions used three spaces for the first level of
indentation, but four spaces for the next level. One used tabs and then
three spaces. Some used 3/4 in a then block but 3/3 in the else block.

Normally I try to avoid field days like this, but since the functions
were so inconsistent, even internally, it was making it difficult to
edit without introducing spurious whitespace changes.

So, just get it over with. git diff -b shows 0 lines changed.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
6960f786c8e1bfbaa0d9eb5f43b3b6bfc7135fcf 29-Sep-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Set the signed/unsigned type bit in Gen4/5 math messages.

It never mattered before since we only did floating point math.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
6b10aab2bbe0e69e2e8efca5e754870c8a543064 29-Sep-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Fix message and response length calculations for INT DIV.

Both POW and INT DIV need a message length of 2; previously, we only
checked for POW.

Also, BRW_MATH_FUNCTION_INT_DIV_QUOTIENT_AND_REMAINDER has a response
length of 2; previously, we only checked for SINCOS. We don't use this
message, but in case we ever decide to, we may as well fix it now.

While we're at it, just move these computations into
brw_set_math_message, since they're entirely based on the function.
This fixes it for both brw_math and the old backend's brw_math_16.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ee2bf3a4b6b37287e6d150d3dd6742b7fa4f8215 29-Sep-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Fix assertions about register types for INT DIV in brw_math.

BRW_MATH_FUNCTION_REMAINDER was missing. Also, it seems worthwhile to
assert that INT DIV's arguments are signed/unsigned integers.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
483f5b348b0f3c0ca7082fd2047c354e8af285e7 22-Aug-2011 Eric Anholt <eric@anholt.net> i965/vs: Add support for pull constant loads for uniform arrays.

v2: reworked the instruction emit and made use of gen6_resolve_implied_move,
from Ken's review
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2f0edc60f4bd2ae5999a6afa656e3bb3f181bf0f 26-Aug-2011 Chad Versace <chad@chad-versace.us> i965: Fix Android build by removing relative includes

Replace each occurence of
#include "../glsl/*.h"
with
#include "glsl/*.h"

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad@chad-versace.us>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f7d2dcae3b6bf39b14c1e71f0721d0e4a2833962 18-Aug-2011 Kenneth Graunke <kenneth@whitecape.org> i965/gen7: Use align1 mode to set URB_WRITE_HWORD channel enables.

Makes the new vertex shader backend work on Ivybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d0c595ac8032aa9aed402a513870b8dc92e42903 12-Aug-2011 Eric Anholt <eric@anholt.net> i965/gen6: Force WHILE exec size to 8.

We can't just look at the instruction that happens to appear at the
start of the loop, because it might be some other exec size and cause
us to only loop on the first N channels. We always want 8 in our
current code (since 16 doesn't work so we don't do 16-wide fragment in
that case).

Fixes loop-03.vert, which was triggering the assertions.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d376fa8e84b044ead47586d1b56a10742bcbdac7 16-Aug-2011 Eric Anholt <eric@anholt.net> i965: Fix assertion failure on a loop consisting of while (true) { break }.

On enabling the precompile step in the VS, we tripped over this
assertion failure in glsl-link-bug-30552.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0f22f98ccd69bb5e8df3c78203bce9bc630965c1 07-Aug-2011 Eric Anholt <eric@anholt.net> i965: Make some EU emit code for DP read/write messages non-static.

We keep building these strange interfaces for DP read/write where
there's a helper function with some partially-specific,
partially-general controls, which is used in exactly one place in code
generation. Making these public will let us set up those instructions
in the one place they're to be generated.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
b5846865deb20c54e88c7c1a7c732d29e9c47975 23-May-2011 Eric Anholt <eric@anholt.net> i965: Warnings cleanup.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
136eb2bde769713b100351ff96bceb970f068c0a 10-May-2011 Eric Anholt <eric@anholt.net> i965/fs: Add support for "if" statements in 16-wide mode on gen6+.

It turns out there's nothing in the hardware preventing this. It
appears that it ought to work on pre-gen6 as well, but just produces
GPU hangs.

Improves glbenchmark Egypt framerate 4.4% +/- 0.3% (n=3), and Pro by
2.6% +/- 0.6% (n=3).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
344283de5d3f4e2bfa10455f6b974cf731184b55 11-May-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Fix RNDZ and RNDE on Sandybridge and Ivybridge.

On gen4/5, the RNDZ and RNDE instructions return floor(x), but set special
"round increment bits" in the flag register; a predicated ADD (+1) fixes
the result.

The documentation still lists '.r' as existing, and says that the
predicated add is necessary, but it apparently lies. According to the
simulator, BRW_CONDITIONAL_R (7) is not a valid conditional modifier
and the RNDZ and RNDE instructions simply produce the correct value.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
199a2f90abdd5dd11f8e2b95e587401d3b46f3ff 11-May-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Fix data port reads on Ivybridge.

These also need to use gen7_dp.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
6e918163dfbdc829f31a0aefc07248c49b890d1d 30-Apr-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Make the CONT instruction point to the WHILE instruction.

This fixes piglit test glsl-fs-loop-continue.shader_test on Ivybridge.
According to the documentation, the CONT instruction's UIP field should
point to the WHILE instruction on both Sandybridge and Ivybridge.

The previous code made UIP point to the implicit DO instruction, which
seems incorrect. I'm not sure how it could have worked on Sandybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
77397ef96edbc17a698ae2a02ec4807b1059c036 30-Apr-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Add support for loops on Ivybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
64ce592679a5b08d66e3cbbf964f9e695e14aee1 16-Mar-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Add support for IF/ELSE/ENDIF control flow on Ivybridge.

Ivybridge's IF instruction doesn't support conditional modifiers.
It also introduces UIP, which must point to the ENDIF instruction.

ELSE and ENDIF remain the same except that JIP moves from dst to src1.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
36f8de02e71ee5c2ca55d86c486eb00d043ae1f5 29-Apr-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Fix sampler message descriptor on Ivybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
09d881bf7420c97a0f684283c24b8ec3e42404ff 27-Apr-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Enable channel masks in Ivybridge's URB_WRITE_HWORD header.

This shouldn't be done using MRFs, but until I have a proper solution
for dealing with MRFs, this allows my hack to keep working.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
97d4d6f77e885d2c343697f26a5ecf821caaf13b 19-Apr-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Fix the URB write message descriptor on Ivybridge.

The message header is still incorrect, but this is a start.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ce526a7452abf552af38b86bd3546d6ff9a83194 19-Apr-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Fix render target writes on Ivybridge.

Ivybridge shifts the data port messages by one bit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
482e8a6cd59292c58b11a9282632aaa9b24f44ae 09-Apr-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Mad hacks to avoid using MRFs on Ivybridge.

Ivybridge's SEND instruction uses GRFs instead of MRFs. Unfortunately,
a lot of our code explicitly uses MRFs, and rewriting it would take a
fair bit of effort. In the meantime, use a hack:

- Change brw_set_dest, brw_set_src0, and brw_set_src1 to implicitly
convert any MRFs into the top 16 GRFs.
- Enable gen6_resolve_implied_move on Ivybridge: Moving g0 to m0
actually moves it to g111 thanks to the previous hack.

It remains to officially reserve these registers so the allocator
doesn't try to reuse them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
c638180fc715aff84422c1092926120af966d417 16-May-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Rework IF/ELSE jump target back-patching.

The primary motivation for this is to better support Ivybridge control
flow. Ivybridge IF instructions need to point to the first instruction
of the ELSE block -and- the ENDIF instruction; the existing code only
supported back-patching one instruction ago.

A second goal is to simplify and centralize the back-patching, hopefully
clarifying the code somewhat.

Previously, brw_ELSE back-patched the IF instruction, and brw_ENDIF
back-patched the previous instruction (IF or ELSE). With this patch,
brw_ENDIF is responsible for patching both the IF and (optional) ELSE.

To support this, the control flow stack (if_stack) maintains pointers to
both the IF and ELSE instructions. Unfortunately, in single program
flow (SPF) mode, both were emitted as ADD instructions, and thus
indistinguishable.

To remedy this, this patch simply emits IF and ELSE, rather than ADDs;
brw_ENDIF will convert them to ADDs (the SPF version of back-patching).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
5936d96d33e767aa99f6afa92f2a6582ff04df23 16-May-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Move IF stack handling into the EU abstraction layer/brw_compile.

This hides the IF stack and back-patching of IF/ELSE instructions from
each of the code generators, greatly simplifying the interface.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1309d2ea723613f1e755dd7785d22456dd39bb08 11-May-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Pass brw_compile pointer to brw_set_src[01].

This makes it symmetric with brw_set_dest, which is convenient, and will
also allow for assertions to be made based off of intel->gen.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2bba244329a6751d5ac07041874b2969b67fa8ee 13-May-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Use BRW_DATAPORT_READ_TARGET_DATA_CACHE instead of 0.

Using the #define'd constant is better than 0 with a comment.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
c77855d64eae45786d2d637bd065c8a700b788e5 13-May-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Rename dp_render_target struct to gen6_dp.

This is actually just the message descriptor for Gen6+ dataport access;
it has nothing to do with the render cache. Access to the sampler cache
and constant cache also would use this struct; rename for clarity.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
17eaff77b09d356aae46c5d89a8eaa67cfa4c1e7 13-May-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Attempt to un-muddle Gen6 data port message target defines.

These are documented on page 245 of IHD_OS_Vol4_Part2.pdf (the public
Sandybridge documentation/SEND instruction description).

Somebody had the bright idea to reuse gen4/5 defines labelled READ/WRITE
which just happened to be the same values as Render Cache/Sampler Cache.
It turns out that this field has nothing to do with READ/WRITE on
Sandybridge, but rather represents which data port to direct it to.

This was especially confusing in brw_set_dp_read_message, which
used "BRW_MESSAGE_TARGET_DATAPORT_WRITE." In a read function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7c647a2fe98a645723fa5eace7f7f6c5c26f4f8e 14-Mar-2011 Eric Anholt <eric@anholt.net> i965: Move the destination reg setup for 8/16 wide to the emit code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
4673f9433fb73febce796945e3883274636fbf62 15-Apr-2011 Eric Anholt <eric@anholt.net> i965: Quit spamming gen6 DP read/write send instructions with gen5 bits.

This was copy-and-paste from originally trying to get DP read/write
working reliably, and notably for other common messages (URB, sampler)
we weren't doing this.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
59c6b775a6aacfe03c84dae62c2fd45d4af9d70b 15-Apr-2011 Eric Anholt <eric@anholt.net> i965/fs: Add gen6 register spilling support.

Most of this is code movement to get the scratch space allocated in a
shared location. Other than that, the only real changes are that the
old oword block messages now operate on oword-aligned areas (with new
messages for unaligned access, which we don't do), and that the
caching control is in the SFID part of the descriptor instead of
message control.

Fixes glsl-fs-convolution-1.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2432ca1c2e205100d48070305ba2d5f8978bce03 12-Apr-2011 Zou Nan hai <nanhai.zou@intel.com> Revert "i965: clear global offset to zero in m0.2 for VS DP read."

This reverts commit 66b66295d0bc856c69fdcccc22575580c7ecee16.
it was already fixed by commit 9d60a7ce08a67eb8b79c60f829d090ba4a37ed7e
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
66b66295d0bc856c69fdcccc22575580c7ecee16 07-Apr-2011 Zou Nan hai <nanhai.zou@intel.com> i965: clear global offset to zero in m0.2 for VS DP read.

Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a019dd0d6e5bba00e8ee7818e004ee42ca507102 03-Apr-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Fix null register use in Sandybridge implied move resolution.

Fixes regressions caused by commit 9a21bc6401, namely GPU hangs when
running gnome-shell or compiz (Mesa bugs #35820 and #35853).

I incorrectly refactored the case that dealt with ARF_NULL; even in that
case, the source register needs to be changed to the MRF.

NOTE: This is a candidate for the 7.10 branch (if 9a21bc6401 is
cherry-picked, take this one too).
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9d60a7ce08a67eb8b79c60f829d090ba4a37ed7e 29-Mar-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Resolve implied moves in brw_dp_READ_4_vs_relative.

Fixes piglit test glsl-vs-arrays-3 on Sandybridge, as well as garbage
rendering in 3DMarkMobileES 2.0's Taiji demo and GLBenchmark 2.0's
Egypt and PRO demos.

NOTE: This a candidate for stable release branches. It depends on
commit 9a21bc640188e4078075b9f8e6701853a4f0bbe4.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9a21bc640188e4078075b9f8e6701853a4f0bbe4 16-Mar-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Refactor Sandybridge implied move handling.

This was open-coded in three different places, and more are necessary.
Extract this into a function so it can be reused.

Unfortunately, not all variations were the same: in particular, one set
compression control and checked that the source register was not
ARF_NULL. This seemed like a good idea, so all cases now do so.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2279156fe7ac9718533b8b0de90ae96100486680 16-Mar-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Rename brw_(IF|CONT)_gen6 functions to gen6_(IF|CONT).
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2baac48f3900b6e7a6443c6c116899cf95275629 16-Mar-2011 Kenneth Graunke <kenneth@whitecape.org> i965: Rename BRW_DATAPORT_..._GEN6 messages to GEN6_... for consistency.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
dc987adc9f5f9f851be124985fa6bbcdbfa4a7a5 24-Dec-2010 Xiang, Haihao <haihao.xiang@intel.com> i965: use align1 access mode for instructions with execSize=1 in VS

All operands must be 16-bytes aligned in aligh16 mode. This fixes l_xxx.c
in oglconform.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
5dc53444c8323c1787dddbe6b67048828df9c684 23-Dec-2010 Eric Anholt <eric@anholt.net> i965: Correct the dp_read message descriptor setup on g4x.

It's mostly like gen4 message descriptor setup, except that the sizes
of type/control changed to be like gen5. Fixes 21 piglit cases on
gm45, including the regressions in bug #32311 from increased VS
constant buffer usage.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
4374703a9b2ce0be105ee544c8402a932e3e1f52 22-Dec-2010 Zhenyu Wang <zhenyuw@linux.intel.com> i965: explicit tell header present for fb write on sandybridge

Determine header present for fb write by msg length is not right
for SIMD16 dispatch, and if there're more output attributes, header
present is not easy to tell from msg length. This explicitly adds
new param for fb write to say header present or not.

Fixes many cases' hang and failure in GL conformance test.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
4fe78d3e12fa963273de4d83b1fd55a78a5d41bf 21-Dec-2010 Eric Anholt <eric@anholt.net> i965: Avoid using float type for raw moves, to work around SNB issue.

The SNB alt-mode math does the denorm and inf reduction even for a
"raw MOV" like we do for g0 message header setup, where we are moving
values that aren't actually floats. Just use UD type, where raw MOVs
really are raw MOVs.

Fixes glxgears since c52adfc2e1d130effea940e75690897eb5d3ceaa, but no
piglit tests had regressed(!)
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
3a3b1bd722786ab0b1386a3a505cadfa89798232 09-Dec-2010 Eric Anholt <eric@anholt.net> i965: Add support for gen6 reladdr VS constant loading.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
15566183a64ef3f9940962a3b08b1c3469c98566 09-Dec-2010 Eric Anholt <eric@anholt.net> i965: Add support for gen6 constant-index constant loading.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
14a9153a32255f186a30b500d6db412388f4de28 09-Dec-2010 Eric Anholt <eric@anholt.net> i965: Clean up VS constant buffer location setup.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7ca7e9b626389dd6dac683c6664b8478e6d5c3b9 07-Dec-2010 Eric Anholt <eric@anholt.net> i965: Work around gen6 ignoring source modifiers on math instructions.

With the change of extended math from having the arguments moved into
mrfs and handed off through message passing to being directly hooked
up to the EU, it looks like the piece for doing source modifiers
(negate and abs) was left out.

Fixes:
fog-modes
glean/fp1-ARB_fog_exp test
glean/fp1-ARB_fog_exp2 test
glean/fp1-Computed fog exp test
glean/fp1-Computed fog exp2 test
ext_fog_coord-modes
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
72845d206e692581b6084c56b8d1f3bc689e8a03 07-Dec-2010 Eric Anholt <eric@anholt.net> i965: Handle saturates on gen6 math instructions.

We get saturate as an argument to brw_math() instead of as compile
state, since that's how the pre-gen6 send instructions work. Fixes
fp-ex2-sat.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
245662f3083795e272fe9ef5d4cbeb6d048cf0e5 03-Dec-2010 Eric Anholt <eric@anholt.net> i965: Add support for the instruction compression bits on gen6.

Since the 8-wide first-quarter and 16-wide first-half have the same
bit encoding, we now need to track "do you want instruction
compression" in the compile state.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
237aa33c67039f6660cac19f061057250b8b3697 04-Dec-2010 Eric Anholt <eric@anholt.net> i965: Make the sampler's implied move on gen6 be a raw move.

We were accidentally doing a float-to-uint conversion.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
5340dd8ccacda82d4e68a4357f90503c84831594 03-Dec-2010 Eric Anholt <eric@anholt.net> i965: Fix up gen6 samplers for their usage by brw_wm_emit.c

We were trying to do the implied move even when we'd already manually
moved the real header in place.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ae0df25ab439508c8bca707b91bbf085ff16d47c 13-Nov-2010 Eric Anholt <eric@anholt.net> i965: Don't smash a group of coordinates doing gen6 16-wide sampler headers.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
843a6a308e05bd4bf2056e08ec65ac4770097b93 01-Dec-2010 Eric Anholt <eric@anholt.net> i965: Add support for gen6 CONTINUE instruction emit.

At this point, piglit tests for fragment shader loops are working.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
00e5a743e2ee3981a34b95067a97fa73c0f5d779 01-Dec-2010 Eric Anholt <eric@anholt.net> i965: Add support for gen6 BREAK ISA emit.

There are now two targets: the hop-to-end-of-block target, and the
target for where to resume execution for active channels.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
4890e0f09c934e3ffb692b417e5444e43685c876 01-Dec-2010 Eric Anholt <eric@anholt.net> i965: Add support for gen6 DO/WHILE ISA emit.

There's no more DO since there's no more mask stack, and WHILE has
been shuffled like IF was.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9249af17b857e8d9a359b4cd04e9393aca517e9c 10-Nov-2010 Zhenyu Wang <zhenyuw@linux.intel.com> i965: fix dest type of 'endif' on sandybridge

That should also be immediate value for type W.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
88087ba1bf5566c8fe1c7d88028d2485126af286 26-Oct-2010 Eric Anholt <eric@anholt.net> i965: Drop the eot argument to read messages, which can never be set.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
3ee5d68075d1b198c92daf01826aba83a35fccf5 26-Oct-2010 Eric Anholt <eric@anholt.net> i965: Add support for constant buffer loads on gen6.

Fixes glsl-fs-uniform-array-5.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1732a8bc72fe0a8eaf7449eda65eba1a017ae909 26-Oct-2010 Eric Anholt <eric@anholt.net> i965: Use SENDC on the first render target write on gen6.

This is apparently required, as the thread will be initiated while it
still has dependencies, and this is what waits for those to be
resolved before writing color.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
748f3744bebc37cc753a5ea1c321854c580a7317 26-Oct-2010 Eric Anholt <eric@anholt.net> i965: Clarify an XXX comment in FB writes with real info.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
3789d5025a3200c40a39119c94c3d38a13e4b65a 25-Oct-2010 Eric Anholt <eric@anholt.net> i965: Add EU code for dword scattered reads (constant buffer array indexing).
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
07cd8f46acc34b04308f81de2faf05ba33da264b 22-Oct-2010 Eric Anholt <eric@anholt.net> i965: Add support for pull constants to the new FS backend.

Fixes glsl-fs-uniform-array-5, but not 6 which fails in ir_to_mesa.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0b77d57394a3712851ec271aa7ad353d56f302a1 21-Oct-2010 Eric Anholt <eric@anholt.net> i965: Don't emit register spill offsets directly into g0.

g0 is used by others, and is expected to be left exactly as it was
dispatched to us. So manually move g0 into our message reg when
spilling/unspilling and update the offset in the MRF. Fixes failures
in texture sampling after having spilled a register.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
99b2c8570ea6f46c6564681631f0e0750a0641cc 19-Oct-2010 Eric Anholt <eric@anholt.net> i965: Add support for register spilling.

It can be tested with if (0) replaced with if (1) to force spilling for all
virtual GRFs. Some simple tests work, but large texturing tests fail.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
6ea108e7db79cb7135a8a1ef216e25381f72c225 19-Oct-2010 Eric Anholt <eric@anholt.net> i965: Set the source operand types for gen6 if/else/endif to integer.

I don't think this should matter, but I'm not sure, and it's
recommended by a kernel checker in fulsim.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d0c87b90a85af0bd9ca7f8cec411a458742190cc 19-Oct-2010 Eric Anholt <eric@anholt.net> i965: Add EU emit support for gen6's new IF instruction with comparison.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f157812bbbcf9caac1f84988e738fc9d1e051056 14-Oct-2010 Kenneth Graunke <kenneth@whitecape.org> i965: Add support for ir_unop_round_even via the RNDE instruction.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
897f6d3c7d06316b0535971cc2de318157c23692 14-Oct-2010 Kenneth Graunke <kenneth@whitecape.org> i965: Correctly emit the RNDZ instruction.

Simply using RNDU, RNDZ, or RNDE does not produce the desired result.
Rather, the RND* instructions place a value in the destination register
that may be 1 less than the correct answer. They can also set per-channel
"increment bits" in a flag register, which, if set, mean dest needs to
be incremented by 1. A second instruction - a predicated add -
completes the job.

Notably, RNDD always produces the correct answer in a single
instruction.

Fixes piglit test glsl-fs-trunc.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e8e79c1d7eed0f5ae8820611cb86bdbd6ce595e6 14-Oct-2010 Zhenyu Wang <zhenyuw@linux.intel.com> i965: Fix GS hang on Sandybridge

Don't use r0 for FF_SYNC dest reg on Sandybridge, which would
smash FFID field in GS payload, that cause later URB write fail.
Also not use r0 in any URB write requiring allocate.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7b5bc38c44269fc51db2f8b5e4ba0222212c6d71 11-Oct-2010 Eric Anholt <eric@anholt.net> i965: Add a couple of checks for gen6 math instruction limits.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
978ffa1d61902f0d55e15fbc51af75d444f35124 09-Oct-2010 Vinson Lee <vlee@vmware.com> i965: Silence unused variable warning on non-debug builds.

Fixes this GCC warning.
brw_eu_emit.c: In function 'brw_math2':
brw_eu_emit.c:1189: warning: unused variable 'intel'
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
feca6609390d4642418cf7aab878e654964510c4 05-Oct-2010 Eric Anholt <eric@anholt.net> i965: Fix up IF/ELSE/ENDIF for gen6.

The jump delta is now in the part of the instruction where the
destination fields used to be, and the src args are ignored (or not,
for the new non-predicated IF that we don't use yet).
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f7cb28fad9855020e9fbd1481df03bb09346d4be 05-Oct-2010 Eric Anholt <eric@anholt.net> i965: Gen6 no longer has the IFF instruction; always use IF.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ea909be58dda7e916cb9ce434ecb78597881ad33 05-Oct-2010 Eric Anholt <eric@anholt.net> i965: Add support for gen6 FB writes to the new FS.

This uses message headers for now, since we'll need it for MRT. We
can cut out the header later.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
956f866030f7bea5fc4a2de28c72e60bdc3a5b3d 17-Sep-2010 Zhenyu Wang <zhenyuw@linux.intel.com> i965: Fix sampler on sandybridge

Sandybridge has not much change on texture sampler with Ironlake.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
c5a3b25bb954db49dcb5e7737018979782d2edba 28-Sep-2010 Zhenyu Wang <zhenyuw@linux.intel.com> i965: fix jump count on sandybridge

Jump count is for 64bit long each, so one instruction requires 2
like on Ironlake.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
67dafa4b56422b44ca26b093d8feb6e743eb89e6 17-Sep-2010 Zhenyu Wang <zhenyuw@linux.intel.com> i965: ff sync message change for sandybridge
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
fe2d4a5ea02df38c9940a726aa04bcf550fab1da 22-Aug-2010 Eric Anholt <eric@anholt.net> i965: Add support for POW in gen6 FS.

Fixes glsl-algebraic-pow-2 in brw_wm_glsl.c mode.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a560a509fab467b0ed4be2bceaf1c5a60890ca0d 05-Sep-2010 Eric Anholt <eric@anholt.net> i965: Add some validation on BRW_OPCODE_MUL and ADD's arguments.

Now that we're playing with other types in brw_fs.cpp, it's easy to
trip over issues like these.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0002069fd5117b52f0ae2be0b7e3d8e839a3a61c 05-Sep-2010 Eric Anholt <eric@anholt.net> i965: Add assertion for another requirement about types.

This catches a failure in the FS backend.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
5c777928591279886e015c10f640828f77b97559 04-Sep-2010 Eric Anholt <eric@anholt.net> i965: Add a bit of validation for some ISA restrictions in the docs.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
352dff62f8005add9e71e6b5ba3b3321cb953d73 29-Aug-2010 Eric Anholt <eric@anholt.net> i965: Make brw_CONT and brw_BREAK take the pop count.

We always need to set it, so pass it in.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
e6ec500e19f455237828f4f3955f888ad0b56382 21-Aug-2010 Eric Anholt <eric@anholt.net> i965: Also use the SIMD8 FB writes for SIMD8 mode on non-SNB.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
5266c0a0c82de625ccac57e7559f57399f761e9e 21-Aug-2010 Zhenyu Wang <zhenyuw@linux.intel.com> i965: Add support for FB writes on Sandybridge.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
3ce2eccbfb925a3af0b91a89a9f7a3603fa45d2d 21-Aug-2010 Zhenyu Wang <zhenyuw@linux.intel.com> i965: Set the destination horiz stride even for da16, as SNB seems to need it.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
21eaa62ba461854003e5f74e6fc32e559e9c8455 22-Jul-2010 Eric Anholt <eric@anholt.net> i965: Clean up brw_dp_READ_4_vs() now that it has fewer options to support.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
96b11f1e3ee12f06be1d33bf085bf1353f23e667 22-Jul-2010 Eric Anholt <eric@anholt.net> i965: Support relative addressed VS constant reads using the appropriate msg.

The previous support was overly complicated by trying to use the same
1-OWORD message for both offsets.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
04de6861c1a41859dd85ca066b964e5df3ad63b6 22-Jul-2010 Eric Anholt <eric@anholt.net> i956: Set the execution size correctly for scratch space writes.

Otherwise, the second half isn't written, and we end up reading back
black.

Fixes the remaining junk drawn in glsl-max-varyings, and will likely
help with a number of large real-world shaders.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a3bfb2f755cb2255879600d12d8440fad7136a9a 21-Jul-2010 Eric Anholt <eric@anholt.net> i965: Use the pretty define for 4-oword DP reads.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d2f3eac8ffba8db8b141f07c22f612362c63ffe9 21-Jul-2010 Eric Anholt <eric@anholt.net> i965: Set the send commit bit on register spills as required pre-gen6.

Otherwise, the subsequent read may not get the written value.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
20be3ff57670529a410b30a1008a71e768d08428 25-Jun-2010 Zhenyu Wang <zhenyuw@linux.intel.com> i965: Add 'wait' instruction support

When EU executes 'wait' instruction, it stalls and sets notification
register state. Host can issue MMIO write to clear notification
register state to allow EU continue on executing again.

Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
5dbbb48f46f99baeba3a24a8371029e216b931bb 13-Jun-2010 Zhenyu Wang <zhenyuw@linux.intel.com> i965: Use the new message header format for FF_SYNC on gen6.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
881ec3a81455f8449d06429811107e1f955f2c60 13-Jun-2010 Zhenyu Wang <zhenyuw@linux.intel.com> i965: Add support for math instructions in the gen6 WM.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
81951393e1e675d6ca3ea052875def70d5e7ab93 14-May-2010 Eric Anholt <eric@anholt.net> i965: Remove constant or ignored-by-hw args from FF sync message setup.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
cdcef6cbf4dd80047819e9098e34a3b98bd502a4 19-Apr-2010 Zhenyu Wang <zhenyuw@linux.intel.com> intel: Clean up chipset name and gen num for Ironlake

Rename old IGDNG to Ironlake, and set 'gen' number for
Ironlake as 5, so tracking the features with generation num
instead of special is_ironlake flag.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a9acde6a723c8f343f65243d1ccac6836215ba0c 19-Mar-2010 Eric Anholt <eric@anholt.net> i965: Ignore execution mask for the mov(m0, g0) of VS URB write header on SNB.

Otherwise, we may not get the FFTID set up which would break freeing
of resources.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
56ff30a9f97a1a7094432333906544d6138d6bf2 10-Mar-2010 Eric Anholt <eric@anholt.net> i965: Use the PLN instruction when possible in interpolation.

Saves an instruction in PINTERP, LINTERP, and PIXEL_W from
brw_wm_glsl.c For non-GLSL it isn't used yet because the deltas have
to be laid out differently.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
dc8c0359448cdae7b367552ba58783c04b199778 10-Mar-2010 Eric Anholt <eric@anholt.net> i965: Set up the execution size before relying on it.

Fixes hangs with texturing in the non-GLSL path since
f6d210c284751ac50a8d6358de7e75a1ff1e4ac7
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
f6d210c284751ac50a8d6358de7e75a1ff1e4ac7 10-Mar-2010 Eric Anholt <eric@anholt.net> i965: Fix the response len of masked sampler messages for 8-wide dispatch.

The bad response length would hang the GPU with a masked sample in a
shader using control flow. For 8-wide, the response length is always
4, and masked slots are just not written to. brw_wm_glsl.c already
allocates registers in the right locations.

Fixes piglit glsl-fs-bug25902 (fd.o bug #25902).
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
839e6bd8b90a124e88020d54ded46460b2a3bc2d 26-Feb-2010 Eric Anholt <eric@anholt.net> i965: Try to hook up the Sandybridge URB_WRITE SEND message.

My units still hang when doing this if the VS is enabled.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
38c449409207c8948c1961a3132475bbd422f8f1 24-Feb-2010 Eric Anholt <eric@anholt.net> i965: Add SNB math opcode support.

This is untested at this point.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
298be2b028263b2c343a707662c6fbfa18293cb2 19-Feb-2010 Kristian Høgsberg <krh@bitplanet.net> Replace the _mesa_*printf() wrappers with the plain libc versions
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
25024d948298a9f3f3210a0b91486f79a3917b0f 31-Dec-2009 Brian Paul <brianp@vmware.com> Merge branch 'mesa_7_7_branch'

Conflicts:
configs/darwin
src/gallium/auxiliary/util/u_clear.h
src/gallium/state_trackers/xorg/xorg_exa_tgsi.c
src/mesa/drivers/dri/i965/brw_draw_upload.c
c67bb15d4e3da430d511444bd7d159ccb0c84b73 29-Dec-2009 Vinson Lee <vlee@vmware.com> intel: Silence compiler warnings.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d0b7ff551ab25153e3023871af3daa65b394a828 27-Dec-2009 Brian Paul <brianp@vmware.com> Merge branch 'mesa_7_6_branch' into mesa_7_7_branch

Conflicts:
src/gallium/auxiliary/util/u_network.c
src/gallium/auxiliary/util/u_network.h
src/gallium/drivers/i915/i915_state.c
src/gallium/drivers/trace/tr_rbug.c
src/gallium/state_trackers/vega/bezier.c
src/gallium/state_trackers/vega/vg_context.c
src/gallium/state_trackers/xorg/xorg_crtc.c
src/gallium/state_trackers/xorg/xorg_driver.c
src/gallium/winsys/xlib/xlib_brw_context.c
src/mesa/main/mtypes.h
2447786ed00a19466c9cc9b9efbfa084e88114eb 25-Dec-2009 Vinson Lee <vlee@vmware.com> i965: Fix assert.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0b87f143c466f7e5bd730895ee29f1cd20a68f9b 17-Dec-2009 Eric Anholt <eric@anholt.net> intel: Replace IS_G4X() across the driver with context structure usage.

Saves ~2KB of code.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1c96e85c9d6b8c636b0636f3320d1057ab5357b3 16-Dec-2009 Eric Anholt <eric@anholt.net> intel: Replace IS_IGDNG checks with intel->is_ironlake or needs_ff_sync.

Saves ~480 bytes of code.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8baee3d25beb616f6d5ba575684e889d60e38740 07-Nov-2009 Eric Anholt <eric@anholt.net> i965: Use Compr4 instruction compression mode on G4X and newer.

No statistically significant performance difference at n=3 with either
openarena or my GL demo, but cutting program size seems like a good
thing to be doing for the hypothetical app that has a working set near
icache size.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
011244853b538a1a5adf602c8ed2de5c0f047548 05-Aug-2009 Eric Anholt <eric@anholt.net> i965: Don't set pop_count in the reserved MBZ area of IF statements.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
b010814e9c7ed30cbdd60a49d81a6ea774c8c3a3 04-Aug-2009 Eric Anholt <eric@anholt.net> i965: Spell "conditional" correctly.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
868aa160745ed0b3f1a83353ef2f3a8fcb5d235e 15-Jul-2009 Xiang, Haihao <haihao.xiang@intel.com> i965: the offset of any branch/jump instruction is in unit of 64bits on IGDNG
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2995bf0d68f1b28ba68b81e9dc79e3ab52bc2795 13-Jul-2009 Xiang, Haihao <haihao.xiang@intel.com> i965: add support for new chipsets

1. new PCI ids
2. fix some 3D commands on new chipset
3. fix send instruction on new chipset
4. new VUE vertex header
5. ff_sync message (added by Zou Nan Hai <nanhai.zou@intel.com>)
6. the offset in JMPI is in unit of 64bits on new chipset
7. new cube map layout
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
73137997e23ff6c1145d036315d1a9ad96651281 02-Jul-2009 Xiang, Haihao <haihao.xiang@intel.com> i965: fixes for JMPI

1. the data type of <src1> (JMPI offset) must be D
2. execution size must be 1
3. NoMask
4. instruction compression isn't allowed.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1b6ae2e004b7a7a76508e0da3c45eb0d851ed10c 01-Jul-2009 Brian Paul <brianp@vmware.com> i965: use BRW_MAX_MRF
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
4fdc6ad41b843109febbe9596dde87f676a8b0e9 26-Jun-2009 Roland Scheidegger <sroland@vmware.com> i965: fix fetching constants from constant buffer in glsl path

the driver used to overwrite grf0 then use implicit move by send instruction
to move contents of grf0 to mrf1. However, we must not overwrite grf0 since
it's still used later for fb write.
Instead, do the move directly do mrf1 (we could use implicit move from another
grf reg to mrf1 but since we need a mov to encode the data anyway it doesn't
seem to make sense).
I think the dp_READ/WRITE_16 functions may suffer from the same issue.
While here also remove unnecessary msg_reg_nr parameter from the dataport
functions since always message register 1 is used.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0bc214a834bbb12b9338837dd9fca9bc389b4bc2 18-Apr-2009 Brian Paul <brianp@vmware.com> i915: fix broken indirect constant buffer reads

The READ message's msg_control value can be 0 or 1 to indicate that the
Oword should be read into the lower or upper half of the target register.
It seems that the other half of the register gets clobbered though. So
we read into two dest registers then use a MOV to combine the upper/lower
halves.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
ee32e9b4753eca62e360f96ce61ef7ff683e6bb7 16-Apr-2009 Brian Paul <brianp@vmware.com> i965: implement relative addressing for VS constant buffer reads

A scatter-read should be possible, but we're just using two READs for
the time being.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
92cc9970039d9c9385dc472fbfac58b93799f5ae 15-Apr-2009 Brian Paul <brianp@vmware.com> i965: fix VS constant buffer reads

This mostly came down to finding the right MRF incantation in the
brw_dp_READ_4_vs() function.

Note: this feature is still disabled (but getting close to done).
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
cafea7528052624c8d3e4cd1c5b26a61bf04d1d0 14-Apr-2009 Brian Paul <brianp@vmware.com> i965: checkpoint commit: VS constant buffers

Hook up a constant buffer, binding table, etc for the VS unit.
This will allow using large constant buffers with vertex shaders.
The new code is disabled at this time (use_const_buffer=FALSE).
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2078e6cf55e3068454df9d843618b412b6abb811 10-Apr-2009 Brian Paul <brianp@vmware.com> i965: new SURF_INDEX_ macros

Used to map drawables, textures and constant buffers to surface binding
table indexes.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8131123effd2124b8ca2aad04bf543e2fe82c7b0 09-Apr-2009 Brian Paul <brianp@vmware.com> i965: set BRW_MASK_DISABLE flag in "send" instruction in brw_dp_READ_4()

This fixes the random results that were seen when fetching a constant
inside an IF/ELSE clause. Disabling the execution mask ensures that all
the components of the register are written.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a330a6fcd0b018829194ffab260f50956bce4832 02-Apr-2009 Brian Paul <brianp@vmware.com> i965: s/GL_FALSE/BRW_COMPRESSION_NONE/
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
30adf0518168ded9c7f519a7c772cab728852b1f 01-Apr-2009 Brian Paul <brianp@vmware.com> i965: fix response length param in brw_dp_READ_4()

We were accidentally clobbering the next register.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8127e49b93820d1768e2d298bbe238dd55c20732 31-Mar-2009 Brian Paul <brianp@vmware.com> i965: added new brw_dp_READ_4() function

Used to read float[4] vectors from the constant buffer/surface.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
6b18a8d3e7114a1af931a2fc20a1e23cb2d7789c 31-Mar-2009 Brian Paul <brianp@vmware.com> i965: new and updated comments
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1146d40b9c35d80c0860c773de1eef27c76e8c01 25-Mar-2009 Brian Paul <brianp@vmware.com> i965: comments for brw_SAMPLE()
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
a10ec13143599344ecb4a486db1454b488cd9645 13-Mar-2009 Brian Paul <brianp@vmware.com> i965: add some register number assertions

Haven't seen failures yet, but if/when there are, more investigation will
be done.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
2f2082bf16ca86b8ebea9e04b77011f74d09c3db 12-Feb-2009 Brian Paul <brianp@vmware.com> i965: minor clean-ups
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
0d797365deb579cfeb2a32f21692515eb6904921 05-Jan-2009 Brian Paul <brianp@vmware.com> i965: implement OPCODE_TRUNC (round toward zero) on vertex path.

Also, fix some RNDD vs. RNDZ confusion elsewhere.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
34b1776e8d965605d12807884c9c447214d57281 02-Nov-2008 Eric Anholt <eric@anholt.net> i965: Merge GM45 into the G4X chipset define.

The mobile and desktop chipsets are the same, and having them separate is
more typing and more chances to screw up.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
72c914805b8b3b37bf8f44d94bc25ca3d146ac66 01-Nov-2008 Keith Packard <keithp@keithp.com> Fix for 58dc8b7: dest regions must not use HorzStride 0 in ExecSize 1

Quoting section 11.3.10, paragraph 10.2 of the 965PRM:

10.2. If ExecSize is 1, dst.HorzStride must not be 0. Note that this is
relaxed from rule 10.1.2. Also note that this rule for destination
horizontal stride is different from that for source as stated in
rule #7.

GM45 gets very angry when rule 10.2 is violated.

Patch 58dc8b7 (i965: support destination horiz strides in align1 access mode)
added support for additional horizontal strides in the ExecSize 1 case, but
failed to notice that mesa occasionally re-purposes a register as a
temporary destination, even though it was constructed as a repeating source
with HorzStride = 0.

While, ideally, we should probably fix the code using these register
specifications, this patch simply rewrites them to use HorzStride 1 as the
pre-58dc8b7 code did.

Signed-off-by: Keith Packard <keithp@keithp.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
58dc8b7db5829188dbb45c020ab44732d6053888 30-Oct-2008 Gary Wong <gtw@gnu.org> i965: support destination horiz strides in align1 access mode.

This is required for scatter writes in destination regions to work.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
7a2ab6d05573508389b38f8f1fa261ba56062865 29-Aug-2008 Xiang, Haihao <haihao.xiang@intel.com> i965: force thread switch after IF/ELSE/ENDIF. partial fix for #16882.

A thread switch is implicitly invoked after the issuance of an IF/ELSE/ENDIF
instruction if necessary. Unfortunately it seems sometimes a forced thread
switch is needed.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
6073b49c7915147c28e9887039a51b8e4e2e62c5 29-Aug-2008 Xiang, Haihao <haihao.xiang@intel.com> i965: mask control for BREAK/CONT/DO/WHILE. partial fix fox #16882
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
92c075eeb7c330ea420400d1c2bae57356b19f03 08-Jul-2008 Xiang, Haihao <haihao.xiang@intel.com> i965: official name for GM45 chipset
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
8e444fb9e2685e3eac42beb848b08e91dc20c88a 29-Jan-2008 Xiang, Haihao <haihao.xiang@intel.com> i965: new integrated graphics chipset support
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
46e03d584a18b89fef956fed3d52e15775846250 27-Nov-2007 Xiang, Haihao <haihao.xiang@intel.com> i965: The jump instruction count is added
to IP pre-increment, and should point to
the first instruction after the do instruction
of the do-while block of code
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
b0b48798c7e854d2e36e0317bf94b7385e815242 29-Sep-2007 Zou Nan hai <nanhai.zou@intel.com> support continue, fix conditional
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
d19d0596daf004b56d80f78fa1a329b43c2ebf94 21-Jun-2007 Zou Nan hai <nanhai.zou@intel.com> support branch and loop in pixel shader
most of the sample working with some small modification
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
35707dbe57873adb5a8088cd47c13bd216e143e4 12-Apr-2007 Zou Nan hai <nanhai.zou@intel.com> Initial 965 GLSL support
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
1b9f78195f62959601d440475a6cbba5e8046813 18-Oct-2006 Eric Anholt <eric@anholt.net> i965: Avoid branch instructions while in single program flow mode.

There is an errata for Broadwater that threads don't have the instruction/loop
mask stacks initialized on thread spawn. In single program flow mode, those
stacks are not writable, so we can't initialize them. However, they do get
read during ELSE and ENDIF instructions. So, instead, replace branch
instructions in single program flow mode with predicated jumps (ADD to the ip
register), avoiding use of the more complicated branch instructions that may
fail. This is also a minor optimization as no ENDIF equivalent is necessary.

Signed-off-by: Keith Packard <keithp@neko.keithp.com>
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
db0e53af74beafa0ba07b200396bfe12fa9f5c89 01-Sep-2006 Keith Whitwell <keith@tungstengraphics.com> fix a couple of cases where a message reg is used as an instruction source.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c
9f344b3e7d6e23674dd4747faec253f103563b36 09-Aug-2006 Eric Anholt <anholt@FreeBSD.org> Add Intel i965G/Q DRI driver.

This driver comes from Tungsten Graphics, with a few further modifications by
Intel.
/external/mesa3d/src/mesa/drivers/dri/i965/brw_eu_emit.c