History log of /external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
5561a377107cee5e07348fce020de579463c298f 09-Jun-2016 Dave Airlie <airlied@redhat.com> gallivm/llvmpipe: prepare support for ARB_gpu_shader_int64.

This enables 64-bit integer support in gallivm and
llvmpipe.

v2: add conversion opcodes.
v3:
- PIPE_CAP_INT64 is not there yet
- restrict DIV/MOD defaults to the CPU, as for 32 bits
- TGSI_OPCODE_I2U64 becomes TGSI_OPCODE_U2I64

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ace70aedcf8b29380a17f68a994b18f60976bca6 03-Jun-2016 Jan Vesely <jan.vesely@rutgers.edu> gallivm: Fix trivial sign warnings

v2: include whitespace fixes

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e5c57824ec4021a859f5cbd4feba148d068713ee 10-Jun-2016 Dave Airlie <airlied@redhat.com> gallivm: make non-float return code bitcast consistent.

This just uses the same form across the fetches.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
3b97e50b9a3e413aca24edbaa3fd4a86dd216faf 10-Jun-2016 Dave Airlie <airlied@redhat.com> gallium/gallivm: use 64-bit test instead of doubles.

This just makes some generic code that currently emits double
suitable for emitting 64-bit values.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
af249a7da9bf2621ab836d5074ef692677b11bbf 16-Apr-2016 Marek Olšák <marek.olsak@amd.com> gallium: use PIPE_SHADER_* everywhere, remove TGSI_PROCESSOR_*

Acked-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
fb523cb6ad3ffef22ab4b9cce9e53859c17c5739 16-Apr-2016 Marek Olšák <marek.olsak@amd.com> gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_*

Use PIPE_SWIZZLE_* everywhere.
Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE.
The new enum is called pipe_swizzle.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
3a26ef23e78f811abdfe657b52b9bc057b9ce5b6 18-Apr-2016 Dave Airlie <airlied@redhat.com> gallivm: convert size query to using a set of parameters.

This isn't currently that easy to expand, so fix it up
before expanding it later to include dynamic samplers.

[airlied: use some local variables (Roland)]

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
d335b6abc0eaa7506203df7c99898645214b4c72 17-Feb-2016 Roland Scheidegger <sroland@vmware.com> gallivm, tgsi: provide fake sample_i_ms implementations

Just like the rest of the msaa "implementation" it's just fake for now...

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
5071c192ccbfac92c53d93aea049bf981ae9e442 05-Jan-2016 Edward O'Callaghan <eocallaghan@alterapraxis.com> gallium: Use unsigned for loop index

Found-by: Coccinelle
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
a13b14930d94b024160fe17814e091356d07f7fb 12-Oct-2015 Dave Airlie <airlied@redhat.com> llvmpipe: fix fp64 inputs to geom shader.

This fixes the fetching of fp64 inputs to the geometry shader,

this fixes the recently posted piglit's
arb_gpu_shader_fp64/execution/gs-fs-vs-double-array.shader_test
arb_vertex_attrib_64bit/execution/gs-fs-vs-attrib-double-array.shader_test

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
147fd00bb36917f8463aacd49a26e95ca0926255 04-Dec-2015 Edward O'Callaghan <eocallaghan@alterapraxis.com> gallium/auxiliary: Trivial code style cleanup

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
24dc0316b4d7b29e055f220b23cab7daf4698c0c 20-Nov-2015 Roland Scheidegger <sroland@vmware.com> gallivm: use sampler index 0 for texel fetches

texel fetches don't use any samplers. Previously we just set the same
number for both texture and sampler unit (as per "ordinary" gl style
sampling where the numbers are always the same) however this would trigger
some assertions checking that the sampler index isn't over PIPE_MAX_SAMPLERS
limit elsewhere with d3d10, so just set to 0.
(Fixing the assertion instead isn't really an option, the sampler isn't
really used but might still pass an out-of-bound pointer around and even
copy some things from it.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
9285ed98f7557722fbb94f47c5bc138ef5dd9c70 27-Oct-2015 Roland Scheidegger <sroland@vmware.com> llvmpipe: add cache for compressed textures

compressed textures are very slow because decoding is rather complex
(and because there's no jit code code to decode them too for non-technical
reasons).
Thus, add some texture cache which holds a couple of decoded blocks.
Right now this handles only s3tc format albeit it could be extended to work
with other formats rather trivially as long as the result of decode fits into
32bit per texel (ideally, rgtc actually would decode to more than 8 bits
per channel, but even then making it work for it shouldn't be too difficult).
This can improve performance noticeably but don't expect wonders (uncompressed
is unsurprisingly still faster). It's also possible it might be slower in
some cases (using nearest filtering for example or if there's otherwise not
many cache hits, the cache is only direct mapped which isn't great).
Also, actual decode of a block relies on util code, thus even though always
full blocks are decoded it is done texel by texel - this could obviously
benefit greatly from simd-optimized code decoding full blocks at once...
Note the cache is per (raster) thread, and currently only used for fragment
shaders.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
8477dd7c2e4416838c54da75a769109b4c5cc48e 30-Jul-2015 Vinson Lee <vlee@freedesktop.org> gallivm: Fix GCC unused-variable warning.

lp_bld_tgsi_soa.c: In function 'lp_emit_immediate_soa':
lp_bld_tgsi_soa.c:3065:18: warning: unused variable 'size' [-Wunused-variable]
const uint size = imm->Immediate.NrTokens - 1;
^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
a2a1a5805fd617e7f3cc8be44dd79b50da07ebb9 21-Jul-2015 Ilia Mirkin <imirkin@alum.mit.edu> gallium: replace INLINE with inline

Generated by running:
git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g'
git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g'
git checkout src/gallium/state_trackers/clover/Doxyfile

and manual edits to
src/gallium/include/pipe/p_compiler.h
src/gallium/README.portability

to remove mentions of the inline define.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e35c5717837d9ac6d9722b011852bdf187f29776 27-Jun-2015 Dave Airlie <airlied@gmail.com> gallivm: add fp64 support. (v2.1)

This adds support for ARB_gpu_shader_fp64 and ARB_vertex_attrib_64bit to
llvmpipe.

Two things that don't mix well are SoA and doubles, see
emit_fetch_double, and emit_store_double_chan in this.

I've also had to split emit_data.chan, to add src_chan,
which can be different for doubles.

It handles indirect double fetches from temps, inputs, constants
and immediates. It doesn't handle double stores to indirects,
however it appears the mesa/st doesn't currently emit these,
it always does UARL/MOV combos, which will work fine.

tested with piglit, no regressions, all the fp64 tests seem to pass.

v2:
switch to using shuffles for fetch/store (Roland)
assert on indirect double stores - mesa/st never emits these (it uses MOV)
fix indirect temp/input/constant/immediates (Roland)
typos/formatting fixes (Roland)

v2.1:
cleanup some long lines, emit_store_double_chan cleanups.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
1a71fbe28ca0525b618f6fb9d7354f3a6589af2f 22-Jun-2015 Dave Airlie <airlied@redhat.com> draw/gallivm: add invocation ID support for llvmpipe.

This extends the draw code to add support for invocations.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e1c4e8aaaafddd0e04cf2a16e28ef8f1e09d8b44 17-May-2015 Marek Olšák <marek.olsak@amd.com> gallium: remove TGSI_SAT_MINUS_PLUS_ONE

It's a remnant of some old NV extension. Unused.

I also have a patch that removes predicates if anyone is interested.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
586536a4e1c34725b3b38c3425db569fac0c91e9 09-Apr-2015 Roland Scheidegger <sroland@vmware.com> gallivm: don't use control flow when doing indirect constant buffer lookups

llvm goes crazy when doing that, using way more memory and time, though there's
probably more to it - this points to a very much similar issue as fixed in
8a9f5ecdb116d0449d63f7b94efbfa8b205d826f. In any case I've seen a quite
plain looking vertex shader with just ~50 simple tgsi instructions (but with a
dozen or so such indirect constant buffer lookups) go from a terribly high
~440ms compile time (consuming 25MB of memory in the process) down to a still
awful ~230ms and 13MB with this fix (with llvm 3.3), so there's still obvious
improvements possible (but I have no clue why it's so slow...).
The resulting shader is most likely also faster (certainly seemed so though
I don't have any hard numbers as it may have been influenced by compile times)
since generally fetching constants outside the buffer range is most likely an
app error (that is we expect all indices to be valid).
It is possible this fixes some mysterious vertex shader slowdowns we've seen
ever since we are conforming to newer apis at least partially (the main draw
loop also has similar looking conditionals which we probably could do without -
if not for the fetch at least for the additional elts condition.)

v2: use static vars for the fake bufs, minor code cleanups

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
0753b135f6e83b171d8a1b08aea967374f3542bc 29-Mar-2015 Roland Scheidegger <sroland@vmware.com> gallivm: implement TG4 for ARB_texture_gather

This is quite trivial, essentially just follow all the same code you'd
use with linear min/mag (and no mip) filter, then just skip the filtering
after looking up the texels in favor of direct assignment of the right channel
to the result. (This is though not true for the multi-offset version if we'd
want to support it - for this would probably need to do something along the
lines of 4x nearest sampling due to the necessity of doing coord wrapping
individually per texel.)
Supports multi-channel formats.
From the SM5 gather cap bit, should support non-constant offsets, plus shadow
comparisons (the former untested), but not component selection (should be
easy to implement but all this stuff is not really exposable anyway for now).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
73c6914195bd3cc81f52192d4ec8e23fc6239c41 29-Mar-2015 Roland Scheidegger <sroland@vmware.com> gallivm: add gather support to sampler interface

Luckily thanks to the revamped interface this is a lot less work now...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
1863ed21ffbb3ab7fd9875dc25e32ececea79d50 28-Mar-2015 Roland Scheidegger <sroland@vmware.com> gallivm: simplify sampler interface

This has got a bit out of control with more and more parameters added.
Worse, whenever something in there changes all callees have to be updated
for that, even though they don't really do much with any parameter in there
except pass it on to the actual sampling function.
Hence simply put almost everything into a struct. Also instead of relying
on some arguments being NULL, be explicit and set this in a key (which is
just reused for function generation for simplicity). (The code still relies
on them being NULL in the end for now.)
Technically there is a minimal functional change here for shadow sampling:
if shadow sampling is done is now determined explicitly by the texture
function (either sample_c or the gl-style tex func inherit this from target)
instead of the static texture state. These two should always match, however.
Otherwise, it should generate all the same code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
8dad9455ff748c543635b24908566c3b94cb93a9 25-Mar-2015 Roland Scheidegger <sroland@vmware.com> gallivm: pass jit_context pointer through to sampling

The callbacks used for getting the dynamic texture/sampler state were using
the jit_context from the generated jit function. This works just fine, however
that way it's impossible to generate separate functions for texture sampling,
as will be done in the next commit. Hence, pass this pointer through all
interfaces so it can be passed to a separate function (technically, it would
probably be possible to extract this pointer from the current function instead,
but this feels hacky and would probably require some more hacks if we'd use
real functions instead of inlining all shader functions at some point).
There should be no difference in the generated code for now.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
04e35cc4aac019fbf6ac5ea8f6d772bb6cacea8d 19-Dec-2014 Brian Paul <brianp@vmware.com> gallivm: silence a couple compiler warnings

Silence warnings about possibly uninitialized variables when making a
release build.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
97dc3d826e0c04e747ff5dbecf3026b6a16737fd 12-Dec-2014 Roland Scheidegger <sroland@vmware.com> draw: implement support for the VERTEXID_NOBASE and BASEVERTEX semantics.

This fixes 4 vertexid related piglit tests with llvmpipe due to switching
behavior of vertexid to the one gl expects.
(Won't fix non-llvm draw path since we don't get the basevertex currently.)
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
d4864cdf15ccd30f0e82d07fd0e9db8a0c115cda 12-Nov-2014 Eric Anholt <eric@anholt.net> gallium: Drop the NRM and NRM4 opcodes.

They weren't generated in tree, and as far as I know all hardware had to
lower it to a DP, RSQ, MUL.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
0c4bc1e29223ffec4617999c0c03d722bdcc170a 02-Oct-2014 Marek Olšák <marek.olsak@amd.com> tgsi: change tgsi_shader_info::properties to a one-dimensional array

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

v2: fix svga too
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
8908fae243cb4c15a675006a1cc472f6c59b0d43 30-Sep-2014 Marek Olšák <marek.olsak@amd.com> tgsi: simplify shader properties in tgsi_shader_info

Use an array of properties indexed by TGSI_PROPERTY_* definitions.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
9da75f96bce738d5b2c58e6fe0ce8ad436667c58 30-Aug-2014 Roland Scheidegger <sroland@vmware.com> gallivm: handle cube map arrays for texture sampling

Pretty easy, just make sure that all paths testing for PIPE_TEXTURE_CUBE
also recognize PIPE_TEXTURE_CUBE_ARRAY, and add the layer * 6 calculation
to the calculated face.
Also handle it for texture size query, looks like OpenGL wants the number
of cubes, not layers (so need division by 6).

No piglit regressions.

v2: fix up adding cube layer to face for seamless filtering (needs to happen
after calculating per-sample face). Undetected by piglit unfortunately.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (v1)
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
549229631849c0caa6826f2eb1170909f5e21c33 07-Aug-2014 Darius Goad <alegend45@gmail.com> gallivm: Handle MSAA textures in emit_fetch_texels

This support is preliminary due to the fact that MSAA is not
actually implemented.

However, this patch does fix the piglit test:
spec/!OpenGL 3.2/glsl-resource-not-bound 2DMS (bug #79740).

(v2 RS: don't emit 4th coord as explicit lod)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
394ea139c7cf577fe00d38634e697c3a740d4ccd 07-Aug-2014 Roland Scheidegger <sroland@vmware.com> draw: hack around weird primitive id input in gs

The distinction between system values and ordinary inputs is not very
obvious in gallium - further fueled by the fact that they use the same
semantic names.
Still, if there's any value which imho really is a system value, it's the
primitive id input into the gs (while earlier (tessleation) stages could read
it, it is _always_ generated by the system). For some odd reason though (which
I'd classify as a bug but seems too complicated to fix) the glsl compiler in
mesa treats this as an ordinary varying, and everything else after that
(including the state tracker and other drivers) just go along with that.
But input fetching in gs for llvm based draw was definitely limited to the
ordinary (2-dimensional) inputs so only worked with other state trackers,
the code was also additionally relying on tgsi_scan_shader filling
uses_primid correctly which did not happen neither (would set it only for
all stages if it was a system value, but only set it for the fragment shader
if it was an input value).
This fixes piglit glsl-1.50-geometry-primitive-id-restart and primitive-id-in
in llvmpipe.

Reviewed-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
c3c33756ff828ba7ae065c224943aeb48b8ed2ba 02-Aug-2014 Roland Scheidegger <sroland@vmware.com> gallivm: fix cube map array (and cube map shadow with bias) handling

In particular need to handle TEX2/TXB2/TXL2 opcodes.
cube map shadow with bias already used TXB2 which didn't work before
at all, despite that there's by default no piglit change (but using
no_quad_lod and no_rho_opt indeed passes some more tex-miplevel-selection
tests).
The actual sampling code still won't handle cube map arrays.

Reviewed-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
1df7199fc933facf2e74304976c3798e474929a1 14-Jun-2014 Marek Olšák <marek.olsak@amd.com> gallium: implement ARB_texture_query_levels

The extension is always supported if GLSL 1.30 is supported.

Softpipe and llvmpipe support is also added (trivial).
Radeon and nouveau support is already done.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
8a9f5ecdb116d0449d63f7b94efbfa8b205d826f 14-May-2014 Roland Scheidegger <sroland@vmware.com> gallivm: only fetch pointers to constant buffers once

In 1d35f77228ad540a551a8e09e062b764a6e31f5e support for multiple constant
buffers was introduced. This meant we had another indirection, and we did
resolve the indirection for each constant buffer access. This looks very
reasonable since llvm can figure out if it's the same pointer, however it
turns out that this can cause llvm compilation time to go through the roof
and beyond (I've seen cases in excess of factor 100, e.g. from 50 ms to more
than 10 seconds (!)), with all the additional time spent in IR optimization
passes (and in the end all of it in DominatorTree::dominate()).
I've been unable to narrow it down a bit more (only some shaders seem affected,
seemingly without much correlation to overall shader complexity or constant
usage) but it is easily avoidable by doing the buffer lookups themeselves just
once (at constant buffer declaration time).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
1527a545a4ebb825db02bba9c9e42a90c15326f6 24-Apr-2014 José Fonseca <jfonseca@vmware.com> gallivm: Fix wrong operator in lp_exec_default.

Courtesy of MSVC static code analyser.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
b1909b260f6c3855c8214319c602fc7adea7faf9 26-Mar-2014 Zack Rusin <zackr@vmware.com> draw/llvm: improve debugging output a bit

it's useful to know what the llvmbuildstore arguments are going to
be before executing it because it can crash and make sure to
print out the inputs only if we're not generating a gs because
it fetches inputs differently.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
69ee3f431f9f1bb782485ede992b95e01ad790a5 05-Feb-2014 Zack Rusin <zackr@vmware.com> gallivm: handle huge number of immediates

We only supported up to 256 immediates, which isn't enough. We had
code which was allocating immediates as an allocated array, but it
was always used along a statically backed array for performance
reasons. This commit adds code to skip that performance optimization
and always use just the dynamically allocated immediates if the
number of them is too great.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
8507afc97fa3323c89ee4cd1359d2fa61015bcd0 04-Feb-2014 Zack Rusin <zackr@vmware.com> gallivm: allow large numbers of temporaries

The number of allowed temporaries increases almost with every
iteration of an api. We used to support 128, then we started
increasing and the newer api's support 4096+. So if we notice
that the number of temporaries is larger than our statically
allocated storage would allow we just treat them as indexable
temporaries and allocate them as an array from the start.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
9bace99d77642f8fbd46b1f0be025ad758f83f5e 28-Jan-2014 Zack Rusin <zackr@vmware.com> gallivm: fix opcode and function nesting

gallivm soa code supported only a single level of nesting for
control flow opcodes (if, switch, loops...) but the d3d10 spec
clearly states that those are nested within functions. To support
nesting of conditionals inside functions we need to store the
nesting data inside function contexts and keep a stack of those.
Furthermore we make sure that if nesting for subroutines is deeper
than 32 then we simply ignore all subsequent 'call' invocations.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
877128505431adaf817dc8069172ebe4a1cdf5d8 17-Jan-2014 José Fonseca <jfonseca@vmware.com> s/Tungsten Graphics/VMware/

Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the
old copyright name is creating unnecessary confusion, hence this change.

This was the sed script I used:

$ cat tg2vmw.sed
# Run as:
#
# git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed
#

# Rename copyrights
s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g
/Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./
s/TUNGSTEN GRAPHICS/VMWARE/g

# Rename emails
s/alanh@tungstengraphics.com/alanh@vmware.com/
s/jens@tungstengraphics.com/jowen@vmware.com/g
s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/
s/jrfonseca\?@tungstengraphics.com/jfonseca@vmware.com/g
s/keithw\?@tungstengraphics.com/keithw@vmware.com/g
s/michel@tungstengraphics.com/daenzer@vmware.com/g
s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/
s/zack@tungstengraphics.com/zackr@vmware.com/

# Remove dead links
s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g

# C string src/gallium/state_trackers/vega/api_misc.c
s/"Tungsten Graphics, Inc"/"VMware, Inc"/

Reviewed-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
93b953d139112bea1c9c64a3de462cbb52c544fd 19-Dec-2013 Zack Rusin <zackr@vmware.com> llvmpipe: do constant buffer bounds checking in shaders

It's possible to bind a smaller buffer as a constant buffer, than
what the shader actually uses/requires. This could cause nasty
crashes. This patch adds the architecture to pass the maximum
allowable constant buffer index to the jit to let it make
sure that the constant buffer indices are always within bounds.
The behavior follows the d3d10 spec, which says the overflow
should always return all zeros, and overflow is only defined
as access beyond the size of the currently bound buffer. Accesses
beyond the declared shader constant register size are not
considered an overflow and expected to return garbage but consistent
garbage (we follow the behavior which some wlk tests expect which
is to return the actual values from the bound buffer).

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
7f56780915e352fda80b0e062591995021916859 20-Nov-2013 Vinson Lee <vlee@freedesktop.org> gallivm: Ignore unknown file type in non-debug builds.

Fixes "Uninitialized pointer read" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e7a5905d8a3960b0981750f8131e3af9acbfcdb8 14-Nov-2013 Si Chen <sichen@vmware.com> gallivm: Fix mask calculation for emit_kill_if.

The exec_mask must be taken in consideration, just like emit_kill above.

The tgsi_exec module has the same bug and should be fixed in a future
change.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
a29e40a42382630c2d9d95d3a1e03a7b3db87add 14-Nov-2013 José Fonseca <jfonseca@vmware.com> gallivm: Compile flag to debug TGSI execution through printfs.

It is similar to tgsi_exec.c's DEBUG_EXECUTION compile flag.

I had prototyped this for a while while debugging an issue, but finally
cleaned this up and added a few more bells and whistles.

v2: Use '$' as marker; better output. Thanks to Brian, Zack and Roland
reviews.

Here is a sample output.

CONST[0].x = 0.00625000009 0.00625000009 0.00625000009 0.00625000009
CONST[0].y = -0.00714285718 -0.00714285718 -0.00714285718 -0.00714285718
CONST[0].z = -1 -1 -1 -1
CONST[0].w = 1 1 1 1
IN[0].x = 143.5 175.5 175.5 143.5
IN[0].y = 123.5 123.5 155.5 155.5
IN[0].z = 0 0 0 0
IN[0].w = 1 1 1 1
$ 1: RCP TEMP[0].w, IN[0].wwww
TEMP[0].w = 1 1 1 1
$ 2: MAD TEMP[0].xy, IN[0], CONST[0], CONST[0].zwzw
TEMP[0].x = -0.103124976 0.0968750715 0.0968750715 -0.103124976
TEMP[0].y = 0.117857158 0.117857158 -0.110714316 -0.110714316
$ 3: MUL OUT[0].xy, TEMP[0], TEMP[0].wwww
OUT[0].x = -0.103124976 0.0968750715 0.0968750715 -0.103124976
OUT[0].y = 0.117857158 0.117857158 -0.110714316 -0.110714316
$ 4: MUL OUT[0].z, IN[0].zzzz, TEMP[0].wwww
OUT[0].z = 0 0 0 0
$ 5: MOV OUT[0].w, TEMP[0]
OUT[0].w = 1 1 1 1
$ 6: END
OUT[0].x = -0.103124976 0.0968750715 0.0968750715 -0.103124976
OUT[0].y = 0.117857158 0.117857158 -0.110714316 -0.110714316
OUT[0].z = 0 0 0 0
OUT[0].w = 1 1 1 1
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
754319490f6946a9ad5ee619822d5fe4254e6759 11-Nov-2013 Roland Scheidegger <sroland@vmware.com> gallivm,llvmpipe: fix float->srgb conversion to handle NaNs

d3d10 requires us to convert NaNs to zero for any float->int conversion.
We don't really do that but mostly seems to work. In particular I suspect the
very common float->unorm8 path only really passes because it relies on sse2
pack intrinsics which just happen to work by luck for NaNs (float->int
conversion in hw gives integer indeterminate value, which just happens to be
-0x80000000 hence gets converted to zero in the end after pack intrinsics).
However, float->srgb didn't get so lucky, because we need to clamp before
blending and clamping resulted in NaN behavior being undefined (and actually
got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp
with defined nan behavior as we can handle the NaN for free this way.
I suspect there's more bugs lurking in this area (e.g. converting floats to
snorm) as we don't really use defined NaN behavior everywhere but this seems
to be good enough.
While here respecify nan behavior modes a bit, in particular the return_second
mode didn't really do what we wanted. From the caller's perspective, we really
wanted to say we need the non-nan result, but we already know the second arg
isn't a NaN. So we use this now instead, which means that cpu architectures
which actually implement min/max by always returning non-nan (that is adhering
to ieee754-2008 rules) don't need to bend over backwards for nothing.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ea1f7d289430ec4815bf0b3398d0815a310c2aa3 06-Nov-2013 Roland Scheidegger <sroland@vmware.com> gallivm: deduplicate some indirect register address code

There's only one minor functional change, for immediates the pixel offsets
are no longer added since the values are all the same for all elements in
any case (it might be better if those weren't stored as soa vectors in the
first place maybe).

Reviewed-by: Zack Rusin <zackr@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
b35ea0934933885b45348fb861d5ad9b8c284910 06-Nov-2013 Roland Scheidegger <sroland@vmware.com> gallivm: fix indirect addressing of inputs

We weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first element.
(Copied straight from the same fix for temps.)
While here fix up a couple of broken comments in the fetch functions,
plus don't name a straight float type float4 which is just confusing.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e9f1f6ab42a7c466b3b6cb5460fcf875822c1dbd 03-Sep-2013 Zack Rusin <zackr@vmware.com> gallivm: support indirect registers on both dimensions

We support indirect addressing only on the vertex index, but some
shaders also use indirect addressing on attributes. This patch
adds support for indirect addressing on both dimensions inside
gs arrays.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ac1a2714c78ab8bc9853478780dc27075d025080 19-Aug-2013 Roland Scheidegger <sroland@vmware.com> gallivm: implement better control of per-quad/per-element/scalar lod

There's a new debug value used to disable per-quad lod optimizations
in fragment shader (ignored for vs/gs as the results are just too wrong
typically). Also trying to detect if a supplied lod value is really a
scalar (if it's coming from immediate or constant file) in which case
sampler code can use this to stay on per-quad-lod path (in fact for
explicit lod could simplify even further and use same lod for both
quads in the avx case but this is not implemented yet).
Still need to actually implement per-element lod bias (and derivatives),
and need to handle per-element lod in size queries.

v2: fix comments, prettify.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e58c2310b845ca809bb99d5fcdab909ff8598c28 14-Aug-2013 Roland Scheidegger <sroland@vmware.com> gallivm: already pass coords in the right place in the sampler interface

This makes things a bit nicer, and more importantly it fixes an issue
where a "downgraded" array texture (due to view reduced to 1 layer and
addressed with (non-array) samplec instruction) would use the wrong
coord as shadow reference value. (This could also be fixed by passing
target through the sampler interface much the same way as is done for
size queries, might do this eventually anyway.)
And if we'd ever want to support (shadow) cube map arrays, we'd need
5 coords in any case.

v2: fix bugs (texel fetch using wrong layer coord for 1d, shadow tex
using wrong shadow coord for 2d...). Plus need to project the shadow
coord, and just for fun keep projecting the layer coord too.

Reviewed-by: Zack Rusin <zackr@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
cd2f26090a3857703bc85bd58ac5922800b06bc0 12-Aug-2013 Roland Scheidegger <sroland@vmware.com> gallivm: fix exec_mask interaction with geometry shader after end of main

Because we must maintain an exec_mask even if there's currently nothing
on the mask stack, we can still have an exec_mask at the end of the program.
Effectively, this mask should be set back to default when returning from main.
Without relying on END/RET opcode (I think it's valid to have neither) it is
actually difficult to do this, as there doesn't seem any reasonable place to
do it, so instead let's just say the exec_mask is invalid outside main (which
it really is effectively).
The problem is that geometry shader called end_primitive outside the shader
(in the epilogue), and as a result used a bogus mask, leading to bugs if we
had to set the (somewhat misnamed) ret_in_main bit anywhere. So just avoid
the mask combining function when called from outside the shader.

Reviewed-by: Zack Rusin <zackr@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
7147094ff235a1c0550e1bafbd12574feca7fdd8 12-Aug-2013 Roland Scheidegger <sroland@vmware.com> gallivm: simplify geometry shader mask handling a bit

Instead of reducing masks to 0/1 simply use the mask directly as -1.
Also use some signed comparison instead of unsigned (as far as I understand
these values have to be (very) small and signed means llvm doesn't have to
apply additional logic to do the unsigned comparisons the cpu can't do).
Saves a couple of instructions in some test geometry shader here.

v2: that was a bit to much optimization, don't skip combining the masks...

Reviewed-by: Zack Rusin <zackr@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
b0f74250e1496d4872fd731b45049868b3efc883 08-Aug-2013 Roland Scheidegger <sroland@vmware.com> gallivm: use texture target from shader instead of static state for size query

d3d10 has no notion of distinct array resources neither at the resource nor
sampler view level. However, shader dcl of resources certainly has, and
d3d10 expects resinfo to return the values according to that - in particular
a resource might have been a 1d texture with some array layers, then the
sampler view might have only used 1 layer so it can be accessed both as 1d
or 1d array texture (I think - the former definitely works). resinfo of a
resource decleared as array needs to return number of array layers but
non-array resource needs to return 0 (and not 1). Hence fix this by passing
the target from the shader decl to emit_size_query and use that (in case of
OpenGL the target will come from the instruction itself).
Could probably do the same for actual sampling, though it may not matter there
(as the bogus components will essentially get clamped away), possibly could
wreak havoc though if it REALLY doesn't match (which is of course an error
but still).

Reviewed-by: Zack Rusin <zackr@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
eac57bc223dd2bf9d988b9f1ee0e126a27c98bf8 07-Aug-2013 Roland Scheidegger <sroland@vmware.com> gallivm: propagate scalar_lod to emit_size_query too

Clearly the returned values need to be per-element if the lod is per element.
Does not actually change behavior yet.

Reviewed-by: Zack Rusin <zackr@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
abcc40e7f05fcc1fd28f226f132f500a703d1e5d 27-Jul-2013 Roland Scheidegger <sroland@vmware.com> gallivm: handle texel swizzles correctly for d3d10-style sample opcodes

unlike OpenGL, the texel swizzle is embedded in the instruction, so honor
that.
(Technically we now execute both the sampler_view swizzle and the
per-instruction swizzle but this should be quite ok.)

v2: add documentation note as it's not obvious.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ab47bbecd64d05d4fe03bed28291387dd08f5b84 16-Jul-2013 Zack Rusin <zackr@vmware.com> gallivm: handle nan's in min/max

Both D3D10 and OpenCL say that if one the inputs is nan then
the other should be returned. To preserve that behavior
the patch fixes both the sse and the non-sse paths in both
functions and adds helper code for handling nans.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
46205ab8cc03cbda6bbc0c958e277f972973ebfe 12-Jul-2013 Brian Paul <brianp@vmware.com> tgsi: rename the TGSI fragment kill opcodes

TGSI_OPCODE_KIL and KILP had confusing names. The former was conditional
kill (if any src component < 0). The later was unconditional kill.
At one time KILP was supposed to work with NV-style condition
codes/predicates but we never had that in TGSI.

This patch renames both opcodes:
TGSI_OPCODE_KIL -> KILL_IF (kill if src.xyzw < 0)
TGSI_OPCODE_KILP -> KILL (unconditional kill)

Note: I didn't just transpose the opcode names to help ensure that I
didn't miss updating any code anywhere.

I believe I've updated all the relevant code and comments but I'm
not 100% sure that some drivers had this right in the first place.
For example, the radeon driver might have llvm.AMDGPU.kill and
llvm.AMDGPU.kilp mixed up. Driver authors should review their code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
f501baabdb5cd356faad0e419c64b2ac312c5756 12-Jul-2013 Brian Paul <brianp@vmware.com> tgsi: fix-up KILP comments

KILP is really unconditional fragment kill.

We've had KIL and KILP transposed forever. I'll fix that next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
f3bbf65929e395360e5565d08d015977dd5b79fa 04-Jul-2013 Roland Scheidegger <sroland@vmware.com> gallivm: do per-pixel lod calculations for explicit lod

d3d10 requires per-pixel lod calculations for explicit lod, lod bias and
explicit derivatives, and we should probably do it for OpenGL too - at least
if they are used from vertex or geometry shaders (so doesn't apply to lod
bias) this doesn't just affect neighboring pixels.
Some code was already there to handle this so fix it up and enable it.
There will no doubt be a performance hit unfortunately, we could do better
if we'd knew we had a real vector shift instruction (with variable shift
count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu).
Don't do anything for lod bias and explicit derivatives yet, though
no special magic should be needed for them neither.
Likewise, the size query is still broken just the same.

v2: Use information if lod is a (broadcast) scalar or not. The idea would be
to base this on the actual value, for now just pretend it's a scalar in fs
and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same
code is generated for fs as before).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e621ec816da65a56c3ba038a85075000bf5882d2 01-Jul-2013 José Fonseca <jfonseca@vmware.com> gallivm: Fix indirect immediate registers.

If reg->Register.Indirect is true then the immediate is not truly a
constant LLVM expression.

There is no performance regression in using LLVMBuildBitCast, as it will
fallback to LLVMConstBitCast internally when the argument is a constant.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
957c040eb86495da2a693c831e13342a81ac1a2e 13-Jun-2013 Roland Scheidegger <sroland@vmware.com> gallivm: (trivial) remove duplicated code block (including comment)
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
386327c48f88b052449afa4f41b1090d3fdb5ce9 07-May-2013 Zack Rusin <zackr@vmware.com> gallivm/soa: implement indirect addressing in immediates

The support is analogous to the way we handle indirect addressing
in temporaries, except that we don't have to worry about storing
(after declarations) and thus we'll able to keep using the old
code when indirect addressing isn't used. In other words we're
still using constants directly, unless the instruction has
immediate register with indirect addressing.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
f9f57312de863dd058a595c69ef3f42a5d90bce5 27-Apr-2013 Zack Rusin <zackr@vmware.com> gallivm: fix indirect addressing of temps in soa mode

we weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first vertex/primitive/pixel in the SoA structure
and not correctly fetching from all structures.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
c0538860bf656a1796b4a5c9c136c7d3517dfba6 22-Apr-2013 José Fonseca <jfonseca@vmware.com> gallivm: Fix assignment of unsigned values to OUT register.

TEMP is not the only register file that accept unsigned. OUT too.

Actually, what determines the appropriate type of the destination value is
not the opcode, but rather the register.

Also cleanup/simplify code. Add a few more asserts, but also make
code more robust by handling graceful if assert fails.

This fixes segfault / assertion in the included vert-uadd.sh graw shader.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
85974e5fee152c96239aa87040799a557cd789ab 20-Apr-2013 Roland Scheidegger <sroland@vmware.com> gallivm: implement switch opcode

Should be able to handle all things which make this tricky to implement.
Fallthroughs, including most notably into/out of default, should be handled
correctly but are quite a mess.
If we see largely unoptimized switches in the wild should probably think
about some "real" switch optimization pass, e.g. things like this:

switch
case1
someinst
brk
case2
default
case3
someinst
brk
case4
someinst
endswitch

are legal, but the pointless case2/case3 statements not only cause condition
evaluation but will turn this into a "fake" fallthrough case (because
mask and defaultmask are already updated for case2 when default is
encountered) requiring executing code twice.
If default is at the end though, there's never any code re-execution, and
if that's not the case if there's no fallthrough in (not even a fake one)
and out of default there's no code re-execution neither.

v2: add comments, and use enum for break type instead of magic boolean.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
8f5d4283c0448ed2e5d2c12bb46ec70be7744a7b 19-Apr-2013 Roland Scheidegger <sroland@vmware.com> gallivm: use uint build context for mask instead of float

Unsurprisingly noone was using it except for grabbing builder.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
107550e71a762bf1999220a81b61107384eca065 19-Apr-2013 Roland Scheidegger <sroland@vmware.com> gallivm/tgsi: fix up breakc

It seems there was a typo in gallivm breakc handling (I am actually still
not sure it is really needed but otherwise that statement really should go
away). Also fix the wrong src argument type, even though they weren't really
used.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
6e833d4d093038ccd0a44f0d7baa33ea37320abe 18-Apr-2013 José Fonseca <jfonseca@vmware.com> gallivm: Drop pos arg from lp_build_tgsi_soa.

Never used.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
cb58c79efb1402cd89504856033b6322d0096233 17-Apr-2013 Zack Rusin <zackr@vmware.com> gallivm/gs: fix indirect addressing in geometry shaders

We were always treating the vertex index as a scalar but when the
shader is using indirect addressing it will be a vector of indices
for each channel. This was causing some nasty crashes insides
LLVM.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
f01f754ca13373d62f5f4ba5ff76d83aa4eac62b 17-Apr-2013 Zack Rusin <zackr@vmware.com> draw/gs: make sure geometry shaders don't overflow

The specification says that the geometry shader should exit if the
number of emitted vertices is bigger or equal to max_output_vertices and
we can't do that because we're running in the SoA mode, which means that
our storing routines will keep getting called on channels that have
overflown (even though they will be masked out, but we just can't skip
them).
So we need some scratch area where we can keep writing the overflown
vertices without overwriting anything important or crashing.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
b739376cffec19870804b1ebd4bef3c2f654e943 11-Apr-2013 Zack Rusin <zackr@vmware.com> gallivm/gs: fix the end primitive calls

The issue with SOA execution and end_primitive opcode is that it
can be executed both when we haven't emitted any vertices, in
which case we don't want to emit an empty primitive, and when
the execution mask is zero and the execution should be skipped. We
handled only the latter of those conditions. Now we're combining the
execution mask with a mask created from emitted vertices to handle
both cases. As a result we don't need the pending_end_primitive
flag which was broken because it was static and could be affected
by both above mentioned conditions at run-time.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
50b3fc6204a28881f625605f988cb0866ae6a6a5 17-Apr-2013 José Fonseca <jfonseca@vmware.com> gallium: Disambiguate TGSI_OPCODE_IF.

TGSI_OPCODE_IF condition had two possible interpretations:

- src.x != 0.0f

- Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was false either for
vertex and fragment shaders
- gallivm/llvmpipe
- postprocess
- vl state tracker
- vega state tracker
- most old drivers
- old internal state trackers
- many graw examples

- src.x != 0U

- Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was true for both
vertex and fragment shaders
- tgsi_exec/softpipe
- r600
- radeonsi
- nv50

And drivers that use draw module also were a mess (because Mesa would
emit float IFs, but draw module supports native integers so it would
interpret IF arg as integers...)

This sort of works if the source argument is limited to float +0.0f or
+1.0f, integer 0, but would fail if source is float -0.0f, or integer in
the float NaN range. It could also fail if source is integer 1, and
hardware flushes denormalized numbers to zero.

But with this change there are now two opcodes, IF and UIF, with clear
meaning.

Drivers that do not support native integers do not need to worry about
UIF. However, for backwards compatibility with old state trackers and
examples, it is advisable that native integer capable drivers also
support the float IF opcode.

I tried to implement this for r600 and radeonsi based on the surrounding
code. I couldn't do this for nouveau, so I just shunted IF/UIF
together, which matches the current behavior.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>

v2:
- Incorporate Roland's feedback.
- Fix r600_shader.c merge conflict.
- Fix typo in radeon, spotted by Michel Dänzer.
- Incorporte Christoph Bumiller's patch to handle TGSI_OPCODE_IF(float)
properly in nv50/ir.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
f61b7da80e238892b0832ec12b11589fba946b47 13-Apr-2013 José Fonseca <jfonseca@vmware.com> gallium: Eliminate TGSI_OPCODE_IFC.

Never used or implemented.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
fe29f99293cb3bbc834f4d4d65e87ac7c734615d 09-Apr-2013 Zack Rusin <zackr@vmware.com> gallivm/tgsi: handle untyped moves

both mov and ucmp can be used to move variables of any type.
correctly note that about ucmp in the tgsi_info and make
sure gallivm can handle that by correctly casting the untyped
moves.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
d56f2d52675397610717875c4a2a5edb04e2c997 09-Apr-2013 Zack Rusin <zackr@vmware.com> gallivm: fix loops and conditionals within GS

We were using simple temporaries, without using alloca or phi
nodes which meant that on every iteration of the loop our
temporaries, which were holding the number of vertices and
primitives which were emitted, were being reset to zero. Now
we're using alloca to allocate those variables to preserve
them across conditionals.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
1ad4a4eeb3fe83ce3ce7336250d725bf0a28de7b 05-Apr-2013 Zack Rusin <zackr@vmware.com> gallivm: fix breakc

we break when the mask values are 0 not, 1, plus it's bit comparison
not a floating point comparison. This fixes both.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ce5096a0a959b97f70c3df46a35bfe694e8c349c 04-Apr-2013 Roland Scheidegger <sroland@vmware.com> gallivm: honor explicit derivatives values for cube maps.

This is trivial now, though need to make sure we pass all the necessary
derivative values (which is 3 each for ddx/ddy not 2).
Passes piglit arb_shader_texture_lod-texgradcube test.

v2: add the forgotten abs() for all incoming derivatives (discovered
by new piglit arb_shader_texture_lod-texgradcube test, though more by
luck as it was failing only for exactly one pixel...).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
d8543bd7528de05e5ce3ac407838e7500428a93d 29-Mar-2013 Zack Rusin <zackr@vmware.com> draw: Implement support for primitive id

We were largely ignoring primitive id.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
f313b0c8502868dab2a87237af295a34ec0dea26 27-Mar-2013 Zack Rusin <zackr@vmware.com> gallivm: cleanup the gs interface

Instead of void pointers use a base interface.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e26d5940ff48595a0f1c1d440ee0141df54e3f51 01-Apr-2013 Adam Jackson <ajax@redhat.com> gallivm: Minor comment cleanup

Signed-off-by: Adam Jackson <ajax@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
f20f981553ede032b72988c10189be7bc2cc6bda 27-Mar-2013 Zack Rusin <zackr@vmware.com> gallivm: Implement the breakc instruction

Required by more modern examples. Like BRK but with a condition.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
b66ffcf2f8a0128497d1e0afed0416a4aa4a14be 27-Mar-2013 Zack Rusin <zackr@vmware.com> gallivm: implement implicit primitive flushing

TGSI semantics currently require an implicit endprim at the end
of GS if an ending primitive hasn't been emitted.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e96f4e3b853ff5fe4d927c69695c0b5f1966d448 18-Feb-2013 Zack Rusin <zackr@vmware.com> gallium/llvm: implement geometry shaders in the llvm paths

This commits implements code generation of the geometry shaders in
the SOA paths. All the code is there but bugs are likely present.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
5af7b45986d1b56c568ebe9c3a40d48853e2e9ff 16-Mar-2013 Roland Scheidegger <sroland@vmware.com> gallivm: fix return opcode handling in main function of a shader

If we're in some conditional or loop we must not return, or the code
after the condition is never executed.
(v2): And, we also can't just continue as nothing happened, since the
mask update code would later check if we actually have a mask, so we
need to remember that there was a return in main where we didn't exit
(to illustrate this, a ret in a if clause would cause a mask update
which is still ok as we're in a conditional, but after the endif the
mask update code would drop the mask hence bringing execution back to
pixels which should have their execution mask set to zero by the ret).
Thanks to Christoph Bumiller for figuring this out.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
21190fbd56ec2f12dc5a1bf1d9fc32d507e8f0a3 07-Mar-2013 Christian König <christian.koenig@amd.com> tgsi: use separate structure for indirect address v2

To further improve the optimization of source and destination
indirect addressing we need the ability to store a reference
to the declaration of the addressed operands.

Since most of the fields in tgsi_src_register doesn't apply for
an indirect addressing operand replace it with a separate
tgsi_ind_register structure and so make room for extra information.

v2: rename Declaration to ArrayID, put the ArrayID into () instead of []

Signed-off-by: Christian König <christian.koenig@amd.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
5c41d1c22282fe2fd72a77339246de8e861b4b22 09-Mar-2013 Roland Scheidegger <sroland@vmware.com> gallivm: clean up passing derivatives around

Previously, the derivatives were calculated and passed in a packed form
to the sample code (for implicit derivatives, explicit derivatives were
packed to the same format).
There's several reasons why this wasn't such a good idea:
1) the derivatives may not even be needed (not as bad as it sounds since
llvm will just throw the calculations needed for them away but still)
2) the special packing format really shouldn't be part of the sampler
interface
3) depending what the sample code actually does the derivatives will
be processed differently, hence there is no "ideal" packing. For cube
maps with explicit derivatives (which we don't do yet) for instance the
packing looked downright useless, and for non-isotropic filtering we'd
need different calculations too.

So, instead just pass the derivatives as is (for explicit derivatives),
or let the rho calculating sample code calculate them itself. This still
does exactly the same packing stuff for implicit derivatives for now,
though explicit ones are handled in a more straightforward manner (quick
estimates show performance should be quite similar, though it is much
easier to follow and also does the rho calculation per-pixel until the
end, which we eventually need for spec compliance anyway).

No piglit changes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
83f7cde1821d8e004d49fb966a323c037631b9a2 19-Feb-2013 Roland Scheidegger <sroland@vmware.com> gallivm: fix indirect src register fetches requiring bitcast

For constant and temporary register fetches, the bitcasts weren't done
correctly for the indirect case, leading to crashes due to type mismatches.
Simply do the bitcasts after fetching (much simpler than fixing up the load
pointer for the various cases).

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=61036

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
f1ab67c13ab97f19c08d99c6ba101edc7d7b80e6 15-Feb-2013 Roland Scheidegger <sroland@vmware.com> gallivm/tgsi: fix issues with sample opcodes

We need to encode them as Texture instructions since the NumOffsets field
is encoded there. However, we don't encode the actual target in there, this
is derived from the sampler view src later.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
427d36a22741890a7ce55b6b5bcd40fd4bdd2d35 12-Feb-2013 Roland Scheidegger <sroland@vmware.com> gallium: fix tgsi SAMPLE_L opcode to use separate source for explicit lod

It looks like using coord.w as explicit lod value is a mistake, most likely
because some dx10 docs had it specified that way. Seems this was changed though:
http://msdn.microsoft.com/en-us/library/windows/desktop/hh447229%28v=vs.85%29.aspx
- let's just hope it doesn't depend on runtime build version or something.
Not only would this need translation (so go against the stated goal these
opcodes should be close to dx10 semantics) but it would prevent usage of this
opcode with cube arrays, which is apparently possible:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb509699%28v=vs.85%29.aspx
(Note not only does this show cube arrays using explicit lod, but also the
confusion with this opcode: it lists an explicit lod parameter value, but then
states last component of location is used as lod).
(For "true" hw drivers, only nv50 had code to handle it, and it appears the
code was already right for the new semantics, though fix up the seemingly
wrong c/d arguments while there.)

v2: fix comment, separate out other changes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
614982d320985c04e247293b54b66d7df5c19004 05-Feb-2013 Roland Scheidegger <sroland@vmware.com> gallivm: fix up size queries for dx10 sviewinfo opcode

Need to calculate the number of mip levels (if it would be worthwile could
store it in dynamic state).
While here, the query code also used chan 2 for the lod value.
This worked with mesa state tracker but it seems safer to use chan 0.
Still passes piglit textureSize (with some handwaving), though the non-GL
parts are (largely) untested.

v2: clarify and expect the sviewinfo opcode to return ints, not floats,
just like the OpenGL textureSize (dx10 supports dst modifiers with resinfo).
Also simplify some code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
0a8043bb766f9c38af559438b646c3659614b103 01-Feb-2013 Roland Scheidegger <sroland@vmware.com> gallivm: hook up dx10 sampling opcodes

They are similar to old-style tex opcodes but with separate sampler and
texture units (and other arguments in different places).
Also adjust the debug tgsi dump code.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
c789b981b244333cfc903bcd1e2fefc010500013 28-Jan-2013 Roland Scheidegger <sroland@vmware.com> gallivm: split sampler and texture state

Split the sampler interface to use separate sampler and texture (sampler_view)
state. This is needed to support dx10-style sampling instructions.
This is not quite complete since both draw/llvmpipe don't really track
textures/samplers independently yet, as well as the gallivm code not quite
using the right sampler or texture index respectively (but it should work
for the sampling codes used by opengl).
We are however losing some optimizations in the process, apply_max_lod will
no longer work, and we potentially could end up with more (unnecessary)
recompiles (if switching textures with/without mipmaps only so it shouldn't
be too bad).

v2: don't use different callback structs for sampler/sampler view functions
(which just complicates things), fix up sampling code to actually use the
right texture or sampler index, and similar for llvmpipe/draw actually
distinguish between samplers and sampler views.

v3: fix more of PIPE_MAX_SAMPLER / PIPE_MAX_SHADER_SAMPLER_VIEWS mismatches
(both in draw and llvmpipe), based on feedback from José get rid of unneeded
static sampler derived state.(which also fixes the only 2 piglit regressions
due to a forgotten assignment), fix comments based on Brian's feedback.

v4: remove some accidental unrelated whitespace changes

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
3b7ce726258de20be2d65f7b9f51b160dd99638a 04-Dec-2012 José Fonseca <jfonseca@vmware.com> gallivm: Allow indirection from TEMP registers too.

The ADDR file is cumbersome for native integer capable drivers. We
should consider deprecating it eventually, but this just adds support
for indirection from TEMP registers.

Reviewed-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
1d35f77228ad540a551a8e09e062b764a6e31f5e 04-Dec-2012 José Fonseca <jfonseca@vmware.com> gallivm,llvmpipe,draw: Support multiple constant buffers.

Support 16 (defined in LP_MAX_TGSI_CONST_BUFFERS) as opposed to 32 (as
defined by PIPE_MAX_CONSTANT_BUFFERS) because that would make the jit
context become unnecessarily large.

v2: Bump limit from 4 to 16 to cover ARB_uniform_buffer_object needs,
per Dave Airlie.

Reviewed-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
b5918d8f1d0d963e58ede1f82ecd836a815cfd89 29-Nov-2012 Roland Scheidegger <sroland@vmware.com> gallivm: fix a trivial txq issue for 2d shadow and cube shadow samplers

untested (couldn't get the piglit test to run even with version overrides)
but seemed blatantly wrong.
In any case it would only affect an error case which when it would happen
probably all hope is lost anyway.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
95e03914d82f4a3722cda00cd6eda54a6f328a73 29-Nov-2012 Roland Scheidegger <sroland@vmware.com> gallivm: support array textures

Support 1d and 2d array textures (including shadow samplers),
and (as a side effect mostly) also shadow cube samplers.
Seems to pass the relevant piglit tests both for sampling and rendering
to (though some require version overrides).
Since we don't support render target indices rendering to array textures
is still restricted to a single layer at a time.
Also, the min/max layer in the sampler view (which is unnecessary for GL)
is ignored (always use all layers).

Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
0b6554ba6f2aa8a771852566340c24205e406d02 27-Nov-2012 Roland Scheidegger <sroland@vmware.com> gallivm,llvmpipe: handle TXF (texelFetch) instruction, including offsets

This also adds some code to handle per-quad lods for more than 4-wide fetches,
because otherwise I'd have to integrate the texelFetch function into
the splitting stuff... (but it is not used yet outside texelFetch).
passes piglit fs-texelFetch-2D, fails fs-texelFetchOffset-2D due to I believe
a test error (results are undefined for out-of-bounds fetches, we return
whatever is at offset 0, whereas the test expects [0,0,0,1]).
Texel offsets are only handled by texelFetch for now, though the interface
can handle it for everything.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
3469715a8a171512cf9b528702e70393f01c6041 13-Jul-2012 José Fonseca <jfonseca@vmware.com> gallivm,draw,llvmpipe: Support wider native registers.

Squashed commit of the following:

commit 7acb7b4f60dc505af3dd00dcff744f80315d5b0e
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon Jul 9 17:46:31 2012 +0100

draw: Don't use dynamically sized arrays.

Not supported by MSVC.

commit 5810c28c83647612cb372d1e763fd9d7780df3cb
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon Jul 9 17:44:16 2012 +0100

gallivm,llvmpipe: Don't use expressions with PIPE_ALIGN_VAR().

MSVC doesn't accept exceptions in _declspec(align(...)). Use a
define instead.

commit 8aafd1457ba572a02b289b3f3411e99a3c056072
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon Jul 9 17:41:56 2012 +0100

gallium/util: Make u_cpu_detect.h header C++ safe.

commit 5795248350771f899cfbfc1a3a58f1835eb2671d
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon Jul 2 12:08:01 2012 +0100

gallium/util: Add ULL suffix to large constants.

As suggested by Andy Furniss: it looks like some old gcc versions
require it.

commit 4c66c22727eff92226544c7d43c4eb94de359e10
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Jun 29 13:39:07 2012 +0100

gallium/util: Truly disable INF/NAN tests on MSVC.

Thanks to Brian for spotting this.

commit 8bce274c7fad578d7eb656d9a1413f5c0844c94e
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Jun 29 13:39:07 2012 +0100

gallium/util: Disable INF/NAN tests on MSVC.

Somehow they are not recognized as constants.

commit 6868649cff8d7fd2e2579c28d0b74ef6dd4f9716
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Jul 5 15:05:24 2012 +0200

gallivm: Cleanup the 2 x 8 float -> 16 ub special path in lp_build_conv.

No behaviour change intended, like 7b98455fb40c2df84cfd3cdb1eb7650f67c8a751.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 5147a0949c4407e8bce9e41d9859314b4a9ccf77
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Jul 5 14:28:19 2012 +0200

gallivm: (trivial) fix issues with multiple-of-4 texture fetch

Some formats can't handle non-multiple of 4 fetches I believe, but
everything must support length 1 and multiples of 4.
So avoid going to scalar fetch (which is very costly) just because length
isn't 4.
Also extend the hack to not use shift with variable count for yuv formats to
arbitrary length (larger than 1) - doesn't matter how many elements we
have we always want to avoid it unless we have variable shift count
instruction (which we should get with avx2).

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 87ebcb1bd71fa4c739451ec8ca89a7f29b168c08
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Jul 4 02:09:55 2012 +0200

gallivm: (trivial) fix typo for wrap repeat mode in linear filtering aos code

This would lead to bogus coordinates at the edges.
(undetected by piglit because this path is only taken for block-based
formats).

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit 3a42717101b1619874c8932a580c0b9e6896b557
Author: José Fonseca <jfonseca@vmware.com>
Date: Tue Jul 3 19:42:49 2012 +0100

gallivm: Fix TGSI integer translation with AVX.

commit d71ff104085c196b16426081098fb0bde128ce4f
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Jun 29 15:17:41 2012 +0100

llvmpipe: Fix LLVM JIT linear path.

It was not working properly because it was looking at the JIT function
before it was actually compiled.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit a94df0386213e1f5f9a6ed470c535f9688ec0a1b
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Jun 28 18:07:10 2012 +0100

gallivm: Refactor lp_build_broadcast(_scalar) to share code.

Doesn't really change the generated assembly, but produces more compact IR,
and of course, makes code more consistent.

Reviewed-by: Brian Paul <brianp@vmware.com>

commit 66712ba2731fc029fa246d4fc477d61ab785edb5
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Jun 27 17:30:13 2012 +0100

gallivm: Make LLVMContextRef a singleton.

There are any places inside LLVM that depend on it. Too many to attempt
to fix.

Reviewed-by: Brian Paul <brianp@vmware.com>

commit ff5fb7897495ac263f0b069370fab701b70dccef
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Jun 28 18:15:27 2012 +0200

gallivm: don't use 8-wide texture fetch in aos path

This appears to be a slight loss usually.
There are probably several reasons for that:
- fetching itself is scalar
- filtering is pure int code hence needs splitting anyway, same
for the final texel offset calculations
- texture wrap related code, which can be done 8-wide, is slightly more
complex with floats (with clamp_to_edge) and float operations generally
more costly hence probably not much faster overall
- the code needed to split when encountering different mip levels for the
quads, adding complexity
So, just split always for aos path (but leave it 8-wide for soa, since we
do 8-wide filtering there when possible).
This should certainly be revisited if we'd have avx2 support.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit ce8032b43dcd8e8d816cbab6428f54b0798f945d
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Jun 27 18:41:19 2012 +0200

gallivm: (trivial) don't extract fparts variable if not needed

Did not have any consequences but unnecessary.

commit aaa9aaed8f80dc282492f62aa583a7ee23a4c6d5
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Jun 27 18:09:06 2012 +0200

gallivm: fix precision issue in aos linear int wrap code

now not just passes at a quick glance but also with piglit...
If we do the wrapping with floats, we also need to set the
weights accordingly. We can potentially end up with different
(integer) coordinates than what the integer calculations would
have chosen, which means the integer weights calculated previously
in this case are completely wrong. Well at least that's what I think
happens, at least recalculating the weights helps.
(Some day really should refactor all the wrapping, so we do whatever is
fastest independent of 16bit int aos or 32bit float soa filtering.)

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit fd6f18588ced7ac8e081892f3bab2916623ad7a2
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Jun 27 11:15:53 2012 +0100

gallium/util: Fix parsing of options with underscore.

For example

GALLIVM_DEBUG=no_brilinear

which was being parsed as two options, "no" and "brilinear".

commit 09a8f809088178a03e49e409fa18f1ac89561837
Author: James Benton <jbenton@vmware.com>
Date: Tue Jun 26 15:00:14 2012 +0100

gallivm: Added a generic lp_build_print_value which prints a LLVMValueRef.

Updated lp_build_printf to share common code.
Removed specific lp_build_print_vecX.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit e59bdcc2c075931bfba2a84967a5ecd1dedd6eb0
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed May 16 15:00:23 2012 +0100

draw,llvmpipe: Avoid named struct types on LLVM 3.0 and later.

Starting with LLVM 3.0, named structures are meant not for debugging, but
for recursive data types, previously also known as opaque types.

The recursive nature of these types leads to several memory management
difficulties. Given that we don't actually need recursive types, avoid
them altogether.

This is an attempt to address fdo bugs 41791 and 44466. The issue is
somewhat random so there's no easy way to check how effective this is.

Cherry-picked from 9af1ba565dfd5cef9ee938bb7c04767d14878fbf

commit df6070f618a203c7a876d984c847cde4cbc26bdb
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Jun 27 14:42:53 2012 +0200

gallivm: (trivial) fix typo in faster aos linear int wrap code

no longer crashes, now REALLY tested.

commit d8f98dce452c867214e6782e86dc08562643c862
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Jun 26 18:20:58 2012 +0200

llvmpipe: (trivial) remove bogus optimization for float aos repeat wrap

This optimization for nearest filtering on the linear path generated
likely bogus results, and the int path didn't have any optimizations
there since the only shader using force_nearest apparently uses
clamp_to_edge not repeat wrap anyway.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit c4e271a0631087c795e756a5bb6b046043b5099d
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Jun 26 23:01:52 2012 +0200

gallivm: faster repeat wrap for linear aos path too

Even if we already have scaled integer coords, it's way faster to use
the original float coord (plus some conversions) rather than use URem.
The choice of what to do for texture wrapping is not really tied to int
aos or float soa filtering though for some modes there can be some gains
(because of easier weight calculations).

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 1174a75b1806e92aee4264ffe0ffe7e70abbbfa3
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Jun 26 14:39:22 2012 +0200

gallivm: improve npot tex wrap repeat in linear soa path

URem gets translated into series of scalar divisions so
just about anything else is faster.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit f849ffaa499ed96fa0efd3594fce255c7f22891b
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Jun 26 00:40:35 2012 +0100

gallivm: (trivial) fix near-invisible shift-space typo

I blame the keyboard.

commit 5298a0b19fe672aebeb70964c0797d5921b51cf0
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 25 16:24:28 2012 +0200

gallivm: add new intrinsic helper to deal with arbitrary vector length

This helper will split vectors which are too large for the hw, or expand
them if they are too small, so a caller of a function using intrinsics which
uses such sizes need not split (or expand) the vectors manually and the
function will still use the intrinsic instead of dropping back to generic
llvm code. It can also accept scalars for use with pseudo-vector intrinsics
(only useful for float arguments, all x86 scalar simd float intrinsics use
4vf32).
Only used for lp_build_min/max() for now (also added the scalar float case
for these while there). (Other basic binary functions could use it easily,
whereas functions with a different interface would need different helpers.)
Expanding vectors isn't widely used, because we always try to use
build contexts with native hw vector sizes. But it might (or not) be nicer
if this wouldn't need to be done, the generated code should in theory stay
the same (it does get hit by lp_build_rho though already since we
didn't have a intrinsic for the scalar lp_build_max case before).

v2: incorporated Brian's feedback, and also made the scalar min/max case work
instead of crash (all scalar simd float intrinsics take 4vf32 as argument,
probably the reason why it wasn't used before).
Moved to lp_bld_intr based on José's request, and passing intrinsic size
instead of length.
Ideally we'd derive the source type info from the passed in llvm value refs
and process some llvmtype return type so we could handle intrinsics where
the source and destination type isn't the same (like float/int conversions,
packing instructions) but that's a bit too complicated for now.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 01aa760b99ec0b2dc8ce57a43650e83f8c1becdf
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 25 16:19:18 2012 +0200

gallivm: (trivial) increase max code size for shader disassembly

64kB was just short of what I needed (which caused a crash) hence
increase to 96kB (should probably be smarter about that).

commit 74aa739138d981311ce13076388382b5e89c6562
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 25 11:53:29 2012 +0100

gallivm: simplify aos float tex wrap repeat nearest

just handle pot and npot the same. The previous pot handling
ended up with exactly the same instructions plus 2 more (leave it
in the soa path though since it is probably still cheaper there).
While here also fix a issue which would cause a crash after an assert.

commit 0e1e755645e9e49cfaa2025191e3245ccd723564
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 25 11:29:24 2012 +0100

gallivm: (trivial) skip floor rounding in ifloor when not signed

This was only done for the non-sse41 case before, but even with
sse41 this is obviously unnecessary (some callers already call
itrunc in this case anyway but some might not).

commit 7f01a62f27dcb1d52597b24825931e88bae76f33
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 25 11:23:12 2012 +0100

gallivm: (trivial) fix bogus comments

commit 5c85be25fd82e28490274c468ce7f3e6e8c1d416
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Jun 20 11:51:57 2012 +0100

translate: Free elt8_func/elt16_func too.

These were leaking.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit 0ad498f36fb6f7458c7cffa73b6598adceee0a6c
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Jun 19 15:55:34 2012 +0200

gallivm: fix bug for tex wrap repeat with linear sampling in aos float path

The comparison needs to be against length not length_minus_one, otherwise
the max texel is never chosen (for the second coordinate).

Fixes piglit texwrap-1D-npot-proj (and 2D/3D versions).

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit d1ad65937c5b76407dc2499b7b774ab59341209e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Jun 19 16:13:43 2012 +0200

gallivm: simplify soa tex wrap repeat with npot textures and no mip filtering

Similar to what is already done in aos sampling for the float path (but not
the int path since we don't get normalized float coordinates there).
URem is expensive and the calculation is done trivially with
normalized floats instead (at least with sse41-capable cpus).
(Some day should probably do the same for the mip filter path but it's much
more complicated there hence the gain is smaller.)

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit e1e23f57ba9b910295c306d148f15643acc3fc83
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 18 20:38:56 2012 +0200

llvmpipe: (trivial) remove duplicated function declaration

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 07ca57eb09e04c48a157733255427ef5de620861
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 18 20:37:34 2012 +0200

llvmpipe: destroy setup variants on context destruction

lp_delete_setup_variants() used to be called in garbage collection,
but this no longer exists hence the setup shaders never got freed.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit ed0003c633859a45f9963a479f4c15ae0ef1dca3
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 18 16:25:29 2012 +0100

gallivm: handle different ilod parts for multiple quad sampling

This fixes filtering when the integer part of the lod is not the same
for all quads. I'm not fully convinced of that solution yet as it just
splits the vector if the levels to be sampled from are different.
But otherwise we'd need to do things like some minify steps, and getting
mip level base address separately anyway hence it wouldn't really look
like much of a win (and making the code even more complex).
This should now give identical results to single quad sampling.

commit 8580ac4cfc43a64df55e84ac71ce1a774d33c0d2
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Jun 14 18:14:47 2012 +0200

gallivm: de-duplicate sample code common to soa and aos sampling

There doesn't seem to be any reason why this code dealing with cube face
selection, lod and mip level calculation is separate in aos and
soa sampling, and I am sick of having it to change in both places.

commit fb541e5f957408ce305b272100196f1e12e5b1e8
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Jun 14 18:15:41 2012 +0200

gallivm: do mip filtering with per quad lod_fpart

This gives better results for mip filtering, though the generated code might
not be optimal. For now it also creates some artifacts if the lod_ipart isn't
the same for all quads, since instead of using the same mip weight for all
quads as previously (which just caused non-smooth gradients) this now will
use the right weights but with the wrong mip level in this case (can easily
be seen with things like texfilt, mipmap_tunnel).
v2: use logic helper suggested by José, and fix issue with negative lod_fpart
values

commit f1cc84eef7d826a20fab6cd8ccef9a275ff78967
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Jun 13 18:35:25 2012 +0200

gallivm: (trivial) fix bogus assert in lp_build_unpack_broadcast_aos_scalars

commit 7c17dbae8ae290df9ce0f50781a09e8ed640c044
Author: James Benton <jbenton@vmware.com>
Date: Tue Jun 12 12:11:14 2012 +0100

util: Reimplement half <-> float conversions.

Removed u_half.py used to generate the table for previous method.

Previous implementation of float to half conversion was faulty for
denormalised and NaNs and would require extra logic to fix,
thus making the speedup of using tables irrelevant.

commit 7762f59274070e1dd4b546f5cb431c2eb71ae5c3
Author: James Benton <jbenton@vmware.com>
Date: Tue Jun 12 12:12:16 2012 +0100

tests: Updated tests to properly handle NaN for half floats.

commit fa94c135aea5911fd93d5dfb6e6f157fb40dce5e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 11 18:33:10 2012 +0200

gallivm: do mip level calculations per quad

This is the final piece which shouldn't change the rendering output yet.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 23cbeaddfe03c09ca18c45d28955515317ffcf4c
Author: Roland Scheidegger <sroland@vmware.com>
Date: Sat Jun 9 00:54:21 2012 +0200

gallivm: do per-quad cube face selection

Doesn't quite fix the piglit cubemap test (not sure why actually)
but doing per-quad face selection is doing the right thing and
definitely an improvement.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit abfb372b3702ac97ac8b5aa80ad1b94a2cc39d33
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 11 18:22:59 2012 +0200

gallivm: do all lod calculations per quad

Still no functional change but lod is now converted to scalar after
lod calculations.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 519368632747ae03feb5bca9c655eccbc5b751b4
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 16:46:10 2012 +0100

gallivm: Added support for half-float to float conversion in lp_build_conv.

Updated various utility functions to support this change.

commit 135b4d683a4c95f7577ba27b9bffa4a6fbd2c2e7
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 16:02:46 2012 +0100

gallivm: Added function for half-float to float conversion.

Updated lp_build_format_aos_array to support half-float source.

commit 37d648827406a20c5007abeb177698723ed86673
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 14:55:18 2012 +0100

util: Updated u_format_tests to rigidly test half-float boundary values.

commit 2ad18165d96e578aa9046df7c93cb1c3284d8c6b
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 14:54:16 2012 +0100

llvmpipe: Updated lp_test_format to properly handle Inf/NaN results.

commit 78740acf25aeba8a7d146493dd5c966e22c27b73
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 14:53:30 2012 +0100

util: Added functions for checking NaN / Inf for double and half-floats.

commit 35e9f640ae01241f9e0d67fe893bbbf564c05809
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu May 24 21:05:13 2012 +0200

gallivm: Fix calculating rho for 3d textures for the single-quad case

Discovered by accident, this looks like a very old typo bug.

commit fc1220c636326536fd0541913154e62afa7cd1d8
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu May 24 21:04:59 2012 +0200

gallivm: do calcs per-quad in lp_build_rho

Still convert to scalar at the end of the function.

commit 50a887ffc550bf310a6988fa2cea5c24d38c1a41
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon May 21 23:21:50 2012 +0200

gallivm: (trivial) return scalar in lp_build_extract_range for length 1 vectors

Our type system on top of llvm's one doesn't generally support vectors of
length 1, instead using scalars. So we should return a scalar from this
function instead of having to bitcast the vector with length 1 later elsewhere.

commit 80c71c621f9391f0f9230460198d861643324876
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 17:49:15 2012 +0100

draw: Fixed bad merge error

commit c47401cfad0c9167de20ff560654f533579f452c
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 15:29:30 2012 +0100

draw: Updated store_clip to store whole vectors instead of individual elements.

commit 2d9c1ad74b0b0b41861fffcecde39f09cc27f1cf
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 15:28:32 2012 +0100

gallivm: Added lp_build_fetch_rgba_aos_array.

A version of lp_build_fetch_rgba_aos which is targeted at simple array formats.

Reads the whole vector from memory in one, instead of reading each element
individually.

Tested with mesa tests and demos.

commit ff7805dc2b6ef6d8b11ec4e54aab1633aef29ac8
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 15:27:40 2012 +0100

gallivm: Added lp_build_pad_vector.

This function pads a vector with undef to a desired length.

commit 701f50acef24a2791dabf4730e5b5687d6eb875d
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 17:27:19 2012 +0100

util: Added util_format_is_array.

This function checks whether a format description is in a simple array format.

commit 5e0a7fa543dcd009de26f34a7926674190fa6246
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 19:13:47 2012 +0100

draw: Removed draw_llvm_translate_from and draw/draw_llvm_translate.c.

This is "replaced" by adding an optimised path in lp_build_fetch_rgba_aos
in an upcoming patch.

commit 8c886d6a7dd3fb464ecf031de6f747cb33e5361d
Author: James Benton <jbenton@vmware.com>
Date: Wed May 16 15:02:31 2012 +0100

draw: Modified store_aos to write the vector as one, not individual elements.

commit 37337f3d657e21dfd662c7b26d61cb0f8cfa6f17
Author: James Benton <jbenton@vmware.com>
Date: Wed May 16 14:16:23 2012 +0100

draw: Changed aos_to_soa to use lp_build_transpose_aos.

commit bd2b69ce5d5c94b067944d1dcd5df9f8e84548f1
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 19:14:27 2012 +0100

draw: Changed soa_to_aos to use lp_build_transpose_aos.

commit 0b98a950d29a116e82ce31dfe7b82cdadb632f2b
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 18:57:45 2012 +0100

gallivm: Added lp_build_transpose_aos which converts between aos and soa.

commit 69ea84531ad46fd145eb619ed1cedbe97dde7cb5
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 18:57:01 2012 +0100

gallivm: Added lp_build_interleave2_half aimed at AVX unpack instructions.

commit 7a4cb1349dd35c18144ad5934525cfb9436792f9
Author: José Fonseca <jfonseca@vmware.com>
Date: Tue May 22 11:54:14 2012 +0100

gallivm: Fix build on Windows.

MC-JIT not yet supported there.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit afd105fc16bb75d874e418046b80d9cc578818a1
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:17:26 2012 +0100

llvmpipe: Added a error counter to lp_test_conv.

Useful for keeping track of progress when fixing errors!

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit b644907d08c10a805657841330fc23db3963d59c
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:16:46 2012 +0100

llvmpipe: Changed known failures in lp_test_conv.

To comply with the recent fixes to lp_bld_conv.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit d7061507bd94f6468581e218e61261b79c760d4f
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:14:38 2012 +0100

llvmpipe: Added fixed point types tests to lp_test_conv.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit 146b3ea39b4726dbe125ac666bd8902ea3d6ca8c
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:26:35 2012 +0100

llvmpipe: Changed lp_test_conv src/dst alignment to be correct.

Now based on the define rather than a fixed number.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit f3b57441f834833a4b142a951eb98df0aa874536
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:06:44 2012 +0100

gallivm: Fixed erroneous optimisation in lp_build_min/max.

Previously assumed normalised was 0 to 1, but it can be -1 to 1
if type is signed.
Tested with lp_test_conv and lp_test_format, reduced errors.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit a0613382e5a215cd146bb277646a6b394d376ae4
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:04:49 2012 +0100

gallivm: Compensate for lp_const_offset in lp_build_conv.

Fixing a /*FIXME*/ to remove errors in integer conversion in lp_build_conv.
Tested using lp_test_conv and lp_test_format, reduced errors.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit a3d2bf15ea345bc8a0664f8f441276fd566566f3
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:01:25 2012 +0100

gallivm: Fixed overflow in lp_build_clamped_float_to_unsigned_norm.

Tested with lp_test_conv and lp_test_format, reduced errors.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit e7b1e76fe237613731fa6003b5e1601a2e506207
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon May 21 20:07:51 2012 +0100

gallivm: Fix build with LLVM 2.6

Trivial, and useful.

commit d3c6bbe5c7f5ba1976710831281ab1b6a631082d
Author: José Fonseca <jfonseca@vmware.com>
Date: Tue May 15 17:15:59 2012 +0100

gallivm: Enable MCJIT/AVX with vanilla LLVM 3.1.

Add the necessary C++ glue, so that we don't need any modifications
to the soon to be released LLVM 3.1.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit 724a019a14d40fdbed21759a204a2bec8a315636
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon May 14 22:04:06 2012 +0100

gallivm: Use HAVE_LLVM 0x0301 consistently.

commit af6991e2a3868e40ad599b46278551b794839748
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon May 14 21:49:06 2012 +0100

gallivm: Add MCRegisterInfo.h to silence benign warnings about missing implementation.

Trivial.

commit 6f8a1d75458daae2503a86c6b030ecc4bb494e23
Author: Vinson Lee <vlee@freedesktop.org>
Date: Mon Apr 2 22:14:15 2012 -0700

gallivm: Pass in a MCInstrInfo to createMCInstPrinter on llvm-3.1.

llvm-3.1svn r153860 makes MCInstrInfo available to the MCInstPrinter.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit 62555b6ed8760545794f83064e27cddcb3ce5284
Author: Vinson Lee <vlee@freedesktop.org>
Date: Tue Mar 27 21:51:17 2012 -0700

gallivm: Fix method overriding in raw_debug_ostream.

Use matching type qualifers to avoid method hiding.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 6a9bd784f4ac68ad0a731dcd39e5a3c39989f2be
Author: Vinson Lee <vlee@freedesktop.org>
Date: Tue Mar 13 22:40:52 2012 -0700

gallivm: Fix createOProfileJITEventListener namespace with llvm-3.1.

llvm-3.1svn r152620 refactored the OProfile profiling code.
createOProfileJITEventListener was moved from the llvm namespace to the
llvm::JITEventListener namespace.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit b674955d39adae272a779be85aa1bd665de24e3e
Author: Vinson Lee <vlee@freedesktop.org>
Date: Mon Mar 5 22:00:40 2012 -0800

gallivm: Pass in a MCRegisterInfo to MCInstPrinter on llvm-3.1.

llvm-3.1svn r152043 changes createMCInstPrinter to take an additional
MCRegisterInfo argument.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit 11ab69971a8a31c62f6de74905dbf8c02884599f
Author: Vinson Lee <vlee@freedesktop.org>
Date: Wed Feb 29 21:20:53 2012 -0800

Revert "gallivm: Change getExtent and readByte to non-const with llvm-3.1."

This reverts commit d5a6c172547d8964f4d4bb79637651decaf9deee.

llvm-3.1svn r151687 makes MemoryObject accessor members const again.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit 339960c82d2a9f5c928ee9035ed31dadb7f45537
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon May 14 16:19:56 2012 +0200

gallivm: (trivial) fix assertion failure for mipmapped 1d textures

In lp_build_rho, we may end up with a 1-element vector (for mipmapped 1d
textures), but in this case we require the type to be a non-vector type,
so need a cast.

commit 9d73edb727bd6d196030dc3026b7bf0c574b3e19
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu May 10 18:12:07 2012 +0200

gallivm: prepare for per-quad lod calculations for large vectors

to be able to handle multiple quads at once in texture sampling and still
do lod calculations per quad, it is necessary to get the per-quad derivatives
into the lp_build_rho function.
Until now these derivative values were just scalars, which isn't going to work.
So we now use vectors, and since the interface needs to change we also do some
different (slightly more efficient) packing of the values.
For 8-wide vectors the packed derivative values for 3 coords would look like
this, this scales to a arbitrary (multiple of 4) vector size:
ds1dx ds1dy dt1dx dt1dy ds2dx ds2dy dt2dx dt2dy
dr1dx dr1dy _____ _____ dr2dx dr2dy _____ _____
The second vector will be unused for 1d and 2d textures.
To facilitate future changes the derivative values are put into a struct, since
quite some functions just pass these values through.
The generated code seems to be very slightly better for 2d textures (with
4-wide vectors) than before with sse2 (if you have a cpu with physical 128bit
simd units - otherwise it's probably not a win).
v2: suggestions from José, rename variables, add comments, use swizzle helper

commit 0aa21de0d31466dac77b05c97005722e902517b8
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu May 10 18:10:31 2012 +0200

gallivm: add undefined swizzle handling to lp_build_swizzle_aos

This is useful for vectors with "holes", it lets llvm choose the most
efficient shuffle instructions if some elements aren't needed without having to
worry what elements to manually pick otherwise.

commit 00faf3f370e7ce92f5ef51002b0ea42ef856e181
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri May 4 17:25:16 2012 +0100

gallivm: Get the LLVM IR optimization passes before JIT compilation.

MC-JIT engine compiles the module immediately on creation, so the optimization
passes were being run too late.

So now we create a target data layout from a string, that matches the
ABI parameters reported by the compiler.

The backend optimization passes were always been run, so the performance
improvement is modest (3% on multiarb mesa demo).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit 40a43f4e2ce3074b5ce9027179d657ebba68800a
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed May 2 16:03:54 2012 +0200

gallivm: (trivial) fix wrong define used in lp_build_pack2

should fix stack-smashing crashes.

commit e6371d0f4dffad4eb3b7a9d906c23f1c88a2ab9e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Apr 30 21:25:29 2012 +0200

gallivm: add perf warnings when not using intrinsics with 256bit vectors

Helper functions using integer sse2 intrinsics could split the vectors with AVX
instead of using generic fallback (which should be faster).
We don't actually expect to hit these paths (hence don't fix them up to actually
do the vector splitting) so just emit warnings (for those functions where it's
obvious doing split/intrinsic is faster than using generic path).
Only emit warnings for 256bit vectors since we _really_ don't expect to hit
arbitrary large vectors which would affect a lot more functions.
The warnings do not actually depend on avx since the same logic applies to
plain sse2 too (but of course again there's _really_ no reason we should hit
these functions with 256bit vectors without avx).

commit 8a9ea701ea7295181e846c6383bf66a5f5e47637
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue May 1 20:37:07 2012 +0200

gallivm: split vectors manually for avx in lp_build_pack2 (v2)

There's 2 reasons for this:
First, there's a llvm bug (fixed in 3.1) which generates tons of byte
inserts/extracts otherwise, and second, more importantly, we want to use
pack intrinsics instead of shuffles.
We do this in lp_build_pack2 and not the calling code (aos sample path)
because potentially other callers might find that useful too, even if
for larger sequences of code using non-native vector sizes it might be
better to manually split vectors.
This should boost texture performance in the aos path considerably.
v2: fix issues with intrinsics types with old llvm

commit 27ac5b48fa1f2ea3efeb5248e2ce32264aba466e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue May 1 20:26:22 2012 +0200

llvmpipe: refactor lp_build_pack2 (v2)

prettify, and it's unnecessary to assert when there's no intrinsic due to
unsupported bit width - the shuffle path will work regardless.
In contrast lp_build_packs2, should only rely on lp_build_pack2 doing the
clamping for element sizes for which there is a sse2 intrinsic.
v2: fix bug spotted by Jose regarding the intrinsic type for packusdw
on old llvm versions.

commit ddf279031f0111de4b18eaf783bdc0a1e47813c8
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue May 1 20:13:59 2012 +0200

gallivm: add src width check in lp_build_packs2()

not doing so would skip clamping even if no sse2 pack instruction is
available, which is incorrect (in theory only, such widths would also always
hit a (unnecessary) assertion in lp_build_pack2().

commit e7f0ad7fe079975eae7712a6e0c54be4fae0114b
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Apr 27 15:57:00 2012 +0200

gallivm: (trivial) fix crash-causing typo for npot textures with avx

commit 28a9d7f6f655b6ec508c8a3aa6ffefc1e79793a0
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Apr 25 19:38:45 2012 +0200

gallivm: (trivial) remove code mistakenly added twice.

commit d5926537316f8ff67ad0a52e7242f7c5478d919b
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Apr 24 21:16:15 2012 +0200

gallivm: add a new avx aos sample path (v2)

Try to avoid mixing float and int address calculations. This does texture wrap
modes with floats, and then the offset calculations still with ints (because
of lack of precision with floats, though we could do some effort to make it work
with not too large (16MB) textures).
This also handles wrap repeat mode with npot-sized textures differently than
either the old soa or aos int path (likely way faster but untested).
Otherwise the actual address wrap code is largely similar to the soa path (not
quite the same as this one also has some int code), it should get used by avx
soa sampling later as well but doesn't handle more complex address modes yet
(this will also have the benefit that we can use aos sampling path for all
texture address modes).
Generated code for that looks reasonable, but still does not split vectors
explicitly for fetch/filter which means still get hit by llvm (fixed upstream)
which generates hundreds of pinsrb/pextrb instead of two shuffles.
It is not obvious though if it's much of a win over just doing address calcs
4-wide but with ints, even if it is definitely much less instructions on avx.
piglit's texwrap seems to look exactly the same but doesn't test
neither the non-normalized nor the npot cases.
v2: fix comments, prettify based on Brian's and Jose's feedback.

commit bffecd22dea66fb416ecff8cffd10dd4bdb73fce
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Apr 19 01:58:29 2012 +0200

gallivm: refactor aos lp_build_sample_image_nearest/linear

split them up to separate address calculations and fetching/filtering.
Need this for being able to do 8-wide float address calcs and 4-wide
fetch/filter later (for avx). Plus the functions were very big scary monsters
anyway (in particular lp_build_sample_image_linear).

commit a80b325c57529adddcfa367f96f03557725c4773
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Apr 16 17:17:18 2012 +0200

gallivm: fix lp_build_resize when truncating width but expanding vector size

Missed this case which I thought was impossible - the assertion for it was
right after the division by zero...
(AoS) texture sampling may ask us to do this, for things like 8 4x32int
vectors to 1 32x8int vector conversion (eventually, we probably don't want
this to happen).

commit f9c8337caa3eb185830d18bce8b95676a065b1d7
Author: Roland Scheidegger <sroland@vmware.com>
Date: Sat Apr 14 18:00:59 2012 +0200

gallivm: fix cube maps with larger vectors

This makes the branchless cube face selection code work with larger vectors.
Because the complexity is quite high (cannot really be improved it seems,
per-face selection would reduce complexity a lot but this leads to errors
unless the derivatives are calculated all from the same face which almost
doubles the work to be done) it is still slower than the branching version,
hence only enable this with large vectors.
It doesn't actually do per-quad face selection yet (only makes sense with
matching lod selection, in fact it will select the same face for all pixels
based on the average of the first four pixels for now) but only different
shuffles are required to make it work (the branching version actually should
work with larger vectors too now thanks to the improved horizontal add but of
course it cannot be extended to really select the face per-quad unless doing
branching per quad).

commit 7780c58869fc9a00af4f23209902db7e058e8a66
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 30 21:11:12 2012 +0100

llvmpipe: (trivial) fix compiler warning

and also clarify comment regarding availability of popcnt instruction.

commit a266dccf477df6d29a611154e988e8895892277e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 30 14:21:07 2012 +0100

gallivm: remove unneeded members in lp_build_sample_context

Minor cleanup, the texture width, height, depth aren't accessed in their
scalar form anywhere. Makes it more obvious those values should probably be
fetched already vectorized (but this requires more invasive changes)...

commit b678c57fb474e14f05e25658c829fc04d2792fff
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Mar 29 15:53:55 2012 +0100

gallivm: add a helper for concatenating vectors

Similar to the extract_range helper intended to get around slow code generated
by llvm for 128bit insertelements.
Concatenating two 128bit vectors this way will result in a single vinsertf128
operation rather than two 64bit stores plus one 128bit load, though it might be
mildly useful for other purposes as well.

commit 415ff228bcd0cf5e44a4c15350a661f0f5520029
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Mar 28 19:41:15 2012 +0100

gallivm: add a custom 2x8f->1x16ub avx conversion path

Similar to the existing 4x4f->1x16ub sse2 path, shaves off a couple
instructions (min/max mostly) because it relies on pack intrinsics clamping.

commit 78c08fc89f8fbcc6dba09779981b1e873e2a0299
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Mar 28 18:44:07 2012 +0100

gallivm: add avx arithmetic intrinsics

Add all avx intrinsics for arithmetic functions (with the exception
of the horizontal add function which needs another look).
Seems to pass basic tests.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit a586caa2800aa5ce54c173f7c0d4fc48153dbc4e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Mar 28 15:31:35 2012 +0100

gallivm: add avx logic intrinsics

Add the blend intrinsics for 8-wide float and 4-wide double vectors.
Since we lack 256bit int instructions these are used for int vectors as well,
though obviously not for byte or word element values.
The comparison intrinsics aren't extended for avx since these are only used
for pre-2.7 llvm versions.

commit 70275e4c13c89315fc2560a4c488c0e6935d5caf
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Mar 28 00:40:53 2012 +0100

gallivm: new helper function for extract shuffles.

Based on José's idea as we can need that in a couple places.
Note that such shuffles should not be used lightly, since data layout
of <4 x i8> is different to <16 x i8> for instance, hence might cause
data rearrangement.

commit 4d586dbae1b0c55915dda1759d2faea631c0a1c2
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 27 18:27:25 2012 +0100

gallivm: (trivial) don't overallocate shuffle variable

using wrong define meant huge array...

commit 06b0ec1f6d665d98c135f9573ddf4ba04b2121ad
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 27 17:54:20 2012 +0100

gallivm: don't do per-element extract/insert for vector element resize

Instead of doing per-element extract/insert if the src vectors
and dst vector differ in total size (which generates atrocious code)
first change the src vectors size by using shuffles to destination
vector size.
We can still do better than that on AVX for packing to color buffer
(by exploiting pack intrinsics characteristics hence eleminating the
need for some clamps) but this already generates much better code.

v2: incorporate feedback from José, Keith and use shuffle instead of
bitcasts/extracts. Due to llvm deficiencies the latter cause all data
to get moved to GPRs and back in pieces (even though the data in the
regs actually stays the same...).

commit c9970d70e05f95d3f52fe7d2cd794176a52693aa
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 23 19:33:19 2012 +0000

gallivm: fix bug in simple position interpolation

Accidental use of position attribute instead of just pixel coordinates.
Caused failures in piglit glsl-fs-ceil and glsl-fs-floor.

commit d0b6fcdb008d04d7f73d3d725615321544da5a7e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 23 15:31:14 2012 +0000

gallivm: fix emission of ceil opcode

lp_build_ceil seems more appropriate than lp_build_trunc.
This seems to be never hit though someone performs some ceil
to floor magic.

commit d97fafed7e62ffa6bf76560a92ea246a1a26d256
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Mar 22 11:46:52 2012 +0000

gallivm: new vectorized path for cubemap calculations

should be faster when adapted to multiple quads as only selection masks need to be different.
The code is more or less a per-pixel version adapted to only do it per quad.
A per pixel version would be much simpler (could drop 2 selects, 6 broadcasts and the messy
horizontal add of 3 vectors at the expense of only 2 more absolute value instructions -
would also just work for arbitary large vectors).
This version doesn't yet work with larger vectors because the horizontal add isn't adjusted
to be able to work with 2x4 vectors (and also because face selection wouldn't be done per
quad just per block though that would be only a correctness issue just as with lod selection).
The downside is this code is quite a bit slower. On a Core2 it can be sped up by disabling the
hw blend instructions for selection and using logicop fallbacks instead, but it is still slower
than the old code, hence leave that in for now. Probably will chose one or the other version
based on vector length in the end.

commit b375fbb18a3fd46859b7fdd42f3e9908ea4ff9a3
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Mar 21 14:42:29 2012 +0000

gallivm: fix optimized occlusion query intrinsic name

commit a9ba0a3b611e48efbb0e79eb09caa85033dbe9a2
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Mar 21 16:19:43 2012 +0000

draw,gallivm,llvmpipe: Call gallivm_verify_function everywhere.

commit f94c2238d2bc7383e088b8845b7410439a602071
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 20 18:54:10 2012 +0000

gallivm: optimize calculations for cube maps a bit

this does some more vectorized calculations and uses horizontal adds if possible.
A definite win with sse3 otherwise it doesn't seem to make much of a difference.
In any case this is arithmetically identical, cannot handle larger vectors.
Should be useful as a reference point against larger vector version later...

commit 21a2c1cf3c8e1ac648ff49e59fdc0e3be77e2ebb
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 20 15:16:27 2012 +0000

llvmpipe: slight optimization of occlusion queries

using movmskps when available.
While this is slightly better for cpus without popcnt we should
really sum the vectors ourselves (it is also possible to cast to i4 before
doing the popcnt but that doesn't help that much neither since llvm
is using some optimized popcnt version for i32)

commit 5ab5a35f216619bcdf55eed52b0db275c4a06c1b
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 20 13:32:11 2012 +0000

llvmpipe: fix occlusion queries with larger vectors

need to adjust casts etc.

commit ff95e6fdf5f16d4ef999ffcf05ea6e8c7160b0d5
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon Mar 19 20:15:25 2012 +0000

gallivm: Restore optimization passes.

commit 57b05b4b36451e351659e98946dae27be0959832
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 19:34:22 2012 +0000

llvmpipe: use existing min2 macro

commit bc9a20e19b4f600a439f45679451f2e87cd4b299
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 19:07:27 2012 +0000

llvmpipe: add some safeguards against really large vectors

As per José's suggestion, prevent things from blowing up if some cpu
would have 1024bit or larger vectors.

commit 0e2b525e5ca1c5bbaa63158bde52ad1c1564a3a9
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 18:31:08 2012 +0000

llvmpipe: fix mask generation for uberwide vectors

this was the only piece preventing 16-wide vectors from working
(apart from the LP_MAX_VECTOR_WIDTH define that is), which is the maximum
as we don't get more pixels in the fragment shader at once.
Hence adjust that so things could be tested properly with that size
even though there seems to be no practical value.

commit 3c8334162211c97f3a11c7f64e9e5a2a91ad9656
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 18:19:41 2012 +0000

llvmpipe: fix the simple interpolation method with larger vectors

so both methods actually _really_ work now. Makes textures look
nice with larger vectors...

commit 1cb0464ef8871be1778d43b0c56adf9c06843e2d
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 17:26:35 2012 +0000

llvmpipe: fix mask generation and position interpolation with 8-wide vectors

trivial bugs, with these things start to look somewhat reasonable.
Textures though have some swizzling issues it seems.

commit 168277a63ef5b72542cf063c337f2d701053ff4b
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 16:04:03 2012 +0000

llvmpipe: don't overallocate variables

we never have more than 16 (stamp size) / 4 (minimum possible vector size).
(With larger vectors those variables are still overallocated a bit.)

commit 409b54b30f81ed0aa9ed0b01affe15c72de9abd2
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 15:56:48 2012 +0000

llvmpipe: add some 32f8 formats to lp_test_conv

Also add the ability to handle different sized vectors.

commit 55dcd3af8366ebdac0af3cdb22c2588f24aa18ce
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 15:47:27 2012 +0000

gallivm: handle different sized vectors in conversion / pack

only fully generic path for now (extract/insert per element).

commit 9c040f78c54575fcd94a8808216cf415fe8868f6
Author: Roland Scheidegger <sroland@vmware.com>
Date: Sun Mar 18 00:58:28 2012 +0100

llvmpipe: fix harmless use of unitialized values

commit 551e9d5468b92fc7d5aa2265db9a52bb1e368a36
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 16 23:31:21 2012 +0100

gallivm: drop special path in extract_broadcast with different sized vectors

Not needed, llvm can handle shuffles with different sized result vector just
fine. Should hopefully generate the same code in the end, but simpler IR.

commit 44da531119ffa07a421eaa041f63607cec88f6f8
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 16 23:28:49 2012 +0100

llvmpipe: adapt interpolation for handling multiple quads at once

this is still WIP there are actually two methods possible not quite
sure what makes the most sense, so there's code for both for now:
1) the iterative method as used before (compute attrib values at upper left
corner of stamp and upper left corner of each quad initially).
It is improved to handle more than one quad at once, and also do some more vectorized
calculations initially for slightly better code - newer cpus have full throughput with
4 wide float vectors, hence don't try to code up a path which might be faster if there's
just one channel active per attribute.
2) just do straight interpolation for each pixel.
Method 2) is more work per quad, but less initially - if all quads are executed
significantly more overall though. But this might change with larger vector lengths.
This method would also be needed if we'd do some kind of active quad merging when
operating on multiple quads at once.
This path contains some hack to force llvm to generate better code, it is still far
from ideal though, still generates far too many unnecessary register spills/reloads.
Both methods should work with different sized vectors.
Not very well tested yet, still seems to work with four-wide vectors, need changes
elsewhere to be able to test with wider vectors.

commit be5d3e82e2fe14ad0a46529ab79f65bf2276cd28
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Mar 16 20:59:37 2012 +0000

draw: Cleanup.

commit f85bc12c7fbacb3de2a94e88c6cd2d5ee0ec0e8d
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Mar 16 20:43:30 2012 +0000

gallivm: More module compilation refactoring.

commit d76f093198f2a06a93b2204857e6fea5fd0b3ece
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Mar 15 21:29:11 2012 +0000

llvmpipe: Use gallivm_compile/free_function() in linear code.

Should had been done before.

commit 122e1adb613ce083ad739b153ced1cde61dfc8c0
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 13 14:47:10 2012 +0100

llvmpipe: generate partial pixel mask for multiple quads

still works with one quad, cannot be tested yet with more
At least for now always fixed order with multiple quads.

commit 4c4f15081d75ed585a01392cd2dcce0ad10e0ea8
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Mar 8 22:09:24 2012 +0100

llvmpipe: refactor state setup a bit

Refactor to make it easier to emit (and potentially later fetch in fs)
coefficients for multiple attributes at once.
Need to think more about how to make this actually happen however, the
problem is different attributes can have different interpolation modes,
requiring different handling in both setup and fs (though linear and
perspective handling is close).

commit 9363e49722ff47094d688a4be6f015a03fba9c79
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Mar 8 19:23:23 2012 +0100

llvmpipe: vectorize tri offset calc

cuts number of instructions in quad-offset-factor from 107 to 75.
This code actually duplicated the (scalar) code calculating the determinant
except it used different vertex order (leading to different sign but it doesn't
matter) hence llvm could not have figured out it's the same (of course with
determinant vectorized in the other place that wouldn't have worked any longer
neither).
Note this particular piece doesn't actually vectorize well, not many arithmetic
instructions left but tons of shuffle instructions...
Probably would need to work on n tris at a time for better vectorization.

commit 63169dcb9dd445c94605625bf86d85306e2b4297
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Mar 8 03:11:37 2012 +0100

llvmpipe: vectorize some scalar code in setup

reduces number of arithmetic instructions, and avoids loading
vector x,y values twice (once as scalars once as vectors).
Results in a reduction of instructions from 76 to 64 in fs setup for glxgears
(16%) on a cpu with sse41.
Since this code uses vec2 disguised as vec4, on old cpus which had physical
64bit sse units (pre-Core2) it probably is less of a win in practice (and if
you have no vectors you can only hope llvm eliminates the arithmetic for
unneeded elements).

commit 732ecb877f951ab89bf503ac5e35ab8d838b58a1
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Mar 7 00:32:24 2012 +0100

draw: fix clipping

bug introduced by 4822fea3f0440b5205e957cd303838c3b128419c broke
clipping pretty badly (verified with lineclip test)

commit ef5d90b86d624c152d200c7c4056f47c3c6d2688
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 6 23:38:59 2012 +0100

draw: don't store vertex header per attribute

storing the vertex header once per attribute is totally unnecessary.
Some quick look at the generated assembly says llvm in fact cannot optimize
away the additional stores (maybe due to potentially aliasing pointers
somewhere).
Plus, this makes the code cleaner and also allows using a vector "or"
instead of scalar ones.

commit 6b3a5a57b0b9850854cfbd7b586e4e50102dda71
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 6 19:11:01 2012 +0100

draw: do the per-vertex "boolean" clipmask "or" with vectors

no point extracting the values and doing it per component.
Doesn't help that much since we still extract the values elsewhere anyway.

commit 36519caf1af40e4480251cc79a2d527350b7c61f
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 2 22:27:01 2012 +0100

gallivm: fix lp_build_extract_broadcast with different sized vectors

Fix the obviously wrong argument, so it doesn't blow up.

commit 76d0ac3ad85066d6058486638013afd02b069c58
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Mar 2 12:16:23 2012 +0000

draw: Compile per module and not per function (WIP).

Enough to get gears w/ LLVM draw + softpipe to work on AVX doing:

GALLIUM_DRIVER=softpipe SOFTPIPE_USE_LLVM=yes glxgears

But still hackish -- will need to rethink and refactor this.

commit 78e32b247d2a7a771be9a1a07eb000d1e54ea8bd
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Feb 29 12:01:05 2012 +0000

llvmpipe: Remove lp_state_setup_fallback.

Never used.

commit 6895d5e40d19b4972c361e8b83fdb7eecda3c225
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon Feb 27 19:14:27 2012 +0000

llvmpipe: Don't emit EMMS on x86

We already take precautions to ensure that LLVM never emits MMX code.

commit 4822fea3f0440b5205e957cd303838c3b128419c
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Feb 29 15:58:19 2012 +0100

draw: modifications for larger vector sizes

We want to be able to use larger vectors especially for running the vertex
shader. With this patch we build soa vectors which might have a different
length than 4.
Note that aos structures really remain the same, only when aos structures
are converted to soa potentially different sized vectors are used.
Samplers probably don't work yet, didn't look at them.
Testing done:
glxgears works with both 128bit and 256bit vectors.

commit f4950fc1ea784680ab767d3dd0dce589f4e70603
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Feb 29 15:51:57 2012 +0100

gallivm: override native vector width with LP_NATIVE_VECTOR_WIDTH env var for debug

commit 6ad6dbf0c92f3bf68ae54e5f2aca035d19b76e53
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Feb 29 15:51:24 2012 +0100

draw: allocate storage with alignment according to native vector width

commit 7bf0e3e7c9bd2469ae7279cabf4c5229ae9880c1
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Feb 24 19:06:08 2012 +0000

gallivm: Fix comment grammar.

Was missing several words. Spotted by Roland.

commit b20f1b28eb890b2fa2de44a0399b9b6a0d453c52
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 19:22:09 2012 +0000

gallivm: Use MC-JIT on LLVM 3.1 + (i.e, SVN)

MC-JIT

Note: MC-JIT is still WIP. For this to work correctly it requires
LLVM changes which are not yet upstream.

commit b1af4dfcadfc241fd4023f4c3f823a1286d452c0
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Feb 23 20:03:15 2012 +0100

llvmpipe: use new lp_type_width() helper in lp_test_blend

commit 04e0a37e888237d4db2298f31973af459ef9c95f
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Feb 23 19:50:34 2012 +0100

llvmpipe: clean up lp_test_blend a little

Using variables just sized and aligned right makes it a bit more obvious
what's going on.
The test still only tests vector length 4.
For AoS anything else probably isn't going to work.
For SoA other lengths should work (at least with floats).

commit e61c393d3ec392ddee0a3da170e985fda885a823
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 17:48:30 2012 +0000

gallivm: Ensure vector width consistency.

Instead of assuming that everything is the max native size.

commit 330081ac7bc41c5754a92825e51456d231bf84dd
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 17:44:14 2012 +0000

draw: More simd vector width consistency fixes.

commit d90ca002753596269e37297e2e6c139b19f29f03
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 17:43:00 2012 +0000

gallivm: Remove unused lp_build_int32_vec4_type() helper.

commit cae23417824d75869c202aaf897808d73a2c1db0
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Feb 23 17:32:16 2012 +0100

gallivm: use global variable for native vector width instead of define

We do not know the simd extensions (and hence the simd width we should use)
available at compile time.
At least for now keep a define for maximum vector width, since a global
variable obviously can't be used to adjust alignment of automatic stack
variables.
Leave the runtime-determined value at 128 for now in all cases.

commit 51270ace6349acc2c294fc6f34c025c707be538a
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 15:41:02 2012 +0000

gallivm: Add a hunk inadvertedly lost when rebasing.

commit bf256df9cfdd0236637a455cbaece949b1253e98
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 14:24:23 2012 +0000

llvmpipe: Use consistent vector width in depth/stencil test.

commit 5543b0901677146662c44be2cfba655fd55da94b
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 14:19:59 2012 +0000

draw: Use a consistent the vector register width.

Instead of 4x32 sometimes, LP_NATIVE_VECTOR_WIDTH other times.

commit eada8bbd22a3a61f549f32fe2a7e408222e5c824
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 12:08:04 2012 +0000

gallivm: Remove garbagge collection.

MC-JIT will require one compilation per module (as opposed to one
compilation per function), therefore no state will be shared,
eliminating the need to do garbagge collection.

commit 556697ea0ed72e0641851e4fbbbb862c470fd7eb
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 10:33:41 2012 +0000

gallivm: Move all native target initialization to lp_set_target_options().

commit c518e8f3f2649d5dc265403511fab4bcbe2cc5c8
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 09:52:32 2012 +0000

llvmpipe: Create one gallivm instance for each test.

commit 90f10af8920ec6be6f2b1e7365cfc477a0cb111d
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 09:48:08 2012 +0000

gallivm: Avoid LLVMAddGlobalMapping() in lp_bld_assert().

Brittle, complex, and unecesary. Just use function pointer constant.

commit 98fde550b33401e3fe006af59db4db628bcbf476
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 09:21:26 2012 +0000

gallivm: Add a lp_build_const_func_pointer() helper.

To be reused in all places where we want to call C code.

commit 6cfedadb62c2ce5af8d75969bc95a607f3ece118
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 09:44:41 2012 +0000

gallivm: Cleanup/simplify lp_build_const_string_variable.

- Move to lp_bld_const where it belongs
- Rename to lp_build_const_string
- take the length from the argument (and don't count the zero terminator twice)
- bitcast the constant to generic i8 *

commit db1d4018c0f1fa682a9da93c032977659adfb68c
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 11:52:17 2012 +0000

gallivm: Set NoFramePointerElimNonLeaf to true where supported.

commit 088614164aa915baaa5044fede728aa898483183
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Feb 22 19:38:47 2012 +0100

llvmpipe: pass in/out pointers rather scalar floats in lp_bld_arit

we don't want llvm to potentially optimize away the vectors (though it doesn't
seem to currently), plus we want to be able to handle in/out vectors of arbitrary
length.

commit 3f5c4e04af8a7592fdffa54938a277c34ae76b51
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Feb 21 23:22:55 2012 +0100

gallivm: fix lp_build_sqrt() for vector length 1

since we optimize away vectors with length 1 need to emit intrinsic
without vector type.

commit 79d94e5f93ed8ba6757b97e2026722ea31d32c06
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Feb 22 17:00:46 2012 +0000

llvmpipe: Remove lp_test_round.

commit 81f41b5aeb3f4126e06453cfc78990086b85b78d
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Feb 21 23:56:24 2012 +0100

llvmpipe: subsume lp_test_round into lp_test_arit

Much simpler, and since the arguments aren't passed as 128bit values can run
on any arch.
This also uses the float instead of the double versions of the c functions
(which probably was the intention anyway).
In contrast to lp_test_round the output is much less verbose however.
Tested vector width of 32 to 512 bits - all pass except 32 (length 1) which
crashes in lp_build_sqrt() due to wrong type.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit 945b338b421defbd274481d8c4f7e0910fd0e7eb
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Feb 22 09:55:03 2012 +0000

gallivm: Centralize the function compilation logic.

This simplifies a lot of code.

Also doing this in a central place will make it easier to carry out the
changes necessary to use MC-JIT in the future.

gallivm: Fix typo in explicit derivative shuffle.

Trivial.

draw: make DEBUG_STORE work again

adapt to lp_build_printf() interface changes

Reviewed-by: José Fonseca <jfonseca@vmware.com>

draw: get rid of vecnf_from_scalar()

just use lp_build_broadcast directly (cannot assign a name but don't really
need it, vecnf_from_scalar() was producing much uglier IR due to using
repeated insertelement instead of insertelement+shuffle).

Reviewed-by: José Fonseca <jfonseca@vmware.com>

llvmpipe: fix typo in complex interpolation code

Fixes position interpolation when using complex mode
(piglit fp-fragment-position and similar)

Reviewed-by: José Fonseca <jfonseca@vmware.com>

draw: fix clipvertex/position storing again

This appears to be the result of a bad merge.
Fixes piglit tests relying on clipping, like a lot of the interpolation tests.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

gallivm: Fix explicit derivative manipulation.

Same counter variable was being used in two nested loops. Use more
meanigful variable names for the counter to fix and avoid this.

gallivm: Prevent buffer overflow in repeat wrap mode for NPOT.

Based on Roland's patch, discussion, and review .

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

gallivm: Fix dims for TGSI_TEXTURE_1D in emit_tex.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

gallivm: Fix explicit volume texture derivatives.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

gallivm: fix 1d shadow texture sampling

Always r coordinate is used, hence need 3 coords not two
(the second one is unused).

Reviewed-by: José Fonseca <jfonseca@vmware.com>

gallivm: Enable AVX support without MCJIT, where available.

For now, this just enables AVX on Windows for testing. If the code is
stable then we might consider prefering the old JIT wherever possible.

No change elsewhere.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
c790c2c7598dea2d5a5b0bfbe47732956e1e89a6 19-Jun-2012 Olivier Galibert <galibert@pobox.com> llvmpipe: Add vertex id support.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
46931ecf480e1d231bb6c2236d91b5390f2465ac 19-Jun-2012 Olivier Galibert <galibert@pobox.com> llvmpipe: Simplify and fix system variables fetch.

The system array values concept doesn't really because it expects the
system values to be fixed per call, which is wrong for gl_VertexID and
iffy for gl_SampleID. So this patch does two things:

- kill the array, have emit_fetch_system_value directly pick the
values it needs (only gl_InstanceID for now, as the previous code)

- correctly handle the expected type in emit_fetch_system_value

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
00eb74b275e21d567a0ab8a6731181e005208634 18-May-2012 José Fonseca <jose.r.fonseca@gmail.com> Fix fetching integer inputs.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
5d10d757276a599a60a68b88b21087b5824a8df7 17-May-2012 Olivier Galibert <galibert@pobox.com> llvmpipe: Implement TXQ.

Piglits test for fragment shaders pass, vertex shaders fail. The
actual failure seems to be in the interpolators, and not the
textureSize query.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: José Fonseca <jose.r.fonseca@gmail.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
914244e59d6ad08ec2dd815129c6e75b32843d80 25-Apr-2012 José Fonseca <jfonseca@vmware.com> gallivm: Use lp_build_alloca instead of LLVMBuildAlloca on the loop limiter.

To ensure that the alloca is at the top of the function body, otherwise
LLVM will not eliminate them, causing stack misalignment on 32bits.

Reviewed-by: James Benton <jbenton@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
5db9d76a6a498c029133a8c2544c4c7c25eebf80 02-Apr-2012 James Benton <jbenton@vmware.com> gallivm: Maximum loop iterations

Limits maximum loop iterations in a TGSI shader to prevent infinite
loops from occurring, any iteration in any loop counts towards this
limit

Signed-off-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
a206c4cd69a881bf3f8d960607d604b6d53e3a26 20-Feb-2012 José Fonseca <jfonseca@vmware.com> gallivm: Fix TGSI_OPCODE_ARR's translation.

Like TGSI_OPCODE_ARL, destination should be an integer.

This fixes invalid LLVM IR on an internal state tracker (currently Mesa
never emits this opcode).

In the future consider making ADDR register also a integer-as-float array,
like all other register kinds, or simply replace ADDR & ARR/ARL with
integer temp and instructions.

Reviewed-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
7199b0b6811b3340cb5c531c8625220e964fa16c 06-Feb-2012 Dave Airlie <airlied@redhat.com> gallivm: fetch immediates to correct type (v2)

Fetch float/uint/int immediates.

v2: bitcast to uint/int to floats as per Jose's suggestions.

Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
117a0e91afa4fae55df88de48a058c9b881c6b14 06-Feb-2012 Dave Airlie <airlied@redhat.com> gallivm: enable stores of integer types. (v2) + fix ARL

Infer from the operand the type of value to store.
MOV is untyped but we use the float store path.

v2: make MOV use float store path.

I've had to squash merge the ARL fix to be stored
as an integer in here to avoid regressions in a number
of piglit tests.

From now on ARL stores to an integer just like HW does.

Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
141f2c2fc9325a5d30629373bb962f42517967ae 06-Feb-2012 Dave Airlie <airlied@redhat.com> gallivm: enable fetch for integer opcodes. (v2)

The infers the type of data required using the opcode,
and casts the input to the appropriate type.

So far this only handles non-indirect constant and temporaries.

v2: as per Jose suggestion, fetch immediates via floats

Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
66461aa249a95053fd5887df75ab791558c3a486 06-Feb-2012 Dave Airlie <airlied@redhat.com> gallivm: add uint/int bld to the base builder. (v2)

These are used inside the action handlers for the integer opcodes.

v2: use uint_bld/int_bld, drop higher level uint_bld.

Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
f667a6f3cefbfb33478de87c166f7a52ed388fb4 07-Feb-2012 Dave Airlie <airlied@redhat.com> gallivm: fix build gather to take a bld context

Then pass the correct build context to it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
639fbe2e75bb23a72262a7bc60d69d026b649609 06-Feb-2012 Dave Airlie <airlied@redhat.com> gallivm: pass build context to exec_mask_store.

For now just pass the current context, but when we want to
store int or unsigned we need to pass those later.

Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
bc2875aa483a0fef7f6e32c1886f6e2edaba7694 12-Aug-2011 Tom Stellard <thomas.stellard@amd.com> gallivm: Add a new interface for doing TGSI->LLVM conversions

lp_bld_tgsi_soa.c has been adapted to use this new interface, but
lp_bld_tgsi_aos.c has only been partially adapted, since nothing in
gallium currently uses it.

v2:
- Rename lp_bld_tgsi_action.[ch] => lp_bld_tgsi_action.[ch]
- Initialize tgsi_info in lp_bld_tgsi_aos.c
- Fix copyright dates
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
82b71db03ddaf0eed504412c9169db37cf9bdadc 14-Jan-2012 Tom Stellard <tstellar@gmail.com> gallium: Move duplicated helper macros to tgsi_exec.h
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
9ee1bcf7a5442ccb517a5cfbaf024755bd4d2738 14-Jan-2012 Tom Stellard <tstellar@gmail.com> gallium: Unify defines of CHAN_[XYZW] in tgsi_exec.h
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e1e03ce4928edf4ea0ef43d853cb869f70b126aa 16-Oct-2011 José Fonseca <jose.r.fonseca@gmail.com> gallivm: Eliminate tgsi_util_get_full_src_register_sign_mode call.

It complicates more than it simplifies, now that there's only one negate
bit on TGSI registers.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
d8452a0be810d7176b0cbfe6632fc0f8016b5733 05-Sep-2011 Marek Olšák <maraeo@gmail.com> gallium: add shadow 1D and 2D array samplers to TGSI

And filling in all the switch statements in auxiliary. Mostly untested.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
31200d0688b67a0d764ad7fe4c2761d0f8d993d8 27-Apr-2011 Marek Olšák <maraeo@gmail.com> gallivm: fix warning: ‘value’ may be used uninitialized in this function

The path where it's uninitialized is guarded by an assert.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
652901e95b4ed406293d0e1fabee857c054119b1 15-Jan-2011 Brian Paul <brianp@vmware.com> Merge branch 'draw-instanced'

Conflicts:
src/gallium/auxiliary/draw/draw_llvm.c
src/gallium/drivers/llvmpipe/lp_state_fs.c
src/glsl/ir_set_program_inouts.cpp
src/mesa/tnl/t_vb_program.c
1d6f3543a063ab9e740fd0c149dcce26c282d773 09-Dec-2010 Brian Paul <brianp@vmware.com> gallivm/llvmpipe: implement system values and instanceID
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
14746b1d4fc7ae30b557dacc819b81756df2f72f 03-Dec-2010 Brian Paul <brianp@vmware.com> gallivm: fix null builder pointers
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
6299f241e9fdd86e705d144a42d9b1979c13f9ad 03-Dec-2010 Brian Paul <brianp@vmware.com> gallivm/llvmpipe: remove lp_build_context::builder

The field was redundant. Use the gallivm->builder value instead.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
efc82aef35a2aac5d2ed9774f6d28f2626796416 01-Dec-2010 Brian Paul <brianp@vmware.com> gallivm/llvmpipe: squash merge of the llvm-context branch

This branch defines a gallivm_state structure which contains the
LLVMBuilderRef, LLVMContextRef, etc. All data structures built with
this object can be periodically freed during a "garbage collection"
operation.

The gallivm_state object has to be passed to most of the builder
functions where LLVMBuilderRef used to be used.

Conflicts:
src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
src/gallium/drivers/llvmpipe/lp_state_setup.c
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
557280542399629ac64a48f5b618365e2b18fce1 30-Nov-2010 Zack Rusin <zackr@vmware.com> gallivm: fix storing of the addr register

we store into the index specified by the register index, not an
indirect register.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
f623d0c1c217d990f207306eb968172af79fa969 09-Nov-2010 Zack Rusin <zackr@vmware.com> gallivm: implement indirect addressing over inputs

Instead of messing with the callers simply copy our inputs into a
alloca array at the beginning of the function and then use it.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
10740acf46e08960dde790005d65a98440f313bc 09-Nov-2010 José Fonseca <jfonseca@vmware.com> gallivm: Allocate TEMP/OUT arrays only once.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
528c3cd24169c6b6c0da60cb2b8f765eb7f05cdc 08-Nov-2010 Zack Rusin <zackr@vmware.com> gallivm: implement indirect addressing of the output registers
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
55c5408ad049423597cd274e7abcd2d91a16ead3 05-Nov-2010 Brian Paul <brianp@vmware.com> gallivm: add const qualifiers, fix comment string
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e8d6b2793ff3907d3646eeaceaa00e2a04728e67 05-Nov-2010 Brian Paul <brianp@vmware.com> gallivm: alloca() was called too often for temporary arrays

Need to increment the array index to point to the last value.
Before, we were calling lp_build_array_alloca() over and over for
no reason.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e7f5d19a1106351f2db8f62f59f51be86eaa93df 04-Nov-2010 Brian Paul <brianp@vmware.com> gallivm: implement execution mask for scatter stores
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ede232e9898698258391a280a098a7ba951b0099 04-Nov-2010 Brian Paul <brianp@vmware.com> gallivm: add pixel offsets in scatter stores

We want to do the scatter store to sequential locations in memory
for the vector of pixels we're processing in SOA format.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
5b294a5d17c818ecbb1295fdd20825da9b106792 04-Nov-2010 Brian Paul <brianp@vmware.com> gallivm: added debug code to dump temp registers
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
3ded3e98ffc36820c8ab318d736eab99bb16f26b 04-Nov-2010 Brian Paul <brianp@vmware.com> gallivm: add some LLVM var labels
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
2fefbc79ac8bb55197ff817feeca2626585d7a8c 04-Nov-2010 Brian Paul <brianp@vmware.com> gallivm: implement scatter stores into temp register file

Something is not quite right, however. The piglit tests mentioned in
fd.o bug 31226 still don't pass.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ec2824cd867d3b782588be1f3b1d5d802eb381ab 19-Oct-2010 Brian Paul <brianp@vmware.com> gallivm: fix incorrect type for zero vector in emit_kilp()

http://bugs.freedesktop.org/show_bug.cgi?id=30974
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
22ec25e2bf5c9309610b68e8e40472a8ea695ba9 10-Oct-2010 Keith Whitwell <keithw@vmware.com> gallivm: don't branch on KILLs near end of shader
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
17dbd41cf23e7e7de2f27e5e9252d7f792d932f3 10-Oct-2010 José Fonseca <jfonseca@vmware.com> gallivm: Pass texture coords derivates as scalars.

We end up treating them as scalars in the end, and it saves some
instructions.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
aa4cb5e2d8d48c7dcc9653c61a9e25494e3e7b2a 07-Oct-2010 Keith Whitwell <keithw@vmware.com> llvmpipe: try to be sensible about whether to branch after mask updates

Don't branch more than once in quick succession. Don't branch at the
end of the shader.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
6d173da5c84142ee64f56f4c2e9e495dc1435e91 16-Sep-2010 José Fonseca <jfonseca@vmware.com> gallivm: Clamp indirect register indices to file_max.

Prevents crashes with bogus data, or bad shader translation.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
3d5b9c1f2d3340259dd0d8765090a5a963074f29 16-Sep-2010 José Fonseca <jfonseca@vmware.com> gallivm: Fix address register swizzle.

We're actually doing a double swizzling:

indirect_reg->Swizzle[indirect_reg->SwizzleX]

instead of simply

indirect_reg->SwizzleX
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
58daea741fa21fe3f89fd7bf106df1545c5b21af 02-Sep-2010 José Fonseca <jfonseca@vmware.com> gallivm: Move the texture modifiers to the header.

Useful to pass these around.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
079763f74648fef051ee5b8f7d730f7fc1ba27d5 02-Sep-2010 José Fonseca <jfonseca@vmware.com> gallivm: Cope with tgsi instruction reallocation failure.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
3fa3c33844b8491a204cda6ae8d67cd6ada78b3b 01-Sep-2010 Brian Paul <brianp@vmware.com> gallivm: fix bug in nested conditionals

This, plus the previous commit fix fd.o bug 29806.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
8690c6a6b4fb0b48e2ae75cd0f64de86b039081c 18-Aug-2010 michal <michal@capacitor.(none)> gallivm: Use proper index to lookup predicate register array.

Doesn't fix anything, as those indices were both always 0.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
8a3a971743a90463e65b44f1769a5301a31ce4cd 09-Aug-2010 José Fonseca <jfonseca@vmware.com> gallivm: Don't call LLVMBuildFNeg on llvm-2.6.

It didn't exist yet.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
fc9a49b638c26801951c33a570178bbb2b67ec60 08-Aug-2010 nobled <nobled@dreamwidth.org> gallivm: Always use floating-point operators for floating-point types

This fixes the assert added in LLVM 2.8:
assert(getType()->isIntOrIntVectorTy() &&
"Tried to create an integer operation on a non-integer type!")

But it also fixes some subtle bugs, since we should've been doing this
since LLVM 2.6 anyway.

Includes a modified patch from steckdenis@yahoo.fr for the
FNeg instructions in emit_fetch(); thanks for pointing those out.

http://bugs.freedesktop.org/29404
http://bugs.freedesktop.org/29407

Signed-off-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
12f5c0f9ce497e99854e0a3a7f5ff297a2a0a1e3 08-Aug-2010 José Fonseca <jfonseca@vmware.com> gallivm: Fix more integer operations.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
cd5af8c703d84dd856528554fa615e9787ebe75f 08-Aug-2010 nobled <nobled@dreamwidth.org> gallivm: Use the correct context for integers

See:
http://bugs.freedesktop.org/29407
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
3662afd87d61e3f65843b210a7e8c9c8a6cb27f0 21-Jul-2010 Brian Paul <brianp@vmware.com> gallivm: replace has_indirect_addressing field with indirect_files field

Instead of one big boolean indicating indirect addressing, use a
bitfield indicating which register files are accessed with indirect
addressing.

Most shaders that use indirect addressing only use it to access the
constant buffer. So no need to use an array for temporary registers
in this case.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
0115f07507fc661a0a19564c496a781c3dcbc7a0 21-Jul-2010 Brian Paul <brianp@vmware.com> gallivm: refactor code into get_indirect_offsets() function
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
890976e02d9b75c0814493901ffddb64092ea548 21-Jul-2010 Brian Paul <brianp@vmware.com> gallivm: added comment
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
be22e1e781094decfb408ad6d74e3d833b297c87 21-Jul-2010 Brian Paul <brianp@vmware.com> gallivm: remove extraneous braces
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
f674ed6b0662a15ab8298da0848a4c82694e0c95 21-Jul-2010 Brian Paul <brianp@vmware.com> gallivm: no longer do indirect addressing in get_temp_ptr()
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
105ed7dfd4abc94db1ce0cba2967ac0491158389 21-Jul-2010 Brian Paul <brianp@vmware.com> gallivm: implement correct indirect addressing of temp registers

As with indexing the const buffer, the ADDR reg may have totally
different values for each element. Need to use a gather operation.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
695814a15b4d64e1fa829d51f18c4089837929c3 21-Jul-2010 Brian Paul <brianp@vmware.com> gallivm: re-org, comments for get_temp_ptr()
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ec0e7b16bb6753bedbd611a97062934bfca03aa7 21-Jul-2010 Brian Paul <brianp@vmware.com> gallivm: rename a var to avoid compiler warnings
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
4363d4d0b945c4ca6c303fb337e1fac39e6e1ad6 21-Jul-2010 Brian Paul <brianp@vmware.com> gallivm: fix indirect addressing of constant buffer

The previous code assumed that all elements of the address register
were the same. But it can vary from pixel to pixel or vertex to
vertex so we must use a gather operation when dynamically indexing
the constant buffer.

Still need to fix this for the temporary register file...
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
5a916204179c6787157af3f3be758dc36162ab20 07-Jun-2010 Keith Whitwell <keithw@vmware.com> gallivm: eliminate tgsi_exec.h include
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ef81779850d1343b3ae284eb9beabeaf11934d4a 03-Jun-2010 José Fonseca <jfonseca@vmware.com> gallivm: Factor out the quad derivative code into a single place. Fix ddy.

For ddy it should be (bottom - top).
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
846b2fccc2a67b08acc6da51f4970fe66ed4559b 19-May-2010 Brian Paul <brianp@vmware.com> gallivm: rename a var: s/val/array_size/
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
32a7209c0a0d5ae63f12056ed969087d942c6298 17-May-2010 José Fonseca <jfonseca@vmware.com> gallivm: Tweak ret_mask handling.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
0b7ca2f8fcb187fb3aa37e0b6dc4b0a84101478f 17-May-2010 Zack Rusin <zackr@vmware.com> gallivm: implement function calls by inlining

with this approach we inline the entire function body in the caller
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
263e038431f24f24aaec252e135ffc9f2f09640e 15-May-2010 José Fonseca <jfonseca@vmware.com> gallivm: Temporarily remove function call support

Commits moved to the gallivm-call feature branch for further
experimentation and stabilization.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
af75bb4c9865a8162ae190d99d062047150b67b5 15-May-2010 Zack Rusin <zackr@vmware.com> gallivm: use our util_snprintf
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
74589283704c1e8723cfcdf14278c05039e2dbda 15-May-2010 Zack Rusin <zackr@vmware.com> gallivm: implement function calls
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
85c6799f6e2645e708eb03201e91f3285de7d9e1 14-May-2010 Brian Paul <brianp@vmware.com> tgsi: clean up in emit_fetch()
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
415c40735dfd110bf902ce43968c8d7bd23ff111 13-May-2010 Brian Paul <brianp@vmware.com> gallivm: silence uninitialized var warning
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
c91d9cb563baf292f04ad3eeb33837806a4fb20d 13-May-2010 Brian Paul <brianp@vmware.com> gallivm: silence uninitialized var warning
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
5b5ce16da533290e605005916a19c8339f283787 12-May-2010 Brian Paul <brianp@vmware.com> gallivm: rename a var
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
3f6dc8e79d918283a6dfcf9c8937a6d52f3bb4f5 12-May-2010 Brian Paul <brianp@vmware.com> gallivm/llvmpipe: add const qualifiers
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
54b94ee96a6d750d57d99ae9819fcf8206d4680d 09-May-2010 José Fonseca <jfonseca@vmware.com> gallivm: Add missing lvalue.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
8ad3e0b55df50beac8ba3c5cafa0be79641a4977 09-May-2010 José Fonseca <jfonseca@vmware.com> gallivm: Add an alternative to LLVMDumpValue that works with Windows GUI apps.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
2d91903882e399e8ea7306fd37d5d214907247e6 08-May-2010 José Fonseca <jfonseca@vmware.com> gallivm: Fix BREAK/CONT translation.

The cont_mask must be restored and exec mask recomputed in order to decide
whether to repeat the loop or not.

Unlike the continue mask, the break_mask must be preserved across loop
iterations.

Fixes several VShader DCT cases, and no regressions with glean.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ff6c78f44f2f741f4825b07dbc15b3a951fe9b2c 08-May-2010 José Fonseca <jfonseca@vmware.com> gallivm: Support predicates.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ec43b2eb45a1b2e33f328f76624c987484e329f3 04-May-2010 José Fonseca <jfonseca@vmware.com> gallivm: Proper implementation of TXL opcode.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
4554cdc289f1d97855825127c0bf8c0e7f6a2eda 04-May-2010 José Fonseca <jfonseca@vmware.com> gallivm: Fix several glitches introduced in the prev commit.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
962558daaed43b0111cd062e32821aad106869d7 04-May-2010 José Fonseca <jfonseca@vmware.com> gallivm: Implement TXD.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
6c8c88f02f0dc9cf39ce51d068525a94fccd5dc7 03-May-2010 José Fonseca <jfonseca@vmware.com> gallivm: Increase the TGSI translation limits and centralize them in a header.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
9db4a211e96356deb963223038eea074a5fe0eda 03-May-2010 José Fonseca <jfonseca@vmware.com> gallivm: Display message instead of crashing when sampler generator was not supplied for tgsi translation.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
e27983bc08d4eff5effbbcffbf5c9f5862fca2cf 03-May-2010 José Fonseca <jfonseca@vmware.com> gallivm: Replace predicate assertion failure with warning message.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
d2b6ed7c4daf094bfe3fa4e0318133d0a8ea3cf6 03-May-2010 Zack Rusin <zackr@vmware.com> gallivm: fix nested break and continue statements

we were resetting the mask on each new break/continue statement within
the same scope. we always need to and the current execution mask
with the current break/continue mask to get the correct result (the
masks are always ~1 initially)
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
b3ba54a01ab7aa2ff1fd569c3ed50c6dbc00b5f4 27-Apr-2010 José Fonseca <jfonseca@vmware.com> gallivm: Drop BGNFOR, ENDFOR, REP, and ENDREP opcodes.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
a18c210a95794c79c6f26dbf4c66d4a85e29169d 27-Apr-2010 José Fonseca <jfonseca@vmware.com> gallivm: Ensure all allocas are in the first block.

Refactor the code to make this easier.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
06574e45b418dab1ec106773c92b7d9e5af45c81 26-Apr-2010 Alan Hourihane <alanh@vmware.com> gallivm: BGNFOR/ENDFOR fallthrough to BGNLOOP/ENDLOOP
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
5d37cebc1b2f41ef68a5f8bb5ad66973ec2c1dd8 25-Apr-2010 Vinson Lee <vlee@vmware.com> gallivm: Rename variable info to opcode_info.

Avoid hiding existing variable already named info in outer scope.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
a1a7738223b754044213b969371823ec52b0a9e2 25-Apr-2010 Vinson Lee <vlee@vmware.com> gallivm: Remove NULL check of pointer that can't be NULL.

info cannot be NULL at the call to debug_printf. emit_instruction
dereferences info, so at debug_printf it is either not NULL or the
program has already crashed.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
021e0dc78b15fab29e761012860276c2597c8d8f 23-Apr-2010 Zack Rusin <zackr@vmware.com> gallivm: implement indirect addressing over temporaries

a bit more involved than indirect addressing over consts, but still
fairly reasonable. we allocate an array instead of individual alloca's,
and we do it only if the shader does indirect addressing.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ded2374e67bdc2c24e868775d2ff77b39b339d56 22-Apr-2010 Zack Rusin <zackr@vmware.com> gallivm: implement indirect addressing over constants

implement indirect addressing (ARL and ARR instructions) when used
with CONST's. indirect addressing over other vars (temps, inputs, outputs)
is not supported yet.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
18a4a83ddab7655253fdb71d37393a32adcda488 22-Apr-2010 Zack Rusin <zackr@vmware.com> gallivm: update comments
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
6d2e09fdc23e2573e9466f60db20ef4ac04b367d 22-Apr-2010 Zack Rusin <zackr@vmware.com> gallivm: fix nested cont statements
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
3a423dcf9dfa725a4e5dca60f0f2b02599d2ed9b 22-Apr-2010 Zack Rusin <zackr@vmware.com> gallivm: fix nested break statemants
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
7fe93f831d74ce46a161c0b0c89f00b9c18caa2b 22-Apr-2010 Brian Paul <brianp@vmware.com> gallivm: added some assertions in loop-gen code

We're hitting these assertions with nested loops...
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
bba6a196bb69afc72a9ec56740a312987e77afc2 22-Apr-2010 Brian Paul <brianp@vmware.com> gallivm: fix copy&paste error: s/cont_stack_size/break_stack_size/
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
fc9b8cd9dda946d8415732aeeed1eff5541cd1ee 22-Apr-2010 Brian Paul <brianp@vmware.com> gallivm: emit_instruction() is boolean
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
feffd259da5f2655222a2f26e2e5665a9e28173f 22-Apr-2010 Brian Paul <brianp@vmware.com> gallivm: implement TGSI KILP

As in tgsi_exec.c we don't actually rely on condition codes; we do
an unconditional kill. The only predication comes from the execution
mask which applies inside loops/conditionals.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
dc886ba1391d7d890bd1f5532bc14553e883a418 30-Mar-2010 Zack Rusin <zackr@vmware.com> gallivm: cleanup the code (found by coverity)

the condition can't be false, declerations are ok even if we don't
emit any.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
012fabca722494162c244a367913562b8cfa4677 29-Mar-2010 Zack Rusin <zackr@vmware.com> gallivm: make sure that the alloca's are the very first thing in the function

otherwise mem2reg can't put them in registers
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
185be3a87a5b38e8821a560c073975c11dcbd3e9 15-Mar-2010 Brian Paul <brianp@vmware.com> gallivm/llvmpipe: rename some constant building functions
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
2410125d072faeb83c8373e676422f6c44c78feb 11-Mar-2010 Brian Paul <brianp@vmware.com> gallivm: include tgsi_dump.h to silence warning
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
d42229707ad4be9be5a8e122354be7102d6ec348 10-Mar-2010 Jose Fonseca <jfonseca@vmware.com> gallivm: simplify conditional branching

Instead of testing each component individually, we can test the entire
vector at once.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ac33e7752d22f03db84e6a4c822b3a3f41d05f77 10-Mar-2010 Zack Rusin <zackr@vmware.com> gallivm: properly test the if condition and branch to the proper label

makes loops work
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
18d406e8a8a838c82ee4ec5dbf244ab8bba0855e 09-Mar-2010 Zack Rusin <zackr@vmware.com> gallivm: implement loops
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
1d84808dc045d7fcf2fade8d1504bc25e7c5041a 08-Mar-2010 Zack Rusin <zackr@vmware.com> gallivm: fix a crash by making sure we set the has_mask flag correctly
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
89258652b6a1d282bed14549907892bdfda752f0 06-Mar-2010 José Fonseca <jfonseca@vmware.com> gallivm: Answer question/comment.

This reverts commit 71c05689528d7987bfb99c3afe04e456887bc7b7.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
ccf57af93f7118a044fa21e874847fa3ed555bca 05-Mar-2010 José Fonseca <jfonseca@vmware.com> gallivm: Add a placeholder for TGSI_FILE_PREDICATE registers.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
71c05689528d7987bfb99c3afe04e456887bc7b7 04-Mar-2010 Brian Paul <brianp@vmware.com> gallivm: added question/comment
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
faf8215bae70f020420242dc812ef141fdcf5417 02-Mar-2010 Zack Rusin <zackr@vmware.com> llvmpipe: improve based on review from Jose and fix else clauses

else was broken in the outter most else statemants, plus the code
didn't need an inverted mask to compute the inverse of the current
condition.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
80f3cc36c511f62666162bca1d88c7746b98a27d 02-Mar-2010 Zack Rusin <zackr@vmware.com> llvmpipe: implement some control-flow

implements if/else/endif constructs and lays down the code for looping
and others. we create a conditional execution mask which decides which
of the four inputs are enabled for any store. it's used only if an
execution mask is present, otherwise we go through a direct store.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
9381dd590f2e45acb8fbb0aa5503c917b832204d 11-Feb-2010 José Fonseca <jfonseca@vmware.com> llvmpipe: Handle TGSI_TOKEN_TYPE_PROPERTY.

Avoids assertion failures with certain shaders.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
128d93a009c493c65f0fa5d220fac3098a54fa14 11-Feb-2010 José Fonseca <jfonseca@vmware.com> gallivm: TGSI_OPCODE_CONT is not deprecated.

Note that with FIXME instead of an assertion failure.

Addresses fdo 25956.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
85c7ec70ad41c8ada75a4cbace83d16815d3e2c5 09-Feb-2010 Zack Rusin <zackr@vmware.com> llvmpipe: switch to using dynamic stack allocation instead of registers

with mutable vars we don't need to follow the phi nodes. meaning that
control flow becomes trivial as we don't have scan the rest of the tgsi
to figure out the variable usage anymore. futhermore the memory2register
pass promotes alloca/store/load to registers while inserting the right phi
nodes. so we get simplicity and performance.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
c61bf363937f40624a5632745630d4f2b9907082 09-Feb-2010 Zack Rusin <zackr@vmware.com> llvmpipe: export the tgsi translation code to a common layer

the llvmpipe tgsi translation is a lot more complete than what was in
gallivm so replacing the latter with the former. this is needed since
the draw llvm paths will use the same code. effectively the proven
llvmpipe code becomes gallivm.
/external/mesa3d/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c