History log of /external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
38d1191f4133dc427fccdbaec61bef33201c2dcc 14-Sep-2012 Marek Olšák <maraeo@gmail.com> draw: fix non-indexed draw calls if there's an index buffer

pipe_draw_info::indexed determines if it should be indexed and not
the presence of an index buffer.

This fixes crashes in r300g.

NOTE: This is a candidate for the stable branches.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 2988fa940e1d8a4531fddff4d554eec1e6e04474)
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
a2c1df4c9a7375bc5306e8cfd07a9f7087759a96 10-Aug-2012 Brian Paul <brianp@vmware.com> draw: index samplers and sampler_view state by shader type

So that we can handle GS state and other types of shaders in the future.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
bef196c7929606bb8c7e9c06fe83a90fc0d95f09 10-Aug-2012 Brian Paul <brianp@vmware.com> draw: move tgsi-related state into a tgsi sub-struct

To better organize things a bit.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
3469715a8a171512cf9b528702e70393f01c6041 13-Jul-2012 José Fonseca <jfonseca@vmware.com> gallivm,draw,llvmpipe: Support wider native registers.

Squashed commit of the following:

commit 7acb7b4f60dc505af3dd00dcff744f80315d5b0e
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon Jul 9 17:46:31 2012 +0100

draw: Don't use dynamically sized arrays.

Not supported by MSVC.

commit 5810c28c83647612cb372d1e763fd9d7780df3cb
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon Jul 9 17:44:16 2012 +0100

gallivm,llvmpipe: Don't use expressions with PIPE_ALIGN_VAR().

MSVC doesn't accept exceptions in _declspec(align(...)). Use a
define instead.

commit 8aafd1457ba572a02b289b3f3411e99a3c056072
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon Jul 9 17:41:56 2012 +0100

gallium/util: Make u_cpu_detect.h header C++ safe.

commit 5795248350771f899cfbfc1a3a58f1835eb2671d
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon Jul 2 12:08:01 2012 +0100

gallium/util: Add ULL suffix to large constants.

As suggested by Andy Furniss: it looks like some old gcc versions
require it.

commit 4c66c22727eff92226544c7d43c4eb94de359e10
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Jun 29 13:39:07 2012 +0100

gallium/util: Truly disable INF/NAN tests on MSVC.

Thanks to Brian for spotting this.

commit 8bce274c7fad578d7eb656d9a1413f5c0844c94e
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Jun 29 13:39:07 2012 +0100

gallium/util: Disable INF/NAN tests on MSVC.

Somehow they are not recognized as constants.

commit 6868649cff8d7fd2e2579c28d0b74ef6dd4f9716
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Jul 5 15:05:24 2012 +0200

gallivm: Cleanup the 2 x 8 float -> 16 ub special path in lp_build_conv.

No behaviour change intended, like 7b98455fb40c2df84cfd3cdb1eb7650f67c8a751.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 5147a0949c4407e8bce9e41d9859314b4a9ccf77
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Jul 5 14:28:19 2012 +0200

gallivm: (trivial) fix issues with multiple-of-4 texture fetch

Some formats can't handle non-multiple of 4 fetches I believe, but
everything must support length 1 and multiples of 4.
So avoid going to scalar fetch (which is very costly) just because length
isn't 4.
Also extend the hack to not use shift with variable count for yuv formats to
arbitrary length (larger than 1) - doesn't matter how many elements we
have we always want to avoid it unless we have variable shift count
instruction (which we should get with avx2).

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 87ebcb1bd71fa4c739451ec8ca89a7f29b168c08
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Jul 4 02:09:55 2012 +0200

gallivm: (trivial) fix typo for wrap repeat mode in linear filtering aos code

This would lead to bogus coordinates at the edges.
(undetected by piglit because this path is only taken for block-based
formats).

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit 3a42717101b1619874c8932a580c0b9e6896b557
Author: José Fonseca <jfonseca@vmware.com>
Date: Tue Jul 3 19:42:49 2012 +0100

gallivm: Fix TGSI integer translation with AVX.

commit d71ff104085c196b16426081098fb0bde128ce4f
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Jun 29 15:17:41 2012 +0100

llvmpipe: Fix LLVM JIT linear path.

It was not working properly because it was looking at the JIT function
before it was actually compiled.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit a94df0386213e1f5f9a6ed470c535f9688ec0a1b
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Jun 28 18:07:10 2012 +0100

gallivm: Refactor lp_build_broadcast(_scalar) to share code.

Doesn't really change the generated assembly, but produces more compact IR,
and of course, makes code more consistent.

Reviewed-by: Brian Paul <brianp@vmware.com>

commit 66712ba2731fc029fa246d4fc477d61ab785edb5
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Jun 27 17:30:13 2012 +0100

gallivm: Make LLVMContextRef a singleton.

There are any places inside LLVM that depend on it. Too many to attempt
to fix.

Reviewed-by: Brian Paul <brianp@vmware.com>

commit ff5fb7897495ac263f0b069370fab701b70dccef
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Jun 28 18:15:27 2012 +0200

gallivm: don't use 8-wide texture fetch in aos path

This appears to be a slight loss usually.
There are probably several reasons for that:
- fetching itself is scalar
- filtering is pure int code hence needs splitting anyway, same
for the final texel offset calculations
- texture wrap related code, which can be done 8-wide, is slightly more
complex with floats (with clamp_to_edge) and float operations generally
more costly hence probably not much faster overall
- the code needed to split when encountering different mip levels for the
quads, adding complexity
So, just split always for aos path (but leave it 8-wide for soa, since we
do 8-wide filtering there when possible).
This should certainly be revisited if we'd have avx2 support.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit ce8032b43dcd8e8d816cbab6428f54b0798f945d
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Jun 27 18:41:19 2012 +0200

gallivm: (trivial) don't extract fparts variable if not needed

Did not have any consequences but unnecessary.

commit aaa9aaed8f80dc282492f62aa583a7ee23a4c6d5
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Jun 27 18:09:06 2012 +0200

gallivm: fix precision issue in aos linear int wrap code

now not just passes at a quick glance but also with piglit...
If we do the wrapping with floats, we also need to set the
weights accordingly. We can potentially end up with different
(integer) coordinates than what the integer calculations would
have chosen, which means the integer weights calculated previously
in this case are completely wrong. Well at least that's what I think
happens, at least recalculating the weights helps.
(Some day really should refactor all the wrapping, so we do whatever is
fastest independent of 16bit int aos or 32bit float soa filtering.)

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit fd6f18588ced7ac8e081892f3bab2916623ad7a2
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Jun 27 11:15:53 2012 +0100

gallium/util: Fix parsing of options with underscore.

For example

GALLIVM_DEBUG=no_brilinear

which was being parsed as two options, "no" and "brilinear".

commit 09a8f809088178a03e49e409fa18f1ac89561837
Author: James Benton <jbenton@vmware.com>
Date: Tue Jun 26 15:00:14 2012 +0100

gallivm: Added a generic lp_build_print_value which prints a LLVMValueRef.

Updated lp_build_printf to share common code.
Removed specific lp_build_print_vecX.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit e59bdcc2c075931bfba2a84967a5ecd1dedd6eb0
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed May 16 15:00:23 2012 +0100

draw,llvmpipe: Avoid named struct types on LLVM 3.0 and later.

Starting with LLVM 3.0, named structures are meant not for debugging, but
for recursive data types, previously also known as opaque types.

The recursive nature of these types leads to several memory management
difficulties. Given that we don't actually need recursive types, avoid
them altogether.

This is an attempt to address fdo bugs 41791 and 44466. The issue is
somewhat random so there's no easy way to check how effective this is.

Cherry-picked from 9af1ba565dfd5cef9ee938bb7c04767d14878fbf

commit df6070f618a203c7a876d984c847cde4cbc26bdb
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Jun 27 14:42:53 2012 +0200

gallivm: (trivial) fix typo in faster aos linear int wrap code

no longer crashes, now REALLY tested.

commit d8f98dce452c867214e6782e86dc08562643c862
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Jun 26 18:20:58 2012 +0200

llvmpipe: (trivial) remove bogus optimization for float aos repeat wrap

This optimization for nearest filtering on the linear path generated
likely bogus results, and the int path didn't have any optimizations
there since the only shader using force_nearest apparently uses
clamp_to_edge not repeat wrap anyway.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit c4e271a0631087c795e756a5bb6b046043b5099d
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Jun 26 23:01:52 2012 +0200

gallivm: faster repeat wrap for linear aos path too

Even if we already have scaled integer coords, it's way faster to use
the original float coord (plus some conversions) rather than use URem.
The choice of what to do for texture wrapping is not really tied to int
aos or float soa filtering though for some modes there can be some gains
(because of easier weight calculations).

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 1174a75b1806e92aee4264ffe0ffe7e70abbbfa3
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Jun 26 14:39:22 2012 +0200

gallivm: improve npot tex wrap repeat in linear soa path

URem gets translated into series of scalar divisions so
just about anything else is faster.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit f849ffaa499ed96fa0efd3594fce255c7f22891b
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Jun 26 00:40:35 2012 +0100

gallivm: (trivial) fix near-invisible shift-space typo

I blame the keyboard.

commit 5298a0b19fe672aebeb70964c0797d5921b51cf0
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 25 16:24:28 2012 +0200

gallivm: add new intrinsic helper to deal with arbitrary vector length

This helper will split vectors which are too large for the hw, or expand
them if they are too small, so a caller of a function using intrinsics which
uses such sizes need not split (or expand) the vectors manually and the
function will still use the intrinsic instead of dropping back to generic
llvm code. It can also accept scalars for use with pseudo-vector intrinsics
(only useful for float arguments, all x86 scalar simd float intrinsics use
4vf32).
Only used for lp_build_min/max() for now (also added the scalar float case
for these while there). (Other basic binary functions could use it easily,
whereas functions with a different interface would need different helpers.)
Expanding vectors isn't widely used, because we always try to use
build contexts with native hw vector sizes. But it might (or not) be nicer
if this wouldn't need to be done, the generated code should in theory stay
the same (it does get hit by lp_build_rho though already since we
didn't have a intrinsic for the scalar lp_build_max case before).

v2: incorporated Brian's feedback, and also made the scalar min/max case work
instead of crash (all scalar simd float intrinsics take 4vf32 as argument,
probably the reason why it wasn't used before).
Moved to lp_bld_intr based on José's request, and passing intrinsic size
instead of length.
Ideally we'd derive the source type info from the passed in llvm value refs
and process some llvmtype return type so we could handle intrinsics where
the source and destination type isn't the same (like float/int conversions,
packing instructions) but that's a bit too complicated for now.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 01aa760b99ec0b2dc8ce57a43650e83f8c1becdf
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 25 16:19:18 2012 +0200

gallivm: (trivial) increase max code size for shader disassembly

64kB was just short of what I needed (which caused a crash) hence
increase to 96kB (should probably be smarter about that).

commit 74aa739138d981311ce13076388382b5e89c6562
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 25 11:53:29 2012 +0100

gallivm: simplify aos float tex wrap repeat nearest

just handle pot and npot the same. The previous pot handling
ended up with exactly the same instructions plus 2 more (leave it
in the soa path though since it is probably still cheaper there).
While here also fix a issue which would cause a crash after an assert.

commit 0e1e755645e9e49cfaa2025191e3245ccd723564
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 25 11:29:24 2012 +0100

gallivm: (trivial) skip floor rounding in ifloor when not signed

This was only done for the non-sse41 case before, but even with
sse41 this is obviously unnecessary (some callers already call
itrunc in this case anyway but some might not).

commit 7f01a62f27dcb1d52597b24825931e88bae76f33
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 25 11:23:12 2012 +0100

gallivm: (trivial) fix bogus comments

commit 5c85be25fd82e28490274c468ce7f3e6e8c1d416
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Jun 20 11:51:57 2012 +0100

translate: Free elt8_func/elt16_func too.

These were leaking.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit 0ad498f36fb6f7458c7cffa73b6598adceee0a6c
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Jun 19 15:55:34 2012 +0200

gallivm: fix bug for tex wrap repeat with linear sampling in aos float path

The comparison needs to be against length not length_minus_one, otherwise
the max texel is never chosen (for the second coordinate).

Fixes piglit texwrap-1D-npot-proj (and 2D/3D versions).

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit d1ad65937c5b76407dc2499b7b774ab59341209e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Jun 19 16:13:43 2012 +0200

gallivm: simplify soa tex wrap repeat with npot textures and no mip filtering

Similar to what is already done in aos sampling for the float path (but not
the int path since we don't get normalized float coordinates there).
URem is expensive and the calculation is done trivially with
normalized floats instead (at least with sse41-capable cpus).
(Some day should probably do the same for the mip filter path but it's much
more complicated there hence the gain is smaller.)

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit e1e23f57ba9b910295c306d148f15643acc3fc83
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 18 20:38:56 2012 +0200

llvmpipe: (trivial) remove duplicated function declaration

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 07ca57eb09e04c48a157733255427ef5de620861
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 18 20:37:34 2012 +0200

llvmpipe: destroy setup variants on context destruction

lp_delete_setup_variants() used to be called in garbage collection,
but this no longer exists hence the setup shaders never got freed.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit ed0003c633859a45f9963a479f4c15ae0ef1dca3
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 18 16:25:29 2012 +0100

gallivm: handle different ilod parts for multiple quad sampling

This fixes filtering when the integer part of the lod is not the same
for all quads. I'm not fully convinced of that solution yet as it just
splits the vector if the levels to be sampled from are different.
But otherwise we'd need to do things like some minify steps, and getting
mip level base address separately anyway hence it wouldn't really look
like much of a win (and making the code even more complex).
This should now give identical results to single quad sampling.

commit 8580ac4cfc43a64df55e84ac71ce1a774d33c0d2
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Jun 14 18:14:47 2012 +0200

gallivm: de-duplicate sample code common to soa and aos sampling

There doesn't seem to be any reason why this code dealing with cube face
selection, lod and mip level calculation is separate in aos and
soa sampling, and I am sick of having it to change in both places.

commit fb541e5f957408ce305b272100196f1e12e5b1e8
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Jun 14 18:15:41 2012 +0200

gallivm: do mip filtering with per quad lod_fpart

This gives better results for mip filtering, though the generated code might
not be optimal. For now it also creates some artifacts if the lod_ipart isn't
the same for all quads, since instead of using the same mip weight for all
quads as previously (which just caused non-smooth gradients) this now will
use the right weights but with the wrong mip level in this case (can easily
be seen with things like texfilt, mipmap_tunnel).
v2: use logic helper suggested by José, and fix issue with negative lod_fpart
values

commit f1cc84eef7d826a20fab6cd8ccef9a275ff78967
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Jun 13 18:35:25 2012 +0200

gallivm: (trivial) fix bogus assert in lp_build_unpack_broadcast_aos_scalars

commit 7c17dbae8ae290df9ce0f50781a09e8ed640c044
Author: James Benton <jbenton@vmware.com>
Date: Tue Jun 12 12:11:14 2012 +0100

util: Reimplement half <-> float conversions.

Removed u_half.py used to generate the table for previous method.

Previous implementation of float to half conversion was faulty for
denormalised and NaNs and would require extra logic to fix,
thus making the speedup of using tables irrelevant.

commit 7762f59274070e1dd4b546f5cb431c2eb71ae5c3
Author: James Benton <jbenton@vmware.com>
Date: Tue Jun 12 12:12:16 2012 +0100

tests: Updated tests to properly handle NaN for half floats.

commit fa94c135aea5911fd93d5dfb6e6f157fb40dce5e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 11 18:33:10 2012 +0200

gallivm: do mip level calculations per quad

This is the final piece which shouldn't change the rendering output yet.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 23cbeaddfe03c09ca18c45d28955515317ffcf4c
Author: Roland Scheidegger <sroland@vmware.com>
Date: Sat Jun 9 00:54:21 2012 +0200

gallivm: do per-quad cube face selection

Doesn't quite fix the piglit cubemap test (not sure why actually)
but doing per-quad face selection is doing the right thing and
definitely an improvement.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit abfb372b3702ac97ac8b5aa80ad1b94a2cc39d33
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Jun 11 18:22:59 2012 +0200

gallivm: do all lod calculations per quad

Still no functional change but lod is now converted to scalar after
lod calculations.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 519368632747ae03feb5bca9c655eccbc5b751b4
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 16:46:10 2012 +0100

gallivm: Added support for half-float to float conversion in lp_build_conv.

Updated various utility functions to support this change.

commit 135b4d683a4c95f7577ba27b9bffa4a6fbd2c2e7
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 16:02:46 2012 +0100

gallivm: Added function for half-float to float conversion.

Updated lp_build_format_aos_array to support half-float source.

commit 37d648827406a20c5007abeb177698723ed86673
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 14:55:18 2012 +0100

util: Updated u_format_tests to rigidly test half-float boundary values.

commit 2ad18165d96e578aa9046df7c93cb1c3284d8c6b
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 14:54:16 2012 +0100

llvmpipe: Updated lp_test_format to properly handle Inf/NaN results.

commit 78740acf25aeba8a7d146493dd5c966e22c27b73
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 14:53:30 2012 +0100

util: Added functions for checking NaN / Inf for double and half-floats.

commit 35e9f640ae01241f9e0d67fe893bbbf564c05809
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu May 24 21:05:13 2012 +0200

gallivm: Fix calculating rho for 3d textures for the single-quad case

Discovered by accident, this looks like a very old typo bug.

commit fc1220c636326536fd0541913154e62afa7cd1d8
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu May 24 21:04:59 2012 +0200

gallivm: do calcs per-quad in lp_build_rho

Still convert to scalar at the end of the function.

commit 50a887ffc550bf310a6988fa2cea5c24d38c1a41
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon May 21 23:21:50 2012 +0200

gallivm: (trivial) return scalar in lp_build_extract_range for length 1 vectors

Our type system on top of llvm's one doesn't generally support vectors of
length 1, instead using scalars. So we should return a scalar from this
function instead of having to bitcast the vector with length 1 later elsewhere.

commit 80c71c621f9391f0f9230460198d861643324876
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 17:49:15 2012 +0100

draw: Fixed bad merge error

commit c47401cfad0c9167de20ff560654f533579f452c
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 15:29:30 2012 +0100

draw: Updated store_clip to store whole vectors instead of individual elements.

commit 2d9c1ad74b0b0b41861fffcecde39f09cc27f1cf
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 15:28:32 2012 +0100

gallivm: Added lp_build_fetch_rgba_aos_array.

A version of lp_build_fetch_rgba_aos which is targeted at simple array formats.

Reads the whole vector from memory in one, instead of reading each element
individually.

Tested with mesa tests and demos.

commit ff7805dc2b6ef6d8b11ec4e54aab1633aef29ac8
Author: James Benton <jbenton@vmware.com>
Date: Tue May 22 15:27:40 2012 +0100

gallivm: Added lp_build_pad_vector.

This function pads a vector with undef to a desired length.

commit 701f50acef24a2791dabf4730e5b5687d6eb875d
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 17:27:19 2012 +0100

util: Added util_format_is_array.

This function checks whether a format description is in a simple array format.

commit 5e0a7fa543dcd009de26f34a7926674190fa6246
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 19:13:47 2012 +0100

draw: Removed draw_llvm_translate_from and draw/draw_llvm_translate.c.

This is "replaced" by adding an optimised path in lp_build_fetch_rgba_aos
in an upcoming patch.

commit 8c886d6a7dd3fb464ecf031de6f747cb33e5361d
Author: James Benton <jbenton@vmware.com>
Date: Wed May 16 15:02:31 2012 +0100

draw: Modified store_aos to write the vector as one, not individual elements.

commit 37337f3d657e21dfd662c7b26d61cb0f8cfa6f17
Author: James Benton <jbenton@vmware.com>
Date: Wed May 16 14:16:23 2012 +0100

draw: Changed aos_to_soa to use lp_build_transpose_aos.

commit bd2b69ce5d5c94b067944d1dcd5df9f8e84548f1
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 19:14:27 2012 +0100

draw: Changed soa_to_aos to use lp_build_transpose_aos.

commit 0b98a950d29a116e82ce31dfe7b82cdadb632f2b
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 18:57:45 2012 +0100

gallivm: Added lp_build_transpose_aos which converts between aos and soa.

commit 69ea84531ad46fd145eb619ed1cedbe97dde7cb5
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 18:57:01 2012 +0100

gallivm: Added lp_build_interleave2_half aimed at AVX unpack instructions.

commit 7a4cb1349dd35c18144ad5934525cfb9436792f9
Author: José Fonseca <jfonseca@vmware.com>
Date: Tue May 22 11:54:14 2012 +0100

gallivm: Fix build on Windows.

MC-JIT not yet supported there.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit afd105fc16bb75d874e418046b80d9cc578818a1
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:17:26 2012 +0100

llvmpipe: Added a error counter to lp_test_conv.

Useful for keeping track of progress when fixing errors!

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit b644907d08c10a805657841330fc23db3963d59c
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:16:46 2012 +0100

llvmpipe: Changed known failures in lp_test_conv.

To comply with the recent fixes to lp_bld_conv.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit d7061507bd94f6468581e218e61261b79c760d4f
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:14:38 2012 +0100

llvmpipe: Added fixed point types tests to lp_test_conv.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit 146b3ea39b4726dbe125ac666bd8902ea3d6ca8c
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:26:35 2012 +0100

llvmpipe: Changed lp_test_conv src/dst alignment to be correct.

Now based on the define rather than a fixed number.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit f3b57441f834833a4b142a951eb98df0aa874536
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:06:44 2012 +0100

gallivm: Fixed erroneous optimisation in lp_build_min/max.

Previously assumed normalised was 0 to 1, but it can be -1 to 1
if type is signed.
Tested with lp_test_conv and lp_test_format, reduced errors.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit a0613382e5a215cd146bb277646a6b394d376ae4
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:04:49 2012 +0100

gallivm: Compensate for lp_const_offset in lp_build_conv.

Fixing a /*FIXME*/ to remove errors in integer conversion in lp_build_conv.
Tested using lp_test_conv and lp_test_format, reduced errors.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit a3d2bf15ea345bc8a0664f8f441276fd566566f3
Author: James Benton <jbenton@vmware.com>
Date: Fri May 18 16:01:25 2012 +0100

gallivm: Fixed overflow in lp_build_clamped_float_to_unsigned_norm.

Tested with lp_test_conv and lp_test_format, reduced errors.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit e7b1e76fe237613731fa6003b5e1601a2e506207
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon May 21 20:07:51 2012 +0100

gallivm: Fix build with LLVM 2.6

Trivial, and useful.

commit d3c6bbe5c7f5ba1976710831281ab1b6a631082d
Author: José Fonseca <jfonseca@vmware.com>
Date: Tue May 15 17:15:59 2012 +0100

gallivm: Enable MCJIT/AVX with vanilla LLVM 3.1.

Add the necessary C++ glue, so that we don't need any modifications
to the soon to be released LLVM 3.1.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit 724a019a14d40fdbed21759a204a2bec8a315636
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon May 14 22:04:06 2012 +0100

gallivm: Use HAVE_LLVM 0x0301 consistently.

commit af6991e2a3868e40ad599b46278551b794839748
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon May 14 21:49:06 2012 +0100

gallivm: Add MCRegisterInfo.h to silence benign warnings about missing implementation.

Trivial.

commit 6f8a1d75458daae2503a86c6b030ecc4bb494e23
Author: Vinson Lee <vlee@freedesktop.org>
Date: Mon Apr 2 22:14:15 2012 -0700

gallivm: Pass in a MCInstrInfo to createMCInstPrinter on llvm-3.1.

llvm-3.1svn r153860 makes MCInstrInfo available to the MCInstPrinter.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit 62555b6ed8760545794f83064e27cddcb3ce5284
Author: Vinson Lee <vlee@freedesktop.org>
Date: Tue Mar 27 21:51:17 2012 -0700

gallivm: Fix method overriding in raw_debug_ostream.

Use matching type qualifers to avoid method hiding.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 6a9bd784f4ac68ad0a731dcd39e5a3c39989f2be
Author: Vinson Lee <vlee@freedesktop.org>
Date: Tue Mar 13 22:40:52 2012 -0700

gallivm: Fix createOProfileJITEventListener namespace with llvm-3.1.

llvm-3.1svn r152620 refactored the OProfile profiling code.
createOProfileJITEventListener was moved from the llvm namespace to the
llvm::JITEventListener namespace.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit b674955d39adae272a779be85aa1bd665de24e3e
Author: Vinson Lee <vlee@freedesktop.org>
Date: Mon Mar 5 22:00:40 2012 -0800

gallivm: Pass in a MCRegisterInfo to MCInstPrinter on llvm-3.1.

llvm-3.1svn r152043 changes createMCInstPrinter to take an additional
MCRegisterInfo argument.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit 11ab69971a8a31c62f6de74905dbf8c02884599f
Author: Vinson Lee <vlee@freedesktop.org>
Date: Wed Feb 29 21:20:53 2012 -0800

Revert "gallivm: Change getExtent and readByte to non-const with llvm-3.1."

This reverts commit d5a6c172547d8964f4d4bb79637651decaf9deee.

llvm-3.1svn r151687 makes MemoryObject accessor members const again.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit 339960c82d2a9f5c928ee9035ed31dadb7f45537
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon May 14 16:19:56 2012 +0200

gallivm: (trivial) fix assertion failure for mipmapped 1d textures

In lp_build_rho, we may end up with a 1-element vector (for mipmapped 1d
textures), but in this case we require the type to be a non-vector type,
so need a cast.

commit 9d73edb727bd6d196030dc3026b7bf0c574b3e19
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu May 10 18:12:07 2012 +0200

gallivm: prepare for per-quad lod calculations for large vectors

to be able to handle multiple quads at once in texture sampling and still
do lod calculations per quad, it is necessary to get the per-quad derivatives
into the lp_build_rho function.
Until now these derivative values were just scalars, which isn't going to work.
So we now use vectors, and since the interface needs to change we also do some
different (slightly more efficient) packing of the values.
For 8-wide vectors the packed derivative values for 3 coords would look like
this, this scales to a arbitrary (multiple of 4) vector size:
ds1dx ds1dy dt1dx dt1dy ds2dx ds2dy dt2dx dt2dy
dr1dx dr1dy _____ _____ dr2dx dr2dy _____ _____
The second vector will be unused for 1d and 2d textures.
To facilitate future changes the derivative values are put into a struct, since
quite some functions just pass these values through.
The generated code seems to be very slightly better for 2d textures (with
4-wide vectors) than before with sse2 (if you have a cpu with physical 128bit
simd units - otherwise it's probably not a win).
v2: suggestions from José, rename variables, add comments, use swizzle helper

commit 0aa21de0d31466dac77b05c97005722e902517b8
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu May 10 18:10:31 2012 +0200

gallivm: add undefined swizzle handling to lp_build_swizzle_aos

This is useful for vectors with "holes", it lets llvm choose the most
efficient shuffle instructions if some elements aren't needed without having to
worry what elements to manually pick otherwise.

commit 00faf3f370e7ce92f5ef51002b0ea42ef856e181
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri May 4 17:25:16 2012 +0100

gallivm: Get the LLVM IR optimization passes before JIT compilation.

MC-JIT engine compiles the module immediately on creation, so the optimization
passes were being run too late.

So now we create a target data layout from a string, that matches the
ABI parameters reported by the compiler.

The backend optimization passes were always been run, so the performance
improvement is modest (3% on multiarb mesa demo).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit 40a43f4e2ce3074b5ce9027179d657ebba68800a
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed May 2 16:03:54 2012 +0200

gallivm: (trivial) fix wrong define used in lp_build_pack2

should fix stack-smashing crashes.

commit e6371d0f4dffad4eb3b7a9d906c23f1c88a2ab9e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Apr 30 21:25:29 2012 +0200

gallivm: add perf warnings when not using intrinsics with 256bit vectors

Helper functions using integer sse2 intrinsics could split the vectors with AVX
instead of using generic fallback (which should be faster).
We don't actually expect to hit these paths (hence don't fix them up to actually
do the vector splitting) so just emit warnings (for those functions where it's
obvious doing split/intrinsic is faster than using generic path).
Only emit warnings for 256bit vectors since we _really_ don't expect to hit
arbitrary large vectors which would affect a lot more functions.
The warnings do not actually depend on avx since the same logic applies to
plain sse2 too (but of course again there's _really_ no reason we should hit
these functions with 256bit vectors without avx).

commit 8a9ea701ea7295181e846c6383bf66a5f5e47637
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue May 1 20:37:07 2012 +0200

gallivm: split vectors manually for avx in lp_build_pack2 (v2)

There's 2 reasons for this:
First, there's a llvm bug (fixed in 3.1) which generates tons of byte
inserts/extracts otherwise, and second, more importantly, we want to use
pack intrinsics instead of shuffles.
We do this in lp_build_pack2 and not the calling code (aos sample path)
because potentially other callers might find that useful too, even if
for larger sequences of code using non-native vector sizes it might be
better to manually split vectors.
This should boost texture performance in the aos path considerably.
v2: fix issues with intrinsics types with old llvm

commit 27ac5b48fa1f2ea3efeb5248e2ce32264aba466e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue May 1 20:26:22 2012 +0200

llvmpipe: refactor lp_build_pack2 (v2)

prettify, and it's unnecessary to assert when there's no intrinsic due to
unsupported bit width - the shuffle path will work regardless.
In contrast lp_build_packs2, should only rely on lp_build_pack2 doing the
clamping for element sizes for which there is a sse2 intrinsic.
v2: fix bug spotted by Jose regarding the intrinsic type for packusdw
on old llvm versions.

commit ddf279031f0111de4b18eaf783bdc0a1e47813c8
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue May 1 20:13:59 2012 +0200

gallivm: add src width check in lp_build_packs2()

not doing so would skip clamping even if no sse2 pack instruction is
available, which is incorrect (in theory only, such widths would also always
hit a (unnecessary) assertion in lp_build_pack2().

commit e7f0ad7fe079975eae7712a6e0c54be4fae0114b
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Apr 27 15:57:00 2012 +0200

gallivm: (trivial) fix crash-causing typo for npot textures with avx

commit 28a9d7f6f655b6ec508c8a3aa6ffefc1e79793a0
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Apr 25 19:38:45 2012 +0200

gallivm: (trivial) remove code mistakenly added twice.

commit d5926537316f8ff67ad0a52e7242f7c5478d919b
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Apr 24 21:16:15 2012 +0200

gallivm: add a new avx aos sample path (v2)

Try to avoid mixing float and int address calculations. This does texture wrap
modes with floats, and then the offset calculations still with ints (because
of lack of precision with floats, though we could do some effort to make it work
with not too large (16MB) textures).
This also handles wrap repeat mode with npot-sized textures differently than
either the old soa or aos int path (likely way faster but untested).
Otherwise the actual address wrap code is largely similar to the soa path (not
quite the same as this one also has some int code), it should get used by avx
soa sampling later as well but doesn't handle more complex address modes yet
(this will also have the benefit that we can use aos sampling path for all
texture address modes).
Generated code for that looks reasonable, but still does not split vectors
explicitly for fetch/filter which means still get hit by llvm (fixed upstream)
which generates hundreds of pinsrb/pextrb instead of two shuffles.
It is not obvious though if it's much of a win over just doing address calcs
4-wide but with ints, even if it is definitely much less instructions on avx.
piglit's texwrap seems to look exactly the same but doesn't test
neither the non-normalized nor the npot cases.
v2: fix comments, prettify based on Brian's and Jose's feedback.

commit bffecd22dea66fb416ecff8cffd10dd4bdb73fce
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Apr 19 01:58:29 2012 +0200

gallivm: refactor aos lp_build_sample_image_nearest/linear

split them up to separate address calculations and fetching/filtering.
Need this for being able to do 8-wide float address calcs and 4-wide
fetch/filter later (for avx). Plus the functions were very big scary monsters
anyway (in particular lp_build_sample_image_linear).

commit a80b325c57529adddcfa367f96f03557725c4773
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Apr 16 17:17:18 2012 +0200

gallivm: fix lp_build_resize when truncating width but expanding vector size

Missed this case which I thought was impossible - the assertion for it was
right after the division by zero...
(AoS) texture sampling may ask us to do this, for things like 8 4x32int
vectors to 1 32x8int vector conversion (eventually, we probably don't want
this to happen).

commit f9c8337caa3eb185830d18bce8b95676a065b1d7
Author: Roland Scheidegger <sroland@vmware.com>
Date: Sat Apr 14 18:00:59 2012 +0200

gallivm: fix cube maps with larger vectors

This makes the branchless cube face selection code work with larger vectors.
Because the complexity is quite high (cannot really be improved it seems,
per-face selection would reduce complexity a lot but this leads to errors
unless the derivatives are calculated all from the same face which almost
doubles the work to be done) it is still slower than the branching version,
hence only enable this with large vectors.
It doesn't actually do per-quad face selection yet (only makes sense with
matching lod selection, in fact it will select the same face for all pixels
based on the average of the first four pixels for now) but only different
shuffles are required to make it work (the branching version actually should
work with larger vectors too now thanks to the improved horizontal add but of
course it cannot be extended to really select the face per-quad unless doing
branching per quad).

commit 7780c58869fc9a00af4f23209902db7e058e8a66
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 30 21:11:12 2012 +0100

llvmpipe: (trivial) fix compiler warning

and also clarify comment regarding availability of popcnt instruction.

commit a266dccf477df6d29a611154e988e8895892277e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 30 14:21:07 2012 +0100

gallivm: remove unneeded members in lp_build_sample_context

Minor cleanup, the texture width, height, depth aren't accessed in their
scalar form anywhere. Makes it more obvious those values should probably be
fetched already vectorized (but this requires more invasive changes)...

commit b678c57fb474e14f05e25658c829fc04d2792fff
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Mar 29 15:53:55 2012 +0100

gallivm: add a helper for concatenating vectors

Similar to the extract_range helper intended to get around slow code generated
by llvm for 128bit insertelements.
Concatenating two 128bit vectors this way will result in a single vinsertf128
operation rather than two 64bit stores plus one 128bit load, though it might be
mildly useful for other purposes as well.

commit 415ff228bcd0cf5e44a4c15350a661f0f5520029
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Mar 28 19:41:15 2012 +0100

gallivm: add a custom 2x8f->1x16ub avx conversion path

Similar to the existing 4x4f->1x16ub sse2 path, shaves off a couple
instructions (min/max mostly) because it relies on pack intrinsics clamping.

commit 78c08fc89f8fbcc6dba09779981b1e873e2a0299
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Mar 28 18:44:07 2012 +0100

gallivm: add avx arithmetic intrinsics

Add all avx intrinsics for arithmetic functions (with the exception
of the horizontal add function which needs another look).
Seems to pass basic tests.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit a586caa2800aa5ce54c173f7c0d4fc48153dbc4e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Mar 28 15:31:35 2012 +0100

gallivm: add avx logic intrinsics

Add the blend intrinsics for 8-wide float and 4-wide double vectors.
Since we lack 256bit int instructions these are used for int vectors as well,
though obviously not for byte or word element values.
The comparison intrinsics aren't extended for avx since these are only used
for pre-2.7 llvm versions.

commit 70275e4c13c89315fc2560a4c488c0e6935d5caf
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Mar 28 00:40:53 2012 +0100

gallivm: new helper function for extract shuffles.

Based on José's idea as we can need that in a couple places.
Note that such shuffles should not be used lightly, since data layout
of <4 x i8> is different to <16 x i8> for instance, hence might cause
data rearrangement.

commit 4d586dbae1b0c55915dda1759d2faea631c0a1c2
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 27 18:27:25 2012 +0100

gallivm: (trivial) don't overallocate shuffle variable

using wrong define meant huge array...

commit 06b0ec1f6d665d98c135f9573ddf4ba04b2121ad
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 27 17:54:20 2012 +0100

gallivm: don't do per-element extract/insert for vector element resize

Instead of doing per-element extract/insert if the src vectors
and dst vector differ in total size (which generates atrocious code)
first change the src vectors size by using shuffles to destination
vector size.
We can still do better than that on AVX for packing to color buffer
(by exploiting pack intrinsics characteristics hence eleminating the
need for some clamps) but this already generates much better code.

v2: incorporate feedback from José, Keith and use shuffle instead of
bitcasts/extracts. Due to llvm deficiencies the latter cause all data
to get moved to GPRs and back in pieces (even though the data in the
regs actually stays the same...).

commit c9970d70e05f95d3f52fe7d2cd794176a52693aa
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 23 19:33:19 2012 +0000

gallivm: fix bug in simple position interpolation

Accidental use of position attribute instead of just pixel coordinates.
Caused failures in piglit glsl-fs-ceil and glsl-fs-floor.

commit d0b6fcdb008d04d7f73d3d725615321544da5a7e
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 23 15:31:14 2012 +0000

gallivm: fix emission of ceil opcode

lp_build_ceil seems more appropriate than lp_build_trunc.
This seems to be never hit though someone performs some ceil
to floor magic.

commit d97fafed7e62ffa6bf76560a92ea246a1a26d256
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Mar 22 11:46:52 2012 +0000

gallivm: new vectorized path for cubemap calculations

should be faster when adapted to multiple quads as only selection masks need to be different.
The code is more or less a per-pixel version adapted to only do it per quad.
A per pixel version would be much simpler (could drop 2 selects, 6 broadcasts and the messy
horizontal add of 3 vectors at the expense of only 2 more absolute value instructions -
would also just work for arbitary large vectors).
This version doesn't yet work with larger vectors because the horizontal add isn't adjusted
to be able to work with 2x4 vectors (and also because face selection wouldn't be done per
quad just per block though that would be only a correctness issue just as with lod selection).
The downside is this code is quite a bit slower. On a Core2 it can be sped up by disabling the
hw blend instructions for selection and using logicop fallbacks instead, but it is still slower
than the old code, hence leave that in for now. Probably will chose one or the other version
based on vector length in the end.

commit b375fbb18a3fd46859b7fdd42f3e9908ea4ff9a3
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Mar 21 14:42:29 2012 +0000

gallivm: fix optimized occlusion query intrinsic name

commit a9ba0a3b611e48efbb0e79eb09caa85033dbe9a2
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Mar 21 16:19:43 2012 +0000

draw,gallivm,llvmpipe: Call gallivm_verify_function everywhere.

commit f94c2238d2bc7383e088b8845b7410439a602071
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 20 18:54:10 2012 +0000

gallivm: optimize calculations for cube maps a bit

this does some more vectorized calculations and uses horizontal adds if possible.
A definite win with sse3 otherwise it doesn't seem to make much of a difference.
In any case this is arithmetically identical, cannot handle larger vectors.
Should be useful as a reference point against larger vector version later...

commit 21a2c1cf3c8e1ac648ff49e59fdc0e3be77e2ebb
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 20 15:16:27 2012 +0000

llvmpipe: slight optimization of occlusion queries

using movmskps when available.
While this is slightly better for cpus without popcnt we should
really sum the vectors ourselves (it is also possible to cast to i4 before
doing the popcnt but that doesn't help that much neither since llvm
is using some optimized popcnt version for i32)

commit 5ab5a35f216619bcdf55eed52b0db275c4a06c1b
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 20 13:32:11 2012 +0000

llvmpipe: fix occlusion queries with larger vectors

need to adjust casts etc.

commit ff95e6fdf5f16d4ef999ffcf05ea6e8c7160b0d5
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon Mar 19 20:15:25 2012 +0000

gallivm: Restore optimization passes.

commit 57b05b4b36451e351659e98946dae27be0959832
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 19:34:22 2012 +0000

llvmpipe: use existing min2 macro

commit bc9a20e19b4f600a439f45679451f2e87cd4b299
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 19:07:27 2012 +0000

llvmpipe: add some safeguards against really large vectors

As per José's suggestion, prevent things from blowing up if some cpu
would have 1024bit or larger vectors.

commit 0e2b525e5ca1c5bbaa63158bde52ad1c1564a3a9
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 18:31:08 2012 +0000

llvmpipe: fix mask generation for uberwide vectors

this was the only piece preventing 16-wide vectors from working
(apart from the LP_MAX_VECTOR_WIDTH define that is), which is the maximum
as we don't get more pixels in the fragment shader at once.
Hence adjust that so things could be tested properly with that size
even though there seems to be no practical value.

commit 3c8334162211c97f3a11c7f64e9e5a2a91ad9656
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 18:19:41 2012 +0000

llvmpipe: fix the simple interpolation method with larger vectors

so both methods actually _really_ work now. Makes textures look
nice with larger vectors...

commit 1cb0464ef8871be1778d43b0c56adf9c06843e2d
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 17:26:35 2012 +0000

llvmpipe: fix mask generation and position interpolation with 8-wide vectors

trivial bugs, with these things start to look somewhat reasonable.
Textures though have some swizzling issues it seems.

commit 168277a63ef5b72542cf063c337f2d701053ff4b
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 16:04:03 2012 +0000

llvmpipe: don't overallocate variables

we never have more than 16 (stamp size) / 4 (minimum possible vector size).
(With larger vectors those variables are still overallocated a bit.)

commit 409b54b30f81ed0aa9ed0b01affe15c72de9abd2
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 15:56:48 2012 +0000

llvmpipe: add some 32f8 formats to lp_test_conv

Also add the ability to handle different sized vectors.

commit 55dcd3af8366ebdac0af3cdb22c2588f24aa18ce
Author: Roland Scheidegger <sroland@vmware.com>
Date: Mon Mar 19 15:47:27 2012 +0000

gallivm: handle different sized vectors in conversion / pack

only fully generic path for now (extract/insert per element).

commit 9c040f78c54575fcd94a8808216cf415fe8868f6
Author: Roland Scheidegger <sroland@vmware.com>
Date: Sun Mar 18 00:58:28 2012 +0100

llvmpipe: fix harmless use of unitialized values

commit 551e9d5468b92fc7d5aa2265db9a52bb1e368a36
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 16 23:31:21 2012 +0100

gallivm: drop special path in extract_broadcast with different sized vectors

Not needed, llvm can handle shuffles with different sized result vector just
fine. Should hopefully generate the same code in the end, but simpler IR.

commit 44da531119ffa07a421eaa041f63607cec88f6f8
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 16 23:28:49 2012 +0100

llvmpipe: adapt interpolation for handling multiple quads at once

this is still WIP there are actually two methods possible not quite
sure what makes the most sense, so there's code for both for now:
1) the iterative method as used before (compute attrib values at upper left
corner of stamp and upper left corner of each quad initially).
It is improved to handle more than one quad at once, and also do some more vectorized
calculations initially for slightly better code - newer cpus have full throughput with
4 wide float vectors, hence don't try to code up a path which might be faster if there's
just one channel active per attribute.
2) just do straight interpolation for each pixel.
Method 2) is more work per quad, but less initially - if all quads are executed
significantly more overall though. But this might change with larger vector lengths.
This method would also be needed if we'd do some kind of active quad merging when
operating on multiple quads at once.
This path contains some hack to force llvm to generate better code, it is still far
from ideal though, still generates far too many unnecessary register spills/reloads.
Both methods should work with different sized vectors.
Not very well tested yet, still seems to work with four-wide vectors, need changes
elsewhere to be able to test with wider vectors.

commit be5d3e82e2fe14ad0a46529ab79f65bf2276cd28
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Mar 16 20:59:37 2012 +0000

draw: Cleanup.

commit f85bc12c7fbacb3de2a94e88c6cd2d5ee0ec0e8d
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Mar 16 20:43:30 2012 +0000

gallivm: More module compilation refactoring.

commit d76f093198f2a06a93b2204857e6fea5fd0b3ece
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Mar 15 21:29:11 2012 +0000

llvmpipe: Use gallivm_compile/free_function() in linear code.

Should had been done before.

commit 122e1adb613ce083ad739b153ced1cde61dfc8c0
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 13 14:47:10 2012 +0100

llvmpipe: generate partial pixel mask for multiple quads

still works with one quad, cannot be tested yet with more
At least for now always fixed order with multiple quads.

commit 4c4f15081d75ed585a01392cd2dcce0ad10e0ea8
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Mar 8 22:09:24 2012 +0100

llvmpipe: refactor state setup a bit

Refactor to make it easier to emit (and potentially later fetch in fs)
coefficients for multiple attributes at once.
Need to think more about how to make this actually happen however, the
problem is different attributes can have different interpolation modes,
requiring different handling in both setup and fs (though linear and
perspective handling is close).

commit 9363e49722ff47094d688a4be6f015a03fba9c79
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Mar 8 19:23:23 2012 +0100

llvmpipe: vectorize tri offset calc

cuts number of instructions in quad-offset-factor from 107 to 75.
This code actually duplicated the (scalar) code calculating the determinant
except it used different vertex order (leading to different sign but it doesn't
matter) hence llvm could not have figured out it's the same (of course with
determinant vectorized in the other place that wouldn't have worked any longer
neither).
Note this particular piece doesn't actually vectorize well, not many arithmetic
instructions left but tons of shuffle instructions...
Probably would need to work on n tris at a time for better vectorization.

commit 63169dcb9dd445c94605625bf86d85306e2b4297
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Mar 8 03:11:37 2012 +0100

llvmpipe: vectorize some scalar code in setup

reduces number of arithmetic instructions, and avoids loading
vector x,y values twice (once as scalars once as vectors).
Results in a reduction of instructions from 76 to 64 in fs setup for glxgears
(16%) on a cpu with sse41.
Since this code uses vec2 disguised as vec4, on old cpus which had physical
64bit sse units (pre-Core2) it probably is less of a win in practice (and if
you have no vectors you can only hope llvm eliminates the arithmetic for
unneeded elements).

commit 732ecb877f951ab89bf503ac5e35ab8d838b58a1
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Mar 7 00:32:24 2012 +0100

draw: fix clipping

bug introduced by 4822fea3f0440b5205e957cd303838c3b128419c broke
clipping pretty badly (verified with lineclip test)

commit ef5d90b86d624c152d200c7c4056f47c3c6d2688
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 6 23:38:59 2012 +0100

draw: don't store vertex header per attribute

storing the vertex header once per attribute is totally unnecessary.
Some quick look at the generated assembly says llvm in fact cannot optimize
away the additional stores (maybe due to potentially aliasing pointers
somewhere).
Plus, this makes the code cleaner and also allows using a vector "or"
instead of scalar ones.

commit 6b3a5a57b0b9850854cfbd7b586e4e50102dda71
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Mar 6 19:11:01 2012 +0100

draw: do the per-vertex "boolean" clipmask "or" with vectors

no point extracting the values and doing it per component.
Doesn't help that much since we still extract the values elsewhere anyway.

commit 36519caf1af40e4480251cc79a2d527350b7c61f
Author: Roland Scheidegger <sroland@vmware.com>
Date: Fri Mar 2 22:27:01 2012 +0100

gallivm: fix lp_build_extract_broadcast with different sized vectors

Fix the obviously wrong argument, so it doesn't blow up.

commit 76d0ac3ad85066d6058486638013afd02b069c58
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Mar 2 12:16:23 2012 +0000

draw: Compile per module and not per function (WIP).

Enough to get gears w/ LLVM draw + softpipe to work on AVX doing:

GALLIUM_DRIVER=softpipe SOFTPIPE_USE_LLVM=yes glxgears

But still hackish -- will need to rethink and refactor this.

commit 78e32b247d2a7a771be9a1a07eb000d1e54ea8bd
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Feb 29 12:01:05 2012 +0000

llvmpipe: Remove lp_state_setup_fallback.

Never used.

commit 6895d5e40d19b4972c361e8b83fdb7eecda3c225
Author: José Fonseca <jfonseca@vmware.com>
Date: Mon Feb 27 19:14:27 2012 +0000

llvmpipe: Don't emit EMMS on x86

We already take precautions to ensure that LLVM never emits MMX code.

commit 4822fea3f0440b5205e957cd303838c3b128419c
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Feb 29 15:58:19 2012 +0100

draw: modifications for larger vector sizes

We want to be able to use larger vectors especially for running the vertex
shader. With this patch we build soa vectors which might have a different
length than 4.
Note that aos structures really remain the same, only when aos structures
are converted to soa potentially different sized vectors are used.
Samplers probably don't work yet, didn't look at them.
Testing done:
glxgears works with both 128bit and 256bit vectors.

commit f4950fc1ea784680ab767d3dd0dce589f4e70603
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Feb 29 15:51:57 2012 +0100

gallivm: override native vector width with LP_NATIVE_VECTOR_WIDTH env var for debug

commit 6ad6dbf0c92f3bf68ae54e5f2aca035d19b76e53
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Feb 29 15:51:24 2012 +0100

draw: allocate storage with alignment according to native vector width

commit 7bf0e3e7c9bd2469ae7279cabf4c5229ae9880c1
Author: José Fonseca <jfonseca@vmware.com>
Date: Fri Feb 24 19:06:08 2012 +0000

gallivm: Fix comment grammar.

Was missing several words. Spotted by Roland.

commit b20f1b28eb890b2fa2de44a0399b9b6a0d453c52
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 19:22:09 2012 +0000

gallivm: Use MC-JIT on LLVM 3.1 + (i.e, SVN)

MC-JIT

Note: MC-JIT is still WIP. For this to work correctly it requires
LLVM changes which are not yet upstream.

commit b1af4dfcadfc241fd4023f4c3f823a1286d452c0
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Feb 23 20:03:15 2012 +0100

llvmpipe: use new lp_type_width() helper in lp_test_blend

commit 04e0a37e888237d4db2298f31973af459ef9c95f
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Feb 23 19:50:34 2012 +0100

llvmpipe: clean up lp_test_blend a little

Using variables just sized and aligned right makes it a bit more obvious
what's going on.
The test still only tests vector length 4.
For AoS anything else probably isn't going to work.
For SoA other lengths should work (at least with floats).

commit e61c393d3ec392ddee0a3da170e985fda885a823
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 17:48:30 2012 +0000

gallivm: Ensure vector width consistency.

Instead of assuming that everything is the max native size.

commit 330081ac7bc41c5754a92825e51456d231bf84dd
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 17:44:14 2012 +0000

draw: More simd vector width consistency fixes.

commit d90ca002753596269e37297e2e6c139b19f29f03
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 17:43:00 2012 +0000

gallivm: Remove unused lp_build_int32_vec4_type() helper.

commit cae23417824d75869c202aaf897808d73a2c1db0
Author: Roland Scheidegger <sroland@vmware.com>
Date: Thu Feb 23 17:32:16 2012 +0100

gallivm: use global variable for native vector width instead of define

We do not know the simd extensions (and hence the simd width we should use)
available at compile time.
At least for now keep a define for maximum vector width, since a global
variable obviously can't be used to adjust alignment of automatic stack
variables.
Leave the runtime-determined value at 128 for now in all cases.

commit 51270ace6349acc2c294fc6f34c025c707be538a
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 15:41:02 2012 +0000

gallivm: Add a hunk inadvertedly lost when rebasing.

commit bf256df9cfdd0236637a455cbaece949b1253e98
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 14:24:23 2012 +0000

llvmpipe: Use consistent vector width in depth/stencil test.

commit 5543b0901677146662c44be2cfba655fd55da94b
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 14:19:59 2012 +0000

draw: Use a consistent the vector register width.

Instead of 4x32 sometimes, LP_NATIVE_VECTOR_WIDTH other times.

commit eada8bbd22a3a61f549f32fe2a7e408222e5c824
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 12:08:04 2012 +0000

gallivm: Remove garbagge collection.

MC-JIT will require one compilation per module (as opposed to one
compilation per function), therefore no state will be shared,
eliminating the need to do garbagge collection.

commit 556697ea0ed72e0641851e4fbbbb862c470fd7eb
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 10:33:41 2012 +0000

gallivm: Move all native target initialization to lp_set_target_options().

commit c518e8f3f2649d5dc265403511fab4bcbe2cc5c8
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 09:52:32 2012 +0000

llvmpipe: Create one gallivm instance for each test.

commit 90f10af8920ec6be6f2b1e7365cfc477a0cb111d
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 09:48:08 2012 +0000

gallivm: Avoid LLVMAddGlobalMapping() in lp_bld_assert().

Brittle, complex, and unecesary. Just use function pointer constant.

commit 98fde550b33401e3fe006af59db4db628bcbf476
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 09:21:26 2012 +0000

gallivm: Add a lp_build_const_func_pointer() helper.

To be reused in all places where we want to call C code.

commit 6cfedadb62c2ce5af8d75969bc95a607f3ece118
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 09:44:41 2012 +0000

gallivm: Cleanup/simplify lp_build_const_string_variable.

- Move to lp_bld_const where it belongs
- Rename to lp_build_const_string
- take the length from the argument (and don't count the zero terminator twice)
- bitcast the constant to generic i8 *

commit db1d4018c0f1fa682a9da93c032977659adfb68c
Author: José Fonseca <jfonseca@vmware.com>
Date: Thu Feb 23 11:52:17 2012 +0000

gallivm: Set NoFramePointerElimNonLeaf to true where supported.

commit 088614164aa915baaa5044fede728aa898483183
Author: Roland Scheidegger <sroland@vmware.com>
Date: Wed Feb 22 19:38:47 2012 +0100

llvmpipe: pass in/out pointers rather scalar floats in lp_bld_arit

we don't want llvm to potentially optimize away the vectors (though it doesn't
seem to currently), plus we want to be able to handle in/out vectors of arbitrary
length.

commit 3f5c4e04af8a7592fdffa54938a277c34ae76b51
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Feb 21 23:22:55 2012 +0100

gallivm: fix lp_build_sqrt() for vector length 1

since we optimize away vectors with length 1 need to emit intrinsic
without vector type.

commit 79d94e5f93ed8ba6757b97e2026722ea31d32c06
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Feb 22 17:00:46 2012 +0000

llvmpipe: Remove lp_test_round.

commit 81f41b5aeb3f4126e06453cfc78990086b85b78d
Author: Roland Scheidegger <sroland@vmware.com>
Date: Tue Feb 21 23:56:24 2012 +0100

llvmpipe: subsume lp_test_round into lp_test_arit

Much simpler, and since the arguments aren't passed as 128bit values can run
on any arch.
This also uses the float instead of the double versions of the c functions
(which probably was the intention anyway).
In contrast to lp_test_round the output is much less verbose however.
Tested vector width of 32 to 512 bits - all pass except 32 (length 1) which
crashes in lp_build_sqrt() due to wrong type.

Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit 945b338b421defbd274481d8c4f7e0910fd0e7eb
Author: José Fonseca <jfonseca@vmware.com>
Date: Wed Feb 22 09:55:03 2012 +0000

gallivm: Centralize the function compilation logic.

This simplifies a lot of code.

Also doing this in a central place will make it easier to carry out the
changes necessary to use MC-JIT in the future.

gallivm: Fix typo in explicit derivative shuffle.

Trivial.

draw: make DEBUG_STORE work again

adapt to lp_build_printf() interface changes

Reviewed-by: José Fonseca <jfonseca@vmware.com>

draw: get rid of vecnf_from_scalar()

just use lp_build_broadcast directly (cannot assign a name but don't really
need it, vecnf_from_scalar() was producing much uglier IR due to using
repeated insertelement instead of insertelement+shuffle).

Reviewed-by: José Fonseca <jfonseca@vmware.com>

llvmpipe: fix typo in complex interpolation code

Fixes position interpolation when using complex mode
(piglit fp-fragment-position and similar)

Reviewed-by: José Fonseca <jfonseca@vmware.com>

draw: fix clipvertex/position storing again

This appears to be the result of a bad merge.
Fixes piglit tests relying on clipping, like a lot of the interpolation tests.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

gallivm: Fix explicit derivative manipulation.

Same counter variable was being used in two nested loops. Use more
meanigful variable names for the counter to fix and avoid this.

gallivm: Prevent buffer overflow in repeat wrap mode for NPOT.

Based on Roland's patch, discussion, and review .

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

gallivm: Fix dims for TGSI_TEXTURE_1D in emit_tex.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

gallivm: Fix explicit volume texture derivatives.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

gallivm: fix 1d shadow texture sampling

Always r coordinate is used, hence need 3 coords not two
(the second one is unused).

Reviewed-by: José Fonseca <jfonseca@vmware.com>

gallivm: Enable AVX support without MCJIT, where available.

For now, this just enables AVX on Windows for testing. If the code is
stable then we might consider prefering the old JIT wherever possible.

No change elsewhere.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
185ed2105829d6f5eb19edb9abbf0d7977e157c3 25-May-2012 Brian Paul <brianp@vmware.com> draw: simplify index buffer specification

Replace draw_set_index_buffer() and draw_set_mapped_index_buffer() with
draw_set_indexes() which simply takes a pointer and an index size.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
8b4f7b0672d663273310fffa9490ad996f5b914a 06-Feb-2012 Christoph Bumiller <e0425955@student.tuwien.ac.at> gallium: add PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION

Just let the hardware do it if it can and avoid drivers having to
check for the special case on each draw call.

v2: update the draw module
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
b6d3a435a0e0e53a9e8cc4c4249dc7c2f897a83d 24-Jan-2011 Jakob Bornecrantz <wallbraker@gmail.com> draw: Only run prepare when state, prim and opt changes

In bad applications like ipers which does a lot of draw calls with
no state changes this helps to greatly reduce time spent in prepare.
In ipers around 7% of CPU was spent in various prepare functions,
after this commit no prepare function show on the profile.

This commit also has the added benefit of now grouping all pipelined
drawing into a single draw call if the driver uses vbuf_render.

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Tested-by: Stéphane Marchesin <marcheu@chromium.org>
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
4a79545bdfb9e948329a761ef350eb83a3d87496 05-Dec-2010 Jakob Bornecrantz <wallbraker@gmail.com> draw: Remove reduced_prim

Conflicts:

src/gallium/auxiliary/draw/draw_context.c

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Tested-by: Stéphane Marchesin <marcheu@chromium.org>
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
1865f341d8f45b389061fc08d2da90b7aa8a6099 06-Jan-2012 Dave Airlie <airlied@redhat.com> draw: clipdistance support (v2)

Add support for using the clipdistance instead of clip plane.

Passes all piglit clipdistance tests.

v2: fixup some comments from Brian in review.

Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
40c5987ed84f9f0b8bb1f707bb13c1aafc39330a 04-Jan-2012 Dave Airlie <airlied@redhat.com> draw/softpipe: add clip vertex support. (v2)

softpipe always clipped using the position vector, however for unclipped
vertices it stored the position in window coordinates, however when position
and clipping are separated, we need to store the clip-space position and
the clip-space vertex clip, so we can interpolate both separately.

This means we have to take the clip space position and store it to use later.

This allows softpipe to pass all the clip-vertex piglit tests.

v2: fix llvm draw regression, the structure being passed into llvm needed
updating, remove some hardcoded ints that should have been enums while there.

Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
ec8cbd79ac4065111365a6720c9564de56855cc8 04-Jan-2012 Dave Airlie <airlied@redhat.com> draw/softpipe: EXT_transform_feedback support (v2)

This replaces the current code with an implementation compatible with
the new gallium interface. I've left some of the remains of the interface
intact so llvmpipe keeps building correctly, and I'll take a look at fixing
llvmpipe up later.

v2: fixup as per Brian's review

Signed-off-by: Dave Airlie <airlied@redhat.com>
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
dc4c821f0817a3db716f965692fb701079f66340 10-Jan-2012 Marek Olšák <maraeo@gmail.com> Squash-merge branch 'gallium-clip-state'

Conflicts:
src/gallium/auxiliary/tgsi/tgsi_strings.c
src/mesa/state_tracker/st_atom_clip.c

commit d919791f2742e913173d6b335128e7d4c63c0840
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date: Fri Jan 6 17:59:22 2012 +0100

d3d1x: adapt to new clip state

commit cfec82bca3fefcdefafca3f4555285ec1d1ae421
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date: Fri Jan 6 14:16:51 2012 +0100

gallium/docs: update for clip state changes

commit c02bfeb81ad9f62041a2285ea6373bbbd602912a
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date: Fri Jan 6 14:21:43 2012 +0100

tgsi: add TGSI_PROPERTY_PROHIBIT_UCPS

commit d4e0a785a6a23ad2f6819fd72e236acb9750028d
Author: Brian Paul <brianp@vmware.com>
Date: Thu Jan 5 08:30:00 2012 -0700

tgsi: consolidate TGSI string arrays in new tgsi_strings.h

There was some duplication between the tgsi_dump.c and tgsi_text.c
files. Also use some static assertions to help catch errors when
adding new TGSI values.

v2: put strings in tgsi_strings.c file instead of the .h file.

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit c28584ce0d8c62bd92c8f140729d344f88a0b3cd
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date: Fri Jan 6 12:48:09 2012 +0100

gallium: extend user_clip_plane_enable to apply to clip distances

commit f1d5016c07f786229ed057effbe55fbfd160b019
Author: Marek Olšák <maraeo@gmail.com>
Date: Fri Jan 6 02:39:09 2012 +0100

nvfx: adapt to new clip state

commit 6f6fa1c26bd19f797c1996731708e3569c9bfe24
Author: Marek Olšák <maraeo@gmail.com>
Date: Fri Jan 6 01:41:39 2012 +0100

st/mesa: fix DrawPixels with GL_DEPTH_CLAMP

commit c86ad730aa1c017788ae88a55f54071bf222be12
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date: Tue Jan 3 23:51:30 2012 +0100

nv50: adapt to new clip state

commit 3a8ae6ac243bae5970729dc4057fe02d992543dc
Author: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Date: Tue Jan 3 23:32:36 2012 +0100

nvc0: adapt to new clip state

commit 6243a8246997f8d2fcc69ab741a2c2dea080ff11
Author: Marek Olšák <maraeo@gmail.com>
Date: Thu Dec 29 01:32:51 2011 +0100

draw: initalize pt.user.planes in draw_init

This fixes a crash in glean/fpexceptions.

commit e3056524b19b56d473f4faff84ffa0eb41497408
Author: Marek Olšák <maraeo@gmail.com>
Date: Mon Dec 26 06:26:55 2011 +0100

svga: adapt to new clip state

commit c5bfa8b37d6d489271df457229081d6bbb51b4b7
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 14:11:51 2011 +0100

r600g: adapt to new clip state

commit f11890905362f62627c4a28a8255b76eb7de7df2
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 14:10:26 2011 +0100

r300g: adapt to new clip state

commit e37465327c79a01112f15f6278d9accc5bf3103f
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 12:39:16 2011 +0100

draw: adapt to new clip state

This adds a regression in the LLVM clipping path. Can anybody see anything
wrong with the code? It works for every other case, just glean/fpexceptions
crashes when doing the "Infinite clip plane test".

commit b474d2b18c72d965eefae4e427c269cba5ce6ba2
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 13:14:59 2011 +0100

u_blitter: don't save/set/restore clip state

commit 9dd240ea91f523a677af45e8d0adb9e661e28602
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 13:11:56 2011 +0100

gallium: don't cso_save/set/restore clip state

The enable bits are in the rasterizer state.

commit a4f7031179f5f4ad524b34b394214b984ac950f6
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 12:58:55 2011 +0100

gallium: default depth_clip to 1

depth_clip = !depth_clamp

commit fe21147a00ab90e549d63fe12ee4625c9c2ffcc3
Author: Marek Olšák <maraeo@gmail.com>
Date: Mon Dec 26 06:14:19 2011 +0100

trace,util: update state logging to new clip state

Also dump the other missing flags.

commit 2a3b96e84ac872dcc5bc1de049fe76bb58d64b23
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Dec 25 10:43:43 2011 +0100

st/mesa: adapt to new clip state

commit b7b656a42fca19d7c85267f42649a206a85a2c72
Author: Marek Olšák <maraeo@gmail.com>
Date: Sat Dec 17 15:45:19 2011 +0100

gallium: move state enable bits from clip_state to rasterizer_state
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
861a029ddb31e91bb4d8e18ab708d0d172f63aad 15-Dec-2011 Marek Olšák <maraeo@gmail.com> gallium: interface changes necessary to implement transform feedback (v5)

Namely:
- EXT_transform_feedback
- ARB_transform_feedback2
- ARB_transform_feedback_instanced

The old interface was not useful for OpenGL and had to be reworked.

This interface was originally designed for OpenGL, but additional
changes have been made in order to make st/d3d1x support easier.

The most notable change is the stream-out info must be linked
with a vertex or geometry shader and cannot be set independently.
This is due to limitations of existing hardware (special shader
instructions must be used to write into stream-out buffers),
and it's also how OpenGL works (stream outputs must be specified
prior to linking shaders).

Other than that, each stream output buffer has a "view" into it that
internally maintains the number of bytes which have been written
into it. (one buffer can be bound in several different transform
feedback objects in OpenGL, so we must be able to have several views
around) The set_stream_output_targets function contains a parameter
saying whether new data should be appended or not.

Also, the view can optionally be used to provide the vertex
count for draw_vbo. Note that the count is supposed to be stored
in device memory and the CPU never gets to know its value.

OpenGL way | Gallium way
------------------------------------
BeginTF = set_so_targets(append_bitmask = 0)
PauseTF = set_so_targets(num_targets = 0)
ResumeTF = set_so_targets(append_bitmask = ~0)
EndTF = set_so_targets(num_targets = 0)
DrawTF = use pipe_draw_info::count_from_stream_output

v2: * removed the reset_stream_output_targets function
* added a parameter append_bitmask to set_stream_output_targets,
each bit specifies whether new data should be appended to each
buffer or not.
v3: * added PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME for ARB_tfb2,
note that the draw-auto subset is always required (for d3d10),
only the pause/resume functionality is limited if the CAP is not
advertised
v4: * update gallium/docs
v5: * compactified struct pipe_stream_output_info, updated dump/trace
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
4eb3225b38ce12cb34ab3d90804c9683bd7b4ed3 08-Nov-2011 José Fonseca <jose.r.fonseca@gmail.com> Remove tgsi_sse2.

tgsi_exec is simple. llvm is fast. tgsi_sse2 ends up being neither.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
e6c237cfd6f53ff569f68255d5d6da15148cd0f5 11-Oct-2011 Brian Paul <brianp@vmware.com> draw/llvm: fix hard-coded number of total clip planes

Instead of 12 use DRAW_TOTAL_CLIP_PLANES. The max number of user-defined
clip planes was increased to 8 so the total number of planes is 14.
This doesn't fix any specific bug, but clearly the old code was wrong.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
4465efc3bf8d755a9afb7a4bb5382e2f5bf113e1 21-Sep-2011 Brian Paul <brianp@vmware.com> draw: add support for guard-band clipping
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
a5c0fb51c6b1d5f7e6ea8f089da921719ad1b6c4 14-Jul-2010 José Fonseca <jfonseca@vmware.com> draw: Reduce the number of vertex shader variants per context to 128.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
3733da31e8b4405b65e1b6ca3b6599ecc5af5fe7 31-Mar-2011 José Fonseca <jfonseca@vmware.com> draw: Prevent out-of-bounds vertex buffer access.

Based on some code and ideas from Keith Whitwell.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
4c73030d47f39441d718157f7d9a59c136bbfac0 23-Jan-2011 Jakob Bornecrantz <wallbraker@gmail.com> draw: Init llvm if not provided
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
efc82aef35a2aac5d2ed9774f6d28f2626796416 01-Dec-2010 Brian Paul <brianp@vmware.com> gallivm/llvmpipe: squash merge of the llvm-context branch

This branch defines a gallivm_state structure which contains the
LLVMBuilderRef, LLVMContextRef, etc. All data structures built with
this object can be periodically freed during a "garbage collection"
operation.

The gallivm_state object has to be passed to most of the builder
functions where LLVMBuilderRef used to be used.

Conflicts:
src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
src/gallium/drivers/llvmpipe/lp_state_setup.c
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
08f890d4c3b8376d1840f90474f7c56329432d95 10-Oct-2010 delphi <tayhuiqithq@gmail.com> draw: some changes to allow for runtime changes to userclip planes
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
e22e3927b056806e9bbb089734132ad0bcb98df1 18-Sep-2010 Brian Paul <brianp@vmware.com> gallium: rework handling of sprite_coord_enable state

Implement the pipe_rasterizer_state::sprite_coord_enable field
in the draw module (and softpipe) according to what's specified
in the documentation.

The draw module can now add any number of extra vertex attributes
to a post-transformed vertex and generate texcoords for those
attributes per sprite_coord_enable. Auto-generated texcoords
for sprites only worked for one texcoord unit before.

The frag shader gl_PointCoord input is now implemented like any
other generic/texcoord attribute.

The draw module now needs to be informed about fragment shaders
since we need to look at the fragment shader's inputs to know
which ones need auto-generated texcoords.

Only softpipe has been updated so far.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
6c0dc4bafbdbdc0cb4b6e5934fe064226dbd47ec 20-Aug-2010 Keith Whitwell <keithw@vmware.com> draw: specialized cliptesting routines
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
4f024e0f642f4f743e4d051ec71c00e45bfd361f 25-Aug-2010 Chia-I Wu <olv@lunarg.com> draw: Add draw_set_index_buffer and others.

This commit adds draw_set_index_buffer, draw_set_mapped_index_buffer,
and draw_vbo. The idea behind the new functions is that an index buffer
should be a state.

draw_arrays and draw_set_mapped_element_buffer are preserved, but the
latter will be removed soon.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
c3fee80f2b35f6a7e48d6015bfc759c66b7e1a2c 07-Aug-2010 Chia-I Wu <olv@lunarg.com> draw: Remove DRAW_PIPE_MAX_VERTICES and DRAW_PIPE_FLAG_MASK.

The higher bits of draw elements are no longer used for the stipple or
edge flags.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
5a085c623faebf957be3fae2f82dc89ef6214585 07-Aug-2010 Chia-I Wu <olv@lunarg.com> draw: Replace vcache by vsplit.

vcache decomposes primitives while vsplit splits primitives. Splitting
is generally easier to do and is faster. More importantly, vcache
depends on flatshade_first to decompose. The outputs may have incorrect
vertex order which is significant to GS.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
5b6bf799e637e9020af3a4bebe514b53d7c38eca 07-Aug-2010 Chia-I Wu <olv@lunarg.com> draw: Replace varray by vsplit.

vsplit is a superset of varray. It sets the split flags comparing to
varray.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
f141abdc8fdbff41e16b0ce53fa3fa8fba32a7f9 07-Aug-2010 Chia-I Wu <olv@lunarg.com> draw: Add flags to draw_prim_info.

A primitive may be splitted in frontends. The splitted primitives
should convey certain flag bits so that the decomposer can correctly
decide the stipple or edge flags.

This commit adds flags to draw_prim_info and updates the decomposer to
honor the flags. Frontends and middle ends will be updated later.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
ba2cc3b8e6ad161181b67fd2575c6bc768584d23 29-Jul-2010 Brian Paul <brianp@vmware.com> gallium: implement bounds checking for constant buffers

Plumb the constant buffer sizes down into the tgsi interpreter where
we can do bounds checking. Optional debug code warns upon out-of-bounds
reading. Plus add a few other assertions in the TGSI interpreter.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
9ca48de1068d4ebce81853d29455c83b4898e25e 21-Jul-2010 Marek Olšák <maraeo@gmail.com> draw: disable depth clipping if depth clamp is enabled
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
01eebfe1b6de2e36dd3af0952fc8329b7073a100 14-Jun-2010 Zack Rusin <zackr@vmware.com> draw: implement vertex texture sampling using llvm
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
8ebfcf31eb905b7d47e520c04420620ae21bdf4e 26-Jun-2010 Zack Rusin <zackr@vmware.com> draw: limit the number of vertex shader variants kept around

we used to create and cache unltimited number of variant, this
change limits the number of variants kept around to a fixed number.
the change is based on a similar patch by Roland for llvmpipe fragment
shaders.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
3560652ccf0d88bcc23c326ea99bbc7091b45f39 15-Jun-2010 Zack Rusin <zackr@vmware.com> gs: make sure we end primitives when finishing executing shaders
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
a192b5eeafae80f9f9e7e7e442abc5b44d583d1a 15-Jun-2010 Zack Rusin <zackr@vmware.com> draw: finish the new pipeline setup

Keith came up with a new way of running the pipeline which involves passing
a few info structs around (for fetch, vertices and prims) and allows us
to correctly handle cases where we endup with multiple primitives generated
by the pipeline itself.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
b85a361ccbac956d2842251395c048a4b3f4c440 14-Jun-2010 Keith Whitwell <keithw@vmware.com> draw wip
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
4d0baa73c9e1a40b4ac089c786af79dc7f1ff219 10-Jun-2010 Zack Rusin <zack@kde.org> draw: geometry shader fixes

don't overwrite the inputs and make sure the correct primitive
is used on entry
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
a45b7f47ee0e38b288cc8fc4f6a1c013e8c227bc 28-May-2010 Zack Rusin <zack@kde.org> gallium: basic and initial implementation of the stream output interface

aka transform feedback
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
6ecbbc3c056d177174c97ac4d1a57abed3ac3177 26-Apr-2010 José Fonseca <jfonseca@vmware.com> draw: Always use the llvm middle end when available & enabled.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
8cb223eb020560d59c8f73e09b832cef477933b7 21-Apr-2010 Brian Paul <brianp@vmware.com> gallium/draw: fix point sprite handling

New draw API function to indicate whether or not to convert points to
quads for sprite rasterization.

Fix point-to-quad conversion regression in the wide-point stage. We
need to check the pipe_rasterizer_state::point_quad_rasterization flag.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
a6171a9dd99713266091982215bf1008c9ac8e64 20-Apr-2010 José Fonseca <jfonseca@vmware.com> Merge branch 'gallium-index-bias'
e3e5faba89996c64f6d5b5a00b9028900ddbd64f 19-Apr-2010 Zack Rusin <zackr@vmware.com> draw llvm: fix typo (boolean, not bool)
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
4df3e76949e1ca7b29f844ad9a715b442396a024 19-Apr-2010 Zack Rusin <zackr@vmware.com> draw llvm: allow runtime switching of pipelines (yes/no to llvm)

use DRAW_USE_LLVM to disable or enable (default) llvm
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
2197fac47cb1f87387820678357cc67c9a2536b9 19-Apr-2010 José Fonseca <jfonseca@vmware.com> draw: Implement index bias.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
8f3bdeaad610d7d5a5c6e73e1e9c721219595754 19-Apr-2010 Brian Paul <brianp@vmware.com> Merge branch '7.8'

Conflicts:

src/gallium/auxiliary/draw/draw_context.c
src/gallium/auxiliary/draw/draw_pipe_aaline.c
src/gallium/drivers/llvmpipe/lp_context.c
e3a34cc7f6c9f959cdc2af4486e84587fab4d0d7 19-Apr-2010 Brian Paul <brianp@vmware.com> gallium/draw: use correct rasterization state for wide/AA points/lines

When points or lines are decomposed into triangles, we need to be sure
to disable polygon culling, stippling, "un-filled" modes, etc.

This patch sets the rasterization state to disable those things prior to
drawing points/lines with triangles, then restores the previous state
afterward.

The new piglit point-no-line-cull test checks this problem & solution.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
ea532f0e725bd68e7784189c9b7f6f7bf7f9d901 10-Apr-2010 José Fonseca <jfonseca@vmware.com> scons: Make LLVM a black-white dependency.

Now that draw depends on llvm it is very difficult to correctly handle
broken llvm installations. Either the user requests LLVM and it needs to
supply a working installation. Or it doesn't, and it gets no LLVM
accelerate pipe drivers.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
c5c5cd7132e18f4aad8e73d8ee879f8823c4c1e7 23-Feb-2010 Zack Rusin <zackr@vmware.com> gallium/draw: initial code to properly support llvm in the draw module

code generate big chunks of the vertex pipeline in order to speed up
software vertex processing.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
c61bf363937f40624a5632745630d4f2b9907082 09-Feb-2010 Zack Rusin <zackr@vmware.com> llvmpipe: export the tgsi translation code to a common layer

the llvmpipe tgsi translation is a lot more complete than what was in
gallivm so replacing the latter with the former. this is needed since
the draw llvm paths will use the same code. effectively the proven
llvmpipe code becomes gallivm.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
256f7f5ab2123bae9549e4f572276e200dc1ae76 03-Feb-2010 Brian Paul <brianp@vmware.com> draw: add const qualifiers, fix return types
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
7c5f255201f42303188137f56ea8acc030444f0e 25-Jan-2010 Michal Krol <michal@vmware.com> gallium: Rename PIPE_MAX_CONSTANT to PIPE_MAX_CONSTANT_BUFFERS.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
9851644435f991a1a1bbb145333a97601627b37d 25-Jan-2010 Michal Krol <michal@vmware.com> gallium: Enable multiple constant buffers for vertex and geometry shaders.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
a5c03bd6f16517bf35c273741080492d70d64c29 16-Jan-2010 Jakob Bornecrantz <jakob@vmware.com> draw: Fix memory leak in gs code
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
f7748d72b46465d807cf4209892d73af62485738 15-Jan-2010 Luca Barbieri <luca@luca-barbieri.com> draw: Add GALLIUM_DUMP_VS environment variable.

Add GALLIUM_DUMP_VS to dump the vertex shader to the console like
GALLIUM_DUMP_FS in softpipe.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
9b21b3c52a8a7d58d08151d1a6bf25c472dec213 05-Jan-2010 Michal Krol <michal@vmware.com> Merge branch 'master' into instanced-arrays

Conflicts:
src/gallium/auxiliary/tgsi/tgsi_dump.c
src/gallium/include/pipe/p_shader_tokens.h
7ca0ce38340144794267609646048b3820d594ab 29-Dec-2009 Michal Krol <michal@vmware.com> Implement draw_arrays_instanced() in softpipe.

Modify the translate module to respect instance divisors and accept
instance id as a parameter to calculate input vertex offset.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
89d8577fb3036547ef0b47498cc8dc5c77f886e0 14-Dec-2009 Zack Rusin <zackr@vmware.com> gallium: add geometry shader support to gallium
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
a0127b6ced257919180ba3a1bf534b68d9c750be 14-Dec-2009 Roland Scheidegger <sroland@vmware.com> gallium: more work for edgeflags changes

fixes, cleanups, etc.
not working yet
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
a08e348a84f57ed5e8bf5888f1ce13934d2ce8fa 09-Dec-2009 Keith Whitwell <keithw@vmware.com> gallium: first steps to treat edgeflags as regular vertex element

The idea here is to eliminate the set_edgeflags() call in pipe_context
by treating edgeflags as a regular vertex element.

Edgeflags provoke special treatment in hardware, which means we need to
label them in some way, in this case we'll be passing them through the
vertex shader and labelling the vertex shader output with a new TGSI
semantic (TGSI_SEMANTIC_EDGEFLAG).
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
c202fe187cf7a08d60e23ce617a5820a8bc510fd 16-Jul-2009 Keith Whitwell <keith@tungstengraphics.com> gallium: reduce recursive include of tgsi_exec.h

A lot of draw code no longer needs to see this header.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
6175653d0bceedba1f599d27111bab14f312f134 16-Jul-2009 Keith Whitwell <keith@tungstengraphics.com> gallium: proper constructor and destructor for tgsi_exec_machine

Centralize the creation, initialization and destruction of this struct.
Use align_malloc instead of home-brew alternatives.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
0c31661e73dd2979df22a275452efc71c7064f81 11-Dec-2008 Brian Paul <brian.paul@tungstengraphics.com> Merge commit 'origin/gallium-0.1' into gallium-0.2
d0bc5293d6e1e9c34fa822b7c2928932ed22462c 11-Dec-2008 Brian Paul <brian.paul@tungstengraphics.com> gallium: added draw_set_mrd() function to fix polygon offset

The Minimum Resolvable Depth factor depends on the driver and can't just
be computed from the number of Z buffer bits.
Glean's polygon offset test now passes with softpipe.
Still need to determine the MRD factor for other gallium drivers, if they use
the draw module's polygon offset stage...
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
f2bccfd3c806a879abf0c40858806ec3825d0628 03-Dec-2008 Brian <brian.paul@tungstengraphics.com> gallium: added draw_texture_samplers() to support texture fetches from vertex shaders

This may only be practical for the softpipe driver at this time.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
d7f1cb5b5a134b63227d5746a2dd1f05597c5c2f 10-Oct-2008 Keith Whitwell <keith@tungstengraphics.com> Merge commit 'origin/gallium-0.1' into gallium-0.2

Conflicts:

src/gallium/auxiliary/gallivm/instructionssoa.cpp
src/gallium/auxiliary/gallivm/soabuiltins.c
src/gallium/auxiliary/rtasm/rtasm_x86sse.c
src/gallium/auxiliary/rtasm/rtasm_x86sse.h
src/mesa/main/texenvprogram.c
src/mesa/shader/arbprogparse.c
src/mesa/shader/prog_statevars.c
src/mesa/state_tracker/st_draw.c
src/mesa/vbo/vbo_exec_draw.c
ca5224945ae11d3c2e80fd39b7e08464d019bbdd 10-Oct-2008 Alan Hourihane <alanh@tungstengraphics.com> gallium: silence warning
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
c48da7d78b4e7bdbe056b3c9668756d49019be06 06-Oct-2008 Keith Whitwell <keith@tungstengraphics.com> draw: add switch for drivers to force vertex data passthrough
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
c208a2c791fa24c7c5887fc496738cbddbfafc72 27-Jul-2008 José Fonseca <jrfonseca@tungstengraphics.com> Merge tgsi/exec and tgsi/util directories.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
2161b0fafcdc16703162dd489d2ec1e7114cce4c 12-Jun-2008 Keith Whitwell <keith@tungstengraphics.com> draw: don't assume vertex position is in data[0]
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
0a4aea0e86a897d9afb9f2a0ec27f03faf8f1b21 02-Jun-2008 Keith Whitwell <keith@tungstengraphics.com> draw: respect driver's max vertex buffer size
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
13581958bd99396ab8ec314f10cf61f717b18a9b 31-May-2008 Michal Krol <michal@tungstengraphics.com> draw: Remove const qualifier.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
807e7c4ccfdaebf8e568357fb1fd8090ccae638c 29-May-2008 Keith Whitwell <keith@tungstengraphics.com> draw: add more switches to turn FSE on/off
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
bb2e13b9e82b68ec3b9fc56a4c35e7ead8fd138f 29-May-2008 Keith Whitwell <keith@tungstengraphics.com> draw: make sure constant buffer data is aligned before passing to aos.c
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
82605d7bcd533d7c96cc619c45970efd7229dc3b 29-May-2008 Keith Whitwell <keith@tungstengraphics.com> draw: draw_range_elements trial
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
62628c4d3d497cbca73fde869c9069fa90e6453e 29-May-2008 Keith Whitwell <keith@tungstengraphics.com> draw: share machine
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
7c99d7fe60e7bb0b7cf103a851aeef4614278ca6 15-May-2008 Keith Whitwell <keith@tungstengraphics.com> draw: create specialized vs varients incorporating fetch & emit
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
2f0d1396e4c1626b3b1ac799bd29e86a9530369e 13-May-2008 Keith Whitwell <keith@tungstengraphics.com> draw: move some state into a new 'vs' area
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
b23706454bb165a62888d264e95a98a2e4cf139c 13-May-2008 Keith Whitwell <keith@tungstengraphics.com> draw: get rid of fetch-shade-emit frontend hack

The code is now living in it's intended place as a pt middle end.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
bbda45ec769120324f44febf00c6bb170f594f23 12-May-2008 Keith Whitwell <keith@tungstengraphics.com> draw: turn fse path into a middle end

Also add some util functions in pt_util.c
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
b5e5369da5fc50d63a6ece931fac44b555eb0314 12-May-2008 Keith Whitwell <keith@tungstengraphics.com> draw: add fetch-shade-emit path

Enable with TEST_FSE=t. Performs fetch from API-provided vertex buffers,
transformation with one of three (two working) hard-coded shaders, and
final emit to hardware vertices all in a single pass.

Currently only really useful for profiling in conjunction with SP_NO_RAST=t.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
fe586f8612dd517b9a1f0d87fbaf3a75e3caf588 07-May-2008 Zack Rusin <zack@tungstengraphics.com> redo the linear paths
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
992d0b997f8f7e965e56852b81e01c290f8c13de 24-Apr-2008 Zack Rusin <zack@tungstengraphics.com> frontend for rendering without elts
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
f93332da5655a31b6c44a1079629a15360ff999b 24-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: handle edgeflags and reset-line-stipple again
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
14d1ca8d867d6e44c756cb759f92421107118b2e 24-Apr-2008 Brian Paul <brian.paul@tungstengraphics.com> gallium: fix issues in recursive flushing

When flushing/rendering, some stages (like AA line/point) need to set
pipe/driver state. Those driver functions often call draw_flush().
That leads to recursion.

Use new draw->suspend_flush flag to explicitly prevent that in the key places.
Remove the draw->vcache_flushing field.
Reuse draw->flushing as a debug/assertion var.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
72fd5b9c5a78792ad8c1fe7c8713a3583008c50a 23-Apr-2008 Brian Paul <brian.paul@tungstengraphics.com> gallium: added a flushing_vcache flag, test in draw_do_flush()

Fixes broken polygon stipple, aaline, aapoint stages
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
0588858702d1a5c9c08573ea6817e2e149473cf6 22-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: allow drivers to query pipeline state more easily

Also, provide a separate flag to say whether the driver can handle
clipping/rhw tasks, in addition to the API flag which indicates they
have already been done.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
40e0439db448a7d93ddb18faac7f14b47b1343c0 21-Apr-2008 José Fonseca <jrfonseca@tungstengraphics.com> gallium: Centralize SSE usage logic.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
7d72607e142c0412b88183b849fd701e698b8f79 19-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: move incoming vertex state into draw->pt

This state is effectively private to the vertex processing part
of the draw module.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
dcf6f776ce32b89b7ff784bb38030bd29698e005 19-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: make draw_reset_vertex_ids private to the draw_pipe_* code
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
e7bac4276634ea1ee81ac71f6f6869f87e689872 19-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: put pipeline flushing behind a new interface
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
bee1d31641674c67676de86fbb4b35ca5bf7f33f 19-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: move pt_pipeline code to draw_pipe.c

This is now the drawing interface to the pipeline. No more
calling into pipeline.first->tri(), etc.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
507fbe2d327efb8d608ce8e07436b97321560808 19-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: move some pipeline-specific code & state to draw_pipe.[ch]
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
1246d06313f443c91dea07239b43a88ba2b86dde 19-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: remove named clipmask flags, tidy up pt middle ends
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
251ebcc175d479dda8d0d5b64fc42f44e747197e 19-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: remove more dead data structures
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
6094e79f4e3350d123c7532b1c73faa60834a62d 19-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: remove dead data structures
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
709e33cf0bfd552220e46f44e8cfa2063c3cef69 18-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: remove old draw_vertex_shader_queue_flush function
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
a41c05b20a36d2160aa232d08ed57d3095438025 18-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: switch over to draw_pt paths, will remove old code shortly
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
a773f06e969a3992451dd7fe6fd55ea96b2774fa 18-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: split off all the extra functionality in the vertex shader

This will at least allow us to make the initial gains to get decent
vertex performance much more quickly & with higher confidence of getting
it right.

At some later point can look again at code-generating all the
fetch/cliptest/viewport extras in the same block as the vertex shader.
For now, just need to get some decent baseline performance.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
c503e55d74cf84f87f82b3dab3cb4d38b201d47a 17-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: move hw vertex emit to a new module
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
280bcff1fa200b790d8712946a4ffbaa47a67433 17-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: add vertex shader run_linear function
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
c96d565643de271c6bda066e892b25d0a97ea4d0 17-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: keep record of number of active vertex buffers
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
a8582efaca35d09c8ca18918a243a9284583356d 16-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: make pt run pipeline when need_pipeline is true, not just when clipped
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
e3309197855b5caf7c4c167d1e7beedf33ed2fdd 14-Apr-2008 Zack Rusin <zack@tungstengraphics.com> pass vertex size to shaders so that callee can decide on the size
of the vertices and not always have to use the maximum vertex
allocation size for them
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
caf293343fd236e97ce399533ac0ada3c7afee7a 14-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: hide passthrough shading paths behind an environment variable
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
3f7a3dd58c0ce2719af83ff1d89a26185d08c04c 13-Apr-2008 Zack Rusin <zack@tungstengraphics.com> Make shaders operate on a block of memory instead of arrays of vertex_header's
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
808f968f3ad0cb32e86f517753d5715d00e9ec2c 12-Apr-2008 Zack Rusin <zack@tungstengraphics.com> return true if one of the vertices has been clipped
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
aadbb1d7fbbaada6e378cb60194e5861cadf98d1 12-Apr-2008 Zack Rusin <zack@tungstengraphics.com> pass arbitrary number of vertices to the shader execution cycle
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
84501e68f6294370d6f2f6aec4e7eab57bcc0e72 04-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> gallium: Handle client-supplied edgeflags.

Also, implement support in the draw module. We were hardwiring these
to one for quite a long time...

Currently using a draw_set_edgeflags() function, may be better to push
the argument into the draw_arrays() function. TBD.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
0b20d1b9b5e0514a68ab460d748753d29df2e70b 04-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: move code to run pipeline from pt to new file

Add facility for draw_vbuf.c to reset these vertex ids on flushes.
Pre-initialize vertex ids correctly.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
d2cb4ba0bb2388c784f145c59f3798f914dc7f39 03-Apr-2008 Keith Whitwell <keith@tungstengraphics.com> draw: add passthrough path to the pipeline

This handles the case where bypass_vs is set, but vertices need to go
through the pipeline for some reason - eg unfilled polygon mode.

Demonstrates how to drive the pipeline from inside one of these things.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
594dab4769533afaeb30a588e1731a6753a93f0d 31-Mar-2008 Brian <brian.paul@tungstengraphics.com> gallium: move the test for bypass_vs into the vs_XXX_run() functions

Also:
1. Added an identity_viewport flag to skip viewport transformation when it
has no effect. Might also add an explicit bypass_viewport flag someday.
2. Separate the code for computing clip codes and doing the viewport transform.
Predicate them separately.
Note: even if bypass_vs is set, we still look at the shader to determine the
number of inputs and outputs.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
39038c11699bbc9baab744542e96d54e91cb452a 28-Mar-2008 Brian <brian.paul@tungstengraphics.com> gallium: replace PIPE_ATTRIB_MAX with PIPE_MAX_ATTRIBS

The later follows the naming scheme of other limits.
Keep the old definition until all possible usage is updated.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
df1744c0433f3f73ebf4b06567fefa946a29c3d8 27-Mar-2008 Brian <brian.paul@tungstengraphics.com> gallium: remove temporary static var
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
4505acf3b28f0b88bf97838ed7898f10e9200b93 25-Mar-2008 Keith Whitwell <keith@tungstengraphics.com> draw: take primitive into account when deciding if the pipeline is active
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
f40357e25c0520ef1d64ffab03501da4c8b93529 23-Mar-2008 Keith Whitwell <keith@tungstengraphics.com> gallium: beginnings of draw module vertex rework

Trying to put a structure in place that we can actually optimize.
Initially just implementing a passthrough mode, this will fairly soon
replace all the vertex_cache/prim_queue/shader_queue stuff that's so
hard to understand...

Split the vertex processing into a couple of distinct stages:
- Frontend
- Prepares two lists of elements (fetch and draw) to be processed
by the next stage. This stage doesn't fetch or draw vertices, but
makes the decision which to draw. Multiple implementations of this
will implement different strategies, currently just a vcache
implementation.
- MiddleEnd
- Takes the list of fetch elements, fetches them, runs the vertex
shader, cliptest, viewport transform on them to produce a
linear array of vertex_header vertices.
- Passes that list of vertices, plus the draw_elements (which index
into that list) onto the backend
- Backend
- Either the existing primitive/clipping pipeline, or the vbuf_render
hardware backend provided by the driver.

Currently, the middle-end is the old passthrough code, and it build hardware
vertices, not vertex_header vertices as above. It may be that passthrough
is a special case in this respect.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
5a09ad8248ce452136ed96a3d46532b03c877618 14-Mar-2008 Brian <brian.paul@tungstengraphics.com> gallium: add explicit control for point sprites (convert points to textured quads)

New draw_enable_point_sprites() function.
Fixes spriteblast.c demo
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
8b8c9acdb747499149e633179a8ad10b0e4206b1 13-Mar-2008 Brian <brian.paul@tungstengraphics.com> gallium: added draw_enable_line_stipple() function

Allows drivers that implement line stipple to turn off this drawing stage.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
12ab5f97013e398b9f6485b97d6691c3c170447a 12-Mar-2008 Brian <brian@poulsbo.localnet.net> gallium: change draw_vertex_shader->state from pointer to struct

We were sometimes keeping a pointer to a stack-allocated object.
Now make a copy of the pipe_shader_state object.
This should fix some seemingly random memory errors/crashes.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
297b3be25a7f097fb9b1a79e332acddc12dcc3fe 10-Mar-2008 Keith Whitwell <keith@tungstengraphics.com> draw: placeholder/prototype code for a passthrough draw path
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
a1a13954885cd469faab49633b5386e5c889e3df 29-Feb-2008 Brian Paul <brian.paul@tungstengraphics.com> gallium: split draw_wide_prim stage into separate point/line stages.

This fixes a validation/code-path problem. Enabling the stage for the sake
of wide points also inadvertantly caused wide lines to be converted to tris
when we actually want them passed through, such as for the AA line stage.
This is just cleaner now.
Also, replace draw_convert_wide_lines() with draw_wide_line_threshold() as
was done for points. Allows for 1-pixel lines to be converted too if needed.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
cddeca51adf0d2b736a223e47b60f6ef3be85bff 28-Feb-2008 Brian <brian@i915.localnet.net> gallium: remove dependencies on pipe_shader_state's semantic info

Use tgsi_scan_shader() to populate a tgsi_shader_info struct and use that instead.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
5e29aab1752c3e07ae2ebde4cb00e6550dab0eb2 26-Feb-2008 Brian <brian@poulsbo.localnet.net> gallium: replace draw_convert_wide_points() with draw_wide_point_threshold()

Specifying a threshold size is a bit more flexible, and allows the option
of converting even 1-pixel points to triangles (set threshold=0).

Also, remove 0.25 pixel bias in wide_point().
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
446bfc32a83008e0865ec869bc80b920c907f10f 22-Feb-2008 Brian <brian.paul@tungstengraphics.com> gallium: new draw stage for polygon stipple.

For hardware without native polygon stipple. Create a 32x32 alpha texture
that encodes the stipple pattern. Modify the user's fragment program to
sample the texture (with gl_FragCoord) and kill the fragment according to
the texel value.
Temporarily enabled in softpipe driver, replacing the sp_quad_stipple.c step.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
eb4dc2dd5ed62e6ccb55ccc2bc13f6a2f3fc1f76 22-Feb-2008 Brian <brian.paul@tungstengraphics.com> gallium: new AA point drawing stage

AA points are drawn by converting the point to a quad, then modifying the
user's fragment shader to compute a coverage value. The final fragment
color's alpha is modulated by the coverage value. Fragments outside the
point's radius are killed.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
30479ef11004c9498c4ef09048efc56227f104cc 15-Feb-2008 Keith Whitwell <keith@tungstengraphics.com> draw: vertex cache rework

Take a baby step to straightening out vertex paths.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
f430d95a36d55141cd9ef911aab70364ce4a4108 18-Feb-2008 José Fonseca <jrfonseca@tungstengraphics.com> Use gallium's rtasm module.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
aceeb80d4f706980aaf71b8e098d4c6718d8ac90 19-Feb-2008 Brian <brian.paul@tungstengraphics.com> gallium: antialiased line drawing

New draw/prim stage: draw_aaline. When installed, lines are replaced by
textured quads to do antialiasing. The current user-defined fragment shader
is modified to do a texture fetch and modulate fragment alpha.
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
b29d8d27292c2ad956d3f0a307603f00ee01af28 15-Feb-2008 Keith Whitwell <keith@tungstengraphics.com> draw: subclass vertex shaders according to execution method

Create new files for shaders compiled/executed with llvm, sse, exec
respectively
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h
92fcbf6e7bc622dcace226bb70ff6d5cdbdbaecb 15-Feb-2008 José Fonseca <jrfonseca@tungstengraphics.com> Code reorganization: s/aux/auxiliary/.

"aux" is a reserved name on Windows (X_X)
/external/mesa3d/src/gallium/auxiliary/draw/draw_private.h