History log of /external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
a991960ca9bb3327624853ca931c104a423cebd8 19-Mar-2017 Karol Herbst <karolherbst@gmail.com> nvc0/ir: treat FMA like MAD for operand propagation

Helps mainly Feral-ported games, due to their use of fma()

shader-db changes:
total instructions in shared programs : 3901147 -> 3842505 (-1.50%)
total gprs used in shared programs : 471258 -> 467359 (-0.83%)
total local used in shared programs : 27405 -> 27361 (-0.16%)
total bytes used in shared programs : 35749888 -> 35214176 (-1.50%)

local gpr inst bytes
helped 17 1829 4091 4091
hurt 4 44 3 3

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 09f16de7e624938d46a63b8285fc5b21050962e9)
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
948cce01964c1dd7365c49381f9a6cf1b6e5f7f9 25-Nov-2016 Samuel Pitoiset <samuel.pitoiset@gmail.com> gm107/ir: do not combine CONST loads

This will allow to use MOV instead of LD. The main advantage is
that MOV doesn't require a read dependency barrier while LD does,
and so this will both reduce barriers pressure and the number of
stall counts needed to read data from constant memory.

This is currently only for user uniform accesses. I should do
something similar when loading from the driver constant buffer
but it seems like a bit tricky to handle for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
3abe68b8282496688186157b51da5600ac540906 14-Sep-2016 Samuel Pitoiset <samuel.pitoiset@gmail.com> nv50/ir: teach insnCanLoad() about SHLADD

Commutativity is not allowed with SHLADD, but src2 can accept
loads. To allow the load propagation pass to do its job, add a
special case like for SUCLAMP because src1 is always an immediate.

This IMAD to SHLADD optimization helps a bunch of shaders from Tomb
Raider, Victor Vran, UE4 demos (+15% perf with Elemental) and Shadow
Warrior.

GF100/GK104:

total instructions in shared programs :2838045 -> 2834712 (-0.12%)
total gprs used in shared programs :396684 -> 396386 (-0.08%)
total local used in shared programs :34416 -> 34416 (0.00%)

local gpr inst bytes
helped 0 326 1105 1105
hurt 0 55 3 3

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
85132c7453230960f34cfe7b7b7fcaaab158d79f 14-Sep-2016 Samuel Pitoiset <samuel.pitoiset@gmail.com> nv50/ir: add preliminary support for SHLADD

This instruction is available since SM20 (Fermi) and allow to do
(a << b) + c in one shot. In some situations, IMAD should be
replaced by SHLADD when b is a power of 2, and ADD+SHL should be
replaced by SHLADD as well.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
9b8b69b3c4671b2301f2926f5d310b319a221500 13-Sep-2016 Samuel Pitoiset <samuel.pitoiset@gmail.com> nvc0/ir: fix comments about instructions info

The comment for the commutative flags was wrong because OP_MUL is
before OP_MAD. While we are at it add missing opcodes, and fix
the comment about the short forms.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
ae7eb93e6caaf2a75fdaab071c0e5e9883376a82 13-Aug-2016 Karol Herbst <karolherbst@gmail.com> nvc0/ir: allow min/max instructions to be dual-issued in pairs

changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0
/benchmark_duration_ms=60000 /width=1024 /height=640:

inst_executed: 1.03G
inst_issued1: 614M -> 580M
inst_issued2: 213M -> 230M

score: 1021 -> 1030

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
d0cf7a6beb4470d945bccb4e753cc7eb6ca5dda8 13-Aug-2016 Karol Herbst <karolherbst@gmail.com> nvc0/ir: don't dual-issue ops that depend or interfere with each other

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
[imirkin: rewrite to split up the helpers and move more logic to target]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
2aa1197eee442ab960f6ad6b84d4cf58511d6cb7 25-Apr-2016 Hans de Goede <hdegoede@redhat.com> nouveau: Add support for SV_WORK_DIM

Add support for SV_WORK_DIM for nvc0 and nve4.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
1f895caba0accc0af3e637d6193ac0b673ce98bc 28-May-2016 Ilia Mirkin <imirkin@alum.mit.edu> nvc0/ir: limit max number of regs based on availability in SM

This effectively limits registers to 32 and 64 for fermi and kepler when
1024 threads are used, but allows the full amount to be used with
smaller thread sizes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
5e32cc91921209ed27027c57d6bff3d25e189e5a 21-May-2016 Samuel Pitoiset <samuel.pitoiset@gmail.com> nv50/ir: fix a comment in canDualIssue()

Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
61d52a5fb9379eede3bf68b011f9477176341ee9 17-Mar-2016 Hans de Goede <hdegoede@redhat.com> nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only
apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for
OpenCL global buffers.

This commits changes the buffer code to use FILE_MEMORY_BUFFER at the
ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL
for use with OpenCL global buffers.

Note that after lowering buffer accesses use the FILE_MEMORY_GLOBAL
register file.

Tested with piglet on a gf119 and a gk107:
./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader
[9/9] pass: 9 /
./piglit run -o shader -t '.*arb_compute_shader.*' results/shader
[20/20] skip: 4, pass: 16 |

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
37b67db6ae34fb6586d640a7a1b6232f091dd812 11-Jan-2016 Ilia Mirkin <imirkin@alum.mit.edu> nvc0/ir: be careful about propagating very large offsets into const load

Indirect constbuf indexing works by using very large offsets. However if
an indirect constbuf index load is const-propagated, it becomes a very
large const offset. Take that into account when legalizing the SSA by
moving the high parts of that offset into the file index. Also disallow
very large (or small) indices on most other instructions.

This fixes regressions in ubo_array_indexing/*-two-arrays piglit tests.

Fixes: abd326e81b (nv50/ir: propagate indirect loads into instructions)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
517a93b346e720082e22e358b63b5dbc5c42aa09 30-Dec-2015 Ilia Mirkin <imirkin@alum.mit.edu> nvc0: add ARB_shader_draw_parameters support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
f97f755192210ce3690e67abccefa133d398d373 08-Dec-2015 Ilia Mirkin <imirkin@alum.mit.edu> nvc0/ir: fix up mul+add -> mad algebraic opt, enable for integers

For some reason this has been disabled for integers ever since codegen
was merged, despite there being emission code for IMAD. Seems to work.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
2b98914fe01f1c7b2de8a096c8923b3ab0a69578 04-Dec-2015 Ilia Mirkin <imirkin@alum.mit.edu> nv50/ir: avoid looking at uninitialized srcMods entries

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
9f2f8bda6e060cb85f6e099a4ad65c58cde36ba0 05-Nov-2015 Hans de Goede <hdegoede@redhat.com> nvc0/ir: Teach insnCanLoad about double immediates

Teach insnCanLoad about double immediates, together with the
"Add support for merge-s to the ConstantFolding pass"

This turns the following (nvc0) code:
1: mov u32 $r2 0x00000000 (8)
2: mov u32 $r3 0x3fe00000 (8)
3: add f64 $r0d $r0d $r2d (8)

Into:
1: add f64 $r0d $r0d 0.500000 (8)

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
7e0036a49258326cc2d875f2960d18c6b3665036 24-Jul-2015 Ilia Mirkin <imirkin@alum.mit.edu> nvc0/ir: tess factors are now sysvals, adapt codegen to expect that

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
88127874a3eacd379f3c721bbdacdbdad4d03125 07-Jul-2014 Ilia Mirkin <imirkin@alum.mit.edu> nvc0/ir: no instruction can load a double immediate

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
874a9396c5adfdcff63139bf6ababb55c1253402 06-Sep-2014 Ilia Mirkin <imirkin@alum.mit.edu> nv50/ir: avoid array overrun when checking for supported mods

Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
ecee4c42292f8e38e59d2ee2d3513694ec58406a 27-May-2014 Alexandre Courbot <acourbot@nvidia.com> nvc0/ir: use SM35 ISA with GK20A

GK20A is mostly compatible with GK104, but uses the SM35 ISA. Use
the GK110 path when this chip is detected.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
7b9475fa652b9df6d599edbea8fa5049fdd995e1 09-May-2014 Ben Skeggs <bskeggs@redhat.com> nvc0: maxwell isa has no per-instruction join modifier

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
0079a375a58b288caacc2721f5a34b8f1233e7d1 09-May-2014 Ben Skeggs <bskeggs@redhat.com> nvc0: allow for easier modification of compiler library routines

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
b4b20d42f6a8cd5aec3ba529a0b8d6ea22e73305 26-Apr-2014 Ilia Mirkin <imirkin@alum.mit.edu> nvc0/ir: add support for new bitfield manipulation opcodes

This adds support for:

IBFE, UBFE, BFI, LSB, IMSB, UMSB, BREV, POPC

Which are all required for ARB_gs5 support.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
b3a2398aded19e25124a4a1d228eb3843827f6b2 24-Apr-2014 Ilia Mirkin <imirkin@alum.mit.edu> nvc0/ir: add support for SAMPLEMASK sysval

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
af38ef907c89ecb1125bf258cafa0793f79a5eb7 21-Apr-2014 Ilia Mirkin <imirkin@alum.mit.edu> nvc0: add support for PIPE_CAP_SAMPLE_SHADING

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
34bf5e27c6d798bcaa63c7541ecea1d3e99fdd3b 14-Mar-2014 Ilia Mirkin <imirkin@alum.mit.edu> nv50/ir/gk110: add 64/128-bit fetch/export support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
92ceb327bad73cfde0b68aafb3921067351617fd 06-Dec-2013 Ben Skeggs <bskeggs@redhat.com> nvc0: fixup gk110 and up not being listed in various switch statements

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
5eb7ff1175a644ffe3b0f1a75cb235400355f9fb 20-Aug-2013 Johannes Obermayr <johannesobermayr@gmx.de> Move nv30, nv50 and nvc0 to nouveau.

It is planned to ship openSUSE 13.1 with -shared libs.
nouveau.la, nv30.la, nv50.la and nvc0.la are currently LIBADDs in all nouveau
related targets.
This change makes it possible to easily build one shared libnouveau.so which is
then LIBADDed.
Also dlopen will be faster for one library instead of three and build time on
-jX will be reduced.

Whitespace fixes were requested by 'git am'.

Signed-off-by: Johannes Obermayr <johannesobermayr@gmx.de>
Acked-by: Christoph Bumiller <christoph.bumiller@speed.at>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp