a991960ca9bb3327624853ca931c104a423cebd8 |
|
19-Mar-2017 |
Karol Herbst <karolherbst@gmail.com> |
nvc0/ir: treat FMA like MAD for operand propagation Helps mainly Feral-ported games, due to their use of fma() shader-db changes: total instructions in shared programs : 3901147 -> 3842505 (-1.50%) total gprs used in shared programs : 471258 -> 467359 (-0.83%) total local used in shared programs : 27405 -> 27361 (-0.16%) total bytes used in shared programs : 35749888 -> 35214176 (-1.50%) local gpr inst bytes helped 17 1829 4091 4091 hurt 4 44 3 3 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit 09f16de7e624938d46a63b8285fc5b21050962e9)
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
948cce01964c1dd7365c49381f9a6cf1b6e5f7f9 |
|
25-Nov-2016 |
Samuel Pitoiset <samuel.pitoiset@gmail.com> |
gm107/ir: do not combine CONST loads This will allow to use MOV instead of LD. The main advantage is that MOV doesn't require a read dependency barrier while LD does, and so this will both reduce barriers pressure and the number of stall counts needed to read data from constant memory. This is currently only for user uniform accesses. I should do something similar when loading from the driver constant buffer but it seems like a bit tricky to handle for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
3abe68b8282496688186157b51da5600ac540906 |
|
14-Sep-2016 |
Samuel Pitoiset <samuel.pitoiset@gmail.com> |
nv50/ir: teach insnCanLoad() about SHLADD Commutativity is not allowed with SHLADD, but src2 can accept loads. To allow the load propagation pass to do its job, add a special case like for SUCLAMP because src1 is always an immediate. This IMAD to SHLADD optimization helps a bunch of shaders from Tomb Raider, Victor Vran, UE4 demos (+15% perf with Elemental) and Shadow Warrior. GF100/GK104: total instructions in shared programs :2838045 -> 2834712 (-0.12%) total gprs used in shared programs :396684 -> 396386 (-0.08%) total local used in shared programs :34416 -> 34416 (0.00%) local gpr inst bytes helped 0 326 1105 1105 hurt 0 55 3 3 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
85132c7453230960f34cfe7b7b7fcaaab158d79f |
|
14-Sep-2016 |
Samuel Pitoiset <samuel.pitoiset@gmail.com> |
nv50/ir: add preliminary support for SHLADD This instruction is available since SM20 (Fermi) and allow to do (a << b) + c in one shot. In some situations, IMAD should be replaced by SHLADD when b is a power of 2, and ADD+SHL should be replaced by SHLADD as well. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
9b8b69b3c4671b2301f2926f5d310b319a221500 |
|
13-Sep-2016 |
Samuel Pitoiset <samuel.pitoiset@gmail.com> |
nvc0/ir: fix comments about instructions info The comment for the commutative flags was wrong because OP_MUL is before OP_MAD. While we are at it add missing opcodes, and fix the comment about the short forms. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
ae7eb93e6caaf2a75fdaab071c0e5e9883376a82 |
|
13-Aug-2016 |
Karol Herbst <karolherbst@gmail.com> |
nvc0/ir: allow min/max instructions to be dual-issued in pairs changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0 /benchmark_duration_ms=60000 /width=1024 /height=640: inst_executed: 1.03G inst_issued1: 614M -> 580M inst_issued2: 213M -> 230M score: 1021 -> 1030 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
d0cf7a6beb4470d945bccb4e753cc7eb6ca5dda8 |
|
13-Aug-2016 |
Karol Herbst <karolherbst@gmail.com> |
nvc0/ir: don't dual-issue ops that depend or interfere with each other Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> [imirkin: rewrite to split up the helpers and move more logic to target] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
2aa1197eee442ab960f6ad6b84d4cf58511d6cb7 |
|
25-Apr-2016 |
Hans de Goede <hdegoede@redhat.com> |
nouveau: Add support for SV_WORK_DIM Add support for SV_WORK_DIM for nvc0 and nve4. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
1f895caba0accc0af3e637d6193ac0b673ce98bc |
|
28-May-2016 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nvc0/ir: limit max number of regs based on availability in SM This effectively limits registers to 32 and 64 for fermi and kepler when 1024 threads are used, but allows the full amount to be used with smaller thread sizes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
5e32cc91921209ed27027c57d6bff3d25e189e5a |
|
21-May-2016 |
Samuel Pitoiset <samuel.pitoiset@gmail.com> |
nv50/ir: fix a comment in canDualIssue() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
61d52a5fb9379eede3bf68b011f9477176341ee9 |
|
17-Mar-2016 |
Hans de Goede <hdegoede@redhat.com> |
nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for OpenCL global buffers. This commits changes the buffer code to use FILE_MEMORY_BUFFER at the ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL for use with OpenCL global buffers. Note that after lowering buffer accesses use the FILE_MEMORY_GLOBAL register file. Tested with piglet on a gf119 and a gk107: ./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader [9/9] pass: 9 / ./piglit run -o shader -t '.*arb_compute_shader.*' results/shader [20/20] skip: 4, pass: 16 | Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
37b67db6ae34fb6586d640a7a1b6232f091dd812 |
|
11-Jan-2016 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nvc0/ir: be careful about propagating very large offsets into const load Indirect constbuf indexing works by using very large offsets. However if an indirect constbuf index load is const-propagated, it becomes a very large const offset. Take that into account when legalizing the SSA by moving the high parts of that offset into the file index. Also disallow very large (or small) indices on most other instructions. This fixes regressions in ubo_array_indexing/*-two-arrays piglit tests. Fixes: abd326e81b (nv50/ir: propagate indirect loads into instructions) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
517a93b346e720082e22e358b63b5dbc5c42aa09 |
|
30-Dec-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nvc0: add ARB_shader_draw_parameters support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
f97f755192210ce3690e67abccefa133d398d373 |
|
08-Dec-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nvc0/ir: fix up mul+add -> mad algebraic opt, enable for integers For some reason this has been disabled for integers ever since codegen was merged, despite there being emission code for IMAD. Seems to work. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
2b98914fe01f1c7b2de8a096c8923b3ab0a69578 |
|
04-Dec-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: avoid looking at uninitialized srcMods entries Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
9f2f8bda6e060cb85f6e099a4ad65c58cde36ba0 |
|
05-Nov-2015 |
Hans de Goede <hdegoede@redhat.com> |
nvc0/ir: Teach insnCanLoad about double immediates Teach insnCanLoad about double immediates, together with the "Add support for merge-s to the ConstantFolding pass" This turns the following (nvc0) code: 1: mov u32 $r2 0x00000000 (8) 2: mov u32 $r3 0x3fe00000 (8) 3: add f64 $r0d $r0d $r2d (8) Into: 1: add f64 $r0d $r0d 0.500000 (8) Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
7e0036a49258326cc2d875f2960d18c6b3665036 |
|
24-Jul-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nvc0/ir: tess factors are now sysvals, adapt codegen to expect that Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
88127874a3eacd379f3c721bbdacdbdad4d03125 |
|
07-Jul-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nvc0/ir: no instruction can load a double immediate Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
874a9396c5adfdcff63139bf6ababb55c1253402 |
|
06-Sep-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: avoid array overrun when checking for supported mods Reported by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
ecee4c42292f8e38e59d2ee2d3513694ec58406a |
|
27-May-2014 |
Alexandre Courbot <acourbot@nvidia.com> |
nvc0/ir: use SM35 ISA with GK20A GK20A is mostly compatible with GK104, but uses the SM35 ISA. Use the GK110 path when this chip is detected. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
7b9475fa652b9df6d599edbea8fa5049fdd995e1 |
|
09-May-2014 |
Ben Skeggs <bskeggs@redhat.com> |
nvc0: maxwell isa has no per-instruction join modifier Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
0079a375a58b288caacc2721f5a34b8f1233e7d1 |
|
09-May-2014 |
Ben Skeggs <bskeggs@redhat.com> |
nvc0: allow for easier modification of compiler library routines Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
b4b20d42f6a8cd5aec3ba529a0b8d6ea22e73305 |
|
26-Apr-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nvc0/ir: add support for new bitfield manipulation opcodes This adds support for: IBFE, UBFE, BFI, LSB, IMSB, UMSB, BREV, POPC Which are all required for ARB_gs5 support. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
b3a2398aded19e25124a4a1d228eb3843827f6b2 |
|
24-Apr-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nvc0/ir: add support for SAMPLEMASK sysval Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
af38ef907c89ecb1125bf258cafa0793f79a5eb7 |
|
21-Apr-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nvc0: add support for PIPE_CAP_SAMPLE_SHADING Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
34bf5e27c6d798bcaa63c7541ecea1d3e99fdd3b |
|
14-Mar-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir/gk110: add 64/128-bit fetch/export support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
92ceb327bad73cfde0b68aafb3921067351617fd |
|
06-Dec-2013 |
Ben Skeggs <bskeggs@redhat.com> |
nvc0: fixup gk110 and up not being listed in various switch statements Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|
5eb7ff1175a644ffe3b0f1a75cb235400355f9fb |
|
20-Aug-2013 |
Johannes Obermayr <johannesobermayr@gmx.de> |
Move nv30, nv50 and nvc0 to nouveau. It is planned to ship openSUSE 13.1 with -shared libs. nouveau.la, nv30.la, nv50.la and nvc0.la are currently LIBADDs in all nouveau related targets. This change makes it possible to easily build one shared libnouveau.so which is then LIBADDed. Also dlopen will be faster for one library instead of three and build time on -jX will be reduced. Whitespace fixes were requested by 'git am'. Signed-off-by: Johannes Obermayr <johannesobermayr@gmx.de> Acked-by: Christoph Bumiller <christoph.bumiller@speed.at> Acked-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
|