300b5ad023962ee95322e890a9ba57396392407e |
|
10-Oct-2016 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: copy over value's register id when resolving merge of a phi The offset needs to be properly copied over to the phi value, otherwise it will get assigned to the base of the merge instead of the proper location. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
d8bcd3ef3723e14a9deabd1cab35b13d80fbbcea |
|
03-Oct-2016 |
Karol Herbst <karolherbst@gmail.com> |
nv50/ra: let simplify return an error and handle that fixes a crash in the case simplify reports an error Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
e14cb05ce138ffa4828a809509f975abd103d3a9 |
|
19-May-2016 |
Samuel Pitoiset <samuel.pitoiset@gmail.com> |
gm107/ra: fix constraints for surface operations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
0d911a720da9677ad0410fdfeab8e81546427102 |
|
09-Jul-2016 |
Ben Skeggs <bskeggs@redhat.com> |
nvc0: initial support for GP100 GPUs Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
d873608bcf97cddaaca396d29f065657c1f63039 |
|
31-May-2016 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: print relevant file's bitset when showing RA info Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
8aa1fd321dd26a9e3cd348f218f102a6debebe92 |
|
27-Apr-2016 |
Samuel Pitoiset <samuel.pitoiset@gmail.com> |
nv50/ir: fix tex constraints for surface coords on Fermi Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
be4caaf247d8a9908d50e31037421d22bed7a2d6 |
|
31-Jan-2016 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: use moveSources to condense sources This makes sure that rIndirectSrc and other things stay updated. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
70834d05cd2ac6ccceff3a8cbf7c797c6d3679ba |
|
20-May-2016 |
Samuel Pitoiset <samuel.pitoiset@gmail.com> |
nv50/ir: fix SUSTx constraints on Kepler To prevent out-of-bounds access and format mismatch we add a predicate on sustp, but we have to account for it when the sources are condensed because a predicate is a source. Using the range 3:6 will only condense the input data and it's always the case. This also fixes constraints when an indirect access is used. This ensures that sources are correctly aligned. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
08f4faa542452a3dbdcc3fd960e75ec043b10390 |
|
05-Apr-2016 |
Samuel Pitoiset <samuel.pitoiset@gmail.com> |
nvc0/ir: fix constraints for OP_SUSTx on Kepler Destination type is actually always 32-bits, so typeSizeof() returns 4 and no sources are condensed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
121a0cedc80c5541d51599383486ba9a7397c6ce |
|
19-Apr-2016 |
Jose Fonseca <jfonseca@vmware.com> |
Revert "nv50/ra: `isinf()` is in namespace `std` since C++11." This reverts commit f525db6358fbaa7b4296d2e6484e0b1ae703ac78. It was superseeded by commit 649704f1f7c9e1d0990d34a76154b2eb656bee42.
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
f525db6358fbaa7b4296d2e6484e0b1ae703ac78 |
|
18-Mar-2016 |
Pierre Moreau <pierre.morrow@free.fr> |
nv50/ra: `isinf()` is in namespace `std` since C++11. This fixes a compile error while building Nouveau with C++11 enabled (and glibc >= 2.23). This happens if SWR is enabled, as it forces C++11. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Jose Fonseca <jfonseca@vmware.com> https://bugs.freedesktop.org/show_bug.cgi?id=94907
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
f96a403bc3e1ef45f92621e9ace48cf757db4059 |
|
19-Mar-2016 |
Pierre Moreau <pierre.morrow@free.fr> |
nv50/ir: Check for valid insn instead of def size This fixes a null pointer dereference during the register allocation pass, if a function had arguments. Functions arguments get a definition from the function itself, a definition which is therefore not linked to any instruction. If a value ends up having a definition but no linked instruction, the register allocation pass doesn't need to consider whether that value is generated by an instruction that can only handle "short" registers (on nv50). Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
33ace5544e755b74ac7c02a7d590f3c64139cc3a |
|
15-Feb-2016 |
Ben Skeggs <bskeggs@redhat.com> |
nvc0: initial support for GM20x GPUs Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
19ae5de981e014e1b366b4652e14eb1ea0421574 |
|
26-Jan-2016 |
Karol Herbst <git@karolherbst.de> |
nv50/ir: fix memory corruption when spilling and redoing RA When RA fails, and we spill, we have to clean everything up before doing RA again. We were forgetting to reset the hi/lo linked lists - at least the hi list is guaranteed to still have pointers to now-deleted RIG nodes. Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
ab70ea1353ac9859ee51d236482fe92a0493362d |
|
11-Dec-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: add short imad support Support emission of the short imad, but also include it in the various logic that tries to make it possible to emit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
44260d908062a4771c30ab635dd527f4266dbaec |
|
09-Dec-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: prefer to color mad def and src2 with the same color This allows us to use the short encoding, and potentially fold immediates in later on. total instructions in shared programs : 6379731 -> 6367861 (-0.19%) total gprs used in shared programs : 728502 -> 728683 (0.02%) total local used in shared programs : 9904 -> 9904 (0.00%) total bytes used in shared programs : 44661008 -> 44154976 (-1.13%) local gpr inst bytes helped 0 51 7267 20306 hurt 0 232 125 274 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
c1c1248b94e17a1a4fa0e6f353377efa99efe602 |
|
09-Dec-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: reduce degree limit on ops that can't encode large reg dests Operations that take immediates can only encode registers up to 64. This fixes a shader in a "Powered by Unity" intro. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
99581ca393037e10d17aab1f4c90ff2bdb1ec557 |
|
07-Dec-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: only unspill once ahead of a group of instructions We already semi-did this but the list of uses as unsorted, so it was unreliable. Sort the uses by bb and serial, and don't unspill for each instruction in a sequence. (And also don't unspill multiple times for a single instruction that uses the value in question multiple times.) This causes a minor reduction in generated instructions for shader-db (as few programs spill) but more importantly it brings determinism to each run's output. On SM10: total instructions in shared programs : 6387945 -> 6379359 (-0.13%) total gprs used in shared programs : 728544 -> 728544 (0.00%) total local used in shared programs : 9904 -> 9904 (0.00%) local gpr inst bytes helped 0 0 322 322 hurt 0 0 0 0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
6c6f28c35e793e098757cfa8fbc860961d52f9e7 |
|
26-Nov-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: fix moves to/from flags Noticed this when looking at a trace that caused flags to spill to/from registers. The flags source/destination wasn't encoded correctly according to both envydis and nvdisasm. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
c672bf3b043ffd1b29d796f9c52a79d1014397ae |
|
29-Nov-2015 |
Samuel Pitoiset <samuel.pitoiset@gmail.com> |
nv50/ir: do not call textureMask() for surface ops That texture mask thing doesn't seem to be needed for surface ops, so just as nve4+, let do that only for texture ops. This fixes a segfault with 'test_surface_st' from gallium/tests/trivial/compute.c on Fermi because this test uses sustp. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
4deb118d06e96731f3481daa72c201d7258bfbbb |
|
18-Apr-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: fix (un)spilling of 3-wide results There is no 96-bit load/store operations, so we have to split it up into a 32-bit parts, with a split/merge around it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90348 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
d31005e3e5588b20760c774f14ac0ea80375a181 |
|
15-Oct-2015 |
Chih-Wei Huang <cwhuang@android-x86.org> |
nv50/ir: use C++11 standard std::unordered_map if possible Note Android version before Lollipop is not supported. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
a072ef8748a65d286e9b542bb9ea6e020fdcc7f8 |
|
10-Sep-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: make edge splitting fix up phi node sources Unfortunately nv50_ir phi nodes aren't directly connected to the CFG, so the mapping between source and the actual BB is by inbound edge order. So when manipulating edges one has to be extremely careful. We were insufficiently careful when splitting critical edges which resulted in the phi nodes being confused as to where their sources were coming from. This primarily manifests itself with the TXL-lowering logic on nv50, when it is inside of a conditional. I've been unable to trigger the issue anywhere else so far. This resolves rendering failures in a number of games like Two Worlds 2, Trine: Enchanted Edition, Trine 2, XCOM:Enemy Unknown, Stacking. It also improves the situation in Hearthstone, Sonic Generations, and The Raven: Legacy of a Master Thief. However more work needs to be done there (splitting a lot more edges solves it, so it's some other sort of RA-related issue). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90887 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
2a4af36517333ef61d5f7ca2264fec3f49ee3662 |
|
19-Jun-2015 |
Chih-Wei Huang <cwhuang@android-x86.org> |
nv50/ir: support different unordered_set implementations If build with C++11 standard, use std::unordered_set. Otherwise if build on old Android version with stlport, use std::tr1::unordered_set with a wrapper class. Otherwise use std::tr1::unordered_set. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
77672cdb64e9c19e974fe5985050709fc317498e |
|
23-Jul-2015 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nvc0/ir: add hazard for 2nd dim of vfetch/load indirect argument Apparently a multi-word load can potentially overwrite the indirect sources, so make sure that RA picks different registers for those. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
0532a5fd00cdddda0fd1727fb519cb4312f47e83 |
|
25-Sep-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
gm107/ir: fix texture argument order Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.3" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
0147c10c5f00b43696ba660aab604d674a75e83c |
|
25-Sep-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: avoid deleting pseudo instructions too early What happens is that a SPLIT operation is part of the spill node, and as a pseudo op, the instruction gets erased after processing its first def. However the later defs still need to refer to it, so instead delay deleting until after that whole RA node is done processing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79462 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
b9f9e3ce03dbd8d044a72a00e1e8856a500b5f72 |
|
05-Sep-2014 |
Christoph Bumiller <e0425955@student.tuwien.ac.at> |
nv50/ir/util: fix BitSet issues BitSet::allocate() is being used with the expectation that it would leave the bitfield untouched if its size hasn't changed, however, the function always zeroed the last word, which led to obscure bugs with live set computation. This also fixes BitSet::resize(), which was broken, but luckily not being used. Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
dfb0ca16065c1d251101bb094f2cfd08cf3cda15 |
|
17-Jul-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: fix phi/union sources when their def has been merged In a situation where double-register values are used, the phi nodes can still end up being u32 values. They all get merged into one RA node though. When fixing up the merge (which comes after the phi node), the phi node's def would get fixed, but not its sources which would remain at the low register value. This maintains the invariant that a phi node's defs and sources are allocated the same register. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
32702cceed6d6e0a83cb21821ee571e02d1d24fd |
|
17-Jul-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: fix hard-coded TYPE_U32 sized register Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
a9b21015f5e3a6a37e53a8b3c755519f7b70479e |
|
08-Jul-2014 |
Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> |
nv50/ir: use unordered_set instead of list to keep track of var uses The set of variable uses does not need to be ordered in any way, and removing/adding elements is a fairly common operation in various optimization passes. This shortens runtime of piglit test fp-long-alu to ~22s from ~4h Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
4ebaabcccb125e3d29ab6e6ac3d23897287d7574 |
|
13-May-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nv50/ir: make sure that texprep/texquerylod's args get coalesced Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
d548d47edf9f05e6dbf9656abc2f8e78d02cb2f6 |
|
09-May-2014 |
Ben Skeggs <bskeggs@redhat.com> |
nvc0: add maxwell (sm50) compiler backend The big missing part here is proper sched data calculations, but hopefully the chosen placeholder will be sufficient for now. Passes piglit as well as GK107 does. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
19ba573a57ff6125a26ff9ae94cf43c36129645f |
|
20-Mar-2014 |
Ilia Mirkin <imirkin@alum.mit.edu> |
nvc0/ir: move sample id to second source arg to fix sampler2DMS The nvc0 texfetch instruction expects the sample id to be in the second source (usually used for the offset) rather than as part of the texture coordinate. This fixes all the sampler2DMS/Array tests on nvc0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> Cc: "10.1" <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
1f4bfb8797d2f851f5e113f85bcbff519977fd99 |
|
19-Feb-2014 |
Christoph Bumiller <e0425955@student.tuwien.ac.at> |
nv50/ir/ra: fix SpillCodeInserter::offsetSlot usage We were turning non-memory spill slots into NULL. Cc: 10.1 <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
2e9ee44797fcce10e2f11ecb8520655f1e30280a |
|
08-Feb-2014 |
Christoph Bumiller <e0425955@student.tuwien.ac.at> |
nv50/ir/ra: some register spilling fixes Cc: 10.1 <mesa-stable@lists.freedesktop.org>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
92ceb327bad73cfde0b68aafb3921067351617fd |
|
06-Dec-2013 |
Ben Skeggs <bskeggs@redhat.com> |
nvc0: fixup gk110 and up not being listed in various switch statements Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|
5eb7ff1175a644ffe3b0f1a75cb235400355f9fb |
|
20-Aug-2013 |
Johannes Obermayr <johannesobermayr@gmx.de> |
Move nv30, nv50 and nvc0 to nouveau. It is planned to ship openSUSE 13.1 with -shared libs. nouveau.la, nv30.la, nv50.la and nvc0.la are currently LIBADDs in all nouveau related targets. This change makes it possible to easily build one shared libnouveau.so which is then LIBADDed. Also dlopen will be faster for one library instead of three and build time on -jX will be reduced. Whitespace fixes were requested by 'git am'. Signed-off-by: Johannes Obermayr <johannesobermayr@gmx.de> Acked-by: Christoph Bumiller <christoph.bumiller@speed.at> Acked-by: Ian Romanick <ian.d.romanick@intel.com>
/external/mesa3d/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
|