ccbbda2b716bcc0dd9ad7b6c7bf9079efa3fca23 |
|
03-Jul-2015 |
Douglas Leung <douglas.leung@imgtec.com> |
Add implicit null pointer and stack overflow checks for Mips. (cherry picked from commit 22bb5a2ebc1e2724179faf4660b2735dcb185f21) Bug: 21555893 Change-Id: I2a995be128a5603d08753c14956dd8c8240ac63c
|
1095793154d2ff33323ba9edaa4f83373bdb6c8e |
|
24-Mar-2015 |
Goran Jakovljevic <Goran.Jakovljevic@imgtec.com> |
[MIPS] Refactoring code for quick compiler Code from compiler/dex/quick/mips64 is merged with code in mips folder. Change-Id: I785983c21549141306484647da86a0bb4815daaa
|
f6737f7ed741b15cfd60c2530dab69f897540735 |
|
23-Mar-2015 |
Vladimir Marko <vmarko@google.com> |
Quick: Clean up Mir2Lir codegen. Clean up WrapPointer()/UnwrapPointer() and OpPcRelLoad(). Change-Id: I1a91f01e1e779599c77f3f6efcac2a6ad34629cf
|
027f0ff64c2512b9a5f1f54f3fea1bec481eb0f5 |
|
28-Feb-2015 |
Douglas Leung <douglas.leung@imgtec.com> |
ART: Add Mips32r6 backend support Add Mips32r6 compiler support. Don't use deprecated Mips32r2 instructions if running in Mips32r6 mode. Change-Id: I54e689aa8c026ccb75c4af515aa2794f471c9f67
|
6ce3eba0f2e6e505ed408cdc40d213c8a512238d |
|
16-Feb-2015 |
Vladimir Marko <vmarko@google.com> |
Add suspend checks to special methods. Generate suspend checks at the beginning of special methods. If we need to call to runtime, go to the slow path where we create a simplified but valid frame, spill all arguments, call art_quick_test_suspend, restore necessary arguments and return back to the fast path. This keeps the fast path overhead to a minimum. Bug: 19245639 Change-Id: I3de5aee783943941322a49c4cf2c4c94411dbaa2
|
0b9203e7996ee1856f620f95d95d8a273c43a3df |
|
23-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Some Quick cleanup Make several fields const in CompilationUnit. May benefit some Mir2Lir code that repeats tests, and in general immutability is good. Remove compiler_internals.h and refactor some other headers to reduce overly broad imports (and thus forced recompiles on changes). Change-Id: I898405907c68923581373b5981d8a85d2e5d185a
|
d500b53ff8742f76b63c9f7593082d9e8114b85f |
|
17-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Some Quick cleanup Move some definitions around. In case a method is already virtual, avoid instruction-set tests. Change-Id: I8d98f098e55ade1bc0cfa32bb2aad006caccd07d
|
8ebdc2bdbbae5dd014bce8d438f0eca02bad9ff9 |
|
14-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Fix indentation in Mips backend Fix the indentation to be standard. Change-Id: I39a16716be3429dfef6df0a585e24423b46363a2
|
717a3e447c6f7a922cf9c3efe522747a187a045d |
|
13-Nov-2014 |
Serguei Katkov <serguei.i.katkov@intel.com> |
Re-factor Quick ABI support Now every architecture must provide a mapper between VRs parameters and physical registers. Additionally as a helper function architecture can provide a bulk copy helper for GenDalvikArgs utility. All other things becomes a common code stuff: GetArgMappingToPhysicalReg, GenDalvikArgsNoRange, GenDalvikArgsRange, FlushIns. Mapper now uses shorty representation of input parameters. This is required due to location are not enough to detect the type of parameter (fp or core). For the details see https://android-review.googlesource.com/#/c/113936/. Change-Id: Ie762b921e0acaa936518ee6b63c9a9d25f83e434 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
bf535be514570fc33fc0a6347a87dcd9097d9bfd |
|
19-Nov-2014 |
Vladimir Marko <vmarko@google.com> |
Add card mark to filled-new-array. Bug: 18032332 Change-Id: I35576b27f9115e4d0b02a11afc5e483b9e93a04a
|
675e09b2753c2fcd521bd8f0230a0abf06e9b0e9 |
|
23-Oct-2014 |
Ningsheng Jian <ningsheng.jian@arm.com> |
ARM: Strength reduction for floating-point division For floating-point division by power of two constants, generate multiplication by the reciprocal instead. Change-Id: I39c79eeb26b60cc754ad42045362b79498c755be
|
6a3c1fcb4ba42ad4d5d142c17a3712a6ddd3866f |
|
31-Oct-2014 |
Ian Rogers <irogers@google.com> |
Remove -Wno-unused-parameter and -Wno-sign-promo from base cflags. Fix associated errors about unused paramenters and implict sign conversions. For sign conversion this was largely in the area of enums, so add ostream operators for the effected enums and fix tools/generate-operator-out.py. Tidy arena allocation code and arena allocated data types, rather than fixing new and delete operators. Remove dead code. Change-Id: I5b433e722d2f75baacfacae4d32aef4a828bfe1b
|
5c5676b26a08454b3f0133783778991bbe5dd681 |
|
30-Sep-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
ART: Add div/rem zero check elimination flag Just as with other throwing bytecodes, it is possible to prove in some cases that a divide/remainder won't throw ArithmeticException. For example, in case two divides with same denominator are in order, then provably the second one cannot throw if the first one did not. This patch adds the elimination flag and updates the signature of several Mir2Lir methods to take the instruction optimization flags into account. Change-Id: I0b078cf7f29899f0f059db1f14b65a37444b84e8 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
832336b3c9eb892045a8de1bb12c9361112ca3c5 |
|
09-Oct-2014 |
Ian Rogers <irogers@google.com> |
Don't copy fill array data to quick literal pool. Currently quick copies the fill array data from the dex file to the literal pool. It then has to go through hoops to pass this PC relative address down to out-of-line code. Instead, pass the offset of the table to the out-of-line code and use the CodeItem data associated with the ArtMethod. This reduces the size of oat code while greatly simplifying it. Unify the FillArrayData implementation in quick, portable and the interpreters. Change-Id: I9c6971cf46285fbf197856627368c0185fdc98ca
|
8d0d03e24325463f0060abfd05dba5598044e9b1 |
|
07-Jun-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
ART: Change temporaries to positive names Changes compiler temporaries to have positive names. The numbering now puts them above the code VRs (locals + ins, in that order). The patch also introduces APIs to query the number of temporaries, locals and ins. The compiler temp infrastructure suffered from several issues which are also addressed by this patch: -There is no longer a queue of compiler temps. This would be polluted with Method* when post opts were called multiple times. -Sanity checks have been added to allow requesting of temps from BE and to prevent temps after frame is committed. -None of the structures holding temps can overflow because they are allocated to allow holding maximum temps. Thus temps can be requested by BE with no problem. -Since the queue of compiler temps is no longer maintained, it is no longer possible to refer to a temp that has invalid ssa (because it was requested before ssa was run). -The BE can now request temps after all ME allocations and it is guaranteed to actually receive them. -ME temps are now treated like normal VRs in all cases with no special handling. Only the BE temps are handled specially because there are no references to them from MIRs. -Deprecated and removed several fields in CompilationUnit that saved register information and updated callsites to call the new interface from MIRGraph. Change-Id: Ia8b1fec9384a1a83017800a59e5b0498dfb2698c Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com> Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
|
53c913bb71b218714823c8c87a1f92830c336f61 |
|
13-Aug-2014 |
Andreas Gampe <agampe@google.com> |
ART: Clean up compiler Clean up the compiler: less extern functions, dis-entangle compilers, hide some compiler specifics, lower global includes. Change-Id: Ibaf88d02505d86994d7845cf0075be5041cc8438
|
8c18c2aaedb171f9b03ec49c94b0e33449dc411b |
|
06-Aug-2014 |
Andreas Gampe <agampe@google.com> |
ART: Generate chained compare-and-branch for short switches Refactor Mir2Lir to generate chained compare-and-branch sequences for short switches on all architectures. Bug: 16241558 (cherry picked from commit 48971b3242e5126bcd800cc9c68df64596b43d13) Change-Id: I0bb3071b8676523e90e0258e9b0e3fd69c1237f4
|
48971b3242e5126bcd800cc9c68df64596b43d13 |
|
06-Aug-2014 |
Andreas Gampe <agampe@google.com> |
ART: Generate chained compare-and-branch for short switches Refactor Mir2Lir to generate chained compare-and-branch sequences for short switches on all architectures. Change-Id: Ie2a572ae69d462ba68a119e9fb93ae538cddd08f
|
c76c614d681d187d815760eb909e5faf488a3c35 |
|
05-Aug-2014 |
Andreas Gampe <agampe@google.com> |
ART: Refactor long ops in quick compiler Make GenArithOpLong virtual. Let the implementation in gen_common be very basic, without instruction-set checks, and meant as a fall-back. Backends should implement and dispatch to code for better implementations. This allows to remove the GenXXXLong virtual methods from Mir2Lir, and clean up the backends (especially removing some LOG(FATAL) implementations). Change-Id: I6366443c0c325c1999582d281608b4fa229343cf
|
984305917bf57b3f8d92965e4715a0370cc5bcfb |
|
28-Jul-2014 |
Andreas Gampe <agampe@google.com> |
ART: Rework quick entrypoint code in Mir2Lir, cleanup To reduce the complexity of calling trampolines in generic code, introduce an enumeration for entrypoints. Introduce a header that lists the entrypoint enum and exposes a templatized method that translates an enum value to the corresponding thread offset value. Call helpers are rewritten to have an enum parameter instead of the thread offset. Also rewrite LoadHelper and GenConversionCall this way. It is now LoadHelper's duty to select the right thread offset size. Introduce InvokeTrampoline virtual method to Mir2Lir. This allows to further simplify the call helpers, as well as make OpThreadMem specific to X86 only (removed from Mir2Lir). Make GenInlinedCharAt virtual, move a copy to X86 backend, and simplify both copies. Remove LoadBaseIndexedDisp and OpRegMem from Mir2Lir, as they are now specific to X86 only. Remove StoreBaseIndexedDisp from Mir2Lir, as it was only ever used in the X86 backend. Remove OpTlsCmp from Mir2Lir, as it was only ever used in the X86 backend. Remove OpLea from Mir2Lir, as it was only ever defined in the X86 backend. Remove GenImmedCheck from Mir2Lir as it was neither used nor implemented. Change-Id: If0a6182288c5d57653e3979bf547840a4c47626e
|
bebee4fd10e5db6cb07f59bc0f73297c900ea5f0 |
|
16-Jul-2014 |
Andreas Gampe <agampe@google.com> |
ART: Refactor GenSelect, refactor gen_common accordingly This adds a GenSelect method meant for selection of constants. The general-purpose GenInstanceof code is refactored to take advantage of this. This cleans up code and squashes a branch-over on ARM64 to a cset. Also add a slow-path for type initialization in GenInstanceof. Bug: 16241558 (cherry picked from commit 90969af6deb19b1dbe356d62fe68d8f5698d3d8f) Change-Id: Ie4494858bb8c26d386cf2e628172b81bba911ae5
|
f9d6aede77c700118e225f8312cd888262b77862 |
|
17-Jul-2014 |
Vladimir Marko <vmarko@google.com> |
Use vabs/fabs on arm/arm64 for intrinsic abs(). Bug: 11579369 (cherry picked from 5030d3ee8c6fe10394912ede107cbc8df63b7b16) Change-Id: I7b0596a8e7e3c87a93b225519c5aeedfe4f22e6d
|
5030d3ee8c6fe10394912ede107cbc8df63b7b16 |
|
17-Jul-2014 |
Vladimir Marko <vmarko@google.com> |
Use vabs/fabs on arm/arm64 for intrinsic abs(). Bug: 11579369 Change-Id: If09da85e22786faa13a2d74f62cee68ea67bd087
|
90969af6deb19b1dbe356d62fe68d8f5698d3d8f |
|
16-Jul-2014 |
Andreas Gampe <agampe@google.com> |
ART: Refactor GenSelect, refactor gen_common accordingly This adds a GenSelect method meant for selection of constants. The general-purpose GenInstanceof code is refactored to take advantage of this. This cleans up code and squashes a branch-over on ARM64 to a cset. Also add a slow-path for type initialization in GenInstanceof. Change-Id: Ie4494858bb8c26d386cf2e628172b81bba911ae5
|
d9cb8ae2ed78f957a773af61759432d7a7bf78af |
|
09-Jul-2014 |
Douglas Leung <douglas@mips.com> |
Fix art test failures for Mips. This patch fixes the following art test failures for Mips: 003-omnibus-opcodes 030-bad-finalizer 041-narrowing 059-finalizer-throw Change-Id: I4e0e9ff75f949c92059dd6b8d579450dc15f4467 Signed-off-by: Douglas Leung <douglas@mips.com>
|
59a42afc2b23d2e241a7e301e2cd68a94fba51e5 |
|
04-Jul-2014 |
Serguei Katkov <serguei.i.katkov@intel.com> |
Update counting VR for promotion For 64-bit it makes sense to compute VR uses together for int and long because core reg is shared. Change-Id: Ie8676ece12c928d090da2465dfb4de4e91411920 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
b5860fb459f1ed71f39d8a87b45bee6727d79fe8 |
|
22-Jun-2014 |
buzbee <buzbee@google.com> |
Register promotion support for 64-bit targets Not sufficiently tested for 64-bit targets, but should be fairly close. A significant amount of refactoring could stil be done, (in later CLs). With this change we are not making any changes to the vmap scheme. As a result, it is a requirement that if a vreg is promoted to both a 32-bit view and the low half of a 64-bit view it must share the same physical register. We may change this restriction later on to allow for more flexibility for 32-bit Arm. For example, if v4, v5, v4/v5 and v5/v6 are all hot enough to promote, we'd end up with something like: v4 (as an int) -> r10 v4/v5 (as a long) -> r10 v5 (as an int) -> r11 v5/v6 (as a long) -> r11 Fix a couple of ARM64 bugs on the way... Change-Id: I6a152b9c164d9f1a053622266e165428045362f3
|
23abec955e2e733999a1e2c30e4e384e46e5dde4 |
|
02-Jul-2014 |
Serban Constantinescu <serban.constantinescu@arm.com> |
AArch64: Add few more inline functions This patch adds inlining support for the following functions: * Math.max/min(long, long) * Math.max/min(float, float) * Math.max/min(double, double) * Integer.reverse(int) * Long.reverse(long) Change-Id: Ia2b1619fd052358b3a0d23e5fcbfdb823d2029b9 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
2db3e269e3051dacb3c8a4af8f03fdad9b0fd740 |
|
26-Jun-2014 |
Douglas Leung <douglas@mips.com> |
Fix quick mode bugs for Mips. This patch enable quick mode for Mips and allows the emulator to boot. However the emulator is still not 100% functional. It still have problems launching some apps. Change-Id: Id46a39a649a2fd431a9f13b06ecf34cbd1d20930 Signed-off-by: Douglas Leung <douglas@mips.com>
|
de68676b24f61a55adc0b22fe828f036a5925c41 |
|
24-Jun-2014 |
Andreas Gampe <agampe@google.com> |
Revert "ART: Split out more cases of Load/StoreRef, volatile as parameter" This reverts commit 2689fbad6b5ec1ae8f8c8791a80c6fd3cf24144d. Breaks the build. Change-Id: I9faad4e9a83b32f5f38b2ef95d6f9a33345efa33
|
3c12c512faf6837844d5465b23b9410889e5eb11 |
|
24-Jun-2014 |
Andreas Gampe <agampe@google.com> |
Revert "Revert "ART: Split out more cases of Load/StoreRef, volatile as parameter"" This reverts commit de68676b24f61a55adc0b22fe828f036a5925c41. Fixes an API comment, and differentiates between inserting and appending. Change-Id: I0e9a21bb1d25766e3cbd802d8b48633ae251a6bf
|
2689fbad6b5ec1ae8f8c8791a80c6fd3cf24144d |
|
23-Jun-2014 |
Andreas Gampe <agampe@google.com> |
ART: Split out more cases of Load/StoreRef, volatile as parameter Splits out more cases of ref registers being loaded or stored. For code clarity, adds volatile as a flag parameter instead of a separate method. On ARM64, continue cleanup. Add flags to print/fatal on size mismatches. Change-Id: I30ed88433a6b4ff5399aefffe44c14a5e6f4ca4e
|
5aa6e04061ced68cca8111af1e9c19781b8a9c5d |
|
14-Jun-2014 |
Ian Rogers <irogers@google.com> |
Tidy x86 assembler. Use helper functions to compute when the kind has a SIB, a ModRM and RegReg form. Change-Id: I86a5cb944eec62451c63281265e6974cd7a08e07
|
8dea81ca9c0201ceaa88086b927a5838a06a3e69 |
|
06-Jun-2014 |
Vladimir Marko <vmarko@google.com> |
Rewrite use/def masks to support 128 bits. Reduce LIR memory usage by holding masks by pointers in the LIR rather than directly and using pre-defined const masks for the common cases, allocating very few on the arena. Change-Id: I0f6d27ef6867acd157184c8c74f9612cebfe6c16
|
a0cd2d701f29e0bc6275f1b13c0edfd4ec391879 |
|
01-Jun-2014 |
buzbee <buzbee@google.com> |
Quick compiler: reference cleanup For 32-bit targets, object references are 32 bits wide both in Dalvik virtual registers and in core physical registers. Because of this, object references and non-floating point values were both handled as if they had the same register class (kCoreReg). However, for 64-bit systems, references are 32 bits in Dalvik vregs, but 64 bits in physical registers. Although the same underlying physical core registers will still be used for object reference and non-float values, different register class views will be used to represent them. For example, an object reference in arm64 might be held in x3 at some point, while the same underlying physical register, w3, would be used to hold a 32-bit int. This CL breaks apart the handling of object reference and non-float values to allow the proper register class (or register view) to be used. A new register class, kRefReg, is introduced which will map to a 32-bit core register on 32-bit targets, and 64-bit core registers on 64-bit targets. From this point on, object references should be allocated registers in the kRefReg class rather than kCoreReg. Change-Id: I6166827daa8a0ea3af326940d56a6a14874f5810
|
ffddfdf6fec0b9d98a692e27242eecb15af5ead2 |
|
03-Jun-2014 |
Tim Murray <timmurray@google.com> |
DO NOT MERGE Merge ART from AOSP to lmp-preview-dev. Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
|
ed65c5e982705defdb597d94d1aa3f2997239c9b |
|
22-May-2014 |
Serban Constantinescu <serban.constantinescu@arm.com> |
AArch64: Enable LONG_* and INT_* opcodes. This patch fixes some of the issues with LONG and INT opcodes. The patch has been tested and passes all the dalvik tests except for 018 and 107. Change-Id: Idd1923ed935ee8236ab0c7e5fa969eaefeea8708 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
b01bf15d18f9b08d77e7a3c6e2897af0e02bf8ca |
|
14-May-2014 |
buzbee <buzbee@google.com> |
64-bit temp register support. Add a 64-bit temp register allocation path. The recent physical register handling rework supports multiple views of the same physical register (or, such as for Arm's float/double regs, different parts of the same physical register). This CL adds a 64-bit core register view for 64-bit targets. In short, each core register will have a 64-bit name, and a 32-bit name. The different views will be kept in separate register pools, but aliasing will be tracked. The core temp register allocation routines will be largely identical - except for 32-bit targets, which will continue to use pairs of 32-bit core registers for holding long values. Change-Id: I8f118e845eac7903ad8b6dcec1952f185023c053
|
082833c8d577db0b2bebc100602f31e4e971613e |
|
18-May-2014 |
buzbee <buzbee@google.com> |
Quick compiler, out of registers fix It turns out that the register pool sanity checker was not working as expected, leaving some inconsistencies unreported. This could result in "out of registers" failures, as well as other more subtle problems. This CL fixes the sanity checker, adds a lot more check and cleans up the previously undetected episodes of insanity. Cherry-pick of internal change 468162 Change-Id: Id2da97e99105a4c272c5fd256205a94b904ecea8
|
05d3aeb33683b16837741f9348d6fba9a8432068 |
|
18-May-2014 |
buzbee <buzbee@google.com> |
Quick compiler, out of registers fix Fixes b/15024623 It turns out that the register pool sanity checker was not working as expected, leaving some inconsistencies unreported. This CL fixes the sanity checker, adds a lot more check and cleans up the previously undetected episodes of insanity. Change-Id: I4d67db864ca5926a1975db251e7e631b65a86275
|
b14329f90f725af0f67c45dfcb94933a426d63ce |
|
15-May-2014 |
Andreas Gampe <agampe@google.com> |
ART: Fix MonitorExit code on ARM We do not emit barriers on non-SMP systems. But on ARM, we have places that need to conditionally execute, which is done through an IT instruction. The guide of said instruction thus changes between SMP and non-SMP systems. To cleanly approach this, change the API so that GenMemBarrier returns whether it generated an instruction. ARM will have to query the result and update any dependent IT. Throw a build system error if TARGET_CPU_SMP is not set. Fix runtime/Android.mk to work with new multilib host. Bug: 14989275 Change-Id: I9e611b770e8a1cd4ca19367d7dae0573ec08dc61
|
2f244e9faccfcca68af3c5484c397a01a1c3a342 |
|
08-May-2014 |
Andreas Gampe <agampe@google.com> |
ART: Add more ThreadOffset in Mir2Lir and backends This duplicates all methods with ThreadOffset parameters, so that both ThreadOffset<4> and ThreadOffset<8> can be handled. Dynamic checks against the compilation unit's instruction set determine which pointer size to use and therefore which methods to call. Methods with unsupported pointer sizes should fatally fail, as this indicates an issue during method selection. Change-Id: Ifdb445b3732d3dc5e6a220db57374a55e91e1bf6
|
674744e635ddbdfb311fbd25b5a27356560d30c3 |
|
24-Apr-2014 |
Vladimir Marko <vmarko@google.com> |
Use atomic load/store for volatile IGET/IPUT/SGET/SPUT. Bug: 14112919 Change-Id: I79316f438dd3adea9b2653ffc968af83671ad282
|
3bf7c60a86d49bf8c05c5d2ac5ca8e9f80bd9824 |
|
07-May-2014 |
Vladimir Marko <vmarko@google.com> |
Cleanup ARM load/store wide and remove unused param s_reg. Use a single LDRD/VLDR instruction for wide load/store on ARM, adjust the base pointer if needed. Remove unused parameter s_reg from LoadBaseDisp(), LoadBaseIndexedDisp() and StoreBaseIndexedDisp() on all architectures. Change-Id: I25a9a42d523a68addbc11abe44ddc55a4401df98
|
455759b5702b9435b91d1b4dada22c4cce7cae3c |
|
06-May-2014 |
Vladimir Marko <vmarko@google.com> |
Remove LoadBaseDispWide and StoreBaseDispWide. Just pass k64 or kDouble to non-wide versions. Change-Id: I000619c3b78d3a71db42edc747c8a0ba1ee229be
|
091cc408e9dc87e60fb64c61e186bea568fc3d3a |
|
31-Mar-2014 |
buzbee <buzbee@google.com> |
Quick compiler: allocate doubles as doubles Significant refactoring of register handling to unify usage across all targets & 32/64 backends. Reworked RegStorage encoding to allow expanded use of x86 xmm registers; removed vector registers as a separate register type. Reworked RegisterInfo to describe aliased physical registers. Eliminated quite a bit of target-specific code and generalized common code. Use of RegStorage instead of int for registers now propagated down to the NewLIRx() level. In future CLs, the NewLIRx() routines will be replaced with versions that are explicit about what kind of operand they expect (RegStorage, displacement, etc.). The goal is to eventually use RegStorage all the way to the assembly phase. TBD: MIPS needs verification. TBD: Re-enable liveness tracking. Change-Id: I388c006d5fa9b3ea72db4e37a19ce257f2a15964
|
7a11ab09f93f54b1c07c0bf38dd65ed322e86bc6 |
|
29-Apr-2014 |
buzbee <buzbee@google.com> |
Quick compiler: debugging assists A few minor assists to ease A/B debugging in the Quick compiler: 1. To save time, the assemblers for some targets only update the object code offsets on instructions involved with pc-relative fixups. We add code to fix up all offsets when doing a verbose codegen listing. 2. Temp registers are normally allocated in a round-robin fashion. When disabling liveness tracking, we now reset the round-robin pool to 0 on each instruction boundary. This makes it easier to spot real codegen differences. 3. Self-register copies were previously emitted, but marked as nops. Minor change to avoid generating them in the first place and reduce clutter. Change-Id: I7954bba3b9f16ee690d663be510eac7034c93723
|
3a74d15ccc9a902874473ac9632e568b19b91b1c |
|
22-Apr-2014 |
Mingyao Yang <mingyao@google.com> |
Delete throw launchpads. Bug: 13170824 Change-Id: I9d5834f5a66f5eb00f2ac80774e8c27dea99949e
|
e643a179cf5585ba6bafdd4fa51730d9f50c06f6 |
|
08-Apr-2014 |
Mingyao Yang <mingyao@google.com> |
Use LIRSlowPath for throwing NPE. Get rid of launchpads for throwing NPE and use LIRSlowPath instead. Also clean up some code of using LIRSlowPath for checking div by zero. Bug: 13170824 Change-Id: I0c20a49c39feff3eb1f147755e557d9bc0ff15bb
|
3da67a558f1fd3d8a157d8044d521753f3f99ac8 |
|
03-Apr-2014 |
Dave Allison <dallison@google.com> |
Add OpEndIT() for marking the end of OpIT blocks In ARM we need to prevent code motion to the inside of an IT block. This was done using a GenBarrier() to mark the end, but it wasn't obvious that this is what was happening. This CL adds an explicit OpEndIT() that takes the LIR of the OpIT for future checks. Bug: 13751744 Change-Id: If41d2adea1f43f11ebb3b72906bd308252ce3d01
|
dd7624d2b9e599d57762d12031b10b89defc9807 |
|
15-Mar-2014 |
Ian Rogers <irogers@google.com> |
Allow mixing of thread offsets between 32 and 64bit architectures. Begin a more full implementation x86-64 REX prefixes. Doesn't implement 64bit thread offset support for the JNI compiler. Change-Id: If9af2f08a1833c21ddb4b4077f9b03add1a05147
|
e2143c0a4af68c08e811885eb2f3ea5bfdb21ab6 |
|
28-Mar-2014 |
Ian Rogers <irogers@google.com> |
Revert "Revert "Optimize easy multiply and easy div remainder."" This reverts commit 3654a6f50a948ead89627f398aaf86a2c2db0088. Remove the part of the change that confused !is_div with being multiply rather than implying remainder. Change-Id: I202610069c69351259a320e8852543cbed4c3b3e
|
3441512d61ac192c1bf0b9b1eb696d5a8a8d677e |
|
28-Mar-2014 |
Brian Carlstrom <bdc@google.com> |
Revert "Optimize easy multiply and easy div remainder." This reverts commit 08df4b3da75366e5db37e696eaa7e855cba01deb. (cherry picked from commit 3654a6f50a948ead89627f398aaf86a2c2db0088) Change-Id: If8befd7c7135b9dfe3d3e9111768aba89aaa0863
|
3654a6f50a948ead89627f398aaf86a2c2db0088 |
|
28-Mar-2014 |
Brian Carlstrom <bdc@google.com> |
Revert "Optimize easy multiply and easy div remainder." This reverts commit 08df4b3da75366e5db37e696eaa7e855cba01deb.
|
08df4b3da75366e5db37e696eaa7e855cba01deb |
|
25-Mar-2014 |
Zheng Xu <zheng.xu@arm.com> |
Optimize easy multiply and easy div remainder. Update OpRegRegShift and OpRegRegRegShift to use RegStorage parameters. Add special cases for *0 and *1. Add more easy multiply special cases for Arm. Reuse easy multiply in SmallLiteralDivRem() to support remainder cases. Change-Id: Icd76a993d3ac8d4988e9653c19eab4efca14fad0
|
2700f7e1edbcd2518f4978e4cd0e05a4149f91b6 |
|
07-Mar-2014 |
buzbee <buzbee@google.com> |
Continuing register cleanup Ready for review. Continue the process of using RegStorage rather than ints to hold register value in the top layers of codegen. Given the huge number of changes in this CL, I've attempted to minimize the number of actual logic changes. With this CL, the use of ints for registers has largely been eliminated except in the lowest utility levels. "Wide" utility routines have been updated to take a single RegStorage rather than a pair of ints representing low and high registers. Upcoming CLs will be smaller and more targeted. My expectations: o Allocate float double registers as a single double rather than a pair of float single registers. o Refactor to push code which assumes long and double Dalvik values are held in a pair of register to the target dependent layer. o Clean-up of the xxx_mir.h files to reduce the amount of #defines for registers. May also do a register renumbering to bring all of our targets' register naming more consistent. Possibly introduce a target-independent float/non-float test at the RegStorage level. Change-Id: I646de7392bdec94595dd2c6f76e0f1c4331096ff
|
b373e091eac39b1a79c11f2dcbd610af01e9e8a9 |
|
21-Feb-2014 |
Dave Allison <dallison@google.com> |
Implicit null/suspend checks (oat version bump) This adds the ability to use SEGV signals to throw NullPointerException exceptions from Java code rather than having the compiler generate explicit comparisons and branches. It does this by using sigaction to trap SIGSEGV and when triggered makes sure it's in compiled code and if so, sets the return address to the entry point to throw the exception. It also uses this signal mechanism to determine whether to check for thread suspension. Instead of the compiler generating calls to a function to check for threads being suspended, the compiler will now load indirect via an address in the TLS area. To trigger a suspend, the contents of this address are changed from something valid to 0. A SIGSEGV will occur and the handler will check for a valid instruction pattern before invoking the thread suspension check code. If a user program taps SIGSEGV it will prevent our signal handler working. This will cause a failure in the runtime. There are two signal handlers at present. You can control them individually using the flags -implicit-checks: on the runtime command line. This takes a string parameter, a comma separated set of strings. Each can be one of: none switch off null null pointer checks suspend suspend checks all all checks So to switch only suspend checks on, pass: -implicit-checks:suspend There is also -explicit-checks to provide the reverse once we change the default. For dalvikvm, pass --runtime-arg -implicit-checks:foo,bar The default is -implicit-checks:none There is also a property 'dalvik.vm.implicit_checks' whose value is the same string as the command option. The default is 'none'. For example to switch on null checks using the option: setprop dalvik.vm.implicit_checks null It only works for ARM right now. Bumps OAT version number due to change to Thread offsets. Bug: 13121132 Change-Id: If743849138162f3c7c44a523247e413785677370
|
00e1ec6581b5b7b46ca4c314c2854e9caa647dd2 |
|
28-Feb-2014 |
Bill Buzbee <buzbee@android.com> |
Revert "Revert "Rework Quick compiler's register handling"" This reverts commit 86ec520fc8b696ed6f164d7b756009ecd6e4aace. Ready. Fixed the original type, plus some mechanical changes for rebasing. Still needs additional testing, but the problem with the original CL appears to have been a typo in the definition of the x86 double return template RegLocation. Change-Id: I828c721f91d9b2546ef008c6ea81f40756305891
|
86ec520fc8b696ed6f164d7b756009ecd6e4aace |
|
26-Feb-2014 |
Bill Buzbee <buzbee@android.com> |
Revert "Rework Quick compiler's register handling" This reverts commit 2c1ed456dcdb027d097825dd98dbe48c71599b6c. Change-Id: If88d69ba88e0af0b407ff2240566d7e4545d8a99
|
2c1ed456dcdb027d097825dd98dbe48c71599b6c |
|
20-Feb-2014 |
buzbee <buzbee@google.com> |
Rework Quick compiler's register handling For historical reasons, the Quick backend found it convenient to consider all 64-bit Dalvik values held in registers to be contained in a pair of 32-bit registers. Though this worked well for ARM (with double-precision registers also treated as a pair of 32-bit single-precision registers) it doesn't play well with other targets. And, it is somewhat problematic for 64-bit architectures. This is the first of several CLs that will rework the way the Quick backend deals with physical registers. The goal is to eliminate the "64-bit value backed with 32-bit register pair" requirement from the target-indendent portions of the backend and support 64-bit registers throughout. The key RegLocation struct, which describes the location of Dalvik virtual register & register pairs, previously contained fields for high and low physical registers. The low_reg and high_reg fields are being replaced with a new type: RegStorage. There will be a single instance of RegStorage for each RegLocation. Note that RegStorage does not increase the space used. It is 16 bits wide, the same as the sum of the 8-bit low_reg and high_reg fields. At a target-independent level, it will describe whether the physical register storage associated with the Dalvik value is a single 32 bit, single 64 bit, pair of 32 bit or vector. The actual register number encoding is left to the target-dependent code layer. Because physical register handling is pervasive throughout the backend, this restructuring necessarily involves large CLs with lots of changes. I'm going to roll these out in stages, and attempt to segregate the CLs with largely mechanical changes from those which restructure or rework the logic. This CL is of the mechanical change variety - it replaces low_reg and high_reg from RegLocation and introduces RegStorage. It also includes a lot of new code (such as many calls to GetReg()) that should go away in upcoming CLs. The tentative plan for the subsequent CLs is: o Rework standard register utilities such as AllocReg() and FreeReg() to use RegStorage instead of ints. o Rework the target-independent GenXXX, OpXXX, LoadValue, StoreValue, etc. routines to take RegStorage rather than int register encodings. o Take advantage of the vector representation and eliminate the current vector field in RegLocation. o Replace the "wide" variants of codegen utilities that take low_reg/high_reg pairs with versions that use RegStorage. o Add 64-bit register target independent codegen utilities where possible, and where not virtualize with 32-bit general register and 64-bit general register variants in the target dependent layer. o Expand/rework the LIR def/use flags to allow for more registers (currently, we lose out on 16 MIPS floating point regs as well as ARM's D16..D31 for lack of space in the masks). o [Possibly] move the float/non-float determination of a register from the target-dependent encoding to RegStorage. In other words, replace IsFpReg(register_encoding_bits). At the end of the day, all code in the target independent layer should be using RegStorage, as should much of the target dependent layer. Ideally, we won't be using the physical register number encoding extracted from RegStorage (i.e. GetReg()) until the NewLIRx() layer. Change-Id: Idc5c741478f720bdd1d7123b94e4288be5ce52cb
|
3bc01748ef1c3e43361bdf520947a9d656658bf8 |
|
06-Feb-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
GenSpecialCase support for x86 Moved GenSpecialCase from being ARM specific to common code to allow it to be used by x86 quick as well. Change-Id: I728733e8f4c4da99af6091ef77e5c76ae0fee850 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
2c498d1f28e62e81fbdb477ff93ca7454e7493d7 |
|
30-Jan-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
Specializing x86 range argument copying The ARM implementation of range argument copying was specialized in some cases. For all other architectures, it would fall back to generating memcpy. This patch updates the x86 implementation so it does not call memcpy and instead generates loads and stores, favoring movement of 128-bit chunks. Change-Id: Ic891e5609a4b0e81a47c29cc5a9b301bd10a1933 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
2bf31e67694da24a19fc1f328285cebb1a4b9964 |
|
23-Jan-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Improve x86 long divide Implement inline division for literal and variable divisors. Use the general case for dividing by a literal by using a double length multiply by the appropriate constant with fixups. This is the Hacker's Delight algorithm. Change-Id: I563c250f99d89fca5ff8bcbf13de74de13815cfe Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
e02d48fb24747f90fd893e1c3572bb3c500afced |
|
15-Jan-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Optimize x86 long arithmetic Be smarter about taking advantage of a constant operand for x86 long add/sub/and/or/xor. Using instructions with immediates and generating results directly into memory reduces the number of temporary registers and avoids hardcoded register usage. Also rewrite the existing non-const x86 arithmetic to avoid fixed register use, and use the fact that x86 instructions are two operand. Pass the opcode to the XXXLong() routines to easily detect two operand DEX opcodes. Add a new StoreFinalValueWide() routine, which is similar to StoreValueWide, but doesn't do an EvalLoc to allocate registers. The src operand must already be in registers, and it just updates the dest location, and calls the right live/dirty routines to get the src into the dest properly. Change-Id: Iefc16e7bc2236a73dc780d3d5137ae8343171f62 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
bd288c2c1206bc99fafebfb9120a83f13cf9723b |
|
21-Dec-2013 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
Add conditional move support to x86 and allow GenMinMax to use it X86 supports conditional moves which is useful for reducing branchiness. This patch adds support to the x86 backend to generate conditional reg to reg operations. Both encoder and decoder support was added for cmov. The x86 version of GenMinMax used for generating inlined version Math.min/max has been updated to make use of the conditional move support. Change-Id: I92c5428e40aa8ff88bd3071619957ac3130efae7 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
5816ed48bc339c983b40dc493e96b97821ce7966 |
|
27-Nov-2013 |
Vladimir Marko <vmarko@google.com> |
Detect special methods at the end of verification. This moves special method handling to method inliner and prepares for eventual inlining of these methods. Change-Id: I51c51b940fb7bc714e33135cd61be69467861352
|
31c2aac7137b69d5622eea09597500731fbee2ef |
|
09-Dec-2013 |
Vladimir Marko <vmarko@google.com> |
Rename ClobberCalleeSave to *Caller*, fix it for x86. Change-Id: I6a72703a11985e2753fa9b4520c375a164301433
|
1c282e2b9a9b432e132b2c332f861cad9feb4a73 |
|
21-Nov-2013 |
Vladimir Marko <vmarko@google.com> |
Refactor intrinsic CAS, prepare for 64-bit version. Bug: 11391018 Change-Id: Ic0f740e0cd0eb47f2c915f81be02f52f7721f8a3
|
e508a2090b19fe705fbc6b99d76474037a74bbfb |
|
04-Nov-2013 |
Vladimir Marko <vmarko@google.com> |
Fix unaligned Memory peek/poke intrinsics. Change-Id: Id454464d0b28aa37f5239f1c6589ceb0b3bbbdea
|
0d82948094d9a198e01aa95f64012bdedd5b6fc9 |
|
12-Oct-2013 |
buzbee <buzbee@google.com> |
64-bit prep Preparation for 64-bit roll. o Eliminated storing pointers in 32-bit int slots in LIR. o General size reductions of common structures to reduce impact of doubled pointer sizes: - BasicBlock struct was 72 bytes, now is 48. - MIR struct was 72 bytes, now is 64. - RegLocation was 12 bytes, now is 8. o Generally replaced uses of BasicBlock* pointers with 16-bit Ids. o Replaced several doubly-linked lists with singly-linked to save one stored pointer per node. o We had quite a few uses of uintptr_t's that were a holdover from the JIT (which used pointers to mapped dex & actual code cache addresses rather than trace-relative offsets). Replaced those with uint32_t's. o Clean up handling of embedded data for switch tables and array data. o Miscellaneous cleanup. I anticipate one or two additional CLs to reduce the size of MIR and LIR structs. Change-Id: I58e426d3f8e5efe64c1146b2823453da99451230
|
a9a8254c920ce8e22210abfc16c9842ce0aea28f |
|
04-Oct-2013 |
Ian Rogers <irogers@google.com> |
Improve quick codegen for aput-object. 1) don't type check known null. 2) if we know types in verify don't check at runtime. 3) if we're runtime checking then move all the code out-of-line. Also, don't set up a callee-save frame for check-cast, do an instance-of test then throw an exception if that fails. Tidy quick entry point of Ldivmod to Lmod which it is on x86 and mips. Fix monitor-enter/exit NPE for MIPS. Fix benign bug in mirror::Class::CannotBeAssignedFromOtherTypes, a byte[] cannot be assigned to from other types. Change-Id: I9cb3859ec70cca71ed79331ec8df5bec969d6745
|
d9c4fc94fa618617f94e1de9af5f034549100753 |
|
02-Oct-2013 |
Ian Rogers <irogers@google.com> |
Inflate contended lock word by suspending owner. Bug 6961405. Don't inflate monitors for Notify and NotifyAll. Tidy lock word, handle recursive lock case alongside unlocked case and move assembly out of line (except for ARM quick). Also handle null in out-of-line assembly as the test is quick and the enter/exit code is already a safepoint. To gain ownership of a monitor on behalf of another thread, monitor contenders must not hold the monitor_lock_, so they wait on a condition variable. Reduce size of per mutex contention log. Be consistent in calling thin lock thread ids just thread ids. Fix potential thread death races caused by the use of FindThreadByThreadId, make it invariant that returned threads are either self or suspended now. Code size reduction on ARM boot.oat 0.2%. Old nexus 7 speedup 0.25%, new nexus 7 speedup 1.4%, nexus 10 speedup 2.24%, nexus 4 speedup 2.09% on DeltaBlue. Change-Id: Id52558b914f160d9c8578fdd7fc8199a9598576a
|
b48819db07f9a0992a72173380c24249d7fc648a |
|
15-Sep-2013 |
buzbee <buzbee@google.com> |
Compile-time tuning: assembly phase Not as much compile-time gain from reworking the assembly phase as I'd hoped, but still worthwhile. Should see ~2% improvement thanks to the assembly rework. On the other hand, expect some huge gains for some application thanks to better detection of large machine-generated init methods. Thinkfree shows a 25% improvement. The major assembly change was to establish thread the LIR nodes that require fixup into a fixup chain. Only those are processed during the final assembly pass(es). This doesn't help for methods which only require a single pass to assemble, but does speed up the larger methods which required multiple assembly passes. Also replaced the block_map_ basic block lookup table (which contained space for a BasicBlock* for each dex instruction unit) with a block id map - cutting its space requirements by half in a 32-bit pointer environment. Changes: o Reduce size of LIR struct by 12.5% (one of the big memory users) o Repurpose the use/def portion of the LIR after optimization complete. o Encode instruction bits to LIR o Thread LIR nodes requiring pc fixup o Change follow-on assembly passes to only consider fixup LIRs o Switch on pc-rel fixup kind o Fast-path for small methods - single pass assembly o Avoid using cb[n]z for null checks (almost always exceed displacement) o Improve detection of large initialization methods. o Rework def/use flag setup. o Remove a sequential search from FindBlock using lookup table of 16-bit block ids rather than full block pointers. o Eliminate pcRelFixup and use fixup kind instead. o Add check for 16-bit overflow on dex offset. Change-Id: I4c6615f83fed46f84629ad6cfe4237205a9562b4
|
bd663de599b16229085759366c56e2ed5a1dc7ec |
|
11-Sep-2013 |
buzbee <buzbee@google.com> |
Compile-time tuning: register/bb utilities This CL yeilds about a 4% improvement in the compilation phase of dex2oat (single-threaded; multi-threaded compilation is more difficult to accurately measure). The register utilities could stand to be completely rewritten, but this gets most of the easy benefit. Next up: the assembly phase. Change-Id: Ife5a474e9b1a6d9e501e888dda6749d34eb77e96
|
11b63d13f0a3be0f74390b66b58614a37f9aa6c1 |
|
27-Aug-2013 |
buzbee <buzbee@google.com> |
Quick compiler: division by literal fix The constant propagation optimization pass attempts to identify constants in Dalvik virtual registers and handle them more efficiently. The use of small constants in divison, though, was handled incorrectly in that the high level code correctly detected the use of a constant, but the actual code generation routine was only expecting the use of a special constant form opcode. see b/10503566 Change-Id: I88aa4d2eafebb2b1af1a1e88049f1845aefae261
|
468532ea115657709bc32ee498e701a4c71762d4 |
|
05-Aug-2013 |
Ian Rogers <irogers@google.com> |
Entry point clean up. Create set of entry points needed for image methods to avoid fix-up at load time: - interpreter - bridge to interpreter, bridge to compiled code - jni - dlsym lookup - quick - resolution and bridge to interpreter - portable - resolution and bridge to interpreter Fix JNI work around to use JNI work around argument rewriting code that'd been accidentally disabled. Remove abstact method error stub, use interpreter bridge instead. Consolidate trampoline (previously stub) generation in generic helper. Simplify trampolines to jump directly into assembly code, keeps stack crawlable. Dex: replace use of int with ThreadOffset for values that are thread offsets. Tidy entry point routines between interpreter, jni, quick and portable. Change-Id: I52a7c2bbb1b7e0ff8a3c3100b774212309d0828e (cherry picked from commit 848871b4d8481229c32e0d048a9856e5a9a17ef9)
|
848871b4d8481229c32e0d048a9856e5a9a17ef9 |
|
05-Aug-2013 |
Ian Rogers <irogers@google.com> |
Entry point clean up. Create set of entry points needed for image methods to avoid fix-up at load time: - interpreter - bridge to interpreter, bridge to compiled code - jni - dlsym lookup - quick - resolution and bridge to interpreter - portable - resolution and bridge to interpreter Fix JNI work around to use JNI work around argument rewriting code that'd been accidentally disabled. Remove abstact method error stub, use interpreter bridge instead. Consolidate trampoline (previously stub) generation in generic helper. Simplify trampolines to jump directly into assembly code, keeps stack crawlable. Dex: replace use of int with ThreadOffset for values that are thread offsets. Tidy entry point routines between interpreter, jni, quick and portable. Change-Id: I52a7c2bbb1b7e0ff8a3c3100b774212309d0828e
|
0cd7ec2dcd8d7ba30bf3ca420b40dac52849876c |
|
18-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Fix cpplint whitespace/blank_line issues Change-Id: Ice937e95e23dd622c17054551d4ae4cebd0ef8a2
|
fc0e3219edc9a5bf81b166e82fd5db2796eb6a0d |
|
17-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Fix multiple inclusion guards to match new pathnames Change-Id: Id7735be1d75bc315733b1773fba45c1deb8ace43
|
7940e44f4517de5e2634a7e07d58d0fb26160513 |
|
12-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Create separate Android.mk for main build targets The runtime, compiler, dex2oat, and oatdump now are in seperate trees to prevent dependency creep. They can now be individually built without rebuilding the rest of the art projects. dalvikvm and jdwpspy were already this way. Builds in the art directory should behave as before, building everything including tests. Change-Id: Ic6b1151e5ed0f823c3dd301afd2b13eb2d8feb81
|