b3cd84a2fbd4875c605cfa5a4a362864b570f1e6 |
|
13-Jul-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a bug in ClassTableGet code generation for IMTs. Introduced by: https://android-review.googlesource.com/#/c/244980/ test:566-polymorphic-inling for fixing x86 crash. Also fixes a performance regression. bug:29188168 (cherry picked from commit ff484b95b25a5181a6a8a191cbd11da501c97651) Change-Id: Iae5a63cb24017222c3fefda695a0a39673719f51
|
df2d4f22d5e89692c90b443da82fe2930518418b |
|
30-Jun-2016 |
Artem Udovichenko <artem.u@samsung.com> |
Revert "Revert "Optimize IMT"" This reverts commit 88f288e3564d79d87c0cd8bb831ec5a791ba4861. Test: Includes smali tests to exercise cts failures that led to revert. These tests check that objects that don't implement any interfaces are handled properly when interface methods are invoked on them. Bug: 29188168 (for initial CL) Bug: 29778499 (reason for revert) Change-Id: I49605d53692cbec1e2622e23ff2893fc51ed4115
|
fd43db68d204caaa0e411ca79a37af15d1c001af |
|
29-Jun-2016 |
Jeff Hao <jeffhao@google.com> |
Revert "Optimize IMT" This reverts commit 0790af1391b316c5c12b4e135be357008c060696. Bug: 29188168 (for initial CL) Bug: 29778499 (reason for revert) Change-Id: I2c3e4ec2cebdd40faec67ddb721b7acdc8e90061
|
0790af1391b316c5c12b4e135be357008c060696 |
|
13-May-2016 |
Nelli Kim <nelli.kim@samsung.com> |
Optimize IMT * Remove IMT for classes which do not implement interfaces * Remove IMT for array classes * Share same IMT Saved memory (measured on hammerhead): boot.art: Total number of classes: 3854 Number of affected classes: 1637 Saved memory: 409kB Chrome (excluding classes in boot.art): Total number of classes: 2409 Number of affected classes: 1259 Saved memory: 314kB Google Maps (excluding classes in boot.art): Total number of classes: 6988 Number of affected classes: 2574 Saved memory: 643kB Performance regression on benchmarks/InvokeInterface.java benchmark (measured timeCall10Interface) 1st launch: 9.6% 2nd launch: 6.8% Bug: 29188168 (cherry picked from commit badee9820fcf5dca5f8c46c3215ae1779ee7736e) Change-Id: If8db765e3333cb78eb9ef0d66c2fc78a5f17f497
|
e5de54cfab5f14ba0b8ff25d8d60901c7021943f |
|
20-Apr-2016 |
Calin Juravle <calin@google.com> |
Split profile recording from jit compilation We still use ProfileInfo objects to record profile information. That gives us the flexibility to add the inline caches in the future and the convenience of the already implemented GC. If UseJIT is false and SaveProfilingInfo true, we will only record the ProfileInfo and never launch compilation tasks. Bug: 27916886 Change-Id: I6e4768dc5d58f2f85f947b276b4244aa11ce3fca
|
c393d63aa2b8f6984672fdd4de631bbeff14b6a2 |
|
15-Apr-2016 |
Alexandre Rames <alexandre.rames@linaro.org> |
Fix: correctly destruct VIXL labels. (cherry picked from commit c01a66465a398ad15da90ab2bdc35b7f4a609b17) Bug: 27505766 Change-Id: I077465e3d308f4331e7a861902e05865f9d99835
|
d1ee80948144526b985afb44a0574248cf7da58a |
|
13-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Move Assemblers to the Arena. And clean up some APIs to return std::unique_ptr<> instead of raw pointers that don't communicate ownership. (cherry picked from commit 93205e395f777c1dd81d3f164cf9a4aec4bde45f) Bug: 27505766 Change-Id: I3017302307a0253d661240750298802fb0d9585e
|
dee58d6bb6d567fcd0c4f39d8d690c3acaf0e432 |
|
07-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals"" This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) This CL fixed an issue with parsing quickened instructions. Bug: 27894376 Bug: 27998571 Bug: 27995065 Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
|
40ecb12f8eeb97b810e11f895278abbf7988ed4d |
|
06-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Fix codegens for MethodLoadKind::kDexCacheViaMethod. Use the original method index instead of the target method index because the target method may point to a different dex file. No regression test: this currently happens only if the codegen uses the kDexCacheViaMethod as a fallback for another load kind and we aim to avoid that fallback, so it would be difficult to write a reliable regression test. We could try and exploit current fallbacks for irreducible loops on x86 and arm but those fallbacks will eventually disappear anyway. Bug: 28036230 Change-Id: I4cc9e046480d3d60a7fb521f0ca6a98914625cdc
|
60328910cad396589474f8513391ba733d19390b |
|
04-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals" Bug: 27995065 This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f. Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
|
e3ff7b293be2a6791fe9d135d660c0cffe4bd73f |
|
02-Mar-2016 |
David Brazdil <dbrazdil@google.com> |
Refactor HGraphBuilder and SsaBuilder to remove HLocals This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) Bug: 27894376 Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
|
cac5a7e871f1f346b317894359ad06fa7bd67fba |
|
22-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Improve const-string code generation. For strings in the boot image, use either direct pointers or pc-relative addresses. For other strings, use PC-relative access to the dex cache arrays for AOT and direct address of the string's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -692KiB (-0.9%) - 64-bit boot.oat: -948KiB (-1.1%) - 32-bit dalvik cache total: -900KiB (-0.9%) - 64-bit dalvik cache total: -3672KiB (-1.5%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -380KiB (-0.5%) - 64-bit boot.oat: -928KiB (-1.0%) - 32-bit dalvik cache total: -468KiB (-0.4%) - 64-bit dalvik cache total: -1928KiB (-0.8%) (contains more files than the 32-bit dalvik cache) Bug: 26884697 Change-Id: Iec7266ce67e6fedc107be78fab2e742a8dab2696
|
5b5b9319ff970979ed47d41a41283e4faeffb602 |
|
22-Mar-2016 |
Roland Levillain <rpl@google.com> |
Fix and improve shift and rotate operations. - Define maximum int and long shift & rotate distances as int32_t constants, as shift & rotate distances are 32-bit integer values. - Consider the (long, long) inputs case as invalid for static evaluation of shift & rotate rotations. - Add more checks in shift & rotate operations constructors as well as in art::GraphChecker. Change-Id: I754b326c3a341c9cc567d1720b327dad6fcbf9d6
|
1a65388f1d86bb232c2e44fecb44cebe13105d2e |
|
18-Mar-2016 |
Roland Levillain <rpl@google.com> |
Clean up art::HConstant predicates. - Make the difference between arithmetic zero and zero-bit pattern non ambiguous. - Introduce Boolean predicates in art::HIntConstant for when they are used as Booleans. - Introduce aritmetic positive and negative zero predicates for floating-point constants. Bug: 27639313 Change-Id: Ia04ecc6f6aa7450136028c5362ed429760c883bd
|
d28f4a00933a4a3b8d5e9db73b8532924d0f989d |
|
14-Mar-2016 |
David Srbecky <dsrbecky@google.com> |
Generate native debug stackmaps before calls as well. The debugger looks up PC of the call instruction, so the runtime's stackmap is not sufficient since it is at PC after the instruction. Change-Id: I0dd06c0b52e8079ea5d064ea10beb12c93584092
|
a5c4a4060edd03eda017abebc85f24cffb083ba7 |
|
15-Mar-2016 |
Roland Levillain <rpl@google.com> |
Make art::HCompare support boolean, byte, short and char inputs. Also extend tests covering the IntegerSignum, LongSignum, IntegerCompare and LongCompare intrinsics and their translation into an art::HCompare instruction. Bug: 27629913 Change-Id: I0afc75ee6e82602b01ec348bbb36a08e8abb8bb8
|
2ae48182573da7087bffc2873730bc758ec29696 |
|
16-Mar-2016 |
Calin Juravle <calin@google.com> |
Clean up NullCheck generation and record stats about it. This removes redundant code from the generators and allows for easier stat recording. Change-Id: Iccd4368f9e9d87a6fecb863dee4e2145c97851c4
|
e5671618d19489ad0781ec0d204c7765317170cf |
|
16-Mar-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Accept boolean as an input of HDivZeroCheck. All our arithmetic operations accept it. bug:27624718 Change-Id: I1f6bb95dc77ecb3fb2fcabb35a93b31c524bfa0a
|
7fc6350f6f1ab04b52b9cd7542e0790528296cbe |
|
09-Feb-2016 |
Artem Serov <artem.serov@linaro.org> |
Integrate BitwiseNegated into shared framework. Share implementation between arm and arm64. Change-Id: I0dd12e772cb23b4c181fd0b1e2a447470b1d8702
|
a1de9188a05afdecca8cd04ecc4fefbac8b9880f |
|
25-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Reduce memory usage of HInstructions. Pack narrow fields and flags into a single 32-bit field. Change-Id: Ib2f7abf987caee0339018d21f0d498f8db63542d
|
9ff0d205fd60cba6753a91f613b198ca2d67f04d |
|
11-Jan-2016 |
Kevin Brodsky <kevin.brodsky@linaro.org> |
Optimizing: ARM64 negated bitwise operations simplification Use negated instructions on ARM64 to replace [bitwise operation + not] patterns, that is: a & ~b (BIC) a | ~b (ORN) a ^ ~b (EON) The simplification only happens if the Not is only used by the bitwise operation. It does not happen if both inputs are Not's (this should be handled by a generic simplification applying De Morgan's laws). Change-Id: I0e112b23fd8b8e10f09bfeff5994508a8ff96e9c
|
4a0dad67867f389e01a5a6c0fe381d210f687c0d |
|
25-Jan-2016 |
Artem Udovichenko <artem.u@samsung.com> |
Revert "Revert "ARM/ARM64: Extend support of instruction combining."" This reverts commit 6b5afdd144d2bb3bf994240797834b5666b2cf98. Change-Id: Ic27a10f02e21109503edd64e6d73d1bb0c6a8ac6
|
9cd6d378bd573cdc14d049d32bdd22a97fa4d84a |
|
09-Feb-2016 |
David Srbecky <dsrbecky@google.com> |
Associate slow paths with the instruction that they belong to. Almost all slow paths already know the instruction they belong to, this CL just moves the knowledge to the base class as well. This is needed to be be able to get the corresponding dex pc for slow path, which allows us generate better native line numbers, which in turn fixes some native debugging stepping issues. Change-Id: I568dbe78a7cea6a43a4a71a014b3ad135782c270
|
c7098ff991bb4e00a800d315d1c36f52a9cb0149 |
|
09-Feb-2016 |
David Srbecky <dsrbecky@google.com> |
Remove HNativeDebugInfo from start of basic blocks. We do not require full environment at the start of basic block. The dex pc contained in basic block is sufficient for line mapping. Change-Id: I5ba9e5f5acbc4a783ad544769f9a73bb33e2bafa
|
c0b601b5e4c1add5eefd45f2f4d2c376a20ba4d4 |
|
08-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Implement HSelect with CSEL/FCSEL on arm64 Change-Id: I549af0cba3c5048066a2d1206b78a70b496d349e
|
6e332529c33be4d7dae5dad3609a839f4c0d3bfc |
|
02-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove HTemporary Change-Id: I21b984224370a9ce7a4a13a9652503cfb03c5f03
|
ca0bf0349f8da35b284df49732e30eeb62591034 |
|
09-Feb-2016 |
Roland Levillain <rpl@google.com> |
Fix ARM64 Baker's read barrier fast path for ArraySet. Do not exhaust the pool of scratch (temporary) registers gratuitously when emitting an instrumented array load with a large constant index. Bug: 26817006 Bug: 12687968 Change-Id: I65a4fe676aa3c9e2c8d7e26195d9af6432c83ff9
|
a19616e3363276e7f2c471eb2839fb16f1d43f27 |
|
02-Feb-2016 |
Aart Bik <ajcbik@google.com> |
Implemented compare/signum intrinsics as HCompare (with all code generation for all) Rationale: At HIR level, many more optimizations are possible, while ultimately generated code can take advantage of full semantics. Change-Id: I6e2ee0311784e5e336847346f7f3c4faef4fd17e
|
a42363f79832a6e14f348514664dc6dc3edf9da2 |
|
17-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement first kind of polymorphic inlining. Add HClassTableGet to fetch an ArtMethod from the vtable or imt, and compare it to the only method the profiling saw. Change-Id: I76afd3689178f10e3be048aa3ac9a97c6f63295d
|
74eb1b264691c4eb399d0858015a7fc13c476ac6 |
|
14-Dec-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement HSelect This patch adds a new HIR instruction to Optimizing. HSelect returns one of two inputs based on the outcome of a condition. This is only initial implementation which: - defines the new instruction, - repurposes BooleanSimplifier to emit it, - extends InstructionSimplifier to statically resolve it, - updates existing code and tests accordingly. Code generators currently emit fallback if/then/else code and will be updated in follow-up CLs to use platform-specific conditional moves when possible. Change-Id: Ib61b17146487ebe6b55350c2b589f0b971dcaaee
|
b3e773eea39a156b3eacf915ba84e3af1a5c14fa |
|
26-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Implement support for instruction inlining Optimizing HIR contains 'non-materialized' instructions which are emitted at their use sites rather than their defining sites. This was not properly handled by the liveness analysis which did not adjust the use positions of the inputs of such instructions. Despite the analysis being incorrect, the current use cases never produce incorrect code. This patch generalizes the concept of inlined instructions and updates liveness analysis to set the compute use positions correctly. Change-Id: Id703c154b20ab861241ae5c715a150385d3ff621
|
4a6a67ca93289b232a620bdf8bf30ff8b7b0b428 |
|
27-Jan-2016 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Remove unused DMB code paths in the ARM64 Optimizing Compiler Currently all ARM64 CPUs will be using the acquire-release code paths. This patch removes the instruction set feature PreferAcquireRelease() as well as all the unused DMB code paths. Change-Id: I61c320d6d685f96c9e260f25eac3593907793830 Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
|
44015868a5ed9f6915d510ade42e84949b719e3a |
|
22-Jan-2016 |
Roland Levillain <rpl@google.com> |
Revert "Revert "ARM64 Baker's read barrier fast path implementation."" This reverts commit 28a2ff0bd6c30549f3f6465d8316f5707b1d072f. Bug: 12687968 Change-Id: I6e25c70f303368629cdb1084f1d7039261cbb79a
|
6b5afdd144d2bb3bf994240797834b5666b2cf98 |
|
22-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "ARM/ARM64: Extend support of instruction combining." The test fails its checker parts. This reverts commit debeb98aaa8950caf1a19df490f2ac9bf563075b. Change-Id: I49929e15950c7814da6c411ecd2b640d12de80df
|
28a2ff0bd6c30549f3f6465d8316f5707b1d072f |
|
21-Jan-2016 |
Mathieu Chartier <mathieuc@google.com> |
Revert "ARM64 Baker's read barrier fast path implementation." This reverts commit c8f1df9965ca7f97ba9e6289f8c7a717765a59a9. This breaks master. Change-Id: Ic07f602af8732e2835bd11f65e3b9e766d3349c7
|
debeb98aaa8950caf1a19df490f2ac9bf563075b |
|
11-Dec-2015 |
Ilmir Usmanov <i.usmanov@samsung.com> |
ARM/ARM64: Extend support of instruction combining. Combine multiply instructions in the following way: ARM64: MUL/NEG -> MNEG ARM32 (32-bit integers only): MUL/ADD -> MLA MUL/SUB -> MLS Change-Id: If20f2d8fb060145ab6fbceeb5a8f1a3d02e0ecdb
|
086d27e2ef9d11138f8832190d09a56e72346f15 |
|
21-Jan-2016 |
Aart Bik <ajcbik@google.com> |
Fix missing case in ARM64 codegen. Rationale: Rather than excluding conditions that are not handled, changed the right-hand-side is zero optimized code to list handled conditions explicitly instead. bug=26689526 Change-Id: I636e01548659c579d9e318f07bda2c24a12371e5
|
c8f1df9965ca7f97ba9e6289f8c7a717765a59a9 |
|
20-Jan-2016 |
Roland Levillain <rpl@google.com> |
ARM64 Baker's read barrier fast path implementation. Introduce an ARM64 fast path implementation in Optimizing for Baker's read barriers (for both heap reference loads and GC root loads). The marking phase of the read barrier is performed by a slow path, invoking the runtime entry point artReadBarrierMark. Other read barrier algorithms continue to use the original slow path based implementation, which has been renamed as GenerateReadBarrierSlow/GenerateReadBarrierForRootSlow. Bug: 12687968 Bug: 26601270 Change-Id: I60da15249b58a8ee1a065ed9be2c4e438ee17150
|
58282f4510961317b8d5a364a6f740a78926716f |
|
14-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove Baseline compiler We don't need Baseline any more and it hasn't been maintained for a while anyway. Let's remove it. Change-Id: I442ed26855527be2df3c79935403a25b1ee55df6
|
d6e069b16a7d4964e546daf3d340ea11756ab090 |
|
18-Jan-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Improve floating point comparisons on arm and arm64. Avoid the extra check for unordered inputs by using the appropriate arm/arm64 condition. Change-Id: Ib5e775a90428db7a2cf377ad9fd6a3192d670617
|
cd3d0fb5a4c113cfdb610454d133762a2ab0e6de |
|
15-Jan-2016 |
Roland Levillain <rpl@google.com> |
Do not use HArm64IntermediateAddress with read barriers. This ARM64 instruction simplification does not yet work correctly with the read barrier compiler instrumentation. Bug: 26601270 Bug: 12687968 Change-Id: I0c3c5d0043ebd936e00984740efbae8b3025c7ca
|
6de1938e562b0d06e462512dd806166e754035ea |
|
08-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove incorrect HFakeString optimization Simplification of HFakeString assumes that it cannot be used until String.<init> is called which is not true and causes different behaviour between the compiler and the interpreter. This patch removes the optimization together with the HFakeString instruction. Instead, HNewInstance is generated and an empty String allocated until it is replaced with the result of the StringFactory call. This is consistent with the behaviour of the interpreter but is too conservative. A follow-up CL will attempt to optimize out the initial allocation when possible. Bug: 26457745 Bug: 26486014 Change-Id: I7139e37ed00a880715bfc234896a930fde670c44
|
42249c3602c3d0243396ee3627ffb5906aa77c1e |
|
08-Jan-2016 |
Aart Bik <ajcbik@google.com> |
Reduce code size by sharing slow paths. Rationale: Sharing identical slow path code reduces code size. Background: Currently, slow paths with the same dex-pc, same physical register spilling code, and identical stack maps are shared (making this only useful for deopt slow paths). The newly introduced mechanism is sufficiently general to allow future improvements by e.g. allowing different dex-pc (by passing this to runtime) or even the kind of slow paths (by passing runtime addresses to the slowpath). Change-Id: I819615c47b4fd98440a241f681f93e4fc22d12e0
|
b7070a2db8b0b7eca14f01f932be305be64ded57 |
|
08-Jan-2016 |
David Srbecky <dsrbecky@google.com> |
Generate Nops to ensure that debug stack maps have distinct PC. Change-Id: I5740ec958a20d236634b66df0e675382ed5c16fc
|
68f6289fbc1b14ed814722c023b3f343c1e59a79 |
|
04-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use std::abs on INT_MIN/LONG_MIN, it's undefined. bug:25494265 Change-Id: I560a3a589b92440020285f9adfdf7c9efb06217c
|
0cf4493166ff28518c8eafa2d0463f6e817cce75 |
|
09-Dec-2015 |
David Srbecky <dsrbecky@google.com> |
Generate more stack maps during native debugging. Generate extra stack map at the start of each java statement. The stack maps are later translated to DWARF which allows LLDB to set breakpoints and view local variables. Change-Id: If00ab875513308e4a1399d1e12e0fe8934a6f0c3
|
5f7b58ea1adfc0639dd605b65f59198d3763f801 |
|
23-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Rewrite HInstruction::Is/As<type>(). Make Is<type>() and As<type>() non-virtual for concrete instruction types, relying on GetKind(), and mark GetKind() as PURE to improve optimization opportunities. This reduces the number of relocations in libart-compiler.so's .rel.dyn section by ~4K, or ~44%, and in .data.rel.ro by ~18K, or ~65%. The file is 96KiB smaller for Nexus 5, including 8KiB reduction of the .text section. Unfortunately, the g++/clang++ __attribute__((pure)) is not strong enough to avoid duplicated virtual calls and we would need the C++ [[pure]] attribute proposed in n3744 instead. To work around this deficiency, we introduce an extra non-virtual indirection for GetKind(), so that the compiler can optimize common expressions such as instruction->IsAdd() || instruction->IsSub() or instruction->IsAdd() && instruction->AsAdd()->... which contain two virtual calls to GetKind() after inlining. Change-Id: I83787de0671a5cb9f5b0a5f4a536cef239d5b401
|
f3e0ee27f46aa6434b900ab33f12cd3157578234 |
|
17-Dec-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "ART: Reduce the instructions generated by packed switch."" This reverts commit b4c137630fd2226ad07dfd178ab15725374220f1. The underlying issue was fixed by https://android-review.googlesource.com/188271 . Bug: 26121945 Change-Id: I58b08eb1a9f0a5c861f8cda93522af64bcf63920
|
b4c137630fd2226ad07dfd178ab15725374220f1 |
|
16-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "ART: Reduce the instructions generated by packed switch." This reverts commit 59f054d98f519a3efa992b1c688eb97bdd8bbf55. bug:26121945 Change-Id: I8a5ad7ef1f1de8d44787c27528fa3f7f5c2e9cd3
|
351dddf4025f07477161209e374741f089d97cb4 |
|
11-Dec-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Clean up after HRor. Change-Id: I96bd7fa2e8bdccb87a3380d063dad0dd57fed9d7
|
40a04bf64e5837fa48aceaffe970c9984c94084a |
|
11-Dec-2015 |
Scott Wakeling <scott.wakeling@linaro.org> |
Replace rotate patterns and invokes with HRor IR. Replace constant and register version bitfield rotate patterns, and rotateRight/Left intrinsic invokes, with new HRor IR. Where k is constant and r is a register, with the UShr and Shl on either side of a |, +, or ^, the following patterns are replaced: x >>> #k OP x << #(reg_size - k) x >>> #k OP x << #-k x >>> r OP x << (#reg_size - r) x >>> (#reg_size - r) OP x << r x >>> r OP x << -r x >>> -r OP x << r Implemented for ARM/ARM64 & X86/X86_64. Tests changed to not be inlined to prevent optimization from folding them out. Additional tests added for constant rotate amounts. Change-Id: I5847d104c0a0348e5792be6c5072ce5090ca2c34
|
917d01680714b2295f109f8fea0aa06764a30b70 |
|
24-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't generate a slow path for strings in the dex cache. Change-Id: I1d258f1a89bf0ec7c7ddd134be9215d480f0b09a
|
59f054d98f519a3efa992b1c688eb97bdd8bbf55 |
|
07-Dec-2015 |
Zheng Xu <zheng.xu@linaro.org> |
ART: Reduce the instructions generated by packed switch. Implement Vladimir Marko's suggestion. The new compare/jump series reduce the number of instructions from (2*n+1) to (1.5*n+3). Generate normal compare/jump series when numEntries <= 3. Generate optimal compare/jump series when numEntries <= threshold. Generate jump tables otherwise. Change-Id: I425547b6787057c7fa84e71f17c145b63b208633
|
e523423a053af5cb55837f07ceae9ff2fd581712 |
|
02-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Don't use the compiler driver for method resolution."" This reverts commit c88ef3a10c474045a3476a02ae75d07ddd3230b7. Change-Id: I0ed88a48b313a8d28bc39fae40631123aadb13ef
|
c88ef3a10c474045a3476a02ae75d07ddd3230b7 |
|
01-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't use the compiler driver for method resolution." Fails 425 in debuggable mode. This reverts commit 4db0bf9c4db6a09716c3388b7d2f88d534470339. Change-Id: I346df8f75674564fc4fb241c60f23e250fc7f0a7
|
4db0bf9c4db6a09716c3388b7d2f88d534470339 |
|
23-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use the compiler driver for method resolution. The compiler driver makes assumptions that don't hold for the optimizing compiler, and will for example always go to slow path for an invoke-super when there's no verified method. Also fix GenerateInvokeVirtual in the presence of intrinsics. Next change will address some of the TODOs in sharpening.cc. Change-Id: I2b0e543ee9b9bebcadb2d26de29e850c59ad58b9
|
8626b741716390a0119ffeb88b5b9fcf08e13010 |
|
25-Nov-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
ARM64: Use the shifter operands. This introduces architecture-specific instruction simplification. On ARM64 we try to merge shifts and sign-extension operations into arithmetic and logical instructions. For example for the Java code int res = a + (b << 5); we would generate lsl w3, w2, #5 add w0, w1, w3 and we now generate add w0, w1, w2, lsl #5 Change-Id: Ic03bdff44a1c12e21ddff1b0513bd32a730742b7
|
42e372e5a34d0fef88007bc5f40dd0fc7c03b58b |
|
24-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize HLoadClass when we know the class is in the cache. Change-Id: Iaa74591eed0f2eabc9ba9f9988681d9582faa320
|
22ccc3a93d32fa6991535eaebb17daf5abaf4ebf |
|
24-Nov-2015 |
Roland Levillain <rpl@google.com> |
ARM64 read barrier support for concurrent GC in Optimizing. This first implementation uses slow paths to instrument heap reference loads and GC root loads for the concurrent copying collector, respectively calling the artReadBarrierSlow and artReadBarrierForRootSlow runtime entry points. Notes: - This implementation does not instrument HInvokeVirtual nor HInvokeInterface instructions (for class reference loads), as the corresponding read barriers are not stricly required with the current concurrent copying collector. - Intrinsics which may eventually call (on slow path) are disabled when read barriers are enabled, as the current slow path infrastructure does not support this case. - When read barriers are enabled, the code generated for a HArraySet instruction always go into the array set slow path for object arrays (delegating the operation to the runtime), as we are lacking a mechanism to keep a temporary register live accross a runtime call (needed for the instrumentation of type checking code, which requires two successive read barriers). Bug: 12687968 Change-Id: Icfb74f67bf23ae80e7723ee6a0c9ff34ba325d48
|
888d067a67640e7d9fc349b0451dfe845acad562 |
|
23-Nov-2015 |
Roland Levillain <rpl@google.com> |
Revamp art::CheckEntrypointTypes uses. Change-Id: I6e13e594539e766ed94524ac3282cec292ba91da
|
729645a937eb9f04a311b3c22471dcf3ebe9bcec |
|
19-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Explicitly add HLoadClass/HClinitCheck for HNewInstance. bug:25735083 bug:25173758 Change-Id: Ie81cfa4fa9c47cc025edb291cdedd7af209a03db
|
418318f4d50e0cfc2d54330d7623ee030d4d727d |
|
20-Nov-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
ARM64: Add support for multiply-accumulate. Change-Id: I88dc313df520480f3fd16bbabda27f9435d25368
|
c53c0797a78a89d637e4230503cc1feb27e855a8 |
|
19-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Clean up the special input in HInvokeStaticOrDirect. Change-Id: I4042aefbdac1a8c236d00e2e7145349a64f6486b
|
3927c8b8361336f1b16aae6eb2ed7577b20560f4 |
|
18-Nov-2015 |
Zheng Xu <zheng.xu@linaro.org> |
Opt compiler: Arm64 packed-switch jump tables. In this patch, we set a rough threshold and only generate jump table with limited number of HIRs in the graph. This is because current VIXL can only handle Adr with label in the range of +/-1Mb. Change-Id: I42bff2095ec26caeacc5efc90afebe34e229b518
|
0debae7bc89eb05f7a2bf7dccd223318fad7c88d |
|
12-Nov-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Refactor GenerateTestAndBranch Each code generator implements a method for generating condition evaluation and branching to arbitrary labels. This patch refactors it for better clarity but also to generate fewer jumps when the true branch is the fallthrough successor. This is preliminary work for implementing HSelect. Change-Id: Iaa545a5ecbacb761c5aa241fa69140cf6eb5952f
|
6dc01748c61a7ad41d4ab701d3e27897bd39a899 |
|
12-Nov-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Minor fixes and cleaning of arm64 static and direct calls code. Fixes: The proper way to avoid the MacroAssembler to generate code before or after an instruction is to block the pools (usually via `vixl::BlockPoolsScope`). Here we can use `vixl::SingleEmissionCheckScope`, that checks we generate only one instruction and also blocks the pools. In practice the current code would have worked fine because VIXL would not have generated anything after `Bl()` or `Ldr()`, but that was not guaranteed. Cleaning: - `XRegisterFrom()` returns an X register. Calling `.X()` is not required. - Since we are sure (after the previous fixes) that nothing will be emitted around the instructions we care about, update the code to bind labels before the instructions for simplicity. Change-Id: I42d49976721e380e66bcd7a5b345f1777009434a
|
0f7dca4ca0be8d2f8776794d35edf8b51b5bc997 |
|
02-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing/X86: PC-relative dex cache array addressing. Add PC-relative dex cache array addressing for X86 and use it for better invoke-static/-direct dispatch. Also delay the initialization to the PC-relative base until needed. Change-Id: Ib8634d5edce4920cd70172fd13211809cf6948d1
|
dc151b2346bb8a4fdeed0c06e54c2fca21d59b5d |
|
15-Oct-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Determine invoke-static/-direct dispatch early. Determine the dispatch type of invoke-static/-direct in a special pass right after the type inference. This allows the inliner to pass the "needs dex cache" check and inline more. It also allows the code generator to avoid requesting a register location for the ArtMethod* for kDexCachePcRelative and direct methods. The supported dispatch check handles also situations that the CompilerDriver currently doesn't allow. The cleanup of the CompilerDriver and required changes to Quick will come in a separate change. Change-Id: I3f8e903a119949e95871d8ab0a995f4731a13a07
|
bb245d199a5240b4c520263fd2c8c10dba79eadc |
|
19-Oct-2015 |
Aart Bik <ajcbik@google.com> |
Generalize codegen and simplification of deopt. Rationale: the de-opt instruction is very similar to an if, so the existing assumption that it always has a conditional "under the hood" is very unsafe, since optimizations may have replaced conditionals with actual values; this CL generalizes handling of deopt. Change-Id: I1c6cb71fdad2af869fa4714b38417dceed676459
|
e6dbf48d7a549e58a3d798bbbdc391e4d091b432 |
|
19-Oct-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
ARM64: Instruction simplification for array accesses. HArrayGet and HArraySet with variable indexes generate two instructions on arm64, like add temp, obj, #data_offset ldr out, [temp, index LSL #shift_amount] When we have multiple accesses to the same array, the initial `add` instruction is redundant. This patch introduces the first instruction simplification in the arm64-specific instruction simplification pass. It splits HArrayGet and HArraySet using the new arm64-specific IR HIntermediateAddress. After that we run GVN again to squash the multiple occurrences of HIntermediateAddress. Change-Id: I2e3d12fbb07fed07b2cb2f3f47f99f5a032f8312
|
4b8f1ecd3aa5a29ec1463ff88fee9db365f257dc |
|
26-Aug-2015 |
Roland Levillain <rpl@google.com> |
Use ATTRIBUTE_UNUSED more. Use it in lieu of UNUSED(), which had some incorrect uses. Change-Id: If247dce58b72056f6eea84968e7196f0b5bef4da
|
e9f37600e98ba21308ad4f70d9d68cf6c057bdbe |
|
09-Oct-2015 |
Aart Bik <ajcbik@google.com> |
Added support for unsigned comparisons Rationale: even though not directly supported in input graph, having the ability to express unsigned comparisons in HIR is useful for all sorts of optimizations. Change-Id: I4543c96a8c1895c3d33aaf85685afbf80fe27d72
|
ec7802a102d49ab5c17495118d4fe0bcc7287beb |
|
01-Oct-2015 |
Vladimir Marko <vmarko@google.com> |
Add DCHECKs to ArenaVector and ScopedArenaVector. Implement dchecked_vector<> template that DCHECK()s element access and insert()/emplace()/erase() positions. Change the ArenaVector<> and ScopedArenaVector<> aliases to use the new template instead of std::vector<>. Remove DCHECK()s that have now become unnecessary from the Optimizing compiler. Change-Id: Ib8506bd30d223f68f52bd4476c76d9991acacadc
|
580b609cd6cfef46108156457df42254d11e72a7 |
|
06-Oct-2015 |
Calin Juravle <calin@google.com> |
Fix location summary for LoadClass Don't request a register for the current method if we're gonna call the runtime. Change-Id: I9760d15108bd95efb2a34e6eacd84b60841781d7
|
98893e146b0ff0e1fd1d7c29252f1d1e75a163f2 |
|
02-Oct-2015 |
Calin Juravle <calin@google.com> |
Add support for unresolved classes in optimizing. Change-Id: I0e299a81e560eb9cb0737ec46125dffc99333b54
|
ecf680d5e1fe6fcdd57962334a7c7865720503cc |
|
05-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Block callee save fp registers in debuggable. This is a simple but conservative implementation. We could extend it by using the registers but still saving them before a call and at method entry. bug: 21057237 Change-Id: Ia2e9e0e2efae0b01625e0f4165d0535c4bf9ba62
|
75d5b9bbd48edbe221d00dc85d25093977c6fa41 |
|
05-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't use floating point callee saves in debuggable." bug:24602865 bug:24605078 This reverts commit 88a95ba893fcda974d492917dd77a9b11693dbf2. Change-Id: Iba97eeab5c2ba725f66cc138f740dac337344828
|
e460d1df1f789c7c8bb97024a8efbd713ac175e9 |
|
29-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Support unresolved fields in optimizing" The CL also changes the calling convetion for 64bit static field set to use kArg2 instead of kArg1. This allows optimizing to keep the asumptions: - arm pairs are always of form (even_reg, odd_reg) - ecx_edx is not used as a register on x86. This reverts commit e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1. Change-Id: I93159917565824084abc96775f31be1a4249f2f3
|
a8a0fe2283b2186198695a2e1c485c205cc12a73 |
|
01-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix another poisoning problem. We were using the wrong temp. Change-Id: Id79d5079cc85f61eb1a45d741a67f24d33e8fa03
|
61b1dbe32e1066d112605c9199370fe88981f1bf |
|
01-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix poisoining bug in arm64. Change-Id: I30ca7f237009d81c9d83fabb6a4c76bf4c74d451
|
88a95ba893fcda974d492917dd77a9b11693dbf2 |
|
30-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use floating point callee saves in debuggable. The runtime stubs don't save them, so GetVReg and SetVReg won't work on them. Not having callee saves will increase code size and reduce performance of fp-heavy methods. But we need to do it for propper debugging. Change-Id: I40354c29718af49b6b3adf61d724d3bb93680107
|
e0395dd58454e27fc47c0ca273913929fb658e6c |
|
25-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize ArraySet for x86/x64/arm/arm64. Change-Id: I5bc8c6adf7f82f3b211f0c21067f5bb54dd0c040
|
5233f93ee336b3581ccdb993ff6342c52fec34b0 |
|
29-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag even more arena allocations. Tag previously "Misc" arena allocations with more specific allocation types. Move some native heap allocations to the arena in BCE. Bug: 23736311 Change-Id: If8ef15a8b614dc3314bdfb35caa23862c9d4d25c
|
225b6464a58ebe11c156144653f11a1c6607f4eb |
|
28-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in code generators. And completely remove the deprecated GrowableArray. Replace GrowableArray with ArenaVector in code generators and related classes and tag arena allocations. Label arrays use direct allocations from ArenaAllocator because Label is non-copyable and non-movable and as such cannot be really held in a container. The GrowableArray never actually constructed them, instead relying on the zero-initialized storage from the arena allocator to be correct. We now actually construct the labels. Also avoid StackMapStream::ComputeDexRegisterMapSize() being passed null references, even though unused. Change-Id: I26a46fdd406b23a3969300a67739d55528df8bf4
|
abfcf18fa2fe723bd683edcb685ed5058d9c7cf3 |
|
21-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Further refinements to checkcast/instanceof. - Use setcc when possible. - Do an exact check in the Object[] case before checking the component type. Change-Id: Ic11c60643af9b41fe4ef2beb59dfe7769bef388f
|
fe57faa2e0349418dda38e77ef1c0ac29db75f4d |
|
18-Sep-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Add basic PackedSwitch support Add HPackedSwitch, and generate it from the builder. Code generators convert this to a series of compare/branch tests. Better implementation in the code generators as a real jump table will follow as separate CLs. Change-Id: If14736fa4d62809b6ae95280148c55682e856911 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
85c7bab43d11180d552179c506c2ffdf34dd749c |
|
18-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Optimize code generation of check-cast and instance-of."" This reverts commit 7537437c6a2f89249a48e30effcc27d4e7c5a04f. Change-Id: If759cb08646e47b62829bebc3c5b1e2f2969cf84
|
7537437c6a2f89249a48e30effcc27d4e7c5a04f |
|
17-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Optimize code generation of check-cast and instance-of." Failures with libcore tests. This reverts commit 64acf303eaa2f32c0b1d8cfcbf044a822c5eec08. Change-Id: Ie6f323fcf5d86bae5c334c1352bb21f1bad60a88
|
e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1 |
|
17-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Support unresolved fields in optimizing" breaks debuggable tests. This reverts commit 23a8e35481face09183a24b9d11e505597c75ebb. Change-Id: I8e60b5c8f48525975f25d19e5e8066c1c94bd2e5
|
64acf303eaa2f32c0b1d8cfcbf044a822c5eec08 |
|
14-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize code generation of check-cast and instance-of. On x86/x64/arm/arm64. Improve code size of selected apks from 0.3% to 1%, and performance of DeltaBlue by 20%. Change-Id: Ib5799f7a53443cd880a121dd7f21932ae9f5c7aa
|
23a8e35481face09183a24b9d11e505597c75ebb |
|
08-Sep-2015 |
Calin Juravle <calin@google.com> |
Support unresolved fields in optimizing Change-Id: I9941fa5fcb6ef0a7a253c7a0b479a44a0210aad4
|
175dc732c80e6f2afd83209348124df349290ba8 |
|
25-Aug-2015 |
Calin Juravle <calin@google.com> |
Support unresolved methods in Optimizing Change-Id: If2da02b50d2fa668cd58f134a005f1752e7746b1
|
77a48ae01bbc5b05ca009cf09e2fcb53e4c8ff23 |
|
15-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Register allocation and runtime support for try/catch"" The original CL triggered b/24084144 which has been fixed by Ib72e12a018437c404e82f7ad414554c66a4c6f8c. This reverts commit 659562aaf133c41b8d90ec9216c07646f0f14362. Change-Id: Id8980436172457d0fcb276349c4405f7c4110a55
|
659562aaf133c41b8d90ec9216c07646f0f14362 |
|
14-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Register allocation and runtime support for try/catch" Breaks libcore test org.apache.harmony.security.tests.java.security.KeyStorePrivateKeyEntryTest#testGetCertificateChain. Need to investigate. This reverts commit b022fa1300e6d78639b3b910af0cf85c43df44bb. Change-Id: Ib24d3a80064d963d273e557a93469c95f37b1f6f
|
b022fa1300e6d78639b3b910af0cf85c43df44bb |
|
20-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Register allocation and runtime support for try/catch This patch completes a series of CLs that add support for try/catch in the Optimizing compiler. With it, Optimizing can compile all methods containing try/catch, provided they don't contain catch loops. Future work will focus on improving performance of the generated code. SsaLivenessAnalysis was updated to propagate liveness information of instructions live at catch blocks, and to keep location information on instructions which may be caught by catch phis. RegisterAllocator was extended to spill values used after catch, and to allocate spill slots for catch phis. Catch phis generated for the same vreg share a spill slot as the raw value must be the same. Location builders and slow paths were updated to reflect the fact that throwing an exception may not lead to escaping the method. Instruction code generators are forbidden from using of implicit null checks in try blocks as live registers need to be saved before handing over to the runtime. CodeGenerator emits a stack map for each catch block, storing locations of catch phis. CodeInfo and StackMapStream recognize this new type of stack map and store them separate from other stack maps to avoid dex_pc conflicts. After having found the target catch block to deliver an exception to, QuickExceptionHandler looks up the dex register maps at the throwing instruction and the catch block and copies the values over to their respective locations. The runtime-support approach was selected because it allows for the best performance in the normal control-flow path, since no propagation of catch phi values is necessary until the exception is thrown. In addition, it also greatly simplifies the register allocation phase. ConstantHoisting was removed from LICMTest because it instantiated (now abstract) HConstant and was bogus anyway (constants are always in the entry block). Change-Id: Ie31038ad8e3ee0c13a5bbbbaf5f0b3e532310e4e
|
bfb5ba90cd6425ce49c2125a87e3b12222cc2601 |
|
01-Sep-2015 |
Andreas Gampe <agampe@google.com> |
Revert "Revert "Do a second check for testing intrinsic types."" This reverts commit a14b9fef395b94fa9a32147862c198fe7c22e3d7. When an intrinsic with invoke-type virtual is recognized, replace the instruction with a new HInvokeStaticOrDirect. Minimal update for dex-cache rework. Fix includes. Change-Id: I1c8e735a2fa7cda4419f76ca0717125ef236d332
|
05792b98980741111b4d0a24d68cff2a8e070a3a |
|
03-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Move DexCache arrays to native. This CL has a companion CL in libcore/ https://android-review.googlesource.com/162985 Change-Id: Icbc9e20ad1b565e603195b12714762bb446515fa
|
5a6cc49ed4f36dd11d6ec1590857b884ad8da6ab |
|
13-Aug-2015 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
SlowPath: Remove the use of Locations in the SlowPath constructors. The main motivation is that using locations in the SlowPath constructors ties us to creating the SlowPaths after register allocation, since before the locations are invalid. A later patch of the series will be moving the SlowPath creation to the LocationsBuilder visitors. This will enable us to add more checking as well as consider sharing multiple SlowPaths of the same type. Change-Id: I7e96dcc2b5586d15153c942373e9281ecfe013f0 Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
|
ecc4366670e12b4812ef1653f7c8d52234ca1b1f |
|
13-Aug-2015 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Add OptimizingCompilerStats to the CodeGenerator class. Just refactoring, not yet used, but will be used by the incoming patch series and future CodeGen specific stats. Change-Id: I7d20489907b82678120518a77bdab9c4cc58f937 Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
|
4dff2fdca6dc0032032ff324161c6343e675e4b0 |
|
20-Aug-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
ARM64: Minor optimization for conversions from long to int. Change-Id: Ice7febba8dd09a4548ab235fc8aee76d7e7676a1
|
581550137ee3a068a14224870e71aeee924a0646 |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Optimizing: Better invoke-static/-direct dispatch."" Fixed kCallArtMethod to use correct callee location for kRecursive. This combination is used when compiling with debuggable flag set. This reverts commit b2c431e80e92eb6437788cc544cee6c88c3156df. Change-Id: Idee0f2a794199ebdf24892c60f8a5dcf057db01c
|
b2c431e80e92eb6437788cc544cee6c88c3156df |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Optimizing: Better invoke-static/-direct dispatch." Reverting due to failing ndebug tests. This reverts commit 9b688a095afbae21112df5d495487ac5231b12d0. Change-Id: Ie4f69da6609df3b7c8443412b6cf7f5c43c2c5d9
|
9b688a095afbae21112df5d495487ac5231b12d0 |
|
06-May-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Better invoke-static/-direct dispatch. Add framework for different types of loading ArtMethod* and code pointer retrieval. Implement invoke-static and invoke-direct calls the same way as Quick. Document the dispatch kinds in HInvokeStaticOrDirect's new enumerations MethodLoadKind and CodePtrLocation. PC-relative loads from dex cache arrays are used only for x86-64 and arm64. The implementation for other architectures will be done in separate CLs. Change-Id: I468ca4d422dbd14748e1ba6b45289f0d31734d94
|
3887c468d731420e929e6ad3acf190d5431e94fc |
|
12-Aug-2015 |
Roland Levillain <rpl@google.com> |
Remove unnecessary `explicit` qualifiers on constructors. Change-Id: Id12e392ad50f66a6e2251a68662b7959315dc567
|
78e3ef6bc5f8aa149f2f8bf0c78ce854c2f910fa |
|
12-Aug-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Add a GVN dependency 'GC' for garbage collection. This will be used by incoming architecture specific optimizations. The dependencies must be conservative. When an HInstruction is created we may not be sure whether it can trigger GC. In that case the 'ChangesGC' dependency must be set. We control at code-generation time that HInstructions that can call have the 'ChangesGC' dependency set. Change-Id: Iea6a7f430009f37a9599b0a0039207049906e45d
|
8c0676ce786f33b8f9c8eedf1ace48988c750932 |
|
03-Aug-2015 |
Serguei Katkov <serguei.i.katkov@intel.com> |
ART-Optimizing: Fix the type of HDivZeroCheck HDivZeroCheck is created during the building CFG and at this moment its type is not known completely. So it sets the type to int or long. However, later SSA builder can insert the type conversion and type of input of HDivZeroCheck can become byte or short while the type of HDivZeroCheck remains the same. In reality the type of HDivZeroCheck should be always equal to its input parameter. To fix this inconsistency we return the type of HDivZeroCheck as its input type. Code generators are updated accordingly. Change-Id: I6a5aedc8d479cfc6328704e7ddf252bca830076b Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
8158f28b6689314213eb4dbbe14166073be71f7e |
|
07-Aug-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Ensure coherency of call kinds for LocationSummary. The coherency is enforced with checks added in the `InvokeRuntime` helper, that we now also use on x86 and x86_64. Change-Id: I8cb92b042f25dc3c5fd390e9c61a45b477d081f4
|
cb1c0557033065f2436ee79e7fa6c19d87064801 |
|
04-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Move exception clearing into own instruction Runtime delivers exceptions only to catch blocks which begin with a MOVE_EXCEPTION instruction (in DEX). In that case, the catch block is expected to clear the thread-local exception storage after having read the exception reference. This patch changes Optimizing to represent MOVE_EXCEPTION with two instructions - HLoadException and HClearException - instead of one. If the exception reference is not used, HLoadException can be safely removed, saving a memory load without breaking the runtime behaviour. Change-Id: Idad8a714467bf9d9d5fccefbc43c0bd8ae13ddba
|
7f63c52c8e94ed1340b7a1d04b046ff12819d2bc |
|
13-Jul-2015 |
Roland Levillain <rpl@google.com> |
Revert "Revert "Fuse long and FP compare & condition on ARM64 in Optimizing."" This reverts commit bed50d2430e02a3d6b94972e8ab4873d7b3b8be0. Bug: 21120453 Change-Id: I5e4aab2703966d9324ebde25bd8b83056fdb10ed
|
9b1eba39322781d361a19f51c9f46520bf078558 |
|
13-Jul-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix baseline for arm64. A HFakeString acts like a null constant. Other backends have different code paths for handling it, so it was only arm64 failing. Change-Id: Iba44d87c8d114b916404db0302574c7059143010
|
2e7cd752452d02499a2f5fbd604c5427aa372f00 |
|
10-Jul-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
[optimizing] Don't rely on the verifier for String.<init>. Continue work on cutting the dependency on the verifier. Change-Id: I0f95b1eb2e10fd8f6bf54817f1202bdf6dfdb0fe
|
bed50d2430e02a3d6b94972e8ab4873d7b3b8be0 |
|
10-Jul-2015 |
Roland Levillain <rpl@google.com> |
Revert "Fuse long and FP compare & condition on ARM64 in Optimizing." This reverts commit 5cfe61f27ed9203498169355bb95db756486d292. Change-Id: I9879e76e7f8315cace05700e3b571a6a4749bf1a
|
5cfe61f27ed9203498169355bb95db756486d292 |
|
10-Jul-2015 |
Roland Levillain <rpl@google.com> |
Fuse long and FP compare & condition on ARM64 in Optimizing. Bug: 21120453 Change-Id: I701e808600fb5ba9ff4d0f5e19e4ce22b1d34b29
|
82000b0cf9bb32fc55cdb125bf37c884d44a8671 |
|
07-Jul-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Improve code generation for ARM64 VisitArrayGet/Set. We prefer the code sequence add temp, obj, #offset ldr out, [temp, index LSL #shift_amount] to add temp, obj, index LSL #shift_amount ldr out, [temp, #offset] Change-Id: I98f51a1b5a5ecd84c677d6dbd4c4bfc0f157f5e2
|
4d02711ea578dbb789abb30cbaf12f9926e13d81 |
|
01-Jul-2015 |
Roland Levillain <rpl@google.com> |
Implement heap poisoning in ART's Optimizing compiler. - Instrument ARM, ARM64, x86 and x86-64 code generators. - Note: To turn heap poisoning on in Optimizing, set the environment variable `ART_HEAP_POISONING' to "true" before compiling ART. Bug: 12687968 Change-Id: Ib3120b38cf805a8a50207a314b9ccc90c8d93740
|
fc6a86ab2b70781e72b807c1798b83829ca7f931 |
|
26-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Implement try/catch blocks in Builder"" This patch enables the GraphBuilder to generate blocks and edges which represent the exceptional control flow when try/catch blocks are present in the code. Actual compilation is still delegated to Quick and Baseline ignores the additional code. To represent the relationship between try and catch blocks, Builder splits the edges which enter/exit a try block and links the newly created blocks to the corresponding exception handlers. This layout will later enable the SsaBuilder to correctly infer the dominators of the catch blocks and to produce the appropriate reverse post ordering. It will not, however, allow for building the complete SSA form of the catch blocks and consequently optimizing such blocks. To this end, a new TryBoundary control-flow instruction is introduced. Codegen treats it the same as a Goto but it allows for additional successors (the handlers). This reverts commit 3e18738bd338e9f8363b26bc895f38c0ec682824. Change-Id: I4f5ea961848a0b83d8db3673763861633e9bfcfb
|
3e18738bd338e9f8363b26bc895f38c0ec682824 |
|
26-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Implement try/catch blocks in Builder" Causes OutOfMemory issues, need to investigate. This reverts commit 0b5c7d1994b76090afcc825e737f2b8c546da2f8. Change-Id: I263e6cc4df5f9a56ad2ce44e18932ca51d7e349f
|
0b5c7d1994b76090afcc825e737f2b8c546da2f8 |
|
11-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement try/catch blocks in Builder This patch enables the GraphBuilder to generate blocks and edges which represent the exceptional control flow when try/catch blocks are present in the code. Actual compilation is still delegated to Quick and Baseline ignores the additional code. To represent the relationship between try and catch blocks, Builder splits the edges which enter/exit a try block and links the newly created blocks to the corresponding exception handlers. This layout will later enable the SsaBuilder to correctly infer the dominators of the catch blocks and to produce the appropriate reverse post ordering. It will not, however, allow for building the complete SSA form of the catch blocks and consequently optimizing such blocks. To this end, a new TryBoundary control-flow instruction is introduced. Codegen treats it the same as a Goto but it allows for additional successors (the handlers). Change-Id: I415b985596d5bebb7b1bb358a46e08b7b04bb53a
|
9931f319cf86c56c2855d800339a3410697633a6 |
|
19-Jun-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Opt compiler: Add a description to slow paths. Change-Id: I22160d90de3fe0ab3e6a2acc440bda8daa00e0f0
|
69aa60163989c33a008115205d39732a76ecc1dc |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Pass current method to HNewInstance and HNewArray."" Problem exposed by this change was fixed in: https://android-review.googlesource.com/#/c/154031/ This reverts commit 7b0e353b49ac3f464c662f20e20e240f0231afff. Change-Id: I680c13dc9db9ba223ab11c7af255222860b4e6d2
|
ae71a0539451a8350bdd9d46c76ddab7b763f209 |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a crash in optimizing compiler with the current method. Crash was due to overwriting the location of the current method in the slow path of an intrinsic. Change-Id: I6ca58ef5b3cea19925e60b9500aef543bc5f71ef
|
7b0e353b49ac3f464c662f20e20e240f0231afff |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Pass current method to HNewInstance and HNewArray." 082-inline-execute fails on x86. This reverts commit e21aa42e1341d34250742abafdd83311ad9fa737. Change-Id: Ib3fd25faee2e0128001e40d3d51a74f959bc4449
|
94015b939060f5041d408d48717f22443e55b6ad |
|
04-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Use HCurrentMethod in HInvokeStaticOrDirect."" Fix was to special case baseline for x86, which does not have enough registers to allocate the current method. This reverts commit c345f141f11faad177aa9635a78088d00cf66086. Change-Id: I5997aa52f8d4df373ae5ff4d4150dac0c44c4c10
|
e21aa42e1341d34250742abafdd83311ad9fa737 |
|
08-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Pass current method to HNewInstance and HNewArray. Also remove unsed CodeGenerator::LoadCurrentMethod. Change-Id: I4b8d3f2a30b8e2c76b6b329a72555483c993cb73
|
c345f141f11faad177aa9635a78088d00cf66086 |
|
04-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Use HCurrentMethod in HInvokeStaticOrDirect." Fails on baseline/x86. This reverts commit 38207af82afb6f99c687f64b15601ed20d82220a. Change-Id: Ib71018367eb7c6046965494a7e996c22af3de403
|
38207af82afb6f99c687f64b15601ed20d82220a |
|
01-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use HCurrentMethod in HInvokeStaticOrDirect. Change-Id: I0d15244b6b44c8b10079398c55da5071a3e3af66
|
925e56296665b36fe4dee4e65c956396969b6288 |
|
03-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Allow void to get in ARM64ReturnLocation. It can now be called with it. Change-Id: Idd10dbf5c9cb5f418504cb4c9252930e6eb4942d
|
fd88f16100cceafbfde1b4f095f17e89444d6fa8 |
|
03-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Factorize code for common LocationSummary of HInvoke. This is one step forward, we could factorize more, but I wanted to get this out of the way first. Change-Id: I6ae411a737eebaecb64974f47af507ce0cfbae85
|
3d21bdf8894e780d349c481e5c9e29fe1556051c |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 (cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33) Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3 ART: Fix casts for 64-bit pointers on 32-bit compiler. Bug: 19264997 Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457 Fix JDWP tests after ArtMethod change Fixes Throwable::GetStackDepth for exception event detection after internal stack trace representation change. Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of proxy method. Bug: 19264997 Change-Id: I363e293796848c3ec491c963813f62d868da44d2 Fix accidental IMT and root marking regression Was always using the conflict trampoline. Also included fix for regression in GC time caused by extra roots. Most of the regression was IMT. Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to detached thread. EvaluateAndApplyChanges: From ~2500 -> ~1980 GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots Bug: 19264997 Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0 Fix bogus image test assert Previously we were comparing the size of the non moving space to size of the image file. Now we properly compare the size of the image space against the size of the image file. Bug: 19264997 Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a [MIPS64] Fix art_quick_invoke_stub argument offsets. ArtMethod reference's size got bigger, so we need to move other args and leave enough space for ArtMethod* and 'this' pointer. This fixes mips64 boot. Bug: 19264997 Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
|
e3b034a6f6f0d80d519ab08bdd18be4de2a4a2db |
|
31-May-2015 |
Mathieu Chartier <mathieuc@google.com> |
Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3
|
e401d146407d61eeb99f8d6176b2ac13c4df1e33 |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
|
fbdaa30a448029d75422c76f29087a4e39630f4a |
|
29-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use the new HCurrentMethod in HLoadString. Change-Id: I23d27e5e10736d127519eb3238ff8f25df3843a2
|
76b1e1799a713a19218de26b171b0aef48a59e98 |
|
27-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a HCurrentMethod node. This enables register allocation for the current method, so that users of it don't always load it from the stack. Currently only used by HLoadClass. Will make follow-up CLs for the other users. Change-Id: If73324d85643102faba47fabbbd2755eb258c59c
|
80afd02024d20e60b197d3adfbb43cc303cf29e0 |
|
19-May-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Clean up arm64 kNumberOfXRegisters usage. Avoid undefined behavior for arm64 stemming from 1u << 32 in loops with upper bound kNumberOfXRegisters. Create iterators for enumerating bits in an integer either from high to low or from low to high and use them for <arch>Context::FillCalleeSaves() on all architectures. Refactor runtime/utils.{h,cc} by moving all bit-fiddling functions to runtime/base/bit_utils.{h,cc} (together with the new bit iterators) and all time-related functions to runtime/base/time_utils.{h,cc}. Improve test coverage and fix some corner cases for the bit-fiddling functions. Bug: 13925192 Change-Id: I704884dab15b41ecf7a1c47d397ab1c3fc7ee0f7
|
07276db28d654594e0e86e9e467cad393f752e6e |
|
18-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't do a null test in MarkGCCard if the value cannot be null. Change-Id: I45687f6d3505178e2fc3689eac9cb6ab1b2c1e29
|
c66671076b12a0ee8b9d1ae782732cc91beacb73 |
|
15-May-2015 |
Zheng Xu <zheng.xu@arm.com> |
Opt compiler: Speedup div/rem by constants on arm32 and arm64. This patch also includes: 1. Add java test for div/rem negative constants. 2. Fix a thumb2 encoding issue where the last operand is "reg, shift #amount" in some instructions. 3. Support a simple filter in arm32 assembler test to filter out unsupported cases, such as "smull r0, r0, r1, r2". 4. Add smull arm32 assembler test. 5. Add smull/umull thumb2 test. 6. Add test for the thumb2 encoding issue which is fixed in this patch. Change-Id: I1601bc9c38f70f11909f2816fe3ec105a158951e
|
c74652867cd9293e86232324e5e057cd73c48e74 |
|
13-May-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Refactor GraphVisualizer attribute printing This patch unifies the way GraphVisualizer prints instruction attributes in preparation of changes to the Checker syntax. Change-Id: I44e91e36c660985ddfe039a9f410fedc48b496ec
|
db216f4d49ea1561a74261c29f1264952232728a |
|
05-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Relax the only one back-edge restriction. The rule is in the way for better register allocation, as it creates an artificial join point between multiple paths. Change-Id: Ia4392890f95bcea56d143138f28ddce6c572ad58
|
896e32d3c4087e141821271b81e7d82d745e4db3 |
|
05-May-2015 |
Roland Levillain <rpl@google.com> |
Small correction in Optimizing's ARM64 code generator. art::arm64::CodeGeneratorARM64::InvokeRuntime should expect its `instruction' argument to be non-null. Change-Id: Idfa949aa9a5f038394092aaea0901e1aa7f97c2c
|
2d27c8e338af7262dbd4aaa66127bb8fa1758b86 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Refactor InvokeDexCallingConventionVisitor in Optimizing. Change-Id: I7ede0f59d5109644887bf5d39201d4e1bf043f34
|
3e3d73349a2de81d14e2279f60ffbd9ab3f3ac28 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Have HInvoke instructions know their number of actual arguments. Add an art::HInvoke::GetNumberOfArguments routine so that art::HInvoke and its subclasses can return the number of actual arguments of the called method. Use it in code generators and intrinsics handlers. Consequently, no longer remove a clinit check as last input of a static invoke if it is still present during baseline code generation, but ensure that static invokes have no such check as last input in optimized compilations. Change-Id: Iaf9e07d1057a3b15b83d9638538c02b70211e476
|
848f70a3d73833fc1bf3032a9ff6812e429661d9 |
|
15-Jan-2014 |
Jeff Hao <jeffhao@google.com> |
Replace String CharArray with internal uint16_t array. Summary of high level changes: - Adds compiler inliner support to identify string init methods - Adds compiler support (quick & optimizing) with new invoke code path that calls method off the thread pointer - Adds thread entrypoints for all string init methods - Adds map to verifier to log when receiver of string init has been copied to other registers. used by compiler and interpreter Change-Id: I797b992a8feb566f9ad73060011ab6f51eb7ce01
|
0379f82393237798616d485ad99952e73e480e12 |
|
25-Apr-2015 |
Roland Levillain <rpl@google.com> |
Fix DCHECKs about clinit checks in Optimizing's code generators. These assertions are not true for the baseline compiler. As a temporary workaround, remove a clinit check as last input of a static invoke if it is still present at the stage of code generation. Change-Id: I5655f4a0873e2e7ee7790b6a341c18b4b7b52af1
|
4c0eb42259d790fddcd9978b66328dbb3ab65615 |
|
24-Apr-2015 |
Roland Levillain <rpl@google.com> |
Ensure inlined static calls perform clinit checks in Optimizing. Calls to static methods have implicit class initialization (clinit) checks of the method's declaring class in Optimizing. However, when such a static call is inlined, the implicit clinit check vanishes, possibly leading to an incorrect behavior. To ensure that inlining static methods does not change the behavior of a program, add explicit class initialization checks (art::HClinitCheck) as well as load class instructions (art::HLoadClass) as last input of static calls (art::HInvokeStaticOrDirect) in Optimizing' control flow graphs, when the declaring class is reachable and not known to be already initialized. Then when considering the inlining of a static method call, proceed only if the method has no implicit clinit check requirement. The added explicit clinit checks are already removed by the art::PrepareForRegisterAllocation visitor. This CL also extends this visitor to turn explicit clinit checks from static invokes into implicit ones after the inlining step, by removing the added art::HLoadClass nodes mentioned hereinbefore. Change-Id: I9ba452b8bd09ae1fdd9a3797ef556e3e7e19c651
|
5ea536aa4a6414db01beaf6f8bd8cb9adc5cfc92 |
|
20-Apr-2015 |
Vladimir Marko <vmarko@google.com> |
Remove ArtMethod* parameter from dex cache entry points. Load the ArtMethod* using an optimized stack walk instead. This reduces the size of the generated code. Three of the entry points are called only from a slow-path and the fourth (InitializeTypeAndVerifyAccess) is rare and already slow enough that the one or two extra loads (depending on whether we already have the ArtMethod* in a register) are insignificant. And as we're starting to use PC-relative addressing of the dex cache arrays (already done by Quick for the boot image), having the ArtMethod* in a register becomes less likely anyway. Change-Id: Ib19b9d204e355e13bf386662a8b158178bf8ad28
|
da40309f61f98c16d7d58e4c34cc0f5eef626f93 |
|
24-Apr-2015 |
Zheng Xu <zheng.xu@arm.com> |
Opt compiler: ARM64: Use ldp/stp on arm64 for slow paths. It should be a bit faster than load/store single registers and reduce the code size. Change-Id: I67b8302adf6174b7bb728f7c2afd2c237e34ffde
|
af88835231c2508509eb19aa2d21b92879351962 |
|
20-Apr-2015 |
Guillaume "Vermeille" Sanchez <guillaumesa@google.com> |
Remove unnecessary null checks in CheckCast and InstanceOf Change-Id: I6fd81cabd8673be360f369e6318df0de8b18b634
|
97833a0d26e265c5885e27af4b8e8969ccb9612a |
|
16-Apr-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Minor object store optimizations for ARM64. This is an adaptation of af07bc121121d7bd7e8329c55dfe24782207b561 for ARM64. Change-Id: I5f4984ac86ede89cdf7c915f4bbf8d091059a0eb
|
d921d64c09b9222b8422f78da6b34b0a61e305c9 |
|
16-Apr-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: ARM64: Block VIXLpools when recording the pc. VIXL automatically handles and generate literal and veneer pools when using the MacroAssembler. In general, the pools can be emitted anywhere. Helpers are provided to forbid VIXL from emitting pools locally. So when writing the pseudo-code __ Fmov(d0, 1.2345); __ Ldr(dst, MemOperand(src, offset)); FunctionRecordingCurrentPC(); __ Add(x0, x1, x2); VIXL might generate code looking like 0x00: ldr s0, [pc, 0xc] 0x04: ldr dst, [src, offset] 0x08: b #0x10 0x0c: <literal 1.2345> 0x10: add x0, x1, x2 and the program counter recorded by the helper will point after the literal pool. So we explicitly stop VIXL from emitting pools when dealing with code where we care about the program counter. Change-Id: Ib964860539bdb10f5704c290bdf74e5db149e462
|
09a99965bb27649f5b1d373f76bfbec6a2500c9e |
|
15-Apr-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: ARM64: Follow other archs for a few codegen stubs. Code generation for HInstanceFieldGet, HInstanceFieldSet, HStaticFieldGet, and HStaticFieldSet are refactored to follow the structure used for other backends. Change-Id: I34a3bd17effa042238c6bf199848cbc2ec26ac5d
|
27df758e2e7baebb6e3f393f9732fd0d064420c8 |
|
17-Apr-2015 |
Calin Juravle <calin@google.com> |
[optimizing] Add memory barriers in constructors when needed If a class has final fields we must add a memory barrier before returning from constructor. This makes sure the fields are visible to other threads. Bug: 19851497 Change-Id: If8c485092fc512efb9636cd568cb0543fb27688e
|
88c13cddc3a4184908662b0f3de796565d348c76 |
|
14-Apr-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Correctly require register or FPU register. Also add a check that location summary are correctly typed with the HInstruction. Change-Id: I699762ff4e8f4e321c7db01ea005236ea1934af9
|
ad4450e5c3ffaa9566216cc6fafbf5c11186c467 |
|
17-Apr-2015 |
Zheng Xu <zheng.xu@arm.com> |
Opt compiler: Implement parallel move resolver without using swap. The algorithm of ParallelMoveResolverNoSwap() is almost the same with ParallelMoveResolverWithSwap(), except the way we resolve the circular dependency. NoSwap() uses additional scratch register to resolve the circular dependency. For example, (0->1) (1->2) (2->0) will be performed as (2->scratch) (1->2) (0->1) (scratch->0). On architectures without swap register support, NoSwap() can reduce the number of moves from 3x(N-1) to (N+1) when there is circular dependency with N moves. And also, NoSwap() algorithm does not depend on architecture register layout information, which means it can support register pairs on arm32 and X/W, D/S registers on arm64 without additional modification. Change-Id: Idf56bd5469bb78c0e339e43ab16387428a082318
|
13b4718ecd52a674b25eac106e654d8e89872750 |
|
15-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Remove DCHECKs for boolean type Since bool and int are interchangeable types, checking whether an input is kPrimBoolean can fail when replaced with 0/1 constant or a phi. This patch removes the problematic DCHECKs, adds a best-effort verification into SSAChecker but leaves the phi case empty until a suitable analysis is implemented. Change-Id: I31e8daf27dd33d2fd74049b82bed1cb7c240c8c6
|
9021825d1e73998b99c81e89c73796f6f2845471 |
|
15-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Type MoveOperands. The ParallelMoveResolver implementation needs to know if a move is for 64bits or not, to handle swaps correctly. Bug found, and test case courtesy of Serguei I. Katkov. Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
|
66d126ea06ce3f507d86ca5f0d1f752170ac9be1 |
|
03-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement HBooleanNot instruction Optimizations simplifying operations on boolean values (boolean simplifier, instruction simplifier) can benefit from having a special HInstruction for negating booleans in order to perform more transforms and produce faster machine code. This patch implements HBooleanNot as 'x xor 1', assuming that booleans are 1-bit integers and allowing for a single-instruction negation on all supported platforms. Change-Id: I33a2649c1821255b18a86ca68ed16416063c739f
|
69a503050fb8a7b3a79b2cd2cdc2d8fbc594575d |
|
14-Apr-2015 |
Zheng Xu <zheng.xu@arm.com> |
ARM64: Remove suspend register. It also clean up build/remove frame used by JNI compiler and generates stp/ldp instead of str/ldr. Also x19 has been unblocked in both quick and optimizing compiler. Change-Id: Idbeac0942265f493266b2ef9b7a65bb4054f0e2d
|
c34dc9362b9ec624b3bdd97d36b6b2098814cd73 |
|
12-Apr-2015 |
David Srbecky <dsrbecky@google.com> |
Move 'ret' instruction generation inside GenerateFrameExit. Change-Id: I0c594d9a2356a006a5ce8dfd41d307cf7c3704ba
|
c6b4dd8980350aaf250f0185f73e9c42ec17cd57 |
|
07-Apr-2015 |
David Srbecky <dsrbecky@google.com> |
Implement CFI for Optimizing. CFI is necessary for stack unwinding in gdb, lldb, and libunwind. Change-Id: I1a3480e3a4a99f48bf7e6e63c4e83a80cfee40a2
|
4388dcc30e2a8aa6897a57c44e6865960712a007 |
|
03-Feb-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: ARM64: Use TBZ and TBNZ in VisitIf. TBZ and TBNZ have a short range. Now that VIXL supports veneers, we can use them safely without the danger of running out of range. Change-Id: Iaf77a441ccf86282c1793a2213a69a2091ca829a
|
760d8efd535764e54500bf65a944ed3f2a54c123 |
|
28-Mar-2015 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Opt Compiler: ARM64 goodness This patch: * Switches on PreferAcquireRelease() (used to decide if load/store volatile should use acquire release-semantics or explicit memory barriers). Note that for ARMv8 CPUs we should always prefer this (as proved by synthetic benchmarks on A53, A57 and Denver). * Enables the use of constants for HBoundsCheck Change-Id: I42524451772c05a1c74af73e97a59a95f49ba6d4 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
fc3ee8f0c1fa0a5a6ca87c0f9c4320c5bc00e062 |
|
02-Apr-2015 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Revert "ART: Valgrind hotfix for VIXL 1.9" Fixed in external/vixl and upstream VIXL. Change-Id: I7d1901da06a7c1517c88a2583bc668a3e23ef852
|
d43b3ac88cd46b8815890188c9c2b9a3f1564648 |
|
01-Apr-2015 |
Mingyao Yang <mingyao@google.com> |
Revert "Revert "Deoptimization-based bce."" This reverts commit 0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430. Change-Id: I1ca10d15bbb49897a0cf541ab160431ec180a006
|
75fda57d0aa3106c7ebad88656c3eea056e5ea6a |
|
01-Apr-2015 |
Andreas Gampe <agampe@google.com> |
ART: Valgrind hotfix for VIXL 1.9 Make sure recommended_checkpoint_ is initialized in the VIXL macro assembler by calling EmitLiteralPool with an empty pool. Change-Id: I08589b8fb092a33a8f4aad824e91b5c16ff761b6
|
d75948ac93a4a317feaf136cae78823071234ba5 |
|
27-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Intrinsify String.compareTo. Change-Id: Ia540df98755ac493fe61bd63f0bd94f6d97fbb57
|
46e2a3915aa68c77426b71e95b9f3658250646b7 |
|
16-Mar-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Boolean simplifier The optimization recognizes the negation pattern generated by 'javac' and replaces it with a single condition. To this end, boolean values are now consistently assumed to be represented by an integer. This is a first optimization which deletes blocks from the HGraph and does so by replacing the corresponding entries with null. Hence, existing code can continue indexing the list of blocks with the block ID, but must check for null when iterating over the list. Change-Id: I7779da69cfa925c6521938ad0bcc11bc52335583
|
0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430 |
|
24-Mar-2015 |
Andreas Gampe <agampe@google.com> |
Revert "Deoptimization-based bce." This breaks compiling the core image: Error after BCE: art::SSAChecker: Instruction 219 in block 1 does not dominate use 221 in block 1. This reverts commit e295e6ec5beaea31be5d7d3c996cd8cfa2053129. Change-Id: Ieeb48797d451836ed506ccb940872f1443942e4e
|
e295e6ec5beaea31be5d7d3c996cd8cfa2053129 |
|
07-Mar-2015 |
Mingyao Yang <mingyao@google.com> |
Deoptimization-based bce. A mechanism is introduced that a runtime method can be called from code compiled with optimizing compiler to deoptimize into interpreter. This can be used to establish invariants in the managed code If the invariant does not hold at runtime, we will deoptimize and continue execution in the interpreter. This allows to optimize the managed code as if the invariant was proven during compile time. However, the exception will be thrown according to the semantics demanded by the spec. The invariant and optimization included in this patch are based on the length of an array. Given a set of array accesses with constant indices {c1, ..., cn}, we can optimize away all bounds checks iff all 0 <= min(ci) and max(ci) < array-length. The first can be proven statically. The second can be established with a deoptimization-based invariant. This replaces n bounds checks with one invariant check (plus slow-path code). Change-Id: I8c6e34b56c85d25b91074832d13dba1db0a81569
|
68e15009173f92fe717546a621b56413d5e9fba1 |
|
17-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
PREOPT compiles using dex2oatd so don't emit debug instructions. Change-Id: I8d2ab8d956ad0ce313928918c658d49f490ad081
|
2d35d9d7490ef3880ee366ccbf8f6e791f398c47 |
|
22-Feb-2015 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Opt Compiler: Materialise constants that cannot be encoded The VIXL MacroAssembler deals gracefully with any immediate. However when the constant has multiple uses and cannot be encoded in the instruction's immediate field we are better off using a register for the constant and thus sharing the constant generation between multiple uses. Eg: var += #Const; // #Const cannot be encoded. var += #Const; Before: After: mov wip0, #Const mov w4, #Const add w0, w0, wip0 add w0, w0, w4 mov wip0, #Const add w0, w0, w4 add w0, w0, wip0 Change-Id: Ied8577c879845777e52867aced16b2b45e06ac6c Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
eeefa1276e83776f08704a3db4237423b0627e20 |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Update locations of registers after slow paths spilling. Change-Id: Id9aafcc13c1a085c17ce65d704c67b73f9de695d
|
a8ac9130b872c080299afacf5dcaab513d13ea87 |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Refactor code in preparation of correct stack maps in slow path. Move the logic of saving/restoring live registers in slow path in the SlowPathCode method. Also add a RecordPcInfo helper to SlowPathCode, that will act as the placeholder of saving correct stack maps. Change-Id: I25c2bc7a642ef854bbc8a3eb570e5c8c8d2d030c
|
3ce57abd8fe50a0a772d14e033a9e7c34beff6cb |
|
12-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Opt Compiler: Materialise constants that cannot be encoded" Fails building the core image. This reverts commit 758c2f65805564e0c51cccaacf8307e52a9e312b. Change-Id: Ic3ebd8a08a3d17a513d820035b430f6de4125866
|
758c2f65805564e0c51cccaacf8307e52a9e312b |
|
22-Feb-2015 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Opt Compiler: Materialise constants that cannot be encoded The VIXL MacroAssembler deals gracefully with any immediate. However when the constant has multiple uses and cannot be encoded in the instruction's immediate field we are better off using a register for the constant and thus sharing the constant generation between multiple uses. Eg: var += #Const; // #Const cannot be encoded. var += #Const; Before: After: mov wip0, #Const mov w4, #Const add w0, w0, wip0 add w0, w0, w4 mov wip0, #Const add w0, w0, w4 add w0, w0, wip0 Change-Id: I8d1f620872d1241cf582fb4f3b45b5091b790146 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
579885a26d761f5ba9550f2a1cd7f0f598c2e1e3 |
|
22-Feb-2015 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Opt Compiler: ARM64: Enable explicit memory barriers over acquire/release Implement remaining explicit memory barrier code paths and temporarily enable the use of explicit memory barriers for testing. This CL also enables the use of instruction set features in the ARM64 backend. kUseAcquireRelease has been replaced with PreferAcquireRelease(), which for now is statically set to false (prefer explicit memory barriers). Please note that we still prefer acquire-release for the ARM64 Optimizing Compiler, but we would like to exercise the explicit memory barrier code path too. Change-Id: I84e047ecd43b6fbefc5b82cf532e3f5c59076458 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
d8ef2e991a1a65f47a26a1eb8c6b34c92b775d6b |
|
24-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
not-int can also take non-int (byte and short) instructions. So we should use the result-type instead if the input type for knowning what instruction to use. Bug: 19454010 Change-Id: I88782ad27ae8c8e1b7868afede5057d26f14685a
|
b1498f67b444c897fa8f1530777ef118e05aa631 |
|
16-Feb-2015 |
Calin Juravle <calin@google.com> |
Improve type propagation with if-contexts This works by adding a new instruction (HBoundType) after each `if (a instanceof ClassA) {}` to bound the type that `a` can take in the True- dominated blocks. Change-Id: Iae6a150b353486d4509b0d9b092164675732b90c
|
d6138ef1ea13d07ae555542f8898b30d89e9ac9a |
|
18-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Ensure the graph is correctly typed. We used to be forgiving because of HIntConstant(0) also being used for null. We now create a special HNullConstant for such uses. Also, we need to run the dead phi elimination twice during ssa building to ensure the correctness. Change-Id: If479efa3680d3358800aebb1cca692fa2d94f6e5
|
9341546f1e5177a0328c67c5899ee81d19bd5d88 |
|
17-Feb-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: ARM64: Optimise floating-point comparison with 0.0. Change-Id: I297ed92445f20fae2ebf301e90e97772072da364
|
a3ec39425e09f92421775d1485660eb633f97aec |
|
15-Feb-2015 |
Zheng Xu <zheng.xu@arm.com> |
Opt compiler: ARM64: Fix blocking fp registers. VIXL reserved float point registers has not been blocked correctly. Change-Id: Ie7131d86bbaff48c431e3e26abd2fa26389ac687
|
c0572a451944f78397619dec34a38c36c11e9d2a |
|
06-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize leaf methods. Avoid suspend checks and stack changes when not needed. Change-Id: I0fdb31e8c631e99091b818874a558c9aa04b1628
|
3d087decd1886b818adcccd4f16802e5e54dd03e |
|
28-Jan-2015 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Opt Compiler: ARM64: Enable Callee-saved register, as defined by AAPCS64. For now we block kQuickSuspendRegister - x19, since Quick and the runtime use this as a suspend counter register. Change-Id: I090d386670e81e7924e4aa9a3864ef30d0580a30 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
cb1b00aedd94785e7599f18065a0b97b314e64f6 |
|
28-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use the non access check entrypoint when possible. Change-Id: I0b53d63141395e26816d5d2ce3fa6a297bb39b54
|
542361f6e9ff05e3ca1f56c94c88bc3efeddd9c4 |
|
29-Jan-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Introduce primitive type helpers. Change-Id: I81e909a185787f109c0afafa27b4335050a0dcdf
|
0a299b9305d42074a47a7922aff855c840a76cfd |
|
29-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix bad rebase. Change-Id: Ia66c5ec4a612908b749b058d85f374d1f1b72a2a
|
1cf95287364948689f6a1a320567acd7728e94a3 |
|
12-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Small optimization for recursive calls: avoid dex cache. Change-Id: I044757a2f06e535cdc1480c4fc8182b89635baf6
|
878d58cbaf6b17a9e3dcab790754527f3ebc69e5 |
|
16-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Arm64 optimizing compiler intrinsics Implement most intrinsics for the optimizing compiler for Arm64. Change-Id: Idb459be09f0524cb9aeab7a5c7fccb1c6b65a707
|
d97dc40d186aec46bfd318b6a2026a98241d7e9c |
|
22-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support callee save floating point registers on x64. - Share the computation of core_spill_mask and fpu_spill_mask between backends. - Remove explicit stack overflow check support: we need to adjust them and since they are not tested, they will easily bitrot. Change-Id: I0b619b8de4e1bdb169ea1ae7c6ede8df0d65837a
|
988939683c26c0b1c8808fc206add6337319509a |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable core callee-save on x64. Will work on other architectures and FP support in other CLs. Change-Id: I8cef0343eedc7202d206f5217fdf0349035f0e4d
|
fa93b504b324784dd9a96e28e6e8f3f1b1ac456a |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not use HNot for creating !bool. HNot folds to ~, not !. Change-Id: I681f968449a2ade7110b2f316146ad16ba5da74c
|
6c2dff8ff8e1440fa4d9e1b2ba2a44d036882801 |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Fully support pairs in the register allocator."" This reverts commit c399fdc442db82dfda66e6c25518872ab0f1d24f. Change-Id: I19f8215c4b98f2f0827e04bf7806c3ca439794e5
|
77520bca97ec44e3758510cebd0f20e3bb4584ea |
|
12-Jan-2015 |
Calin Juravle <calin@google.com> |
Record implicit null checks at the actual invoke time. ImplicitNullChecks are recorded only for instructions directly (see NB below) preceeded by NullChecks in the graph. This way we avoid recording redundant safepoints and minimize the code size increase. NB: ParallalelMoves might be inserted by the register allocator between the NullChecks and their uses. These modify the environment and the correct action would be to reverse their modification. This will be addressed in a follow-up CL. Change-Id: Ie50006e5a4bd22932dcf11348f5a655d253cd898
|
c399fdc442db82dfda66e6c25518872ab0f1d24f |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Fully support pairs in the register allocator." Libcore tests fail. This reverts commit 41aedbb684ccef76ff8373f39aba606ce4cb3194. Change-Id: I2572f120d4bbaeb7a4d4cbfd47ab00c9ea39ac6c
|
41aedbb684ccef76ff8373f39aba606ce4cb3194 |
|
14-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fully support pairs in the register allocator. Enabled on ARM for longs and doubles. Change-Id: Id8792d08bd7ca9fb049c5db8a40ae694bafc2d8b
|
93edf73a5fecd526920fbd870068fa592376ac8a |
|
20-Jan-2015 |
Calin Juravle <calin@google.com> |
Use CompilerOptions for implicit stack overflow checks Change-Id: I52744382a7e3d2c6c11a43e027d87bf43ec4e62b
|
cd6dffedf1bd8e6dfb3fb0c933551f9a90f7de3f |
|
08-Jan-2015 |
Calin Juravle <calin@google.com> |
Add implicit null checks for the optimizing compiler - for backends: arm, arm64, x86, x86_64 - fixed parameter passing for CodeGenerator - 003-omnibus-opcodes test verifies that NullPointerExceptions work as expected Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
|
71fb52fee246b7d511f520febbd73dc7a9bbca79 |
|
30-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Optimizing compiler intrinsics Add intrinsics infrastructure to the optimizing compiler. Add almost all intrinsics supported by Quick to the x86-64 backend. Further intrinsics require more assembler support. Change-Id: I48de9b44c82886bb298d16e74e12a9506b8e8807
|
02d81cc8d162a31f0664249535456775e397b608 |
|
05-Jan-2015 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Opt Compiler: ARM64: Add support for rem-float, rem-double and volatile. Add support for rem-float, rem-double and volatile memory accesses using acquire-release and memory barriers. Change-Id: I96a24dff66002c3b772c3d8e6ed792e3cb59048a Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
1cc7dbabd03e0a6c09d68161417a21bd6f9df371 |
|
18-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Reorder entrypoint argument order Shuffle the ArtMethod* referrer backwards for easier removal. Clean up ARM & MIPS assembly code. Change some macros to make future changes easier. Change-Id: Ie2862b68bd6e519438e83eecd9e1611df51d7945
|
5b4b898ed8725242ee6b7229b94467c3ea3054c8 |
|
18-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't block quick callee saved registers for optimizing." X64 has one libcore test failing, and codegen_test on arm is failing. This reverts commit 6004796d6c630696127df2494dcd4f30d1367a34. Change-Id: I20e00431fa18e11ce4c0cb6fffa91977fa8e9b4f
|
6004796d6c630696127df2494dcd4f30d1367a34 |
|
15-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't block quick callee saved registers for optimizing. This change builds on: https://android-review.googlesource.com/#/c/118983/ - Also fix x86_64 assembler bug triggered by this change. - Fix (and improve) x86's backend byte register usage. - Fix a bug in baseline register allocator: a fixed out register must prevent inputs from allocating it. Change-Id: I4883862e29b4e4b6470f1823cf7eab7e7863d8ad
|
4e44c829e282b3979a73bfcba92510e64fbec209 |
|
17-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Small optimization for recursive calls: avoid dex cache." Fails on target. This reverts commit 390f59f9bec64fd81b05e796dfaeb03ab6d4cc81. Change-Id: Ic3865b8897068ba20df0fbc2bcf561faf6c290c1
|
390f59f9bec64fd81b05e796dfaeb03ab6d4cc81 |
|
12-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Small optimization for recursive calls: avoid dex cache. Change-Id: Ic4054b6c38f0a2a530ba6ef747647f86cee0b1b8
|
e53798a7e3267305f696bf658e418c92e63e0834 |
|
01-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Inlining support in optimizing. Currently only inlines simple things that don't require an environment, such as: - Returning a constant. - Returning a parameter. - Returning an arithmetic operation. Change-Id: Ie844950cb44f69e104774a3cf7a8dea66bc85661
|
3e69f16ae3fddfd24f4f0e29deb106d564ab296c |
|
10-Dec-2014 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Add arm64 support for register allocation. Change-Id: Idc6e84eee66170de4a9c0a5844c3da038c083aa7
|
01fcc9ee556f98d0163cc9b524e989760826926f |
|
01-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Remove type conversion nodes converting to the same type. When optimizing, we ensure these conversions do not reach the code generators. When not optimizing, we cannot get such situations. Change-Id: I717247c957667675dc261183019c88efa3a38452
|
02164b352a1474c616771582ca9a73a2cc514c1f |
|
13-Nov-2014 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Opt Compiler: Arm64: Add support for more IRs plus various fixes. Add support for more IRs and update others. Change-Id: Iae1bef01dc3c0d238a46fbd2800e71c38288b1d2 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
32f5b4d2c8c9b52e9522941c159577b21752d0fa |
|
25-Nov-2014 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Vixl: Update the VIXL interface to VIXL 1.7 and enable VIXL debug. This patch updates the interface to VIXL 1.7 and enables the debug version of VIXL when ART is built in debug mode. Change-Id: I443fb941bec3cffefba7038f93bb972e6b7d8db5 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
eace45873190a27302b3644c32ec82854b59d299 |
|
25-Nov-2014 |
Mathieu Chartier <mathieuc@google.com> |
Move dexCacheStrings from ArtMethod to Class Adds one load for const strings which are not direct. Saves >= 60KB of memory avg per app. Image size: -350KB. Bug: 17643507 Change-Id: I2d1a3253d9de09682be9bc6b420a29513d592cc8 (cherry picked from commit f521f423b66e952f746885dd9f6cf8ef2788955d)
|
9aec02fc5df5518c16f1e5a9b6cb198a192db973 |
|
19-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add shifts Added SHL, SHR, USHR for arm, x86, x86_64. Change-Id: I971f594e270179457e6958acf1401ff7630df07e
|
86a8d7afc7f00ff0f5ea7b8aaf4d50514250a4e6 |
|
19-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Consistently use k{InstructionSet}WordSize. These constants were defined prior to k{InstructionSet}PointerSize. So use them consistently in optimizing as a first step. We can discuss whether we should remove them in a second step. Change-Id: If129de1a3bb8b65f8d9c816a8ad466815fb202e6
|
4a962e5373bc0992d6e9ba6a43bb65845e7a8783 |
|
18-Nov-2014 |
Andreas Gampe <agampe@google.com> |
ART: Build fix Change-Id: I0c4d1c2981bdfb95e12c8c624826349281ada0cf
|
2d7210188805292e463be4bcf7a133b654d7e0ea |
|
10-Nov-2014 |
Mathieu Chartier <mathieuc@google.com> |
Change 64 bit ArtMethod fields to be pointer sized Changed the 64 bit entrypoint and gc map fields in ArtMethod to be pointer sized. This saves a large amount of memory on 32 bit systems. Reduces ArtMethod size by 16 bytes on 32 bit. Total number of ArtMethod on low memory mako: 169957 Image size: 49203 methods -> 787248 image size reduction. Zygote space size: 1070 methods -> 17120 size reduction. App methods: ~120k -> 2 MB savings. Savings per app on low memory mako: 125K+ per app (less active apps -> more image methods per app). Savings depend on how often the shared methods are on dirty pages vs shared. TODO in another CL, delete gc map field from ArtMethod since we should be able to get it from the Oat method header. Bug: 17643507 Change-Id: Ie9508f05907a9f693882d4d32a564460bf273ee8 (cherry picked from commit e832e64a7e82d7f72aedbd7d798fb929d458ee8f)
|
67555f7e9a05a9d436e034f67ae683bbf02d072d |
|
18-Nov-2014 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Add support for more IRs on arm64. Change-Id: I4b6425135d1af74912a206411288081d2516f8bf
|
bacfec30ee9f2f6fdfd190f11b105b609938efca |
|
14-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add REM_INT, REM_LONG - for arm, x86, x86_64 - minor cleanup/fix in div tests Change-Id: I240874010206a5a9b3aaffbc81a885b94c248f93
|
9574c4b5f5ef039d694ac12c97e25ca02eca83c0 |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement and/or/xor in optimizing. Change-Id: I7cf6da1fd334a7177a5580931b8f174dd40b7cec
|
b7baf5c58d0e864f8c3f889357c51288aed42e61 |
|
11-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement monitorenter/monitorexit. Pretty simple as they just invoke the runtime. Change-Id: I5fcb2c783deac27e55e28d8b3da3e68ea4b77363
|
57a88d4ac205874dc85d22f9f6a9ca3c4c373eeb |
|
10-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement checkcast for optimizing. - Ended up not using HTypeCheck because of how instanceof and checkcast end up having different logic for code generation. - Fix a x86_64 assembler bug triggered by now enabling more methods to be compiled. Difficult to test today without b/18117217. Change-Id: I3022e7ae03befb1d10bea9637ad21fadc430abe0
|
fc19de8b201475231751b9df08fce01a093e5c2b |
|
07-Nov-2014 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Add arm64 support for a few more IRs. Change-Id: I781ddcbc61eb2b04ae80b1c7697e1ed5694bd5b9
|
a89086e3be94fb262c4c4feb15241b30616c3b8f |
|
07-Nov-2014 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Add arm64 support for floating-point. Change-Id: I0d97ab0f5ab770fee62c819505743febbce8835e
|
52839d17c06175e19ca4a093fb878450d1c4310d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support invoke-interface in optimizing. Change-Id: Ic18d7c3d2810557231caf0571956e0c431f5d384
|
4e5965100bd508162c4990fc8e779d6c25e38b9c |
|
07-Nov-2014 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Fix HNot on ARM64.
|
5dffc05f6a64d44b4045f3bc00ca40082452875d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix codegen_test: HNot has only one input. Change-Id: I13e54d39dfbf80593f2e9592dbd286c54938e95a
|
6f5c41f9e409bc4da53b5d7c385202255e391e72 |
|
06-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement instanceof in optimizing. - Only fast-path for now: null or same class. - Use pQuickInstanceofNonTrivial for slow path. Change-Id: Ic5196b94bef792f081f3cb4d15157058e1381e6b
|
fb4e5fac5fbaf9171a38aeb4bc082f3c2b3122dd |
|
06-Nov-2014 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: specify that inputs and outputs don't overlap on arm64. Change-Id: I062b70c6534c0d203674dccddbf11f94da72cdb4
|
f43083d560565aea46c602adb86423daeefe589d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not update Out after it has a valid location. Slow paths use LocationSummary to know where to move things around, and they are executed at the end of the code generation. This fix is needed for https://android-review.googlesource.com/#/c/113345/. Change-Id: Id336c6409479b1de6dc839b736a7234d08a7774a
|
de58ab2c03ff8112b07ab827c8fa38f670dfc656 |
|
05-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement try/catch/throw in optimizing. - We currently don't run optimizations in the presence of a try/catch. - We therefore implement Quick's mapping table. - Also fix a missing null check on array-length. Change-Id: I6917dfcb868e75c1cf6eff32b7cbb60b6cfbd68f
|
55dcfb5e0dd626993bb2b7b9f692c1b02b5d955f |
|
24-Oct-2014 |
Roland Levillain <rpl@google.com> |
Add support for not-long on ARM64 in the optimizing compiler. Change-Id: I3e98ff411ba358d92774def18a12daccdc4f558f
|
d0d4852847432368b090c184d6639e573538dccf |
|
04-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add div-int and exception handling. - for backends: arm, x86, x86_64 - fixed a register allocator bug: the request for a fixed register for the first input was ignored if the output was kSameAsFirstInput - added divide by zero exception - more tests - shuffle around some code in the builder to reduce the number of lines of code for a single function. Change-Id: Id3a515e02bfbc66cd9d16cb9746f7551bdab3d42
|
dff1f2812ecdaea89978c5351f0c70cdabbc0821 |
|
05-Nov-2014 |
Roland Levillain <rpl@google.com> |
Support int-to-long conversions in the optimizing compiler. - Add support for the int-to-float Dex instruction in the optimizing compiler. - Add a HTypeConversion node type for control-flow graphs. - Generate x86, x86-64 and ARM (but not ARM64) code for int-to-float HTypeConversion nodes. - Add a 64-bit "Move doubleword to quadword with sign-extension" (MOVSXD) instruction to the x86-64 assembler. - Add related tests to test/422-type-conversion. Change-Id: Ieb8ec5380f9c411857119c79aa8d0728fd10f780
|
277ccbd200ea43590dfc06a93ae184a765327ad0 |
|
04-Nov-2014 |
Andreas Gampe <agampe@google.com> |
ART: More warnings Enable -Wno-conversion-null, -Wredundant-decls and -Wshadow in general, and -Wunused-but-set-parameter for GCC builds. Change-Id: I81bbdd762213444673c65d85edae594a523836e5
|
6a3c1fcb4ba42ad4d5d142c17a3712a6ddd3866f |
|
31-Oct-2014 |
Ian Rogers <irogers@google.com> |
Remove -Wno-unused-parameter and -Wno-sign-promo from base cflags. Fix associated errors about unused paramenters and implict sign conversions. For sign conversion this was largely in the area of enums, so add ostream operators for the effected enums and fix tools/generate-operator-out.py. Tidy arena allocation code and arena allocated data types, rather than fixing new and delete operators. Remove dead code. Change-Id: I5b433e722d2f75baacfacae4d32aef4a828bfe1b
|
b5f62b3dc5ac2731ba8ad53cdf3d9bdb14fbf86b |
|
30-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support for CONST_STRING in optimizing compiler. Change-Id: Iab8517bdadd1d15ffbe570010f093660be7c51aa
|
42d641bfd9ef3c03c68177b2a429b20056670d86 |
|
27-Oct-2014 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Add ARM64 support for the Mul IR. Also disable compilation and use of the boot image with the optimizing compiler: this won't work with the way we're bringing up arm64 and we need to find a better solution. Bug: 18147756 Change-Id: I6ec0de73681f9226d095bc3db92338dbd46499aa
|
19a19cffd197a28ae4c9c3e59eff6352fd392241 |
|
22-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for static fields in optimizing compiler. Change-Id: Id2f010589e2bd6faf42c05bb33abf6816ebe9fa9
|
7c4954d429626a6ceafbf05be41bf5f840894e44 |
|
28-Oct-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add division for floats and doubles backends: x86, x86_64, arm. Also: - ordered instructions based on their name. - add missing kNoOutputOverlap to add/sub/mul. Change-Id: Ie47cde3b15ac74e7a1660c67a2eed1d7871f0ad0
|
5319defdf502fc4569316473846b83180ec08035 |
|
23-Oct-2014 |
Alexandre Rames <alexandre.rames@arm.com> |
ART: optimizing compiler: initial support for ARM64. The ARM64 port uses VIXL for code generation, to which it defers work like label binding and branch resolving, register type coherency checking, and immediate values handling. Change-Id: I0a44508c0c991f472a63e67b3469cdd878fe1a68 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com> Signed-off-by: Alexandre Rames <alexandre.rames@arm.com>
|