b3cd84a2fbd4875c605cfa5a4a362864b570f1e6 |
|
13-Jul-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a bug in ClassTableGet code generation for IMTs. Introduced by: https://android-review.googlesource.com/#/c/244980/ test:566-polymorphic-inling for fixing x86 crash. Also fixes a performance regression. bug:29188168 (cherry picked from commit ff484b95b25a5181a6a8a191cbd11da501c97651) Change-Id: Iae5a63cb24017222c3fefda695a0a39673719f51
|
df2d4f22d5e89692c90b443da82fe2930518418b |
|
30-Jun-2016 |
Artem Udovichenko <artem.u@samsung.com> |
Revert "Revert "Optimize IMT"" This reverts commit 88f288e3564d79d87c0cd8bb831ec5a791ba4861. Test: Includes smali tests to exercise cts failures that led to revert. These tests check that objects that don't implement any interfaces are handled properly when interface methods are invoked on them. Bug: 29188168 (for initial CL) Bug: 29778499 (reason for revert) Change-Id: I49605d53692cbec1e2622e23ff2893fc51ed4115
|
fd43db68d204caaa0e411ca79a37af15d1c001af |
|
29-Jun-2016 |
Jeff Hao <jeffhao@google.com> |
Revert "Optimize IMT" This reverts commit 0790af1391b316c5c12b4e135be357008c060696. Bug: 29188168 (for initial CL) Bug: 29778499 (reason for revert) Change-Id: I2c3e4ec2cebdd40faec67ddb721b7acdc8e90061
|
0790af1391b316c5c12b4e135be357008c060696 |
|
13-May-2016 |
Nelli Kim <nelli.kim@samsung.com> |
Optimize IMT * Remove IMT for classes which do not implement interfaces * Remove IMT for array classes * Share same IMT Saved memory (measured on hammerhead): boot.art: Total number of classes: 3854 Number of affected classes: 1637 Saved memory: 409kB Chrome (excluding classes in boot.art): Total number of classes: 2409 Number of affected classes: 1259 Saved memory: 314kB Google Maps (excluding classes in boot.art): Total number of classes: 6988 Number of affected classes: 2574 Saved memory: 643kB Performance regression on benchmarks/InvokeInterface.java benchmark (measured timeCall10Interface) 1st launch: 9.6% 2nd launch: 6.8% Bug: 29188168 (cherry picked from commit badee9820fcf5dca5f8c46c3215ae1779ee7736e) Change-Id: If8db765e3333cb78eb9ef0d66c2fc78a5f17f497
|
e5de54cfab5f14ba0b8ff25d8d60901c7021943f |
|
20-Apr-2016 |
Calin Juravle <calin@google.com> |
Split profile recording from jit compilation We still use ProfileInfo objects to record profile information. That gives us the flexibility to add the inline caches in the future and the convenience of the already implemented GC. If UseJIT is false and SaveProfilingInfo true, we will only record the ProfileInfo and never launch compilation tasks. Bug: 27916886 Change-Id: I6e4768dc5d58f2f85f947b276b4244aa11ce3fca
|
d1ee80948144526b985afb44a0574248cf7da58a |
|
13-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Move Assemblers to the Arena. And clean up some APIs to return std::unique_ptr<> instead of raw pointers that don't communicate ownership. (cherry picked from commit 93205e395f777c1dd81d3f164cf9a4aec4bde45f) Bug: 27505766 Change-Id: I3017302307a0253d661240750298802fb0d9585e
|
dee58d6bb6d567fcd0c4f39d8d690c3acaf0e432 |
|
07-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals"" This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) This CL fixed an issue with parsing quickened instructions. Bug: 27894376 Bug: 27998571 Bug: 27995065 Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
|
40ecb12f8eeb97b810e11f895278abbf7988ed4d |
|
06-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Fix codegens for MethodLoadKind::kDexCacheViaMethod. Use the original method index instead of the target method index because the target method may point to a different dex file. No regression test: this currently happens only if the codegen uses the kDexCacheViaMethod as a fallback for another load kind and we aim to avoid that fallback, so it would be difficult to write a reliable regression test. We could try and exploit current fallbacks for irreducible loops on x86 and arm but those fallbacks will eventually disappear anyway. Bug: 28036230 Change-Id: I4cc9e046480d3d60a7fb521f0ca6a98914625cdc
|
60328910cad396589474f8513391ba733d19390b |
|
04-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals" Bug: 27995065 This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f. Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
|
e3ff7b293be2a6791fe9d135d660c0cffe4bd73f |
|
02-Mar-2016 |
David Brazdil <dbrazdil@google.com> |
Refactor HGraphBuilder and SsaBuilder to remove HLocals This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) Bug: 27894376 Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
|
4c858cd53e0efba9ee0c3e20d035900fcf14c145 |
|
16-Mar-2016 |
Pavel Vyssotski <pavel.n.vyssotski@intel.com> |
ART: Fix TypeConversion from long const to float on x86_64 LocationsBuilderX86_64::VisitTypeConversion should load 32-bit constant for float type. Change-Id: I24335568af65e6b98bf07d36f90c8696497dd137 Signed-off-by: Pavel Vyssotski <pavel.n.vyssotski@intel.com>
|
cac5a7e871f1f346b317894359ad06fa7bd67fba |
|
22-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Improve const-string code generation. For strings in the boot image, use either direct pointers or pc-relative addresses. For other strings, use PC-relative access to the dex cache arrays for AOT and direct address of the string's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -692KiB (-0.9%) - 64-bit boot.oat: -948KiB (-1.1%) - 32-bit dalvik cache total: -900KiB (-0.9%) - 64-bit dalvik cache total: -3672KiB (-1.5%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -380KiB (-0.5%) - 64-bit boot.oat: -928KiB (-1.0%) - 32-bit dalvik cache total: -468KiB (-0.4%) - 64-bit dalvik cache total: -1928KiB (-0.8%) (contains more files than the 32-bit dalvik cache) Bug: 26884697 Change-Id: Iec7266ce67e6fedc107be78fab2e742a8dab2696
|
5b5b9319ff970979ed47d41a41283e4faeffb602 |
|
22-Mar-2016 |
Roland Levillain <rpl@google.com> |
Fix and improve shift and rotate operations. - Define maximum int and long shift & rotate distances as int32_t constants, as shift & rotate distances are 32-bit integer values. - Consider the (long, long) inputs case as invalid for static evaluation of shift & rotate rotations. - Add more checks in shift & rotate operations constructors as well as in art::GraphChecker. Change-Id: I754b326c3a341c9cc567d1720b327dad6fcbf9d6
|
1a65388f1d86bb232c2e44fecb44cebe13105d2e |
|
18-Mar-2016 |
Roland Levillain <rpl@google.com> |
Clean up art::HConstant predicates. - Make the difference between arithmetic zero and zero-bit pattern non ambiguous. - Introduce Boolean predicates in art::HIntConstant for when they are used as Booleans. - Introduce aritmetic positive and negative zero predicates for floating-point constants. Bug: 27639313 Change-Id: Ia04ecc6f6aa7450136028c5362ed429760c883bd
|
d28f4a00933a4a3b8d5e9db73b8532924d0f989d |
|
14-Mar-2016 |
David Srbecky <dsrbecky@google.com> |
Generate native debug stackmaps before calls as well. The debugger looks up PC of the call instruction, so the runtime's stackmap is not sufficient since it is at PC after the instruction. Change-Id: I0dd06c0b52e8079ea5d064ea10beb12c93584092
|
a5c4a4060edd03eda017abebc85f24cffb083ba7 |
|
15-Mar-2016 |
Roland Levillain <rpl@google.com> |
Make art::HCompare support boolean, byte, short and char inputs. Also extend tests covering the IntegerSignum, LongSignum, IntegerCompare and LongCompare intrinsics and their translation into an art::HCompare instruction. Bug: 27629913 Change-Id: I0afc75ee6e82602b01ec348bbb36a08e8abb8bb8
|
2ae48182573da7087bffc2873730bc758ec29696 |
|
16-Mar-2016 |
Calin Juravle <calin@google.com> |
Clean up NullCheck generation and record stats about it. This removes redundant code from the generators and allows for easier stat recording. Change-Id: Iccd4368f9e9d87a6fecb863dee4e2145c97851c4
|
e5671618d19489ad0781ec0d204c7765317170cf |
|
16-Mar-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Accept boolean as an input of HDivZeroCheck. All our arithmetic operations accept it. bug:27624718 Change-Id: I1f6bb95dc77ecb3fb2fcabb35a93b31c524bfa0a
|
a1de9188a05afdecca8cd04ecc4fefbac8b9880f |
|
25-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Reduce memory usage of HInstructions. Pack narrow fields and flags into a single 32-bit field. Change-Id: Ib2f7abf987caee0339018d21f0d498f8db63542d
|
9cd6d378bd573cdc14d049d32bdd22a97fa4d84a |
|
09-Feb-2016 |
David Srbecky <dsrbecky@google.com> |
Associate slow paths with the instruction that they belong to. Almost all slow paths already know the instruction they belong to, this CL just moves the knowledge to the base class as well. This is needed to be be able to get the corresponding dex pc for slow path, which allows us generate better native line numbers, which in turn fixes some native debugging stepping issues. Change-Id: I568dbe78a7cea6a43a4a71a014b3ad135782c270
|
c7098ff991bb4e00a800d315d1c36f52a9cb0149 |
|
09-Feb-2016 |
David Srbecky <dsrbecky@google.com> |
Remove HNativeDebugInfo from start of basic blocks. We do not require full environment at the start of basic block. The dex pc contained in basic block is sufficient for line mapping. Change-Id: I5ba9e5f5acbc4a783ad544769f9a73bb33e2bafa
|
ed009780b56a81c5046e6b5a344e12117ea45357 |
|
22-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing/x86-64: Use MOVL in Load64BitValue() if IsUint<32>(). Change-Id: Ie8bfb1861a384d0906f2aff9e8a94be0925c65b6
|
b52bbde2870e5ab5d126612961dcb3da8e5236ee |
|
12-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Simplify consecutive type conversions. Merge two consecutive type conversions to one if the result of such merged conversion is guaranteed to be the same and remove all implicit conversions, not just conversions to the same type. Improve codegens to handle conversions from long to integral types smaller than int. This will make it easier to simplify `(byte) (x & 0xffL)` to `(byte) x` where the conversion from long to byte is done by two dex instructions, long-to-int and in int-to-byte. Bug: 23965701 Change-Id: I833f193556671136ad2cd3f5b31cdfbc2d99c19d
|
dee1b9aec66d1ab1472df3954d8c1cc25896f62e |
|
12-Feb-2016 |
Mark Mendell <mark.p.mendell@intel.com> |
X86_64: Allow HSelect to generate CMOV from memory Use the cmov with Address operand to allow CMOV from stack location. Change-Id: Ia2f856c7b5003c413f23adaabe19be06f38c78ab Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
6e332529c33be4d7dae5dad3609a839f4c0d3bfc |
|
02-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove HTemporary Change-Id: I21b984224370a9ce7a4a13a9652503cfb03c5f03
|
86503785cd6414b8692e5c83cadaa2972b6a099b |
|
11-Feb-2016 |
Roland Levillain <rpl@google.com> |
Fix x86-64 Baker's read barrier fast path for CheckCast. Use an art::x86_64::Label instead of an art::x86_64::NearLabel as end label when emitting code for a HCheckCast instruction, as the range of the latter may sometimes be too short when Baker's read barriers are enabled. Bug: 12687968 Change-Id: Ia9742dce65be7d4fb104688f3c4717b65df1fb54
|
a19616e3363276e7f2c471eb2839fb16f1d43f27 |
|
02-Feb-2016 |
Aart Bik <ajcbik@google.com> |
Implemented compare/signum intrinsics as HCompare (with all code generation for all) Rationale: At HIR level, many more optimizations are possible, while ultimately generated code can take advantage of full semantics. Change-Id: I6e2ee0311784e5e336847346f7f3c4faef4fd17e
|
7c0b44f180f1b8cf82c568091d250071d1130954 |
|
01-Feb-2016 |
Mark Mendell <mark.p.mendell@intel.com> |
Support CMOV for x86_64 Select If possible, generate CMOV to implement HSelect. Tricky cases are an FP condition (no single CC generated), FP inputs (no FP CMOV) and when the condition is a boolean or not emitted at the use site. In these cases, keep using the existing HSelect code. Added Load32BitValue for int and FP and used that to remove code duplication. Added minimal checker test for int/long CMOV generation. Change-Id: Id71e515f0afa5a30f53c5de3a5244de1ea429aae Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
c5d4754198aadb2ada2d3f5daacd10d79bc13f38 |
|
28-Jan-2016 |
Aart Bik <ajcbik@google.com> |
Implementation of integer intrinsics on x86_64 Rationale: Efficient implementations of common integer operations. Already tested in: 564-checker-bitcount 565-checker-rotate: 566-checker-signum 567-checker-compare 568-checker-onebit (extended to deal with run-time zero) Change-Id: Ib48c76eee751e7925056d7f26797e9a9b5ae60dd
|
a42363f79832a6e14f348514664dc6dc3edf9da2 |
|
17-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement first kind of polymorphic inlining. Add HClassTableGet to fetch an ArtMethod from the vtable or imt, and compare it to the only method the profiling saw. Change-Id: I76afd3689178f10e3be048aa3ac9a97c6f63295d
|
74eb1b264691c4eb399d0858015a7fc13c476ac6 |
|
14-Dec-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement HSelect This patch adds a new HIR instruction to Optimizing. HSelect returns one of two inputs based on the outcome of a condition. This is only initial implementation which: - defines the new instruction, - repurposes BooleanSimplifier to emit it, - extends InstructionSimplifier to statically resolve it, - updates existing code and tests accordingly. Code generators currently emit fallback if/then/else code and will be updated in follow-up CLs to use platform-specific conditional moves when possible. Change-Id: Ib61b17146487ebe6b55350c2b589f0b971dcaaee
|
b3e773eea39a156b3eacf915ba84e3af1a5c14fa |
|
26-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Implement support for instruction inlining Optimizing HIR contains 'non-materialized' instructions which are emitted at their use sites rather than their defining sites. This was not properly handled by the liveness analysis which did not adjust the use positions of the inputs of such instructions. Despite the analysis being incorrect, the current use cases never produce incorrect code. This patch generalizes the concept of inlined instructions and updates liveness analysis to set the compute use positions correctly. Change-Id: Id703c154b20ab861241ae5c715a150385d3ff621
|
95e7ffc28ea4d6deba356e636b16120ae49b62e2 |
|
22-Jan-2016 |
Roland Levillain <rpl@google.com> |
Improve documentation and assertions of read barrier instrumentation. For ARM, x86, x86-64 back ends. The case of the ARM64 back end is already handled in https://android-review.googlesource.com/#/c/197870/. Bug: 12687968 Change-Id: I6df1128cc100cbdb89020876e1a54de719508be3
|
e3f43ac79e50a4693ea4d46acf5cffca64910cee |
|
19-Jan-2016 |
Roland Levillain <rpl@google.com> |
Some read barrier clean-up in Optimizing. These changes make the read barrier compiler instrumentation code more uniform among the ARM, ARM64, x86 and x86-64 back ends. Bug: 12687968 Change-Id: I6b1c0cf2bc22ed6cd6b14754136bef4a2a036ea5
|
58282f4510961317b8d5a364a6f740a78926716f |
|
14-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove Baseline compiler We don't need Baseline any more and it hasn't been maintained for a while anyway. Let's remove it. Change-Id: I442ed26855527be2df3c79935403a25b1ee55df6
|
6de1938e562b0d06e462512dd806166e754035ea |
|
08-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove incorrect HFakeString optimization Simplification of HFakeString assumes that it cannot be used until String.<init> is called which is not true and causes different behaviour between the compiler and the interpreter. This patch removes the optimization together with the HFakeString instruction. Instead, HNewInstance is generated and an empty String allocated until it is replaced with the result of the StringFactory call. This is consistent with the behaviour of the interpreter but is too conservative. A follow-up CL will attempt to optimize out the initial allocation when possible. Bug: 26457745 Bug: 26486014 Change-Id: I7139e37ed00a880715bfc234896a930fde670c44
|
42249c3602c3d0243396ee3627ffb5906aa77c1e |
|
08-Jan-2016 |
Aart Bik <ajcbik@google.com> |
Reduce code size by sharing slow paths. Rationale: Sharing identical slow path code reduces code size. Background: Currently, slow paths with the same dex-pc, same physical register spilling code, and identical stack maps are shared (making this only useful for deopt slow paths). The newly introduced mechanism is sufficiently general to allow future improvements by e.g. allowing different dex-pc (by passing this to runtime) or even the kind of slow paths (by passing runtime addresses to the slowpath). Change-Id: I819615c47b4fd98440a241f681f93e4fc22d12e0
|
b7070a2db8b0b7eca14f01f932be305be64ded57 |
|
08-Jan-2016 |
David Srbecky <dsrbecky@google.com> |
Generate Nops to ensure that debug stack maps have distinct PC. Change-Id: I5740ec958a20d236634b66df0e675382ed5c16fc
|
68f6289fbc1b14ed814722c023b3f343c1e59a79 |
|
04-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use std::abs on INT_MIN/LONG_MIN, it's undefined. bug:25494265 Change-Id: I560a3a589b92440020285f9adfdf7c9efb06217c
|
8a1c728e40813d30a85a1f27afaf16a3f105d32a |
|
29-Jun-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86_64: Replace x86_64 xchg instruction use Replacing 'xchg' to exchange two registers with a three instruction move sequence using the 'TMP' register r10 seems to be a big win. This is because xchg is a serializing instruction, even when used on registers. Change-Id: I1c0f7687630936e7f3d2efc4b30ad11233bd484c Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
152408f8c2188a7ed950cad04883b2f67dc74e84 |
|
31-Dec-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86: templatize GenerateTestAndBranch and friends Allow the use of NearLabel as well as Label. This will be used by the HSelect patch. Replace a couple of Label(s) with NearLabel(s) as well. Change-Id: I8e674c89e691bcdbccf4a5cdc07ad13b29ec21dd Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
6ce017304099d1df97ffa016ce0efce79c67f344 |
|
30-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
On x64, cmpl can never take a int64 immediate. Fix a wrong type widening in x64 code generator and add CHECKs in the assembler. Change-Id: Id35f5d47c6cf78ed07e73ab783db09712d3c437f
|
7f59d59ff5a716283c9ba0ead17ab7c51bc2e525 |
|
29-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix code generation for String.<init> on x64. The ArtMethod is a 64bit pointer so should be loaded with movq. Change-Id: I80803046a9144776d7f069e8baee61e39ae289d5
|
0cf4493166ff28518c8eafa2d0463f6e817cce75 |
|
09-Dec-2015 |
David Srbecky <dsrbecky@google.com> |
Generate more stack maps during native debugging. Generate extra stack map at the start of each java statement. The stack maps are later translated to DWARF which allows LLDB to set breakpoints and view local variables. Change-Id: If00ab875513308e4a1399d1e12e0fe8934a6f0c3
|
5f7b58ea1adfc0639dd605b65f59198d3763f801 |
|
23-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Rewrite HInstruction::Is/As<type>(). Make Is<type>() and As<type>() non-virtual for concrete instruction types, relying on GetKind(), and mark GetKind() as PURE to improve optimization opportunities. This reduces the number of relocations in libart-compiler.so's .rel.dyn section by ~4K, or ~44%, and in .data.rel.ro by ~18K, or ~65%. The file is 96KiB smaller for Nexus 5, including 8KiB reduction of the .text section. Unfortunately, the g++/clang++ __attribute__((pure)) is not strong enough to avoid duplicated virtual calls and we would need the C++ [[pure]] attribute proposed in n3744 instead. To work around this deficiency, we introduce an extra non-virtual indirection for GetKind(), so that the compiler can optimize common expressions such as instruction->IsAdd() || instruction->IsSub() or instruction->IsAdd() && instruction->AsAdd()->... which contain two virtual calls to GetKind() after inlining. Change-Id: I83787de0671a5cb9f5b0a5f4a536cef239d5b401
|
f3e0ee27f46aa6434b900ab33f12cd3157578234 |
|
17-Dec-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "ART: Reduce the instructions generated by packed switch."" This reverts commit b4c137630fd2226ad07dfd178ab15725374220f1. The underlying issue was fixed by https://android-review.googlesource.com/188271 . Bug: 26121945 Change-Id: I58b08eb1a9f0a5c861f8cda93522af64bcf63920
|
17077d888a6752a2e5f8161eee1b2c3285783d12 |
|
16-Dec-2015 |
Mark P Mendell <mark.p.mendell@intel.com> |
Revert "Revert "X86: Use locked add rather than mfence"" This reverts commit 0da3b9117706760e8722029f407da6d0297cc943. Fix a compilation failure that slipped in somehow. Change-Id: Ide8681cdc921febb296ea47aa282cc195f154049
|
0da3b9117706760e8722029f407da6d0297cc943 |
|
16-Dec-2015 |
Aart Bik <ajcbik@google.com> |
Revert "X86: Use locked add rather than mfence" This reverts commit 7b3e4f99b25c31048a33a08688557b133ad345ab. Reason: build error on sdk (linux) in git_mirror-aosp-master-with-vendor , please fix first art/compiler/optimizing/code_generator_x86_64.cc:4032:7: error: use of undeclared identifier 'codegen_' codegen_->MemoryFence(); Change-Id: I91f8542cfd944b7425d1981c35872dcdcb901e18
|
b4c137630fd2226ad07dfd178ab15725374220f1 |
|
16-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "ART: Reduce the instructions generated by packed switch." This reverts commit 59f054d98f519a3efa992b1c688eb97bdd8bbf55. bug:26121945 Change-Id: I8a5ad7ef1f1de8d44787c27528fa3f7f5c2e9cd3
|
7b3e4f99b25c31048a33a08688557b133ad345ab |
|
19-Nov-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86: Use locked add rather than mfence Java semantics for memory ordering can be satisfied using lock addl $0,0(SP) rather than mfence. The locked add synchronizes the memory caches, but doesn't affect device memory. Timing on a micro benchmark with a mfence or lock add $0,0(sp) in a loop with 600000000 iterations: time ./mfence real 0m5.411s user 0m5.408s sys 0m0.000s time ./locked_add real 0m3.552s user 0m3.550s sys 0m0.000s Implement this as an instruction-set-feature lock_add. This is off by default (uses mfence), and enabled for atom & silvermont variants. Generation of mfence can be forced by a parameter to MemoryFence. Change-Id: I5cb4fded61f4cbbd7b7db42a1b6902e43e458911 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
1e7f8db01a929ac816ca122868edc067c3c6cd17 |
|
15-Dec-2015 |
Roland Levillain <rpl@google.com> |
x86-64 Baker's read barrier fast path implementation. Introduce an x86-64 fast path implementation in Optimizing for Baker's read barriers (for both heap reference loads and GC root loads). The marking phase of the read barrier is performed by a slow path, invoking the runtime entry point artReadBarrierMark. Other read barrier algorithms continue to use the original slow path based implementation, which has been renamed as GenerateReadBarrierSlow/GenerateReadBarrierForRootSlow. Bug: 12687968 Change-Id: I9329293ddca7f9bcb512132bde6675aa202b98b2
|
351dddf4025f07477161209e374741f089d97cb4 |
|
11-Dec-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Clean up after HRor. Change-Id: I96bd7fa2e8bdccb87a3380d063dad0dd57fed9d7
|
40a04bf64e5837fa48aceaffe970c9984c94084a |
|
11-Dec-2015 |
Scott Wakeling <scott.wakeling@linaro.org> |
Replace rotate patterns and invokes with HRor IR. Replace constant and register version bitfield rotate patterns, and rotateRight/Left intrinsic invokes, with new HRor IR. Where k is constant and r is a register, with the UShr and Shl on either side of a |, +, or ^, the following patterns are replaced: x >>> #k OP x << #(reg_size - k) x >>> #k OP x << #-k x >>> r OP x << (#reg_size - r) x >>> (#reg_size - r) OP x << r x >>> r OP x << -r x >>> -r OP x << r Implemented for ARM/ARM64 & X86/X86_64. Tests changed to not be inlined to prevent optimization from folding them out. Additional tests added for constant rotate amounts. Change-Id: I5847d104c0a0348e5792be6c5072ce5090ca2c34
|
917d01680714b2295f109f8fea0aa06764a30b70 |
|
24-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't generate a slow path for strings in the dex cache. Change-Id: I1d258f1a89bf0ec7c7ddd134be9215d480f0b09a
|
59f054d98f519a3efa992b1c688eb97bdd8bbf55 |
|
07-Dec-2015 |
Zheng Xu <zheng.xu@linaro.org> |
ART: Reduce the instructions generated by packed switch. Implement Vladimir Marko's suggestion. The new compare/jump series reduce the number of instructions from (2*n+1) to (1.5*n+3). Generate normal compare/jump series when numEntries <= 3. Generate optimal compare/jump series when numEntries <= threshold. Generate jump tables otherwise. Change-Id: I425547b6787057c7fa84e71f17c145b63b208633
|
e523423a053af5cb55837f07ceae9ff2fd581712 |
|
02-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Don't use the compiler driver for method resolution."" This reverts commit c88ef3a10c474045a3476a02ae75d07ddd3230b7. Change-Id: I0ed88a48b313a8d28bc39fae40631123aadb13ef
|
c88ef3a10c474045a3476a02ae75d07ddd3230b7 |
|
01-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't use the compiler driver for method resolution." Fails 425 in debuggable mode. This reverts commit 4db0bf9c4db6a09716c3388b7d2f88d534470339. Change-Id: I346df8f75674564fc4fb241c60f23e250fc7f0a7
|
4db0bf9c4db6a09716c3388b7d2f88d534470339 |
|
23-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use the compiler driver for method resolution. The compiler driver makes assumptions that don't hold for the optimizing compiler, and will for example always go to slow path for an invoke-super when there's no verified method. Also fix GenerateInvokeVirtual in the presence of intrinsics. Next change will address some of the TODOs in sharpening.cc. Change-Id: I2b0e543ee9b9bebcadb2d26de29e850c59ad58b9
|
42e372e5a34d0fef88007bc5f40dd0fc7c03b58b |
|
24-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize HLoadClass when we know the class is in the cache. Change-Id: Iaa74591eed0f2eabc9ba9f9988681d9582faa320
|
888d067a67640e7d9fc349b0451dfe845acad562 |
|
23-Nov-2015 |
Roland Levillain <rpl@google.com> |
Revamp art::CheckEntrypointTypes uses. Change-Id: I6e13e594539e766ed94524ac3282cec292ba91da
|
4f6b0b551ee549af12fce75c8379f5137fe4cfad |
|
23-Nov-2015 |
Roland Levillain <rpl@google.com> |
Clean up read barrier related comments in Optimizing. Bug: 12687968 Change-Id: Idf2e371e01e10d9d32c95b150735e2c96244232e
|
729645a937eb9f04a311b3c22471dcf3ebe9bcec |
|
19-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Explicitly add HLoadClass/HClinitCheck for HNewInstance. bug:25735083 bug:25173758 Change-Id: Ie81cfa4fa9c47cc025edb291cdedd7af209a03db
|
c53c0797a78a89d637e4230503cc1feb27e855a8 |
|
19-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Clean up the special input in HInvokeStaticOrDirect. Change-Id: I4042aefbdac1a8c236d00e2e7145349a64f6486b
|
0debae7bc89eb05f7a2bf7dccd223318fad7c88d |
|
12-Nov-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Refactor GenerateTestAndBranch Each code generator implements a method for generating condition evaluation and branching to arbitrary labels. This patch refactors it for better clarity but also to generate fewer jumps when the true branch is the fallthrough successor. This is preliminary work for implementing HSelect. Change-Id: Iaa545a5ecbacb761c5aa241fa69140cf6eb5952f
|
0d5a281c671444bfa75d63caf1427a8c0e6e1177 |
|
13-Nov-2015 |
Roland Levillain <rpl@google.com> |
x86/x86-64 read barrier support for concurrent GC in Optimizing. This first implementation uses slow paths to instrument heap reference loads and GC root loads for the concurrent copying collector, respectively calling the artReadBarrierSlow and artReadBarrierForRootSlow (new) runtime entry points. Notes: - This implementation does not instrument HInvokeVirtual nor HInvokeInterface instructions (for class reference loads), as the corresponding read barriers are not stricly required with the current concurrent copying collector. - Intrinsics which may eventually call (on slow path) are disabled when read barriers are enabled, as the current slow path infrastructure does not support this case. - When read barriers are enabled, the code generated for a HArraySet instruction always go into the array set slow path for object arrays (delegating the operation to the runtime), as we are lacking a mechanism to keep a temporary register live accross a runtime call (needed for the instrumentation of type checking code, which requires two successive read barriers). Bug: 12687968 Change-Id: I14cd6107233c326389120336f93955b28ffbb329
|
0f7dca4ca0be8d2f8776794d35edf8b51b5bc997 |
|
02-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing/X86: PC-relative dex cache array addressing. Add PC-relative dex cache array addressing for X86 and use it for better invoke-static/-direct dispatch. Also delay the initialization to the PC-relative base until needed. Change-Id: Ib8634d5edce4920cd70172fd13211809cf6948d1
|
ea5af68d6dda832bdfb5978a0c5d6f86a3f67e80 |
|
22-Oct-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86-64: Split long/double constant array/field set A long constant needs to be in a register to store to memory. By allowing stores of constants that are outside of the range of int32_t, we reduce register usage. Also support sets of float/double constants by using integer stores. Rename RegisterOrInt32LongConstant to RegisterOrInt32Constant as it now handles any type of constant. Change-Id: I025d9ef889a5a433e45aa03b376bae40f14197d2 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
b8b97695d178337736b61609220613b92f344d45 |
|
22-May-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Fix conditional jump over jmp (X86/X86-64/ARM32) Optimize the code generation for 'if' statements to jump to the 'false' block if the next block to be generated is the 'true' block. Add an X86-64 test for this case. Note that ARM64 & MIPS64 have not been updated. Change-Id: Iebb1352feb9d3bd0142d8b0621a2e3069a708ea7 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
dc151b2346bb8a4fdeed0c06e54c2fca21d59b5d |
|
15-Oct-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Determine invoke-static/-direct dispatch early. Determine the dispatch type of invoke-static/-direct in a special pass right after the type inference. This allows the inliner to pass the "needs dex cache" check and inline more. It also allows the code generator to avoid requesting a register location for the ArtMethod* for kDexCachePcRelative and direct methods. The supported dispatch check handles also situations that the CompilerDriver currently doesn't allow. The cleanup of the CompilerDriver and required changes to Quick will come in a separate change. Change-Id: I3f8e903a119949e95871d8ab0a995f4731a13a07
|
bb245d199a5240b4c520263fd2c8c10dba79eadc |
|
19-Oct-2015 |
Aart Bik <ajcbik@google.com> |
Generalize codegen and simplification of deopt. Rationale: the de-opt instruction is very similar to an if, so the existing assumption that it always has a conditional "under the hood" is very unsafe, since optimizations may have replaced conditionals with actual values; this CL generalizes handling of deopt. Change-Id: I1c6cb71fdad2af869fa4714b38417dceed676459
|
4b8f1ecd3aa5a29ec1463ff88fee9db365f257dc |
|
26-Aug-2015 |
Roland Levillain <rpl@google.com> |
Use ATTRIBUTE_UNUSED more. Use it in lieu of UNUSED(), which had some incorrect uses. Change-Id: If247dce58b72056f6eea84968e7196f0b5bef4da
|
e9f37600e98ba21308ad4f70d9d68cf6c057bdbe |
|
09-Oct-2015 |
Aart Bik <ajcbik@google.com> |
Added support for unsigned comparisons Rationale: even though not directly supported in input graph, having the ability to express unsigned comparisons in HIR is useful for all sorts of optimizations. Change-Id: I4543c96a8c1895c3d33aaf85685afbf80fe27d72
|
9c86b485bc6169eadf846dd5f7cdf0958fe1eb23 |
|
18-Sep-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86_64 jump tables for PackedSwitch Implement PackedSwitch using a jump table of offsets to blocks. Bug: 24092914 Bug: 21119474 Change-Id: I83430086c03ef728d30d79b4022607e9245ef98f Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
ee3cf0731d0ef0787bc2947c8e3ca432b513956b |
|
06-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Intrinsify System.arraycopy. Currently on x64, will do the other architectures in different changes. Change-Id: I15fbbadb450dd21787809759a8b14b21b1e42624
|
ec7802a102d49ab5c17495118d4fe0bcc7287beb |
|
01-Oct-2015 |
Vladimir Marko <vmarko@google.com> |
Add DCHECKs to ArenaVector and ScopedArenaVector. Implement dchecked_vector<> template that DCHECK()s element access and insert()/emplace()/erase() positions. Change the ArenaVector<> and ScopedArenaVector<> aliases to use the new template instead of std::vector<>. Remove DCHECK()s that have now become unnecessary from the Optimizing compiler. Change-Id: Ib8506bd30d223f68f52bd4476c76d9991acacadc
|
580b609cd6cfef46108156457df42254d11e72a7 |
|
06-Oct-2015 |
Calin Juravle <calin@google.com> |
Fix location summary for LoadClass Don't request a register for the current method if we're gonna call the runtime. Change-Id: I9760d15108bd95efb2a34e6eacd84b60841781d7
|
98893e146b0ff0e1fd1d7c29252f1d1e75a163f2 |
|
02-Oct-2015 |
Calin Juravle <calin@google.com> |
Add support for unresolved classes in optimizing. Change-Id: I0e299a81e560eb9cb0737ec46125dffc99333b54
|
e460d1df1f789c7c8bb97024a8efbd713ac175e9 |
|
29-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Support unresolved fields in optimizing" The CL also changes the calling convetion for 64bit static field set to use kArg2 instead of kArg1. This allows optimizing to keep the asumptions: - arm pairs are always of form (even_reg, odd_reg) - ecx_edx is not used as a register on x86. This reverts commit e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1. Change-Id: I93159917565824084abc96775f31be1a4249f2f3
|
e0395dd58454e27fc47c0ca273913929fb658e6c |
|
25-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize ArraySet for x86/x64/arm/arm64. Change-Id: I5bc8c6adf7f82f3b211f0c21067f5bb54dd0c040
|
5233f93ee336b3581ccdb993ff6342c52fec34b0 |
|
29-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag even more arena allocations. Tag previously "Misc" arena allocations with more specific allocation types. Move some native heap allocations to the arena in BCE. Bug: 23736311 Change-Id: If8ef15a8b614dc3314bdfb35caa23862c9d4d25c
|
225b6464a58ebe11c156144653f11a1c6607f4eb |
|
28-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in code generators. And completely remove the deprecated GrowableArray. Replace GrowableArray with ArenaVector in code generators and related classes and tag arena allocations. Label arrays use direct allocations from ArenaAllocator because Label is non-copyable and non-movable and as such cannot be really held in a container. The GrowableArray never actually constructed them, instead relying on the zero-initialized storage from the arena allocator to be correct. We now actually construct the labels. Also avoid StackMapStream::ComputeDexRegisterMapSize() being passed null references, even though unused. Change-Id: I26a46fdd406b23a3969300a67739d55528df8bf4
|
abfcf18fa2fe723bd683edcb685ed5058d9c7cf3 |
|
21-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Further refinements to checkcast/instanceof. - Use setcc when possible. - Do an exact check in the Object[] case before checking the component type. Change-Id: Ic11c60643af9b41fe4ef2beb59dfe7769bef388f
|
fe57faa2e0349418dda38e77ef1c0ac29db75f4d |
|
18-Sep-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Add basic PackedSwitch support Add HPackedSwitch, and generate it from the builder. Code generators convert this to a series of compare/branch tests. Better implementation in the code generators as a real jump table will follow as separate CLs. Change-Id: If14736fa4d62809b6ae95280148c55682e856911 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
85c7bab43d11180d552179c506c2ffdf34dd749c |
|
18-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Optimize code generation of check-cast and instance-of."" This reverts commit 7537437c6a2f89249a48e30effcc27d4e7c5a04f. Change-Id: If759cb08646e47b62829bebc3c5b1e2f2969cf84
|
85b62f23fc6dfffe2ddd3ddfa74611666c9ff41d |
|
09-Sep-2015 |
Andreas Gampe <agampe@google.com> |
ART: Refactor intrinsics slow-paths Refactor slow paths so that there is a default implementation for common cases (only arm64 with vixl is special). Write a generic intrinsic slow-path that can be reused for the specific architectures. Move helper functions into CodeGenerator so that they are accessible. Change-Id: Ibd788dce432601c6a9f7e6f13eab31f28dcb8550
|
7537437c6a2f89249a48e30effcc27d4e7c5a04f |
|
17-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Optimize code generation of check-cast and instance-of." Failures with libcore tests. This reverts commit 64acf303eaa2f32c0b1d8cfcbf044a822c5eec08. Change-Id: Ie6f323fcf5d86bae5c334c1352bb21f1bad60a88
|
e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1 |
|
17-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Support unresolved fields in optimizing" breaks debuggable tests. This reverts commit 23a8e35481face09183a24b9d11e505597c75ebb. Change-Id: I8e60b5c8f48525975f25d19e5e8066c1c94bd2e5
|
64acf303eaa2f32c0b1d8cfcbf044a822c5eec08 |
|
14-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize code generation of check-cast and instance-of. On x86/x64/arm/arm64. Improve code size of selected apks from 0.3% to 1%, and performance of DeltaBlue by 20%. Change-Id: Ib5799f7a53443cd880a121dd7f21932ae9f5c7aa
|
23a8e35481face09183a24b9d11e505597c75ebb |
|
08-Sep-2015 |
Calin Juravle <calin@google.com> |
Support unresolved fields in optimizing Change-Id: I9941fa5fcb6ef0a7a253c7a0b479a44a0210aad4
|
175dc732c80e6f2afd83209348124df349290ba8 |
|
25-Aug-2015 |
Calin Juravle <calin@google.com> |
Support unresolved methods in Optimizing Change-Id: If2da02b50d2fa668cd58f134a005f1752e7746b1
|
77a48ae01bbc5b05ca009cf09e2fcb53e4c8ff23 |
|
15-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Register allocation and runtime support for try/catch"" The original CL triggered b/24084144 which has been fixed by Ib72e12a018437c404e82f7ad414554c66a4c6f8c. This reverts commit 659562aaf133c41b8d90ec9216c07646f0f14362. Change-Id: Id8980436172457d0fcb276349c4405f7c4110a55
|
659562aaf133c41b8d90ec9216c07646f0f14362 |
|
14-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Register allocation and runtime support for try/catch" Breaks libcore test org.apache.harmony.security.tests.java.security.KeyStorePrivateKeyEntryTest#testGetCertificateChain. Need to investigate. This reverts commit b022fa1300e6d78639b3b910af0cf85c43df44bb. Change-Id: Ib24d3a80064d963d273e557a93469c95f37b1f6f
|
b022fa1300e6d78639b3b910af0cf85c43df44bb |
|
20-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Register allocation and runtime support for try/catch This patch completes a series of CLs that add support for try/catch in the Optimizing compiler. With it, Optimizing can compile all methods containing try/catch, provided they don't contain catch loops. Future work will focus on improving performance of the generated code. SsaLivenessAnalysis was updated to propagate liveness information of instructions live at catch blocks, and to keep location information on instructions which may be caught by catch phis. RegisterAllocator was extended to spill values used after catch, and to allocate spill slots for catch phis. Catch phis generated for the same vreg share a spill slot as the raw value must be the same. Location builders and slow paths were updated to reflect the fact that throwing an exception may not lead to escaping the method. Instruction code generators are forbidden from using of implicit null checks in try blocks as live registers need to be saved before handing over to the runtime. CodeGenerator emits a stack map for each catch block, storing locations of catch phis. CodeInfo and StackMapStream recognize this new type of stack map and store them separate from other stack maps to avoid dex_pc conflicts. After having found the target catch block to deliver an exception to, QuickExceptionHandler looks up the dex register maps at the throwing instruction and the catch block and copies the values over to their respective locations. The runtime-support approach was selected because it allows for the best performance in the normal control-flow path, since no propagation of catch phi values is necessary until the exception is thrown. In addition, it also greatly simplifies the register allocation phase. ConstantHoisting was removed from LICMTest because it instantiated (now abstract) HConstant and was bogus anyway (constants are always in the entry block). Change-Id: Ie31038ad8e3ee0c13a5bbbbaf5f0b3e532310e4e
|
bfb5ba90cd6425ce49c2125a87e3b12222cc2601 |
|
01-Sep-2015 |
Andreas Gampe <agampe@google.com> |
Revert "Revert "Do a second check for testing intrinsic types."" This reverts commit a14b9fef395b94fa9a32147862c198fe7c22e3d7. When an intrinsic with invoke-type virtual is recognized, replace the instruction with a new HInvokeStaticOrDirect. Minimal update for dex-cache rework. Fix includes. Change-Id: I1c8e735a2fa7cda4419f76ca0717125ef236d332
|
05792b98980741111b4d0a24d68cff2a8e070a3a |
|
03-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Move DexCache arrays to native. This CL has a companion CL in libcore/ https://android-review.googlesource.com/162985 Change-Id: Icbc9e20ad1b565e603195b12714762bb446515fa
|
5a6cc49ed4f36dd11d6ec1590857b884ad8da6ab |
|
13-Aug-2015 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
SlowPath: Remove the use of Locations in the SlowPath constructors. The main motivation is that using locations in the SlowPath constructors ties us to creating the SlowPaths after register allocation, since before the locations are invalid. A later patch of the series will be moving the SlowPath creation to the LocationsBuilder visitors. This will enable us to add more checking as well as consider sharing multiple SlowPaths of the same type. Change-Id: I7e96dcc2b5586d15153c942373e9281ecfe013f0 Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
|
ecc4366670e12b4812ef1653f7c8d52234ca1b1f |
|
13-Aug-2015 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Add OptimizingCompilerStats to the CodeGenerator class. Just refactoring, not yet used, but will be used by the incoming patch series and future CodeGen specific stats. Change-Id: I7d20489907b82678120518a77bdab9c4cc58f937 Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
|
0c9497da9485ba688c592e5f452b7b1305a519c0 |
|
21-Aug-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86: Use short forward jumps if possible The optimizing compiler uses 32 bit relative jumps for all forward jumps, just in case the offset is too large to fit in one byte. Some of the generated code knows that the jumps will in fact fit. Use the 'NearLabel' class for the code generator and intrinsics. Use the jecxz/jrcxz instruction for string intrinsics. Unfortunately, conditional jumps to basic blocks don't know enough to use this, as we don't know how much code will be generated. This saves a whopping 0.24% for core.oat and boot.oat sizes, but every little bit helps, and it reduces icache footprint slightly. Change-Id: I633fe3b2e0e810b4ce12fdad8c02135644b63506 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
581550137ee3a068a14224870e71aeee924a0646 |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Optimizing: Better invoke-static/-direct dispatch."" Fixed kCallArtMethod to use correct callee location for kRecursive. This combination is used when compiling with debuggable flag set. This reverts commit b2c431e80e92eb6437788cc544cee6c88c3156df. Change-Id: Idee0f2a794199ebdf24892c60f8a5dcf057db01c
|
b2c431e80e92eb6437788cc544cee6c88c3156df |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Optimizing: Better invoke-static/-direct dispatch." Reverting due to failing ndebug tests. This reverts commit 9b688a095afbae21112df5d495487ac5231b12d0. Change-Id: Ie4f69da6609df3b7c8443412b6cf7f5c43c2c5d9
|
9b688a095afbae21112df5d495487ac5231b12d0 |
|
06-May-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Better invoke-static/-direct dispatch. Add framework for different types of loading ArtMethod* and code pointer retrieval. Implement invoke-static and invoke-direct calls the same way as Quick. Document the dispatch kinds in HInvokeStaticOrDirect's new enumerations MethodLoadKind and CodePtrLocation. PC-relative loads from dex cache arrays are used only for x86-64 and arm64. The implementation for other architectures will be done in separate CLs. Change-Id: I468ca4d422dbd14748e1ba6b45289f0d31734d94
|
3887c468d731420e929e6ad3acf190d5431e94fc |
|
12-Aug-2015 |
Roland Levillain <rpl@google.com> |
Remove unnecessary `explicit` qualifiers on constructors. Change-Id: Id12e392ad50f66a6e2251a68662b7959315dc567
|
78e3ef6bc5f8aa149f2f8bf0c78ce854c2f910fa |
|
12-Aug-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Add a GVN dependency 'GC' for garbage collection. This will be used by incoming architecture specific optimizations. The dependencies must be conservative. When an HInstruction is created we may not be sure whether it can trigger GC. In that case the 'ChangesGC' dependency must be set. We control at code-generation time that HInstructions that can call have the 'ChangesGC' dependency set. Change-Id: Iea6a7f430009f37a9599b0a0039207049906e45d
|
cfa410b0ea561318f74a76c5323f0f6cd8eaaa50 |
|
25-May-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] More x86_64 code improvements Use the constant area some more, use 32-bit immediates in movq instructions when possible, and other small tweaks. Remove the commented out code for Math.Abs(float/double) as it would fail for baseline compiler due to the output being the same as the input. Change-Id: Ifa39f1865b94cec2e1c0a99af3066a645e9d3618 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
8c0676ce786f33b8f9c8eedf1ace48988c750932 |
|
03-Aug-2015 |
Serguei Katkov <serguei.i.katkov@intel.com> |
ART-Optimizing: Fix the type of HDivZeroCheck HDivZeroCheck is created during the building CFG and at this moment its type is not known completely. So it sets the type to int or long. However, later SSA builder can insert the type conversion and type of input of HDivZeroCheck can become byte or short while the type of HDivZeroCheck remains the same. In reality the type of HDivZeroCheck should be always equal to its input parameter. To fix this inconsistency we return the type of HDivZeroCheck as its input type. Code generators are updated accordingly. Change-Id: I6a5aedc8d479cfc6328704e7ddf252bca830076b Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
8158f28b6689314213eb4dbbe14166073be71f7e |
|
07-Aug-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Ensure coherency of call kinds for LocationSummary. The coherency is enforced with checks added in the `InvokeRuntime` helper, that we now also use on x86 and x86_64. Change-Id: I8cb92b042f25dc3c5fd390e9c61a45b477d081f4
|
cb1c0557033065f2436ee79e7fa6c19d87064801 |
|
04-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Move exception clearing into own instruction Runtime delivers exceptions only to catch blocks which begin with a MOVE_EXCEPTION instruction (in DEX). In that case, the catch block is expected to clear the thread-local exception storage after having read the exception reference. This patch changes Optimizing to represent MOVE_EXCEPTION with two instructions - HLoadException and HClearException - instead of one. If the exception reference is not used, HLoadException can be safely removed, saving a memory load without breaking the runtime behaviour. Change-Id: Idad8a714467bf9d9d5fccefbc43c0bd8ae13ddba
|
4a2aa4af61e653a89f88d776dcdc55f6c7ca05f2 |
|
27-Jul-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Optimizing: Use more X86 3 operand multiplies The X86_64 code generator generated 3 operand multiplies for long multiplication only. Add support for 3 operand multiplication for int as well for both X86 and X86_64. Note that the RHS operand must be a 32 bit constant, and that it is possible for the constant to end up in a register (!) due to a previous use by another instruction. Handle this case by checking the operand, otherwise the first input might not be the same as the output, due to the use of Any(). Also allow stack operands for multiplication. Change-Id: I8f3d14cc01e9a91210f418258aa18065ee87979d Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
2e7cd752452d02499a2f5fbd604c5427aa372f00 |
|
10-Jul-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
[optimizing] Don't rely on the verifier for String.<init>. Continue work on cutting the dependency on the verifier. Change-Id: I0f95b1eb2e10fd8f6bf54817f1202bdf6dfdb0fe
|
4fa13f65ece3b68fe3d8722d679ebab8656bbf99 |
|
06-Jul-2015 |
Roland Levillain <rpl@google.com> |
Fuse long and FP compare & condition on ARM in Optimizing. Also: - Stylistic changes in corresponding parts on the x86 and x86-64 code generators. - Update and improve the documentation of art::arm::Condition. Bug: 21120453 Change-Id: If144772046e7d21362c3c2086246cb7d011d49ce
|
c470193cfc522fc818eb2eaab896aef9caf0c75a |
|
10-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Fuse long and FP compare & condition on x86/x86-64 in Optimizing. This is a preliminary implementation of fusing long/float/double compares with conditions to avoid materializing the result from the compare and condition. The information from a HCompare is transferred to the HCondition if it is legal. There must be only a single use of the HCompare, the HCompare and HCondition must be in the same block, the HCondition must not need materialization. Added GetOppositeCondition() to HCondition to return the flipped condition. Bug: 21120453 Change-Id: I1f1db206e6dc336270cd71070ed3232dedc754d6 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
4d02711ea578dbb789abb30cbaf12f9926e13d81 |
|
01-Jul-2015 |
Roland Levillain <rpl@google.com> |
Implement heap poisoning in ART's Optimizing compiler. - Instrument ARM, ARM64, x86 and x86-64 code generators. - Note: To turn heap poisoning on in Optimizing, set the environment variable `ART_HEAP_POISONING' to "true" before compiling ART. Bug: 12687968 Change-Id: Ib3120b38cf805a8a50207a314b9ccc90c8d93740
|
06b66d05a6251d91b5e2516f579bfff5fa49191c |
|
01-Jul-2015 |
Roland Levillain <rpl@google.com> |
Fix a MOV instruction in Optimizing's x86-64 code generator. Use `movl' instead of `movw' to store a 32-bit immediate (integer or reference) into a field. Also fix art::Location::RegisterOrInt32LongConstant to properly handle non-long constants. Change-Id: I34c6ec8eaa1632822a31969f87c9c2d6c5b96326
|
fc6a86ab2b70781e72b807c1798b83829ca7f931 |
|
26-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Implement try/catch blocks in Builder"" This patch enables the GraphBuilder to generate blocks and edges which represent the exceptional control flow when try/catch blocks are present in the code. Actual compilation is still delegated to Quick and Baseline ignores the additional code. To represent the relationship between try and catch blocks, Builder splits the edges which enter/exit a try block and links the newly created blocks to the corresponding exception handlers. This layout will later enable the SsaBuilder to correctly infer the dominators of the catch blocks and to produce the appropriate reverse post ordering. It will not, however, allow for building the complete SSA form of the catch blocks and consequently optimizing such blocks. To this end, a new TryBoundary control-flow instruction is introduced. Codegen treats it the same as a Goto but it allows for additional successors (the handlers). This reverts commit 3e18738bd338e9f8363b26bc895f38c0ec682824. Change-Id: I4f5ea961848a0b83d8db3673763861633e9bfcfb
|
3e18738bd338e9f8363b26bc895f38c0ec682824 |
|
26-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Implement try/catch blocks in Builder" Causes OutOfMemory issues, need to investigate. This reverts commit 0b5c7d1994b76090afcc825e737f2b8c546da2f8. Change-Id: I263e6cc4df5f9a56ad2ce44e18932ca51d7e349f
|
0b5c7d1994b76090afcc825e737f2b8c546da2f8 |
|
11-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement try/catch blocks in Builder This patch enables the GraphBuilder to generate blocks and edges which represent the exceptional control flow when try/catch blocks are present in the code. Actual compilation is still delegated to Quick and Baseline ignores the additional code. To represent the relationship between try and catch blocks, Builder splits the edges which enter/exit a try block and links the newly created blocks to the corresponding exception handlers. This layout will later enable the SsaBuilder to correctly infer the dominators of the catch blocks and to produce the appropriate reverse post ordering. It will not, however, allow for building the complete SSA form of the catch blocks and consequently optimizing such blocks. To this end, a new TryBoundary control-flow instruction is introduced. Codegen treats it the same as a Goto but it allows for additional successors (the handlers). Change-Id: I415b985596d5bebb7b1bb358a46e08b7b04bb53a
|
9931f319cf86c56c2855d800339a3410697633a6 |
|
19-Jun-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Opt compiler: Add a description to slow paths. Change-Id: I22160d90de3fe0ab3e6a2acc440bda8daa00e0f0
|
cad65427d39c8ca9849d49d049ca6d263ada938a |
|
19-Jun-2015 |
Jeff Hao <jeffhao@google.com> |
Fix StringChange for optimizing compiler. Uses optimizing compiler more and fixes x86_64 invoke codegen. Bug: 21902634 (cherry-picked from commit e0a9a53ec4b4ccbf9b1d67957fb99a45b469ccc2) Change-Id: I56881889bee7092b8401b090af1c0f1004c11667
|
e0a9a53ec4b4ccbf9b1d67957fb99a45b469ccc2 |
|
19-Jun-2015 |
Jeff Hao <jeffhao@google.com> |
Fix StringChange for optimizing compiler. Uses optimizing compiler more and fixes x86_64 invoke codegen. Bug: 21902634 Change-Id: Ia2a87d013c4746b107014a04a22a0a37269cfdb2
|
33d6903e570daf8f3cf7c1f6ebd9a6dd22c7c23c |
|
18-Jun-2015 |
Roland Levillain <rpl@google.com> |
Replace some run-time assertions with compile-time ones in ART. Change-Id: I16c3fad45c4b98b94b7c83d071374096e81d407a
|
69aa60163989c33a008115205d39732a76ecc1dc |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Pass current method to HNewInstance and HNewArray."" Problem exposed by this change was fixed in: https://android-review.googlesource.com/#/c/154031/ This reverts commit 7b0e353b49ac3f464c662f20e20e240f0231afff. Change-Id: I680c13dc9db9ba223ab11c7af255222860b4e6d2
|
ae71a0539451a8350bdd9d46c76ddab7b763f209 |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a crash in optimizing compiler with the current method. Crash was due to overwriting the location of the current method in the slow path of an intrinsic. Change-Id: I6ca58ef5b3cea19925e60b9500aef543bc5f71ef
|
7b0e353b49ac3f464c662f20e20e240f0231afff |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Pass current method to HNewInstance and HNewArray." 082-inline-execute fails on x86. This reverts commit e21aa42e1341d34250742abafdd83311ad9fa737. Change-Id: Ib3fd25faee2e0128001e40d3d51a74f959bc4449
|
94015b939060f5041d408d48717f22443e55b6ad |
|
04-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Use HCurrentMethod in HInvokeStaticOrDirect."" Fix was to special case baseline for x86, which does not have enough registers to allocate the current method. This reverts commit c345f141f11faad177aa9635a78088d00cf66086. Change-Id: I5997aa52f8d4df373ae5ff4d4150dac0c44c4c10
|
e21aa42e1341d34250742abafdd83311ad9fa737 |
|
08-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Pass current method to HNewInstance and HNewArray. Also remove unsed CodeGenerator::LoadCurrentMethod. Change-Id: I4b8d3f2a30b8e2c76b6b329a72555483c993cb73
|
c345f141f11faad177aa9635a78088d00cf66086 |
|
04-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Use HCurrentMethod in HInvokeStaticOrDirect." Fails on baseline/x86. This reverts commit 38207af82afb6f99c687f64b15601ed20d82220a. Change-Id: Ib71018367eb7c6046965494a7e996c22af3de403
|
38207af82afb6f99c687f64b15601ed20d82220a |
|
01-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use HCurrentMethod in HInvokeStaticOrDirect. Change-Id: I0d15244b6b44c8b10079398c55da5071a3e3af66
|
0d1652e1e3768b30e4d80f31d59db580312581d8 |
|
03-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix compilation errors with gcc. Change-Id: If88d4f639658db2d6d71f5abcad563211138fc4a
|
fd88f16100cceafbfde1b4f095f17e89444d6fa8 |
|
03-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Factorize code for common LocationSummary of HInvoke. This is one step forward, we could factorize more, but I wanted to get this out of the way first. Change-Id: I6ae411a737eebaecb64974f47af507ce0cfbae85
|
3d21bdf8894e780d349c481e5c9e29fe1556051c |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 (cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33) Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3 ART: Fix casts for 64-bit pointers on 32-bit compiler. Bug: 19264997 Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457 Fix JDWP tests after ArtMethod change Fixes Throwable::GetStackDepth for exception event detection after internal stack trace representation change. Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of proxy method. Bug: 19264997 Change-Id: I363e293796848c3ec491c963813f62d868da44d2 Fix accidental IMT and root marking regression Was always using the conflict trampoline. Also included fix for regression in GC time caused by extra roots. Most of the regression was IMT. Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to detached thread. EvaluateAndApplyChanges: From ~2500 -> ~1980 GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots Bug: 19264997 Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0 Fix bogus image test assert Previously we were comparing the size of the non moving space to size of the image file. Now we properly compare the size of the image space against the size of the image file. Bug: 19264997 Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a [MIPS64] Fix art_quick_invoke_stub argument offsets. ArtMethod reference's size got bigger, so we need to move other args and leave enough space for ArtMethod* and 'this' pointer. This fixes mips64 boot. Bug: 19264997 Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
|
62a46b2b4ac066a740fb22e58a246c18501fa909 |
|
01-Jun-2015 |
Roland Levillain <rpl@google.com> |
Use down_cast instead of reinterpret_cast in Optimizing codegens. Change-Id: Ifa23023ffaca631a4f6b5745dd7492c39521a26f
|
e3b034a6f6f0d80d519ab08bdd18be4de2a4a2db |
|
31-May-2015 |
Mathieu Chartier <mathieuc@google.com> |
Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3
|
e401d146407d61eeb99f8d6176b2ac13c4df1e33 |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
|
fbdaa30a448029d75422c76f29087a4e39630f4a |
|
29-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use the new HCurrentMethod in HLoadString. Change-Id: I23d27e5e10736d127519eb3238ff8f25df3843a2
|
76b1e1799a713a19218de26b171b0aef48a59e98 |
|
27-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a HCurrentMethod node. This enables register allocation for the current method, so that users of it don't always load it from the stack. Currently only used by HLoadClass. Will make follow-up CLs for the other users. Change-Id: If73324d85643102faba47fabbbd2755eb258c59c
|
0d37cd0a895cedb1653cf9897d9f9058855e2aee |
|
27-May-2015 |
Roland Levillain <rpl@google.com> |
Rename VisitCondition's argument in code generators. This argument is a condition instruction, not a comparison. Change-Id: I026f799d2161df58b0c8a84600eb8fffd6f7b998
|
33bf2459e6cfe477a9be0c45aec3f6f359ee077c |
|
27-May-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] x86: Prefer add over lea if possible Looking at some generated code, I noticed an lea being used when an add was sufficient. Check for that case, and generate the add. Fixed for x86 and x86_64. Change-Id: I110304ff0fed8837ada96d34353a293d29022ce5 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
07276db28d654594e0e86e9e467cad393f752e6e |
|
18-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't do a null test in MarkGCCard if the value cannot be null. Change-Id: I45687f6d3505178e2fc3689eac9cb6ab1b2c1e29
|
c74652867cd9293e86232324e5e057cd73c48e74 |
|
13-May-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Refactor GraphVisualizer attribute printing This patch unifies the way GraphVisualizer prints instruction attributes in preparation of changes to the Checker syntax. Change-Id: I44e91e36c660985ddfe039a9f410fedc48b496ec
|
92e83bf8c0b2df8c977ffbc527989631d94b1819 |
|
07-May-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Tune some x86_64 moves Generate Moves of constant FP values by loading from the constant table. Use 'movl' to load a 64 bit register for positive 32-bit values, saving a byte in the generated code by taking advantage of the implicit zero extension. Change a couple of xorq(reg, reg) to xorl to (potentially) save a byte of code per xor. Change-Id: I5b2a807f0d3b29294fd4e7b8ef6d654491fa0b01 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
db216f4d49ea1561a74261c29f1264952232728a |
|
05-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Relax the only one back-edge restriction. The rule is in the way for better register allocation, as it creates an artificial join point between multiple paths. Change-Id: Ia4392890f95bcea56d143138f28ddce6c572ad58
|
2d27c8e338af7262dbd4aaa66127bb8fa1758b86 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Refactor InvokeDexCallingConventionVisitor in Optimizing. Change-Id: I7ede0f59d5109644887bf5d39201d4e1bf043f34
|
3e3d73349a2de81d14e2279f60ffbd9ab3f3ac28 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Have HInvoke instructions know their number of actual arguments. Add an art::HInvoke::GetNumberOfArguments routine so that art::HInvoke and its subclasses can return the number of actual arguments of the called method. Use it in code generators and intrinsics handlers. Consequently, no longer remove a clinit check as last input of a static invoke if it is still present during baseline code generation, but ensure that static invokes have no such check as last input in optimized compilations. Change-Id: Iaf9e07d1057a3b15b83d9638538c02b70211e476
|
848f70a3d73833fc1bf3032a9ff6812e429661d9 |
|
15-Jan-2014 |
Jeff Hao <jeffhao@google.com> |
Replace String CharArray with internal uint16_t array. Summary of high level changes: - Adds compiler inliner support to identify string init methods - Adds compiler support (quick & optimizing) with new invoke code path that calls method off the thread pointer - Adds thread entrypoints for all string init methods - Adds map to verifier to log when receiver of string init has been copied to other registers. used by compiler and interpreter Change-Id: I797b992a8feb566f9ad73060011ab6f51eb7ce01
|
99dbd6883f5dab7743d5fb5d0ad2e82c75a7011e |
|
22-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Handle x86 const length BoundsCheck Allow a constant length for BoundsCheck. Change-Id: I2c7adc6e733cf8ce6997aba76aa763d0835bd2d6 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
0379f82393237798616d485ad99952e73e480e12 |
|
25-Apr-2015 |
Roland Levillain <rpl@google.com> |
Fix DCHECKs about clinit checks in Optimizing's code generators. These assertions are not true for the baseline compiler. As a temporary workaround, remove a clinit check as last input of a static invoke if it is still present at the stage of code generation. Change-Id: I5655f4a0873e2e7ee7790b6a341c18b4b7b52af1
|
4c0eb42259d790fddcd9978b66328dbb3ab65615 |
|
24-Apr-2015 |
Roland Levillain <rpl@google.com> |
Ensure inlined static calls perform clinit checks in Optimizing. Calls to static methods have implicit class initialization (clinit) checks of the method's declaring class in Optimizing. However, when such a static call is inlined, the implicit clinit check vanishes, possibly leading to an incorrect behavior. To ensure that inlining static methods does not change the behavior of a program, add explicit class initialization checks (art::HClinitCheck) as well as load class instructions (art::HLoadClass) as last input of static calls (art::HInvokeStaticOrDirect) in Optimizing' control flow graphs, when the declaring class is reachable and not known to be already initialized. Then when considering the inlining of a static method call, proceed only if the method has no implicit clinit check requirement. The added explicit clinit checks are already removed by the art::PrepareForRegisterAllocation visitor. This CL also extends this visitor to turn explicit clinit checks from static invokes into implicit ones after the inlining step, by removing the added art::HLoadClass nodes mentioned hereinbefore. Change-Id: I9ba452b8bd09ae1fdd9a3797ef556e3e7e19c651
|
5ea536aa4a6414db01beaf6f8bd8cb9adc5cfc92 |
|
20-Apr-2015 |
Vladimir Marko <vmarko@google.com> |
Remove ArtMethod* parameter from dex cache entry points. Load the ArtMethod* using an optimized stack walk instead. This reduces the size of the generated code. Three of the entry points are called only from a slow-path and the fourth (InitializeTypeAndVerifyAccess) is rare and already slow enough that the one or two extra loads (depending on whether we already have the ArtMethod* in a register) are insignificant. And as we're starting to use PC-relative addressing of the dex cache arrays (already done by Quick for the boot image), having the ArtMethod* in a register becomes less likely anyway. Change-Id: Ib19b9d204e355e13bf386662a8b158178bf8ad28
|
af88835231c2508509eb19aa2d21b92879351962 |
|
20-Apr-2015 |
Guillaume "Vermeille" Sanchez <guillaumesa@google.com> |
Remove unnecessary null checks in CheckCast and InstanceOf Change-Id: I6fd81cabd8673be360f369e6318df0de8b18b634
|
40741f394b2737e503f2c08be0ae9dd490fb106b |
|
21-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Use more X86_64 addressing modes Allow constant and memory addresses to more X86_64 instructions. Add memory formats to X86_64 instructions to match. Fix a bug in cmpq(CpuRegister, const Address&). Allow mov <addr>,immediate (instruction 0xC7) to be a valid faulting instruction. Change-Id: I5b8a409444426633920cd08e09f687a7afc88a39 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
27df758e2e7baebb6e3f393f9732fd0d064420c8 |
|
17-Apr-2015 |
Calin Juravle <calin@google.com> |
[optimizing] Add memory barriers in constructors when needed If a class has final fields we must add a memory barrier before returning from constructor. This makes sure the fields are visible to other threads. Bug: 19851497 Change-Id: If8c485092fc512efb9636cd568cb0543fb27688e
|
88c13cddc3a4184908662b0f3de796565d348c76 |
|
14-Apr-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Correctly require register or FPU register. Also add a check that location summary are correctly typed with the HInstruction. Change-Id: I699762ff4e8f4e321c7db01ea005236ea1934af9
|
13b4718ecd52a674b25eac106e654d8e89872750 |
|
15-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Remove DCHECKs for boolean type Since bool and int are interchangeable types, checking whether an input is kPrimBoolean can fail when replaced with 0/1 constant or a phi. This patch removes the problematic DCHECKs, adds a best-effort verification into SSAChecker but leaves the phi case empty until a suitable analysis is implemented. Change-Id: I31e8daf27dd33d2fd74049b82bed1cb7c240c8c6
|
e14590bdfed24df30e6b7545fc819ba03ff8bba1 |
|
15-Apr-2015 |
Guillaume Sanchez <guillaumesa@google.com> |
Revert "[optimizing] Improve x86 parallel moves/swaps" This reverts commit a5c19ce8d200d68a528f2ce0ebff989106c4a933. This commit introduces a performance regression on CaffeineLogic of 30%. Change-Id: I917e206e249d44e1748537bc1b2d31054ea4959d
|
9021825d1e73998b99c81e89c73796f6f2845471 |
|
15-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Type MoveOperands. The ParallelMoveResolver implementation needs to know if a move is for 64bits or not, to handle swaps correctly. Bug found, and test case courtesy of Serguei I. Katkov. Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
|
66d126ea06ce3f507d86ca5f0d1f752170ac9be1 |
|
03-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement HBooleanNot instruction Optimizations simplifying operations on boolean values (boolean simplifier, instruction simplifier) can benefit from having a special HInstruction for negating booleans in order to perform more transforms and produce faster machine code. This patch implements HBooleanNot as 'x xor 1', assuming that booleans are 1-bit integers and allowing for a single-instruction negation on all supported platforms. Change-Id: I33a2649c1821255b18a86ca68ed16416063c739f
|
9d8606de5e274c00242ee73ffb693bc34589f184 |
|
12-Apr-2015 |
David Srbecky <dsrbecky@google.com> |
Whitespace cleanup in DWARFReg helper functions. Change-Id: Iedc05969b05be6d93e40467ff23287faaae08fb3
|
c34dc9362b9ec624b3bdd97d36b6b2098814cd73 |
|
12-Apr-2015 |
David Srbecky <dsrbecky@google.com> |
Move 'ret' instruction generation inside GenerateFrameExit. Change-Id: I0c594d9a2356a006a5ce8dfd41d307cf7c3704ba
|
a5c19ce8d200d68a528f2ce0ebff989106c4a933 |
|
01-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Improve x86 parallel moves/swaps Add a new constructor to ScratchRegisterScope that will supply a register if there is a free one, but not spill to force one. Use this to generated alternate code that doesn't use a temporary, as the spill/restore of a register generates extra instructions that aren't necessary on x86. Here is the benefit for a 32 bit memory-to-memory exchange with no free registers: < 50 push eax < 53 push ebx < 8B44244C mov eax, [esp + 76] < 8B5C246C mov ebx, [esp + 108] < 8944246C mov [esp + 108], eax < 895C244C mov [esp + 76], ebx < 5B pop ebx < 58 pop eax --- > FF742444 push [esp + 68] > FF742468 push [esp + 104] > 8F44244C pop [esp + 72] > 8F442468 pop [esp + 100] Avoid using xchg instruction, as it is slow on smaller processors. Change-Id: Id29ee3abd998577baaee552d55d23e60ae0c7871 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
39dcf55a56da746e04f477f89e7b00ba1de03880 |
|
10-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Address x86_64 RIP patch comments Nicolas had some comments after the patch https://android-review.googlesource.com/#/c/144100 had merged. Fix the problems that he found. Change-Id: I40e8a4273997860db7511dc8f1986281b72bead2 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
b19930c5cba3cf662dce5ee057fcc9829b4cbb9c |
|
09-Apr-2015 |
Guillaume Sanchez <guillaumesa@google.com> |
Follow up of "div/rem on x86 and x86_64", to tidy up the code a little. Change-Id: Ibf39cbc8ac1d773599d70be2cb1e941674b60f1d
|
c6b4dd8980350aaf250f0185f73e9c42ec17cd57 |
|
07-Apr-2015 |
David Srbecky <dsrbecky@google.com> |
Implement CFI for Optimizing. CFI is necessary for stack unwinding in gdb, lldb, and libunwind. Change-Id: I1a3480e3a4a99f48bf7e6e63c4e83a80cfee40a2
|
f55c3e0825cdfc4c5a27730031177d1a0198ec5a |
|
27-Mar-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Add RIP support for x86_64 Support a constant area addressed using RIP on x86_64. Use it for FP operations to avoid loading constants into a CPU register and moving to a XMM register. Change-Id: I58421759ef2a8475538876c20e696ec787015a72 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
0f88e87085b7cf6544dadff3f555773966a6853e |
|
30-Mar-2015 |
Guillaume Sanchez <guillaumesa@google.com> |
Speedup div/rem by constants on x86 and x86_64 This is done using the algorithms in Hacker's Delight chapter 10. Change-Id: I7bacefe10067569769ed31a1f7834f796fb41119
|
65b798ea10dd716c1bb3dda029f9bf255435af72 |
|
06-Apr-2015 |
Andreas Gampe <agampe@google.com> |
ART: Enable more Clang warnings Change-Id: Ie6aba02f4223b1de02530e1515c63505f37e184c
|
d43b3ac88cd46b8815890188c9c2b9a3f1564648 |
|
01-Apr-2015 |
Mingyao Yang <mingyao@google.com> |
Revert "Revert "Deoptimization-based bce."" This reverts commit 0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430. Change-Id: I1ca10d15bbb49897a0cf541ab160431ec180a006
|
fb8d279bc011b31d0765dc7ca59afea324fd0d0c |
|
01-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Implement x86/x86_64 math intrinsics Implement floor/ceil/round/RoundFloat on x86 and x86_64. Implement RoundDouble on x86_64. Add support for roundss and roundsd on both architectures. Support them in the disassembler as well. Add the instruction set features for x86, as the 'round' instruction is only supported if SSE4.1 is supported. Fix the tests to handle the addition of passing the instruction set features to x86 and x86_64. Add assembler tests for roundsd and roundss to x86_64 assembler tests. Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
12bca97a9934a00c60776768dcaee958c4981bb6 |
|
30-Mar-2015 |
Zheng Xu <zheng.xu@arm.com> |
Opt compiler: Fix move from constant. Change-Id: Ifadb190569d349560ae9a2c49b7cabcffac362c8
|
d75948ac93a4a317feaf136cae78823071234ba5 |
|
27-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Intrinsify String.compareTo. Change-Id: Ia540df98755ac493fe61bd63f0bd94f6d97fbb57
|
b2bd1c5f9171f35fa5b71ada42d1a9e11189428d |
|
25-Mar-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Formatting and comments in BooleanSimplifier Change-Id: I9a5aa3f2aa8b0a29d7b0f1e5e247397cf8e9e379
|
46e2a3915aa68c77426b71e95b9f3658250646b7 |
|
16-Mar-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Boolean simplifier The optimization recognizes the negation pattern generated by 'javac' and replaces it with a single condition. To this end, boolean values are now consistently assumed to be represented by an integer. This is a first optimization which deletes blocks from the HGraph and does so by replacing the corresponding entries with null. Hence, existing code can continue indexing the list of blocks with the block ID, but must check for null when iterating over the list. Change-Id: I7779da69cfa925c6521938ad0bcc11bc52335583
|
da4d79bc9a4aeb9da7c6259ce4c9c1c3bf545eb8 |
|
24-Mar-2015 |
Roland Levillain <rpl@google.com> |
Unify ART's various implementations of bit_cast. ART had several implementations of art::bit_cast: 1. one in runtime/base/casts.h, declared as: template <class Dest, class Source> inline Dest bit_cast(const Source& source); 2. another one in runtime/utils.h, declared as: template<typename U, typename V> static inline V bit_cast(U in); 3. and a third local version, in runtime/memory_region.h, similar to the previous one: template<typename Source, typename Destination> static Destination MemoryRegion::local_bit_cast(Source in); This CL removes versions 2. and 3. and changes their callers to use 1. instead. That version was chosen over the others as: - it was the oldest one in the code base; and - its syntax was closer to the standard C++ cast operators, as it supports the following use: bit_cast<Destination>(source) since `Source' can be deduced from `source'. Change-Id: I7334fd5d55bf0b8a0c52cb33cfbae6894ff83633
|
0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430 |
|
24-Mar-2015 |
Andreas Gampe <agampe@google.com> |
Revert "Deoptimization-based bce." This breaks compiling the core image: Error after BCE: art::SSAChecker: Instruction 219 in block 1 does not dominate use 221 in block 1. This reverts commit e295e6ec5beaea31be5d7d3c996cd8cfa2053129. Change-Id: Ieeb48797d451836ed506ccb940872f1443942e4e
|
e295e6ec5beaea31be5d7d3c996cd8cfa2053129 |
|
07-Mar-2015 |
Mingyao Yang <mingyao@google.com> |
Deoptimization-based bce. A mechanism is introduced that a runtime method can be called from code compiled with optimizing compiler to deoptimize into interpreter. This can be used to establish invariants in the managed code If the invariant does not hold at runtime, we will deoptimize and continue execution in the interpreter. This allows to optimize the managed code as if the invariant was proven during compile time. However, the exception will be thrown according to the semantics demanded by the spec. The invariant and optimization included in this patch are based on the length of an array. Given a set of array accesses with constant indices {c1, ..., cn}, we can optimize away all bounds checks iff all 0 <= min(ci) and max(ci) < array-length. The first can be proven statically. The second can be established with a deoptimization-based invariant. This replaces n bounds checks with one invariant check (plus slow-path code). Change-Id: I8c6e34b56c85d25b91074832d13dba1db0a81569
|
68e15009173f92fe717546a621b56413d5e9fba1 |
|
17-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
PREOPT compiles using dex2oatd so don't emit debug instructions. Change-Id: I8d2ab8d956ad0ce313928918c658d49f490ad081
|
3f6c7f61855172d3d9b7a9221baba76136088e7c |
|
13-Mar-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Improve x86, x86_64 code Tweak the generated code to allow more use of constants and other small changes - Use test vs. compare to 0 - EmitMove of 0.0 should use xorps - VisitCompare kPrimLong can use constants - cmp/add/sub/mul on x86_64 can use constants if in int32_t range - long bit operations on x86 examine long constant high/low to optimize - Use 3 operand imulq if constant is in int32_t range Change-Id: I2dd4010fdffa129fe00905b0020590fe95f3f926 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
a8ac9130b872c080299afacf5dcaab513d13ea87 |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Refactor code in preparation of correct stack maps in slow path. Move the logic of saving/restoring live registers in slow path in the SlowPathCode method. Also add a RecordPcInfo helper to SlowPathCode, that will act as the placeholder of saving correct stack maps. Change-Id: I25c2bc7a642ef854bbc8a3eb570e5c8c8d2d030c
|
2ed20afc6a1032e9e0cf919cb8d1b2b41e147182 |
|
06-Mar-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Clean the use of `virtual` and `OVERRIDE`. Change-Id: I806ec522b979334cee8f344fc95e8660c019160a
|
f60c90ba8d1eee6f137a9e1a8a65e4d6bec35d6d |
|
04-Mar-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Improve x86/x86_64 bound check code Don't force a constant index into a register just to compare to the array size. Allow a constant, and compare the constant to the size. Change-Id: I1c5732fbd42e63f7eac5c6219a19e9c431c22664 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
dc5ac731f6369b53b42f1cee3404f3b3384cec34 |
|
25-Feb-2015 |
Mingyao Yang <mingyao@google.com> |
Opt compiler: enhance gvn for commutative ops. Change-Id: I415b50d58b30cab4ec38077be22373eb9598ec40
|
09b8463493aeb6ea2bce05f67d3457d5fcc8a7d9 |
|
13-Feb-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing compiler] x86 goodness Implement the x86 version of https://android-review.googlesource.com/#/c/129560/, which made some enhancements to x86_64 code. - Use leal to implement 3 operand adds - Use testl rather than cmpl to 0 for registers - Use leaq for x86_64 for adds with constant in int32_t range Note: - The range and register allocator tests seem quite fragile. I had to change ADD_INT_LIT8 to XOR_INT_LIT8 for the register allocator test to get the code to run. It seems like this is a bit hard-coded to expected code generation sequences. I also changes BuildTwoAdds to BuildTwoSubs for the same reason. - For the live range test, I just changed the expected output, as the Locations were different. Change-Id: I402f2e95ddc8be4eb0befb3dae1b29feadfa29ab Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
d8ef2e991a1a65f47a26a1eb8c6b34c92b775d6b |
|
24-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
not-int can also take non-int (byte and short) instructions. So we should use the result-type instead if the input type for knowning what instruction to use. Bug: 19454010 Change-Id: I88782ad27ae8c8e1b7868afede5057d26f14685a
|
b1498f67b444c897fa8f1530777ef118e05aa631 |
|
16-Feb-2015 |
Calin Juravle <calin@google.com> |
Improve type propagation with if-contexts This works by adding a new instruction (HBoundType) after each `if (a instanceof ClassA) {}` to bound the type that `a` can take in the True- dominated blocks. Change-Id: Iae6a150b353486d4509b0d9b092164675732b90c
|
d6138ef1ea13d07ae555542f8898b30d89e9ac9a |
|
18-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Ensure the graph is correctly typed. We used to be forgiving because of HIntConstant(0) also being used for null. We now create a special HNullConstant for such uses. Also, we need to run the dead phi elimination twice during ssa building to ensure the correctness. Change-Id: If479efa3680d3358800aebb1cca692fa2d94f6e5
|
748f140d5f0631780dbeecb033c1416faf78930d |
|
27-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
x64 goodness. - Use test instead of cmp when comparing against 0. - Make it possible to use lea for add. - Use xor instead of mov when loading 0. Change-Id: Ide95c4e2d9b773e952412892f2df6869600c324e
|
c0572a451944f78397619dec34a38c36c11e9d2a |
|
06-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize leaf methods. Avoid suspend checks and stack changes when not needed. Change-Id: I0fdb31e8c631e99091b818874a558c9aa04b1628
|
cb1b00aedd94785e7599f18065a0b97b314e64f6 |
|
28-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use the non access check entrypoint when possible. Change-Id: I0b53d63141395e26816d5d2ce3fa6a297bb39b54
|
1cf95287364948689f6a1a320567acd7728e94a3 |
|
12-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Small optimization for recursive calls: avoid dex cache. Change-Id: I044757a2f06e535cdc1480c4fc8182b89635baf6
|
4dee636d21d9ce54386cdfbb824e5eb2a9c1af0d |
|
23-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support callee-save registers on ARM. Change-Id: I7c519b7a828c9891b1141a8e51e12d6a8bc84118
|
4597b5b7648169fbdca1af69b7643e27a6c8a523 |
|
23-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix valgrind error. Also introduce kLastCpuRegister to define kFakeReturnRegister. Change-Id: I58cef6186c0452d45b5d2dcba9298cbe07f3552d
|
d97dc40d186aec46bfd318b6a2026a98241d7e9c |
|
22-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support callee save floating point registers on x64. - Share the computation of core_spill_mask and fpu_spill_mask between backends. - Remove explicit stack overflow check support: we need to adjust them and since they are not tested, they will easily bitrot. Change-Id: I0b619b8de4e1bdb169ea1ae7c6ede8df0d65837a
|
a26369a8d42f9e2a4b0b8a02fc38d2d31f42e08e |
|
22-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix lint error. Change-Id: Iccba489098dd2a5b8796beefc781284006624f74
|
988939683c26c0b1c8808fc206add6337319509a |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable core callee-save on x64. Will work on other architectures and FP support in other CLs. Change-Id: I8cef0343eedc7202d206f5217fdf0349035f0e4d
|
fa93b504b324784dd9a96e28e6e8f3f1b1ac456a |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not use HNot for creating !bool. HNot folds to ~, not !. Change-Id: I681f968449a2ade7110b2f316146ad16ba5da74c
|
77520bca97ec44e3758510cebd0f20e3bb4584ea |
|
12-Jan-2015 |
Calin Juravle <calin@google.com> |
Record implicit null checks at the actual invoke time. ImplicitNullChecks are recorded only for instructions directly (see NB below) preceeded by NullChecks in the graph. This way we avoid recording redundant safepoints and minimize the code size increase. NB: ParallalelMoves might be inserted by the register allocator between the NullChecks and their uses. These modify the environment and the correct action would be to reverse their modification. This will be addressed in a follow-up CL. Change-Id: Ie50006e5a4bd22932dcf11348f5a655d253cd898
|
24f2dfae084b2382c053f5d688fd6bb26cb8a328 |
|
15-Jan-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing compiler] Implement inline x86 FP '%' Replace the calls to fmod/fmodf by inline code as is done in the Quick compiler. Remove the quick fmod/fmodf runtime entries, as they are no longer in use. 64 bit code generator Move() routine needed to be enhanced to handle constants, as Location::Any() allows them to be generated. Change-Id: I6b6a42f6faeed4b0b3c940453e487daf5b25d184 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
93edf73a5fecd526920fbd870068fa592376ac8a |
|
20-Jan-2015 |
Calin Juravle <calin@google.com> |
Use CompilerOptions for implicit stack overflow checks Change-Id: I52744382a7e3d2c6c11a43e027d87bf43ec4e62b
|
cd6dffedf1bd8e6dfb3fb0c933551f9a90f7de3f |
|
08-Jan-2015 |
Calin Juravle <calin@google.com> |
Add implicit null checks for the optimizing compiler - for backends: arm, arm64, x86, x86_64 - fixed parameter passing for CodeGenerator - 003-omnibus-opcodes test verifies that NullPointerExceptions work as expected Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
|
71fb52fee246b7d511f520febbd73dc7a9bbca79 |
|
30-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Optimizing compiler intrinsics Add intrinsics infrastructure to the optimizing compiler. Add almost all intrinsics supported by Quick to the x86-64 backend. Further intrinsics require more assembler support. Change-Id: I48de9b44c82886bb298d16e74e12a9506b8e8807
|
1cc7dbabd03e0a6c09d68161417a21bd6f9df371 |
|
18-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Reorder entrypoint argument order Shuffle the ArtMethod* referrer backwards for easier removal. Clean up ARM & MIPS assembly code. Change some macros to make future changes easier. Change-Id: Ie2862b68bd6e519438e83eecd9e1611df51d7945
|
52c489645b6e9ae33623f1ec24143cde5444906e |
|
16-Dec-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add support for volatile - for backends: arm, x86, x86_64 - added necessary instructions to assemblies - clean up code gen for field set/get - fixed InstructionDataEquals for some instructions - fixed comments in compiler_enums * 003-opcode test verifies basic volatile functionality Change-Id: I144393efa312dfb2c332cb84056b00edffee338a
|
5b4b898ed8725242ee6b7229b94467c3ea3054c8 |
|
18-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't block quick callee saved registers for optimizing." X64 has one libcore test failing, and codegen_test on arm is failing. This reverts commit 6004796d6c630696127df2494dcd4f30d1367a34. Change-Id: I20e00431fa18e11ce4c0cb6fffa91977fa8e9b4f
|
6004796d6c630696127df2494dcd4f30d1367a34 |
|
15-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't block quick callee saved registers for optimizing. This change builds on: https://android-review.googlesource.com/#/c/118983/ - Also fix x86_64 assembler bug triggered by this change. - Fix (and improve) x86's backend byte register usage. - Fix a bug in baseline register allocator: a fixed out register must prevent inputs from allocating it. Change-Id: I4883862e29b4e4b6470f1823cf7eab7e7863d8ad
|
4e44c829e282b3979a73bfcba92510e64fbec209 |
|
17-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Small optimization for recursive calls: avoid dex cache." Fails on target. This reverts commit 390f59f9bec64fd81b05e796dfaeb03ab6d4cc81. Change-Id: Ic3865b8897068ba20df0fbc2bcf561faf6c290c1
|
390f59f9bec64fd81b05e796dfaeb03ab6d4cc81 |
|
12-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Small optimization for recursive calls: avoid dex cache. Change-Id: Ic4054b6c38f0a2a530ba6ef747647f86cee0b1b8
|
e53798a7e3267305f696bf658e418c92e63e0834 |
|
01-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Inlining support in optimizing. Currently only inlines simple things that don't require an environment, such as: - Returning a constant. - Returning a parameter. - Returning an arithmetic operation. Change-Id: Ie844950cb44f69e104774a3cf7a8dea66bc85661
|
486cc19e1e2eca4231f760117e95090c03e2d8c6 |
|
08-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Explicitly mask constants in shift operations. The assemblers expect an int8, so we mask ahead of calling them. Change-Id: Id668cda6853fa365ac02531bf7aae288cad20fcd
|
d2ec87d84057174d4884ee16f652cbcfd31362e9 |
|
08-Dec-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add REM_FLOAT and REM_DOUBLE - for arm, x86, x86_64 backends - reinstated fmod quick entry points for x86. This is a partial revert of bd3682eada753de52975ae2b4a712bd87dc139a6 which added inline assembly for floting point rem on x86. Note that Quick still uses the inline version. - fix rem tests for longs Change-Id: I73be19a9f2f2bcf3f718d9ca636e67bdd72b5440
|
4c0b61f506644bb6b647be05d02c5fb45b9ceb48 |
|
05-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for double-to-int & double-to-long in optimizing. - Add support for the double-to-int and double-to-long Dex instructions in the optimizing compiler. - Add S1 to the list of ARM FPU parameter registers so that a double value can be passed as parameter during a call to the runtime through D0. - Have art::x86_64::X86_64Assembler::cvttsd2si work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for double to int and double to long HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: Ic93b9ec6630c26e940f7966a3346ad3fd5a2ab3a
|
8964e2b689d80fe546604ac8c724078645095cf1 |
|
04-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for float-to-double & double-to-float in optimizing. Change-Id: I41b0fee5a28c83757697c8d000b7e224cf5a4534
|
624279f3c70f9904cbaf428078981b05d3b324c0 |
|
04-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for float-to-long in the optimizing compiler. - Add support for the float-to-long Dex instruction in the optimizing compiler. - Add a Dex PC field to art::HTypeConversion to allow the x86 and ARM code generators to produce runtime calls. - Instruct art::CodeGenerator::RecordPcInfo not to record PC information for HTypeConversion instructions. - Add S0 to the list of ARM FPU parameter registers. - Have art::x86_64::X86_64Assembler::cvttss2si work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for float to long HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I954214f0d537187883f83f7a83a1bb2dd8a21fd4
|
3f8f936aff35f29d86183d31c20597ea17e9789d |
|
02-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for float-to-int in the optimizing compiler. - Add support for the float-to-int Dex instruction in the optimizing compiler. - Factor type conversion related lines in compiler/optimizing/builder.cc. - Generate x86, x86-64 and ARM (but not ARM64) code for float to int HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I2382dfc04bf394ed75f675148cfcf98216d65bc6
|
01fcc9ee556f98d0163cc9b524e989760826926f |
|
01-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Remove type conversion nodes converting to the same type. When optimizing, we ensure these conversions do not reach the code generators. When not optimizing, we cannot get such situations. Change-Id: I717247c957667675dc261183019c88efa3a38452
|
6d0e483dd2e0b63e952de060738c10e2abd12ff7 |
|
27-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for long-to-float in the optimizing compiler. - Add support for the long-to-float Dex instruction in the optimizing compiler. - Have art::x86_64::X86_64Assembler::cvtsi2ss work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for long to float HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: Ic983cbeb1ae2051add40bc519a8f00a6196166c9
|
199f336af1fc8212646fda67675df0361ece33d6 |
|
27-Nov-2014 |
Roland Levillain <rpl@google.com> |
Wrap long lines in the optimizing compiler. Change-Id: I5dee0c65e6652de574ae952b1f1dfc7355859e45
|
271ab9c916980209fbc6b26e5545d76e58471569 |
|
27-Nov-2014 |
Roland Levillain <rpl@google.com> |
Ensure opt. compiler doesn't get core & FP registers mixed up. Replace Location::As<T>() with two method methods (Location::AsRegister<T>() and Location::AsFpuRegister<T>()) checking the kind of the location (register). Change-Id: I22b4abee1a124b684becd2dc1caf33652b911070
|
5368c219a462defc90c4b896b34eb7506ba5c142 |
|
27-Nov-2014 |
Roland Levillain <rpl@google.com> |
Fix neg-float & neg-double for null values in opt. compiler. - Implement float and double negation as an exclusive or with a bit sign mask in x86 and x86-64 code generators. - Enable requests of temporary FPU (double) registers during register allocation. - Update test cases in test/415-optimizing-arith-neg. Change-Id: I9572c24b27c645ba698825e60cd5b3956b4895fa
|
ddb7df25af45d7cd19ed1138e537973735cc78a5 |
|
25-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE} Adds: - float comparison for arm, x86, x86_64 backends. - ucomis{s,d} assembly to x86 and x86_64. - vmstat assebmly for thumb2 - new assembly tests Change-Id: Ie3e19d0c08b3b875cd0a4be4ee4e9c8a4a076290
|
647b9ed41cdb7cf302fd356627a3ba372419b78c |
|
27-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for long-to-double in the optimizing compiler. - Add support for the long-to-double Dex instruction in the optimizing compiler. - Enable requests of temporary FPU (double) registers during code generation. - Fix art::x86::X86Assembler::LoadLongConstant and extend it to int64_t values. - Have art::x86_64::X86_64Assembler::cvtsi2sd work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for long to double HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: Ie73d9e5e25bd2e15f585c371e8fc2dcb83438ccd
|
91debbc3da3e3376416e4394155d9f9e355255cb |
|
26-Nov-2014 |
Calin Juravle <calin@google.com> |
Revert "[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE}" Fails on arm due to missing vmrs op after vcmp. I revert this instead of pushing the fix because I don't understand yet why it compiles with run-test but not with dex2oat. This reverts commit fd861249f31ab360c12dd1ffb131d50f02b0bfc6. Change-Id: Idc2d30f6a0f39ddd3596aa18a532ae90f8aaf62f
|
fd861249f31ab360c12dd1ffb131d50f02b0bfc6 |
|
25-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE} - adds float comparison for arm, x86, x86_64 backends. - adds ucomis{s,d} assembly to x86 and x86_64. Change-Id: I232d2b6e9ecf373beb5cc63698dd97a658ff9c83
|
799f506b8d48bcceef5e6cf50f3f5eb6bcea05e1 |
|
26-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE}" Fails on x86_64 and target. This reverts commit cea28ec4b9e94ec942899acf1dbf20f8999b36b4. Change-Id: I30c1d188c7ecfe765f137a307022ede84f15482c
|
cea28ec4b9e94ec942899acf1dbf20f8999b36b4 |
|
25-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE} - adds float comparison for arm, x86, x86_64 backends. - adds ucomis{s,d} assembly to x86 and x86_64. Change-Id: Ie91e04bfb402025073054f3803a3a569e4705caa
|
eace45873190a27302b3644c32ec82854b59d299 |
|
25-Nov-2014 |
Mathieu Chartier <mathieuc@google.com> |
Move dexCacheStrings from ArtMethod to Class Adds one load for const strings which are not direct. Saves >= 60KB of memory avg per app. Image size: -350KB. Bug: 17643507 Change-Id: I2d1a3253d9de09682be9bc6b420a29513d592cc8 (cherry picked from commit f521f423b66e952f746885dd9f6cf8ef2788955d)
|
3159674c0863f53cfbc1913d493550221ac47f02 |
|
24-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a bug in the type analysis phase of optimizing. Dex code can lead to the creation of a phi with one float input and one integer input. Since the SSA builder trusts the verifier, it assumes that the integer input must be converted to float. However, when the register is not used afterwards, the verifier hasn't ensured that. Therefore, the compiler must remove the phi prior to doing type propagation. Change-Id: Idcd51c4dccce827c59d1f2b253bc1c919bc07df5
|
9aec02fc5df5518c16f1e5a9b6cb198a192db973 |
|
19-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add shifts Added SHL, SHR, USHR for arm, x86, x86_64. Change-Id: I971f594e270179457e6958acf1401ff7630df07e
|
86a8d7afc7f00ff0f5ea7b8aaf4d50514250a4e6 |
|
19-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Consistently use k{InstructionSet}WordSize. These constants were defined prior to k{InstructionSet}PointerSize. So use them consistently in optimizing as a first step. We can discuss whether we should remove them in a second step. Change-Id: If129de1a3bb8b65f8d9c816a8ad466815fb202e6
|
2d7210188805292e463be4bcf7a133b654d7e0ea |
|
10-Nov-2014 |
Mathieu Chartier <mathieuc@google.com> |
Change 64 bit ArtMethod fields to be pointer sized Changed the 64 bit entrypoint and gc map fields in ArtMethod to be pointer sized. This saves a large amount of memory on 32 bit systems. Reduces ArtMethod size by 16 bytes on 32 bit. Total number of ArtMethod on low memory mako: 169957 Image size: 49203 methods -> 787248 image size reduction. Zygote space size: 1070 methods -> 17120 size reduction. App methods: ~120k -> 2 MB savings. Savings per app on low memory mako: 125K+ per app (less active apps -> more image methods per app). Savings depend on how often the shared methods are on dirty pages vs shared. TODO in another CL, delete gc map field from ArtMethod since we should be able to get it from the Oat method header. Bug: 17643507 Change-Id: Ie9508f05907a9f693882d4d32a564460bf273ee8 (cherry picked from commit e832e64a7e82d7f72aedbd7d798fb929d458ee8f)
|
e832e64a7e82d7f72aedbd7d798fb929d458ee8f |
|
10-Nov-2014 |
Mathieu Chartier <mathieuc@google.com> |
Change 64 bit ArtMethod fields to be pointer sized Changed the 64 bit entrypoint and gc map fields in ArtMethod to be pointer sized. This saves a large amount of memory on 32 bit systems. Reduces ArtMethod size by 16 bytes on 32 bit. Total number of ArtMethod on low memory mako: 169957 Image size: 49203 methods -> 787248 image size reduction. Zygote space size: 1070 methods -> 17120 size reduction. App methods: ~120k -> 2 MB savings. Savings per app on low memory mako: 125K+ per app (less active apps -> more image methods per app). Savings depend on how often the shared methods are on dirty pages vs shared. TODO in another CL, delete gc map field from ArtMethod since we should be able to get it from the Oat method header. Bug: 17643507 Change-Id: Ie9508f05907a9f693882d4d32a564460bf273ee8
|
cff137481eda0eb8dbdf9d2a303ae2bdac2c7322 |
|
17-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for int-to-float & int-to-double in optimizing. - Add support for the int-to-float and int-to-double Dex instructions in the optimizing compiler. - Generate x86, x86-64 and ARM (but not ARM64) code for byte to float, short to float, int to float, char to float, byte to double, short to double, int to double and char to double HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I963f9d0184a5d3721af2d8f593f133d5af7aa6a3
|
bacfec30ee9f2f6fdfd190f11b105b609938efca |
|
14-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add REM_INT, REM_LONG - for arm, x86, x86_64 - minor cleanup/fix in div tests Change-Id: I240874010206a5a9b3aaffbc81a885b94c248f93
|
01a8d7135c59b4a664d1e0c0e4d8db343d4118ef |
|
14-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for int-to-short in the optimizing compiler. - Add support for the int-to-short Dex instruction in the optimizing compiler. - Generate x86, x86-64 and ARM (but not ARM64) code for byte to short, int to short and char to short HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: If1829549708d9c3473efaa641f7f0bcfa6080ae9
|
af07bc121121d7bd7e8329c55dfe24782207b561 |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Minor object store optimizations. - Avoid emitting write barrier when the value is null. - Do not do a typecheck on an arraystore when storing something that was loaded from the same array. Change-Id: I902492928692e4553b5af0fc99cce3c2186c442a
|
981e45424f52735b1c61ae0eac7e299ed313f8db |
|
14-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for int-to-char in the optimizing compiler. - Add support for the int-to-char Dex instruction in the optimizing compiler. - Implement the ARM and Thumb-2 UBFX instructions and add tests for them. - Generate x86, x86-64 and ARM (but not ARM64) code for byte to char, short to char, int to char (and char to char!) HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I5cd4c6d86f0f6a966c059715b98db35cc8f9de76
|
51d3fc40637fc73d4156ad617cd451b844cbb75e |
|
13-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for int-to-byte in the optimizing compiler. - Add support for the int-to-byte Dex instruction in the optimizing compiler. - Implement the ARM and Thumb-2 SBFX instructions. - Generate x86, x86-64 and ARM (but not ARM64) code for char to byte, short to byte and int to byte HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: Ic8b8911b90d4b5281fad15bcee96bc3ee85dc577
|
a21f598fd4dfdb95dc8597d3156120cc20d94c02 |
|
13-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Fix Move for instruction with constant output Change-Id: I15d89292dc62f8dd8643530f95ace2e8be034411
|
d6fb6cfb6f2d0d9595f55e8cc18d2753be5d9a13 |
|
11-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add DIV_LONG - for backends: arm, x86, x86_64 - added cqo, idivq, testq assembly for x64_64 - small cleanups Change-Id: I762ef37880749038ed25d6014370be9a61795200
|
f0e3937b87453234d0d7970b8712082062709b8d |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Do a parallel move in BoundsCheckSlowPath. The two locations of the index and length could overlap, so we need a parallel move. Also factorize the code for doing a parallel move based on two locations. Change-Id: Iee8b3459e2eed6704d45e9a564fb2cd050741ea4
|
9574c4b5f5ef039d694ac12c97e25ca02eca83c0 |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement and/or/xor in optimizing. Change-Id: I7cf6da1fd334a7177a5580931b8f174dd40b7cec
|
b7baf5c58d0e864f8c3f889357c51288aed42e61 |
|
11-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement monitorenter/monitorexit. Pretty simple as they just invoke the runtime. Change-Id: I5fcb2c783deac27e55e28d8b3da3e68ea4b77363
|
57a88d4ac205874dc85d22f9f6a9ca3c4c373eeb |
|
10-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement checkcast for optimizing. - Ended up not using HTypeCheck because of how instanceof and checkcast end up having different logic for code generation. - Fix a x86_64 assembler bug triggered by now enabling more methods to be compiled. Difficult to test today without b/18117217. Change-Id: I3022e7ae03befb1d10bea9637ad21fadc430abe0
|
946e143941d456a4ec666f7f54719c65c5aa3f5d |
|
11-Nov-2014 |
Roland Levillain <rpl@google.com> |
Revert "Revert "Add support for long-to-int in the optimizing compiler."" This reverts commit 3adfd1b4fb20ac2b0217b5d2737bfe30ad90257a. Change-Id: Iacf0c6492d49267e24f1b727dbf6379b21fd02db
|
3adfd1b4fb20ac2b0217b5d2737bfe30ad90257a |
|
11-Nov-2014 |
Roland Levillain <rpl@google.com> |
Revert "Add support for long-to-int in the optimizing compiler." This reverts commit 647b96f29cb81832e698f863884fdba06674c9de. Change-Id: I552f23585463c676acbd547521b4d3ee5c0342eb
|
647b96f29cb81832e698f863884fdba06674c9de |
|
11-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for long-to-int in the optimizing compiler. - Add support for the long-to-int Dex instruction in the optimizing compiler. - Generate x86, x86-64 and ARM (but not ARM64) code for long-to-int HTypeConversion nodes. - Add related tests to test/422-type-conversion. - Also fix comments in test/415-optimizing-arith-neg and in test/416-optimizing-arith-not. Change-Id: I3084af30f2a495d178362ae1154dc7ceb7bf3a58
|
666c732cfa211abf44ed90120a87bf8c18138e55 |
|
10-Nov-2014 |
Roland Levillain <rpl@google.com> |
Support Java conversions from char to long in opt. compiler. These char to long conversions generate int-to-long Dex instructions. Change-Id: I6a8e71b57870cf5e8d5bc638fabce0fc7593f0b2
|
52839d17c06175e19ca4a093fb878450d1c4310d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support invoke-interface in optimizing. Change-Id: Ic18d7c3d2810557231caf0571956e0c431f5d384
|
6f5c41f9e409bc4da53b5d7c385202255e391e72 |
|
06-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement instanceof in optimizing. - Only fast-path for now: null or same class. - Use pQuickInstanceofNonTrivial for slow path. Change-Id: Ic5196b94bef792f081f3cb4d15157058e1381e6b
|
f43083d560565aea46c602adb86423daeefe589d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not update Out after it has a valid location. Slow paths use LocationSummary to know where to move things around, and they are executed at the end of the code generation. This fix is needed for https://android-review.googlesource.com/#/c/113345/. Change-Id: Id336c6409479b1de6dc839b736a7234d08a7774a
|
52e832b1278449e62d9eb502d54d5ff18f8606ed |
|
06-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support floats and doubles in fields. Change-Id: I19832106633405403f0461b3fe13b268abe39db3
|
de58ab2c03ff8112b07ab827c8fa38f670dfc656 |
|
05-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement try/catch/throw in optimizing. - We currently don't run optimizations in the presence of a try/catch. - We therefore implement Quick's mapping table. - Also fix a missing null check on array-length. Change-Id: I6917dfcb868e75c1cf6eff32b7cbb60b6cfbd68f
|
3dbcb38a8b2237b0da290ae35dc0caab3cb47b3d |
|
28-Oct-2014 |
Roland Levillain <rpl@google.com> |
Support float & double negation in the optimizing compiler. - Add support for the neg-float and neg-double Dex instructions in the optimizing compiler. - Generate x86, x86-64 and ARM (but not ARM64) code for float and double HNeg nodes. - Add related tests to test/415-optimizing-arith-neg. Change-Id: I29739a86e13dbe6f64e191641d01637c867cba6c
|
d0d4852847432368b090c184d6639e573538dccf |
|
04-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add div-int and exception handling. - for backends: arm, x86, x86_64 - fixed a register allocator bug: the request for a fixed register for the first input was ignored if the output was kSameAsFirstInput - added divide by zero exception - more tests - shuffle around some code in the builder to reduce the number of lines of code for a single function. Change-Id: Id3a515e02bfbc66cd9d16cb9746f7551bdab3d42
|
dff1f2812ecdaea89978c5351f0c70cdabbc0821 |
|
05-Nov-2014 |
Roland Levillain <rpl@google.com> |
Support int-to-long conversions in the optimizing compiler. - Add support for the int-to-float Dex instruction in the optimizing compiler. - Add a HTypeConversion node type for control-flow graphs. - Generate x86, x86-64 and ARM (but not ARM64) code for int-to-float HTypeConversion nodes. - Add a 64-bit "Move doubleword to quadword with sign-extension" (MOVSXD) instruction to the x86-64 assembler. - Add related tests to test/422-type-conversion. Change-Id: Ieb8ec5380f9c411857119c79aa8d0728fd10f780
|
424f676379f2f872acd1478672022f19f3240fc1 |
|
03-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement CONST_CLASS in optimizing compiler. Change-Id: Ia8c8dfbef87cb2f7893bfb6e178466154eec9efd
|
6a3c1fcb4ba42ad4d5d142c17a3712a6ddd3866f |
|
31-Oct-2014 |
Ian Rogers <irogers@google.com> |
Remove -Wno-unused-parameter and -Wno-sign-promo from base cflags. Fix associated errors about unused paramenters and implict sign conversions. For sign conversion this was largely in the area of enums, so add ostream operators for the effected enums and fix tools/generate-operator-out.py. Tidy arena allocation code and arena allocated data types, rather than fixing new and delete operators. Remove dead code. Change-Id: I5b433e722d2f75baacfacae4d32aef4a828bfe1b
|
b5f62b3dc5ac2731ba8ad53cdf3d9bdb14fbf86b |
|
30-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support for CONST_STRING in optimizing compiler. Change-Id: Iab8517bdadd1d15ffbe570010f093660be7c51aa
|
19a19cffd197a28ae4c9c3e59eff6352fd392241 |
|
22-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for static fields in optimizing compiler. Change-Id: Id2f010589e2bd6faf42c05bb33abf6816ebe9fa9
|
7c4954d429626a6ceafbf05be41bf5f840894e44 |
|
28-Oct-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add division for floats and doubles backends: x86, x86_64, arm. Also: - ordered instructions based on their name. - add missing kNoOutputOverlap to add/sub/mul. Change-Id: Ie47cde3b15ac74e7a1660c67a2eed1d7871f0ad0
|
705664321a5cc1418255172f92d7d7195cf60a7b |
|
24-Oct-2014 |
Roland Levillain <rpl@google.com> |
Add long bitwise not instruction in the optimizing compiler. - Add support for the not-long (long integer one's complement negation) instruction in the optimizing compiler. - Add a 64-bit NOT instruction (notq) to the x86-64 assembler. - Generate ARM, x86 and x86-64 code for long HNot nodes. - Gather not-related tests in test/416-optimizing-arith-not. Change-Id: I2d5b75e9875664d6032d04f8401b2bbb84506948
|
2e07b4f0a84a7968b4690c2b1be2e2f75cc6fa8e |
|
23-Oct-2014 |
Roland Levillain <rpl@google.com> |
Revert "Revert "Implement long negate instruction in the optimizing compiler."" This reverts commit 30ca3d847fe72cfa33e1b2473100ea2d8bea4517. Change-Id: I188ca8d460d55d3a9966bcf31e0588575afa77d2
|
30ca3d847fe72cfa33e1b2473100ea2d8bea4517 |
|
23-Oct-2014 |
Roland Levillain <rpl@google.com> |
Revert "Implement long negate instruction in the optimizing compiler." This reverts commit 66ce173a40eff4392e9949ede169ccf3108be2db.
|
66ce173a40eff4392e9949ede169ccf3108be2db |
|
23-Oct-2014 |
Roland Levillain <rpl@google.com> |
Implement long negate instruction in the optimizing compiler. - Add support for the neg-long (long integer two's complement negate) instruction in the optimizing compiler. - Add a 64-bit NEG instruction (negq) to the x86-64 assembler. - Generate ARM, x86 and x86-64 code for integer HNeg nodes. - Put neg-related tests into test/415-optimizing-arith-neg. Change-Id: I1fbe9611e134408a6b8745d1df20ab6ffa5e50f2
|
1135168a1a9e2a6493657be8c5e91d67e5f224a7 |
|
23-Oct-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add float/double subtraction - for arm, x86, x86_64 - add tests - a bit of clean up Change-Id: I3761b0d908aca3e3c5d60da481fafb423ff7c9b9
|
1cc5f251df558b0e22cea5000626365eb644c727 |
|
22-Oct-2014 |
Roland Levillain <rpl@google.com> |
Implement int bit-wise not operation in the optimizing compiler. - Add support for the not-int (integer one's complement negate) instruction in the optimizing compiler. - Extend the HNot control-flow graph node type and make it inherit from HUnaryOperation. - Generate ARM, x86 and x86-64 code for integer HNeg nodes. - Exercise these additions in the codegen_test gtest, as there is not direct way to assess the support of not-int from a Java source. Indeed, compiling a Java expression such as `~a' using javac and then dx generates an xor-int/lit8 Dex instruction instead of the expected not-int Dex instruction. This is probably because the Java bytecode has an `ixor' instruction, but there's not instruction directly corresponding to a bit-wise not operation. Change-Id: I223aed75c4dac5785e04d99da0d22e8d699aee2b
|
b5bfa96ff20e86316961327dec5c859239dab6a0 |
|
21-Oct-2014 |
Calin Juravle <calin@google.com> |
Add multiplication for floats/doubles in optimizing compiler Change-Id: I61de8ce1d9e37e30db62e776979b3f22dc643894
|
a3d05a40de076aabf12ea284c67c99ff28b43dbf |
|
20-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement array creation related DEX instructions. Implement new-array, filled-new-array, and fill-array-data. Change-Id: I405560d66777a57d881e384265322617ac5d3ce3
|
c8147a76ed2f440f38329dc08ff889d393b5c535 |
|
21-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix off by one errors in linear scan register allocator. Change-Id: I65eea3cc125e12106a7160d30cb91c5d173bd405
|
102cbed1e52b7c5f09458b44903fe97bb3e14d5f |
|
15-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement register allocator for floating point registers. Also: - Fix misuses of emitting the rex prefix in the x86_64 assembler. - Fix movaps code generation in the x86_64 assembler. Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
|
88cb1755e1d6acaed0f66ce65d7a2a4465053342 |
|
20-Oct-2014 |
Roland Levillain <rpl@google.com> |
Implement int negate instruction in the optimizing compiler. - Add support for the neg-int (integer two's complement negate) instruction in the optimizing compiler. - Add a HNeg node type for control-flow graphs and an intermediate HUnaryOperation base class. - Generate ARM, x86 and x86-64 code for integer HNeg nodes. Change-Id: I72fd3e1e5311a75c38a8cb665a9211a20325a42e
|
8e3964b766652a0478e8e0e303e8556c997675f1 |
|
17-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Remove the notion of dies at entry. - Instead, explicitly say that the output does not overlap. - Inputs that must be in a fixed register do die at entry, as we know they have a location that others can not take. - There is also no need to differentiate between an input move and a connecting sibling move - those can be put in the same parallel move instruction. Change-Id: I1b2b2827906601f822b59fb9d6a21d48e43bae27
|
34bacdf7eb46c0ffbf24ba7aa14a904bc9176fb2 |
|
07-Oct-2014 |
Calin Juravle <calin@google.com> |
Add multiplication for integral types This also fixes an issue where we could allocate a pair register even if one of its parts was already blocked. Change-Id: I4869175933409add2a56f1ccfb369c3d3dd3cb01
|
92a73aef279be78e3c2b04db1713076183933436 |
|
16-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use assembler classes in code_generator.h. The arm64 backend uses its own assembler and does not share the same classes as the other backends. To avoid conflicts or unnecessary mappings, just don't use those classes in the shared part of the code generator. Change-Id: I9e5fa40c1021d2e83a4ef14c52cd1ccd03f2f73d
|
3a3fd0f8d3981691aa2331077a8fae5feee08dd1 |
|
10-Oct-2014 |
Roland Levillain <rpl@google.com> |
Turn constant conditional jumps into unconditional jumps. If a condition (input of an art::HIf instruction) is constant (an art::HConstant object), evaluate it at compile time and generate an unconditional branch instruction if it is true (in lieu of a conditional jump). Change-Id: I262e43ffe66d5c25dbbfa98092a41c8b3c4c75d6
|
71175b7f19a4f6cf9cc264feafd820dbafa371fb |
|
09-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Cleanup baseline register allocator. - Use three arrays for blocking regsters instead of one and computing offsets in that array.] - Don't pass blocked_registers_ to methods, just use the field. Change-Id: Ib698564c31127c59b5a64c80f4262394b8394dc6
|
fc787ecd91127b2c8458afd94e5148e2ae51a1f5 |
|
10-Oct-2014 |
Ian Rogers <irogers@google.com> |
Enable -Wimplicit-fallthrough. Falling through switch cases on a clang build must now annotate the fallthrough with the FALLTHROUGH_INTENDED macro. Bug: 17731372 Change-Id: I836451cd5f96b01d1ababdbf9eef677fe8fa8324
|
476df557fed5f0b3f32f8d11a654674bb403a8f8 |
|
09-Oct-2014 |
Roland Levillain <rpl@google.com> |
Use Is*() helpers to shorten code in the optimizing compiler. Change-Id: I79f31833bc9a0aa2918381aa3fb0b05d45f75689
|
360231a056e796c36ffe62348507e904dc9efb9b |
|
08-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix code generation of materialized conditions. Move the logic for knowing if a condition needs to be materialized in an optimization pass (so that the information does not change as a side effect of another optimization). Also clean-up arm and x86_64 codegen: - arm: ldr and str are for power-users when a constant is in play. We should use LoadFromOffset and StoreToOffset. - x86_64: fix misuses of movq instead of movl. Change-Id: I01a03b91803624be2281a344a13ad5efbf4f3ef3
|
56b9ee6fe1d6880c5fca0e7feb28b25a1ded2e2f |
|
09-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Stop converting from Location to ManagedRegister. Now the source of truth is the Location object that knows which register (core, pair, fpu) it needs to refer to. Change-Id: I62401343d7479ecfb24b5ed161ec7829cda5a0b1
|
7e70b002c4552347ed1af8c002a0e13f08864f20 |
|
08-Oct-2014 |
Ian Rogers <irogers@google.com> |
Header file clean up. Remove runtime.h from object.h. Move TypeStaticIf to its own header file to avoid bringing utils.h into allocator.h. Move Array::DataOffset into -inl.h as it now has a utils.h dependency. Fix include issues arising from this. Change-Id: I4605b1aa4ff5f8dc15706a0132e15df03c7c8ba0
|
01ef345767ea609417fc511e42007705c9667546 |
|
01-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add trivial register hints to the register allocator. - Add hints for phis, same as first input, and expected registers. - Make the if instruction accept non-condition instructions. Change-Id: I34fa68393f0d0c19c68128f017b7a05be556fbe5
|
7fb49da8ec62e8a10ed9419ade9f32c6b1174687 |
|
06-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for floats and doubles. - Follows Quick conventions. - Currently only works with baseline register allocator. Change-Id: Ie4b8e298f4f5e1cd82364da83e4344d4fc3621a3
|
26a25ef62a13f409f941aa39825a51b4d6f0f047 |
|
30-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a prepare for register allocation pass. - Currently the pass just changes the uses of checks to the actual values. - Also optimize array access, now that inputs can be constants. - And fix another bug in the register allocator reveiled by this change. Change-Id: I43be0dbde9330ee5c8f9d678de11361292d8bd98
|
9ae0daa60c568f98ef0020e52366856ff314615f |
|
30-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for inputs dying at entry of instructions. - Start using it in places where it makes sense. - Also improve suspend check on arm to use subs directly. Change-Id: I09ac0589f5ccb9b850ee757c76dcbcf35ee8cd01
|
5799fc0754da7ff2b50b472e05c65cd4ba32dda2 |
|
25-Sep-2014 |
Roland Levillain <rpl@google.com> |
Optimizing compiler: remove unnecessary `explicit' keywords. Change-Id: I5927fd92d53308c81e14edbd6e7d1c943bfa085b
|
3c04974a90b0e03f4b509010bff49f0b2a3da57f |
|
24-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize suspend checks in optimizing compiler. - Remove the ones added during graph build (they were added for the baseline code generator). - Emit them at loop back edges after phi moves, so that the test can directly jump to the loop header. - Fix x86 and x86_64 suspend check by using cmpw instead of cmpl. Change-Id: I6fad5795a55705d86c9e1cb85bf5d63dadfafa2a
|
3bca0df855f0e575c6ee020ed016999fc8f14122 |
|
19-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support for saving and restoring live registers in a slow path. And use it in suspend check slow paths. Change-Id: I79caf28f334c145a36180c79a6e2fceae3990c31
|
18efde5017369e005f1e8bcd3bbfb04e85053640 |
|
22-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix code generation with materialized conditions. Change-Id: I8630af3c13fc1950d3fa718d7488407b00898796
|
e982f0b8e809cece6f460fa2d8df25873aa69de4 |
|
13-Aug-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement invoke virtual in optimizing compiler. Also refactor 004 tests to make them work with both Quick and Optimizing. Change-Id: I87e275cb0ae0258fc3bb32b612140000b1d2adf8
|
fbc695f9b8e2084697e19c1355ab925f99f0d235 |
|
15-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Implement suspend checks in new compiler."" This reverts commit 7e3652c45c30c1f2f840e6088e24e2db716eaea7. Change-Id: Ib489440c34e41cba9e9e297054f9274f6e81a2d8
|
7e3652c45c30c1f2f840e6088e24e2db716eaea7 |
|
15-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Implement suspend checks in new compiler." This reverts commit 6fbce029fba3ed5da6c36017754ed408e6bcb632. Change-Id: Ia915c27873b021e658a10212e559095dfc91284e
|
6fbce029fba3ed5da6c36017754ed408e6bcb632 |
|
10-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement suspend checks in new compiler. For simplicity, they are currently placed on all (dex-level) back edges, and at method entry. Change-Id: I6e833e244d559dd788c69727e22fe40aff5b3435
|
3946844c34ad965515f677084b07d663d70ad1b8 |
|
02-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Runtime support for the new stack maps for the opt compiler. Now most of the methods supported by the compiler can be optimized, instead of using the baseline. Change-Id: I80ab36a34913fa4e7dd576c7bf55af63594dc1fa
|
03c9785a8a6d712775cf406c4371d0227c44148f |
|
14-Aug-2014 |
Dave Allison <dallison@google.com> |
Revert "Revert "Reduce stack usage for overflow checks"" Fixes stack protection issue. Fixes mac build issue. This reverts commit 83b1940e6482b9d8feba5c492507735686650ea5. Change-Id: I7ba17252882b23a740bcda2ea94aacf398255406
|
83b1940e6482b9d8feba5c492507735686650ea5 |
|
14-Aug-2014 |
Dave Allison <dallison@google.com> |
Revert "Reduce stack usage for overflow checks" This reverts commit 63c051a540e6dfc806f656b88ac3a63e99395429. Change-Id: I282a048994fcd130fe73842b16c21680053c592f
|
63c051a540e6dfc806f656b88ac3a63e99395429 |
|
26-Jul-2014 |
Dave Allison <dallison@google.com> |
Reduce stack usage for overflow checks This reduces the stack space reserved for overflow checks to 12K, split into an 8K gap and a 4K protected region. GC needs over 8K when running in a stack overflow situation. Also prevents signal runaway by detecting a signal inside code that resulted from a signal handler invokation. And adds a max signal count to the SignalTest to prevent it running forever. Also reduces the number of iterations for the InterfaceTest as this was taking (almost) forever with the --trace option on run-test. Bug: 15435566 Change-Id: Id4fd46f22d52d42a9eb431ca07948673e8fda694 Conflicts: compiler/optimizing/code_generator_x86_64.cc runtime/arch/x86/fault_handler_x86.cc runtime/arch/x86_64/quick_entrypoints_x86_64.S
|
648d7112609dd19c38131b3e71c37bcbbd19d11e |
|
26-Jul-2014 |
Dave Allison <dallison@google.com> |
Reduce stack usage for overflow checks This reduces the stack space reserved for overflow checks to 12K, split into an 8K gap and a 4K protected region. GC needs over 8K when running in a stack overflow situation. Also prevents signal runaway by detecting a signal inside code that resulted from a signal handler invokation. And adds a max signal count to the SignalTest to prevent it running forever. Also reduces the number of iterations for the InterfaceTest as this was taking (almost) forever with the --trace option on run-test. Bug: 15435566 Change-Id: Id4fd46f22d52d42a9eb431ca07948673e8fda694
|
f6e206c820fe75a341c98ef12410475d33028640 |
|
07-Aug-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support x86_64 stack overflow checks in opt compiler. Also re-enable SignalTest on optimizing-32. Change-Id: I2ca13f6f9ea775c654ee07cc5026c985263d6380
|
3c7bb98698f77af10372cf31824d3bb115d9bf0f |
|
23-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement array get and array put in optimizing. Also fix a couple of assembler/disassembler issues. Change-Id: I705c8572988c1a9c4df3172b304678529636d5f6
|
f12feb8e0e857f2832545b3f28d31bad5a9d3903 |
|
17-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Stack overflow checks and NPE checks for optimizing. Change-Id: I59e97448bf29778769b79b51ee4ea43f43493d96
|
1a43dd78d054dbad8d7af9ba4829ea2f1cb70b53 |
|
17-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add write barriers to optimizing compiler. Change-Id: I43a40954757f51d49782e70bc28f7c314d6dbe17
|
96f89a290eb67d7bf4b1636798fa28df14309cc7 |
|
11-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add assembly operations with constants in optimizing compiler. Change-Id: I5bcc35ab50d4457186effef5592a75d7f4e5b65f
|
ab032bc1ff57831106fdac6a91a136293609401f |
|
15-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a braino in the stack layout. Also do some refactoring to have this code be just in CodeGenerator. Change-Id: I88de109889138af8d60027973c12a64bee813cb7
|
e50383288a75244255d3ecedcc79ffe9caf774cb |
|
04-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support fields in optimizing compiler. - Required support for temporaries, to be only used by baseline compiler. - Also fixed a few invalid assumptions around locations and instructions that don't need materialization. These instructions should not have an Out. Change-Id: Idc4a30dd95dd18015137300d36bec55fc024cf62
|
412f10cfed002ab617c78f2621d68446ca4dd8bd |
|
19-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support longs in the register allocator for x86_64. Change-Id: I7fb6dfb761bc5cf9e5705682032855a0a70ca867
|
20dfc797dc631bf8d655dcf123f46f13332d3074 |
|
17-Jun-2014 |
Dave Allison <dallison@google.com> |
Add some more instruction support to optimizing compiler. This adds a few more DEX instructions to the optimizing compiler's builder (constants, moves, if_xx, etc). Also: * Changes the codegen for IF_XX instructions to use a condition rather than comparing a value against 0. * Fixes some instructions in the ARM disassembler. * Fixes PushList and PopList in the thumb2 assembler. * Switches the assembler for the optimizing compiler to thumb2 rather than ARM. Change-Id: Iaafcd02243ccc5b03a054ef7a15285b84c06740f
|
ecb2f9ba57b08ceac4204ddd6a0a88a0524f8741 |
|
13-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable the register allocator on x86_64. Also fix an x86_64 assembler bug for movl. Change-Id: I8d17c68cd35ddd1d8df159f2d6173a013a7c3347
|
9cf35523764d829ae0470dae2d5dd99be469c841 |
|
09-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add x86_64 support to the optimizing compiler. Change-Id: I4462d9ae15be56c4a3dc1bd4d1c0c6548c1b94be
|