c393d63aa2b8f6984672fdd4de631bbeff14b6a2 |
|
15-Apr-2016 |
Alexandre Rames <alexandre.rames@linaro.org> |
Fix: correctly destruct VIXL labels. (cherry picked from commit c01a66465a398ad15da90ab2bdc35b7f4a609b17) Bug: 27505766 Change-Id: I077465e3d308f4331e7a861902e05865f9d99835
|
d58b837ae41c6d8ce010c362e8f85bd938715900 |
|
12-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Allocate code generators on the arena. Change-Id: If8cf0ee43711f6e13171443e3c057ff370ccfbaa
|
dee58d6bb6d567fcd0c4f39d8d690c3acaf0e432 |
|
07-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals"" This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) This CL fixed an issue with parsing quickened instructions. Bug: 27894376 Bug: 27998571 Bug: 27995065 Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
|
60328910cad396589474f8513391ba733d19390b |
|
04-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals" Bug: 27995065 This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f. Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
|
e3ff7b293be2a6791fe9d135d660c0cffe4bd73f |
|
02-Mar-2016 |
David Brazdil <dbrazdil@google.com> |
Refactor HGraphBuilder and SsaBuilder to remove HLocals This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) Bug: 27894376 Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
|
cac5a7e871f1f346b317894359ad06fa7bd67fba |
|
22-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Improve const-string code generation. For strings in the boot image, use either direct pointers or pc-relative addresses. For other strings, use PC-relative access to the dex cache arrays for AOT and direct address of the string's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -692KiB (-0.9%) - 64-bit boot.oat: -948KiB (-1.1%) - 32-bit dalvik cache total: -900KiB (-0.9%) - 64-bit dalvik cache total: -3672KiB (-1.5%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -380KiB (-0.5%) - 64-bit boot.oat: -928KiB (-1.0%) - 32-bit dalvik cache total: -468KiB (-0.4%) - 64-bit dalvik cache total: -1928KiB (-0.8%) (contains more files than the 32-bit dalvik cache) Bug: 26884697 Change-Id: Iec7266ce67e6fedc107be78fab2e742a8dab2696
|
d28f4a00933a4a3b8d5e9db73b8532924d0f989d |
|
14-Mar-2016 |
David Srbecky <dsrbecky@google.com> |
Generate native debug stackmaps before calls as well. The debugger looks up PC of the call instruction, so the runtime's stackmap is not sufficient since it is at PC after the instruction. Change-Id: I0dd06c0b52e8079ea5d064ea10beb12c93584092
|
2ae48182573da7087bffc2873730bc758ec29696 |
|
16-Mar-2016 |
Calin Juravle <calin@google.com> |
Clean up NullCheck generation and record stats about it. This removes redundant code from the generators and allows for easier stat recording. Change-Id: Iccd4368f9e9d87a6fecb863dee4e2145c97851c4
|
9cd6d378bd573cdc14d049d32bdd22a97fa4d84a |
|
09-Feb-2016 |
David Srbecky <dsrbecky@google.com> |
Associate slow paths with the instruction that they belong to. Almost all slow paths already know the instruction they belong to, this CL just moves the knowledge to the base class as well. This is needed to be be able to get the corresponding dex pc for slow path, which allows us generate better native line numbers, which in turn fixes some native debugging stepping issues. Change-Id: I568dbe78a7cea6a43a4a71a014b3ad135782c270
|
c7098ff991bb4e00a800d315d1c36f52a9cb0149 |
|
09-Feb-2016 |
David Srbecky <dsrbecky@google.com> |
Remove HNativeDebugInfo from start of basic blocks. We do not require full environment at the start of basic block. The dex pc contained in basic block is sufficient for line mapping. Change-Id: I5ba9e5f5acbc4a783ad544769f9a73bb33e2bafa
|
6e332529c33be4d7dae5dad3609a839f4c0d3bfc |
|
02-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove HTemporary Change-Id: I21b984224370a9ce7a4a13a9652503cfb03c5f03
|
b331febbab8e916680faba722cc84b66b84218a3 |
|
05-Feb-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Implement on-stack replacement for arm/arm64/x86/x86_64."" This reverts commit bd89a5c556324062b7d841843b039392e84cfaf4. Change-Id: I08d190431520baa7fcec8fbdb444519f25ac8d44
|
bd89a5c556324062b7d841843b039392e84cfaf4 |
|
05-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Implement on-stack replacement for arm/arm64/x86/x86_64." DCHECK whether loop headers are covered fails. This reverts commit 891bc286963892ed96134ca1adb7822737af9710. Change-Id: I0f9a90630b014b16d20ba1dfba31ce63e6648021
|
891bc286963892ed96134ca1adb7822737af9710 |
|
29-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement on-stack replacement for arm/arm64/x86/x86_64. High-level overview: - osr_method_threshold is used to know when to compile a method in osr mode (-> treat all loops as irreducible). - branch instructions in the compiler query whether they can jump to an osr method. - An osr entry point is found through the stack maps: if a stack map is duplicated in the CodeInfo, it is an osr entry point. Change-Id: Ifb39338cd281e2c7eccce67f4e18d46428be71e4
|
58282f4510961317b8d5a364a6f740a78926716f |
|
14-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove Baseline compiler We don't need Baseline any more and it hasn't been maintained for a while anyway. Let's remove it. Change-Id: I442ed26855527be2df3c79935403a25b1ee55df6
|
42249c3602c3d0243396ee3627ffb5906aa77c1e |
|
08-Jan-2016 |
Aart Bik <ajcbik@google.com> |
Reduce code size by sharing slow paths. Rationale: Sharing identical slow path code reduces code size. Background: Currently, slow paths with the same dex-pc, same physical register spilling code, and identical stack maps are shared (making this only useful for deopt slow paths). The newly introduced mechanism is sufficiently general to allow future improvements by e.g. allowing different dex-pc (by passing this to runtime) or even the kind of slow paths (by passing runtime addresses to the slowpath). Change-Id: I819615c47b4fd98440a241f681f93e4fc22d12e0
|
b7070a2db8b0b7eca14f01f932be305be64ded57 |
|
08-Jan-2016 |
David Srbecky <dsrbecky@google.com> |
Generate Nops to ensure that debug stack maps have distinct PC. Change-Id: I5740ec958a20d236634b66df0e675382ed5c16fc
|
f71b3ade9c99ce2fec2f5049ce9c5968721e1b81 |
|
08-Dec-2015 |
David Srbecky <dsrbecky@google.com> |
Get source mapping table from stack maps. Stack maps contain pc to dex mapping. Reuse them instead of maintaining separate map. Change-Id: Iaaec9a6bd2603eace1dfc8f4344087883d88cce3
|
0d5a281c671444bfa75d63caf1427a8c0e6e1177 |
|
13-Nov-2015 |
Roland Levillain <rpl@google.com> |
x86/x86-64 read barrier support for concurrent GC in Optimizing. This first implementation uses slow paths to instrument heap reference loads and GC root loads for the concurrent copying collector, respectively calling the artReadBarrierSlow and artReadBarrierForRootSlow (new) runtime entry points. Notes: - This implementation does not instrument HInvokeVirtual nor HInvokeInterface instructions (for class reference loads), as the corresponding read barriers are not stricly required with the current concurrent copying collector. - Intrinsics which may eventually call (on slow path) are disabled when read barriers are enabled, as the current slow path infrastructure does not support this case. - When read barriers are enabled, the code generated for a HArraySet instruction always go into the array set slow path for object arrays (delegating the operation to the runtime), as we are lacking a mechanism to keep a temporary register live accross a runtime call (needed for the instrumentation of type checking code, which requires two successive read barriers). Bug: 12687968 Change-Id: I14cd6107233c326389120336f93955b28ffbb329
|
0f7dca4ca0be8d2f8776794d35edf8b51b5bc997 |
|
02-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing/X86: PC-relative dex cache array addressing. Add PC-relative dex cache array addressing for X86 and use it for better invoke-static/-direct dispatch. Also delay the initialization to the PC-relative base until needed. Change-Id: Ib8634d5edce4920cd70172fd13211809cf6948d1
|
d28b969c273ab777ca9b147b87fcef671b4f695f |
|
04-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Code cleanup to avoid CompilerDriver abstractions in JIT. Avoids allocating a CompiledMethod. Change-Id: I35b4aa0d7c74daba68e827a01e71c300fce3b3bf
|
dc151b2346bb8a4fdeed0c06e54c2fca21d59b5d |
|
15-Oct-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Determine invoke-static/-direct dispatch early. Determine the dispatch type of invoke-static/-direct in a special pass right after the type inference. This allows the inliner to pass the "needs dex cache" check and inline more. It also allows the code generator to avoid requesting a register location for the ArtMethod* for kDexCachePcRelative and direct methods. The supported dispatch check handles also situations that the CompilerDriver currently doesn't allow. The cleanup of the CompilerDriver and required changes to Quick will come in a separate change. Change-Id: I3f8e903a119949e95871d8ab0a995f4731a13a07
|
5bd05a5c9492189ec28edaf6396d6a39ddf03367 |
|
13-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement System.arraycopy intrinsic for arm. Change-Id: I58ae1af5103e281fe59fbe022b718d6d8f293a5e
|
b95fb775cc4c08349d0d905adbc96ad85e50601d |
|
30-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Clean up after tagging arena allocations. Change-Id: Id6ee1fe44c4c57d373db7a39530f29a5ca9aee18
|
98893e146b0ff0e1fd1d7c29252f1d1e75a163f2 |
|
02-Oct-2015 |
Calin Juravle <calin@google.com> |
Add support for unresolved classes in optimizing. Change-Id: I0e299a81e560eb9cb0737ec46125dffc99333b54
|
e460d1df1f789c7c8bb97024a8efbd713ac175e9 |
|
29-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Support unresolved fields in optimizing" The CL also changes the calling convetion for 64bit static field set to use kArg2 instead of kArg1. This allows optimizing to keep the asumptions: - arm pairs are always of form (even_reg, odd_reg) - ecx_edx is not used as a register on x86. This reverts commit e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1. Change-Id: I93159917565824084abc96775f31be1a4249f2f3
|
5233f93ee336b3581ccdb993ff6342c52fec34b0 |
|
29-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag even more arena allocations. Tag previously "Misc" arena allocations with more specific allocation types. Move some native heap allocations to the arena in BCE. Bug: 23736311 Change-Id: If8ef15a8b614dc3314bdfb35caa23862c9d4d25c
|
225b6464a58ebe11c156144653f11a1c6607f4eb |
|
28-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in code generators. And completely remove the deprecated GrowableArray. Replace GrowableArray with ArenaVector in code generators and related classes and tag arena allocations. Label arrays use direct allocations from ArenaAllocator because Label is non-copyable and non-movable and as such cannot be really held in a container. The GrowableArray never actually constructed them, instead relying on the zero-initialized storage from the arena allocator to be correct. We now actually construct the labels. Also avoid StackMapStream::ComputeDexRegisterMapSize() being passed null references, even though unused. Change-Id: I26a46fdd406b23a3969300a67739d55528df8bf4
|
85b62f23fc6dfffe2ddd3ddfa74611666c9ff41d |
|
09-Sep-2015 |
Andreas Gampe <agampe@google.com> |
ART: Refactor intrinsics slow-paths Refactor slow paths so that there is a default implementation for common cases (only arm64 with vixl is special). Write a generic intrinsic slow-path that can be reused for the specific architectures. Move helper functions into CodeGenerator so that they are accessible. Change-Id: Ibd788dce432601c6a9f7e6f13eab31f28dcb8550
|
e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1 |
|
17-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Support unresolved fields in optimizing" breaks debuggable tests. This reverts commit 23a8e35481face09183a24b9d11e505597c75ebb. Change-Id: I8e60b5c8f48525975f25d19e5e8066c1c94bd2e5
|
23a8e35481face09183a24b9d11e505597c75ebb |
|
08-Sep-2015 |
Calin Juravle <calin@google.com> |
Support unresolved fields in optimizing Change-Id: I9941fa5fcb6ef0a7a253c7a0b479a44a0210aad4
|
175dc732c80e6f2afd83209348124df349290ba8 |
|
25-Aug-2015 |
Calin Juravle <calin@google.com> |
Support unresolved methods in Optimizing Change-Id: If2da02b50d2fa668cd58f134a005f1752e7746b1
|
fa6b93c4b69e6d7ddfa2a4ed0aff01b0608c5a3a |
|
15-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in HGraph. Replace GrowableArray with ArenaVector in HGraph and related classes HEnvironment, HLoopInformation, HInvoke and HPhi, and tag allocations with new arena allocation types. Change-Id: I3d79897af405b9a1a5b98bfc372e70fe0b3bc40d
|
77a48ae01bbc5b05ca009cf09e2fcb53e4c8ff23 |
|
15-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Register allocation and runtime support for try/catch"" The original CL triggered b/24084144 which has been fixed by Ib72e12a018437c404e82f7ad414554c66a4c6f8c. This reverts commit 659562aaf133c41b8d90ec9216c07646f0f14362. Change-Id: Id8980436172457d0fcb276349c4405f7c4110a55
|
659562aaf133c41b8d90ec9216c07646f0f14362 |
|
14-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Register allocation and runtime support for try/catch" Breaks libcore test org.apache.harmony.security.tests.java.security.KeyStorePrivateKeyEntryTest#testGetCertificateChain. Need to investigate. This reverts commit b022fa1300e6d78639b3b910af0cf85c43df44bb. Change-Id: Ib24d3a80064d963d273e557a93469c95f37b1f6f
|
b022fa1300e6d78639b3b910af0cf85c43df44bb |
|
20-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Register allocation and runtime support for try/catch This patch completes a series of CLs that add support for try/catch in the Optimizing compiler. With it, Optimizing can compile all methods containing try/catch, provided they don't contain catch loops. Future work will focus on improving performance of the generated code. SsaLivenessAnalysis was updated to propagate liveness information of instructions live at catch blocks, and to keep location information on instructions which may be caught by catch phis. RegisterAllocator was extended to spill values used after catch, and to allocate spill slots for catch phis. Catch phis generated for the same vreg share a spill slot as the raw value must be the same. Location builders and slow paths were updated to reflect the fact that throwing an exception may not lead to escaping the method. Instruction code generators are forbidden from using of implicit null checks in try blocks as live registers need to be saved before handing over to the runtime. CodeGenerator emits a stack map for each catch block, storing locations of catch phis. CodeInfo and StackMapStream recognize this new type of stack map and store them separate from other stack maps to avoid dex_pc conflicts. After having found the target catch block to deliver an exception to, QuickExceptionHandler looks up the dex register maps at the throwing instruction and the catch block and copies the values over to their respective locations. The runtime-support approach was selected because it allows for the best performance in the normal control-flow path, since no propagation of catch phi values is necessary until the exception is thrown. In addition, it also greatly simplifies the register allocation phase. ConstantHoisting was removed from LICMTest because it instantiated (now abstract) HConstant and was bogus anyway (constants are always in the entry block). Change-Id: Ie31038ad8e3ee0c13a5bbbbaf5f0b3e532310e4e
|
2a7c1ef95c850abae915b3a59fbafa87e6833967 |
|
22-Jul-2015 |
Yevgeny Rouban <yevgeny.y.rouban@intel.com> |
Add more dwarf debug line info for Optimized methods. Optimizing compiler generates minimum debug line info that is built using the dex_pc information about suspend points. This is not enough for performance and debugging needs. This CL generates additional debug line information for instructions which have known dex_pc and it ensures that whole call sites are mapped (as opposed to suspend points which map only one instruction past the function call). Bug: 23157336 Change-Id: I9f2b1c2038e3560847c175b8121cf9496b8b58fa Signed-off-by: Yevgeny Rouban <yevgeny.y.rouban@intel.com>
|
f9f6441c665b5ff9004d3ed55014f46d416fb1bb |
|
02-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag Arena allocations with their source. This adds the ability to track where we allocate memory when the kArenaAllocatorCountAllocations flag is turned on. Also move some allocations from native heap to the Arena and remove some unnecessary utilities. Bug: 23736311 Change-Id: I1aaef3fd405d1de444fe9e618b1ce7ecef07ade3
|
ecc4366670e12b4812ef1653f7c8d52234ca1b1f |
|
13-Aug-2015 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Add OptimizingCompilerStats to the CodeGenerator class. Just refactoring, not yet used, but will be used by the incoming patch series and future CodeGen specific stats. Change-Id: I7d20489907b82678120518a77bdab9c4cc58f937 Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
|
4ab02352db4051d590b793f34d166a0b5c633c4a |
|
12-Aug-2015 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Use CodeGenerator::RecordPcInfo instead of SlowPathCode::RecordPcInfo. Part of a clean-up and refactoring series. SlowPathCode::RecordPcInfo is currently just a wrapper around CodGenerator::RecordPcInfo. Change-Id: Iffabef4ef37c365051130bf98a6aa6dc0a0fb254 Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
|
581550137ee3a068a14224870e71aeee924a0646 |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Optimizing: Better invoke-static/-direct dispatch."" Fixed kCallArtMethod to use correct callee location for kRecursive. This combination is used when compiling with debuggable flag set. This reverts commit b2c431e80e92eb6437788cc544cee6c88c3156df. Change-Id: Idee0f2a794199ebdf24892c60f8a5dcf057db01c
|
b2c431e80e92eb6437788cc544cee6c88c3156df |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Optimizing: Better invoke-static/-direct dispatch." Reverting due to failing ndebug tests. This reverts commit 9b688a095afbae21112df5d495487ac5231b12d0. Change-Id: Ie4f69da6609df3b7c8443412b6cf7f5c43c2c5d9
|
9b688a095afbae21112df5d495487ac5231b12d0 |
|
06-May-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Better invoke-static/-direct dispatch. Add framework for different types of loading ArtMethod* and code pointer retrieval. Implement invoke-static and invoke-direct calls the same way as Quick. Document the dispatch kinds in HInvokeStaticOrDirect's new enumerations MethodLoadKind and CodePtrLocation. PC-relative loads from dex cache arrays are used only for x86-64 and arm64. The implementation for other architectures will be done in separate CLs. Change-Id: I468ca4d422dbd14748e1ba6b45289f0d31734d94
|
78e3ef6bc5f8aa149f2f8bf0c78ce854c2f910fa |
|
12-Aug-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Add a GVN dependency 'GC' for garbage collection. This will be used by incoming architecture specific optimizations. The dependencies must be conservative. When an HInstruction is created we may not be sure whether it can trigger GC. In that case the 'ChangesGC' dependency must be set. We control at code-generation time that HInstructions that can call have the 'ChangesGC' dependency set. Change-Id: Iea6a7f430009f37a9599b0a0039207049906e45d
|
8158f28b6689314213eb4dbbe14166073be71f7e |
|
07-Aug-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Ensure coherency of call kinds for LocationSummary. The coherency is enforced with checks added in the `InvokeRuntime` helper, that we now also use on x86 and x86_64. Change-Id: I8cb92b042f25dc3c5fd390e9c61a45b477d081f4
|
45b83aff85a8a8dfcae0da90d010fa2d7eb299a7 |
|
06-Jul-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Fix LSRA bug with explicit register temporaries"" This reverts commit a5fc140ff315dda9bc0a8e59963ed547676cd941. Change-Id: Ic322484176e55d0c7cd7250d629b9e5046006a4f
|
a5fc140ff315dda9bc0a8e59963ed547676cd941 |
|
06-Jul-2015 |
Calin Juravle <calin@google.com> |
Revert "Fix LSRA bug with explicit register temporaries" register_allocator_test32 fails. This reverts commit 283b8541546e7673d33d104241623d07c91cf500. Change-Id: I2a46f3c68de3e8273e402102065c13797045c481
|
283b8541546e7673d33d104241623d07c91cf500 |
|
03-Jul-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Fix LSRA bug with explicit register temporaries A temporary with an explicit RegisterLocation, such as ESI on x86 didn't have the register marked as allocated. This caused it to not be saved/restored in the prologue/epilogue, causing problems in the caller routine, which expected it to be saved. Found while implementing https://android-review.googlesource.com/#/c/157522/. Change-Id: I22ca2b24c2d21b1c6ab6cfb7dec26cb38034a891 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
eb7b7399dbdb5e471b8ae00a567bf4f19edd3907 |
|
19-Jun-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Opt compiler: Add disassembly to the '.cfg' output. This is automatically added to the '.cfg' output when using the usual `--dump-cfg` option. Change-Id: I864bfc3a8299c042e72e451cc7730ad8271e4deb
|
9931f319cf86c56c2855d800339a3410697633a6 |
|
19-Jun-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Opt compiler: Add a description to slow paths. Change-Id: I22160d90de3fe0ab3e6a2acc440bda8daa00e0f0
|
cf93a5cd9c978f59113d42f9f642fab5e2cc8877 |
|
16-Jun-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "ART: Implement literal pool for arm, fix branch fixup."" This reverts commit fbeb4aede0ddc5b1e6a5a3a40cc6266fe8518c98. Adjust block label positions. Bad catch block labels were the reason for the revert. Change-Id: Ia6950d639d46b9da6b07f3ade63ab46d03d63310
|
fbeb4aede0ddc5b1e6a5a3a40cc6266fe8518c98 |
|
16-Jun-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "ART: Implement literal pool for arm, fix branch fixup." This reverts commit f38caa68cce551fb153dff37d01db518e58ed00f. Change-Id: Id88b82cc949d288cfcdb3c401b96f884b777fc40 Reason: broke the tests.
|
f38caa68cce551fb153dff37d01db518e58ed00f |
|
29-May-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Implement literal pool for arm, fix branch fixup. Change-Id: Iecc91418bb4ee1c957f42fefb737d0ee2ba960e7
|
bd8c725e465cc7f44062745a6f2b73248f5159ed |
|
12-Jun-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Remove PcInfo, use the StackMapStream instead. Change-Id: I474f3a89f6c7ee5c7accd21791b1c1e311104158
|
fd88f16100cceafbfde1b4f095f17e89444d6fa8 |
|
03-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Factorize code for common LocationSummary of HInvoke. This is one step forward, we could factorize more, but I wanted to get this out of the way first. Change-Id: I6ae411a737eebaecb64974f47af507ce0cfbae85
|
3d21bdf8894e780d349c481e5c9e29fe1556051c |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 (cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33) Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3 ART: Fix casts for 64-bit pointers on 32-bit compiler. Bug: 19264997 Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457 Fix JDWP tests after ArtMethod change Fixes Throwable::GetStackDepth for exception event detection after internal stack trace representation change. Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of proxy method. Bug: 19264997 Change-Id: I363e293796848c3ec491c963813f62d868da44d2 Fix accidental IMT and root marking regression Was always using the conflict trampoline. Also included fix for regression in GC time caused by extra roots. Most of the regression was IMT. Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to detached thread. EvaluateAndApplyChanges: From ~2500 -> ~1980 GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots Bug: 19264997 Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0 Fix bogus image test assert Previously we were comparing the size of the non moving space to size of the image file. Now we properly compare the size of the image space against the size of the image file. Bug: 19264997 Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a [MIPS64] Fix art_quick_invoke_stub argument offsets. ArtMethod reference's size got bigger, so we need to move other args and leave enough space for ArtMethod* and 'this' pointer. This fixes mips64 boot. Bug: 19264997 Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
|
e401d146407d61eeb99f8d6176b2ac13c4df1e33 |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
|
b1d0f3f7e92fdcc92fe2d4c48cbb1262c005583f |
|
14-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support InlineInfo in StackMap. Change-Id: I9956091775cedc609fdae7dec1433fcb8858a477
|
e82549b14c7def0a45461183964f7e6a34cbb70c |
|
06-May-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Fold HTypeConversion of constants While looking into optimizing long shifts on x86, I found that the compiler wasn't folding HTypeConversion of constants. Add simple conversions of constants, taking care of float/double values with NaNs and small/large values, ensuring Java conversion semantics. Add checker cases to see that constant folding of HTypeConversion is done. Ensure 422-type-conversion type conversion routiness don't get inlined to avoid compile time folding. Change-Id: I5a4eb376b64bc4e41bf908af5875bed312efb228 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
ec525fc30848189051b888da53ba051bc0878b78 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Factor MoveArguments methods in Optimizing's intrinsics handlers. Also add a precondition similar to the one present in code generators, regarding static invoke related explicit clinit check elimination in non-baseline compilations. Change-Id: I26f4dcb5d02824d7556f90b4b0c85b08b737fa53
|
2d27c8e338af7262dbd4aaa66127bb8fa1758b86 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Refactor InvokeDexCallingConventionVisitor in Optimizing. Change-Id: I7ede0f59d5109644887bf5d39201d4e1bf043f34
|
3e3d73349a2de81d14e2279f60ffbd9ab3f3ac28 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Have HInvoke instructions know their number of actual arguments. Add an art::HInvoke::GetNumberOfArguments routine so that art::HInvoke and its subclasses can return the number of actual arguments of the called method. Use it in code generators and intrinsics handlers. Consequently, no longer remove a clinit check as last input of a static invoke if it is still present during baseline code generation, but ensure that static invokes have no such check as last input in optimized compilations. Change-Id: Iaf9e07d1057a3b15b83d9638538c02b70211e476
|
da40309f61f98c16d7d58e4c34cc0f5eef626f93 |
|
24-Apr-2015 |
Zheng Xu <zheng.xu@arm.com> |
Opt compiler: ARM64: Use ldp/stp on arm64 for slow paths. It should be a bit faster than load/store single registers and reduce the code size. Change-Id: I67b8302adf6174b7bb728f7c2afd2c237e34ffde
|
9021825d1e73998b99c81e89c73796f6f2845471 |
|
15-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Type MoveOperands. The ParallelMoveResolver implementation needs to know if a move is for 64bits or not, to handle swaps correctly. Bug found, and test case courtesy of Serguei I. Katkov. Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
|
c6b4dd8980350aaf250f0185f73e9c42ec17cd57 |
|
07-Apr-2015 |
David Srbecky <dsrbecky@google.com> |
Implement CFI for Optimizing. CFI is necessary for stack unwinding in gdb, lldb, and libunwind. Change-Id: I1a3480e3a4a99f48bf7e6e63c4e83a80cfee40a2
|
da4d79bc9a4aeb9da7c6259ce4c9c1c3bf545eb8 |
|
24-Mar-2015 |
Roland Levillain <rpl@google.com> |
Unify ART's various implementations of bit_cast. ART had several implementations of art::bit_cast: 1. one in runtime/base/casts.h, declared as: template <class Dest, class Source> inline Dest bit_cast(const Source& source); 2. another one in runtime/utils.h, declared as: template<typename U, typename V> static inline V bit_cast(U in); 3. and a third local version, in runtime/memory_region.h, similar to the previous one: template<typename Source, typename Destination> static Destination MemoryRegion::local_bit_cast(Source in); This CL removes versions 2. and 3. and changes their callers to use 1. instead. That version was chosen over the others as: - it was the oldest one in the code base; and - its syntax was closer to the standard C++ cast operators, as it supports the following use: bit_cast<Destination>(source) since `Source' can be deduced from `source'. Change-Id: I7334fd5d55bf0b8a0c52cb33cfbae6894ff83633
|
522e2241f5b5f331d0aa2f8508f4c97973f7f012 |
|
17-Mar-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Fix condition for StoreNeedsWriteBarrier Codegen's StoreNeedsWriteBarrier assumed nulls are represented as integer constants and generated a barrier when not needed. This patch fixes the bug. Change-Id: I79247f1009b1fe6f24dba0d57e846ecc55806d4d
|
eeefa1276e83776f08704a3db4237423b0627e20 |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Update locations of registers after slow paths spilling. Change-Id: Id9aafcc13c1a085c17ce65d704c67b73f9de695d
|
fead4e4f397455aa31905b2982d4d861126ab89d |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
[optimizing] Don't record None locations in the stack maps. - moved environment recording from code generator to stack map stream - added creation/loading factory methods for the DexRegisterMap (hides internal details) - added new tests Change-Id: Ic8b6d044f0d8255c6759c19a41df332ef37876fe
|
a8ac9130b872c080299afacf5dcaab513d13ea87 |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Refactor code in preparation of correct stack maps in slow path. Move the logic of saving/restoring live registers in slow path in the SlowPathCode method. Also add a RecordPcInfo helper to SlowPathCode, that will act as the placeholder of saving correct stack maps. Change-Id: I25c2bc7a642ef854bbc8a3eb570e5c8c8d2d030c
|
234d69d075d1608f80adb647f7935077b62b6376 |
|
09-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "[optimizing] Enable x86 long support."" This reverts commit 154552e666347d41d95d7619c6ee56249ff4feca. Change-Id: Idc726551c249a888b7ff5fde8508ae50e81b2e13
|
154552e666347d41d95d7619c6ee56249ff4feca |
|
06-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "[optimizing] Enable x86 long support." Few libcore failures. This reverts commit b4ba354cf8d22b261205494875cc014f18587b50. Change-Id: I4a28d853e730dff9b69aec9555505803cf2fcd63
|
b4ba354cf8d22b261205494875cc014f18587b50 |
|
05-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
[optimizing] Enable x86 long support. Change-Id: I9006972a65a1f191c45691104a960366747f9d16
|
5f8741860d465410bfed495dbb5f794590d338da |
|
04-Mar-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Use callee-save registers for x86 Add ESI, EDI, EBP to available registers for non-baseline mode. Ensure that they aren't used when byte addressible registers are needed. Change-Id: Ie7130d4084c2ae9cfcd1e47c26eb3e5dcac1ebd6 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
d6138ef1ea13d07ae555542f8898b30d89e9ac9a |
|
18-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Ensure the graph is correctly typed. We used to be forgiving because of HIntConstant(0) also being used for null. We now create a special HNullConstant for such uses. Also, we need to run the dead phi elimination twice during ssa building to ensure the correctness. Change-Id: If479efa3680d3358800aebb1cca692fa2d94f6e5
|
aa9b7c48069699e2aabedc6c0f62cb131fee0c73 |
|
17-Feb-2015 |
Roland Levillain <rpl@google.com> |
Have the opt. compiler set the size of "empty" frames to zero. This is to mimic Quick's behavior and honor stack frame alignment constraints after changes introduced by Change-Id I0fdb31e8c631e99091b818874a558c9aa04b1628. This issue use to make oatdump crash on oat files produced by the optimized compiler (e.g. out/host/linux-x86/framework/x86_64/core-optimizing.oat). Change-Id: I8ba52601edb0a0993eaf8923eba55aafdce5043e
|
dc23d8318db08cb42e20f1d16dbc416798951a8b |
|
16-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Avoid generating jmp +0. When a block branches to a non-following block, but blocks in-between do branch to it, we can avoid doing the branch. Change-Id: I9b343f662a4efc718cd4b58168f93162a24e1219
|
c0572a451944f78397619dec34a38c36c11e9d2a |
|
06-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize leaf methods. Avoid suspend checks and stack changes when not needed. Change-Id: I0fdb31e8c631e99091b818874a558c9aa04b1628
|
4c204bafbc8d596894f8cb8ec696f5be1c6f12d8 |
|
03-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use a different block order when not compiling baseline. Use the linearized order instead, as it puts blocks logically next to each other in a better way. Also, it does not contain dead blocks. Change-Id: Ie65b56041a093c8155e6c1e06351cb36a4053505
|
4dee636d21d9ce54386cdfbb824e5eb2a9c1af0d |
|
23-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support callee-save registers on ARM. Change-Id: I7c519b7a828c9891b1141a8e51e12d6a8bc84118
|
d97dc40d186aec46bfd318b6a2026a98241d7e9c |
|
22-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support callee save floating point registers on x64. - Share the computation of core_spill_mask and fpu_spill_mask between backends. - Remove explicit stack overflow check support: we need to adjust them and since they are not tested, they will easily bitrot. Change-Id: I0b619b8de4e1bdb169ea1ae7c6ede8df0d65837a
|
988939683c26c0b1c8808fc206add6337319509a |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable core callee-save on x64. Will work on other architectures and FP support in other CLs. Change-Id: I8cef0343eedc7202d206f5217fdf0349035f0e4d
|
77520bca97ec44e3758510cebd0f20e3bb4584ea |
|
12-Jan-2015 |
Calin Juravle <calin@google.com> |
Record implicit null checks at the actual invoke time. ImplicitNullChecks are recorded only for instructions directly (see NB below) preceeded by NullChecks in the graph. This way we avoid recording redundant safepoints and minimize the code size increase. NB: ParallalelMoves might be inserted by the register allocator between the NullChecks and their uses. These modify the environment and the correct action would be to reverse their modification. This will be addressed in a follow-up CL. Change-Id: Ie50006e5a4bd22932dcf11348f5a655d253cd898
|
cd6dffedf1bd8e6dfb3fb0c933551f9a90f7de3f |
|
08-Jan-2015 |
Calin Juravle <calin@google.com> |
Add implicit null checks for the optimizing compiler - for backends: arm, arm64, x86, x86_64 - fixed parameter passing for CodeGenerator - 003-omnibus-opcodes test verifies that NullPointerExceptions work as expected Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
|
f85a9ca9859ad843dc03d3a2b600afbaf2e9bbdd |
|
13-Jan-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing compiler] Compute live spill size The current stack frame calculation assumes that each live register to be saved/restored has the word size of the machine. This fails for X86, where a double in an XMM register takes up 8 bytes. Change the calculation to keep track of the number of core registers and number of fp registers to handle this distinction. This is slightly pessimal, as the registers may not be active at the same time, but the only way to handle this would be to allocate both classes of registers simultaneously, or remember all the active intervals, matching them up and compute the size of each safepoint interval. Change-Id: If7860aa319b625c214775347728cdf49a56946eb Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
12df9ebf72255544b0147c81b1dca6644a29764e |
|
09-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Move code around in OptimizingCompiler::Compile to reduce stack space. Also fix an (intentional) memory leak, by allocating the CodeGenerator on the heap instead of the arena: they construct an Assembler object that requires destruction. BUG:18787334 Change-Id: I8cf0667cb70ce5b14d4ac334bd4487a562635f1b
|
840e5461a85f8908f51e7f6cd562a9129ff0e7ce |
|
07-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement double and float support for arm in register allocator. The basic approach is: - An instruction that needs two registers gets two intervals. - When allocating the low part, we also allocate the high part. - When splitting a low (or high) interval, we also split the high (or low) equivalent. - Allocation follows the (S/D register) requirement that low registers are always even and the high equivalent is low + 1. Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
|
3416601a9e9be81bb7494864287fd3602d18ef13 |
|
19-Dec-2014 |
Calin Juravle <calin@google.com> |
Look at instruction set features when generating volatiles code Change-Id: Ia882405719fdd60b63e4102af7e085f7cbe0bb2a
|
e21dc3db191df04c100620965bee4617b3b24397 |
|
09-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Swap-space in the compiler Introduce a swap-space and corresponding allocator to transparently switch native allocations to memory backed by a file. Bug: 18596910 (cherry picked from commit 62746d8d9c4400e4764f162b22bfb1a32be287a9) Change-Id: I131448f3907115054a592af73db86d2b9257ea33
|
5b4b898ed8725242ee6b7229b94467c3ea3054c8 |
|
18-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't block quick callee saved registers for optimizing." X64 has one libcore test failing, and codegen_test on arm is failing. This reverts commit 6004796d6c630696127df2494dcd4f30d1367a34. Change-Id: I20e00431fa18e11ce4c0cb6fffa91977fa8e9b4f
|
6004796d6c630696127df2494dcd4f30d1367a34 |
|
15-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't block quick callee saved registers for optimizing. This change builds on: https://android-review.googlesource.com/#/c/118983/ - Also fix x86_64 assembler bug triggered by this change. - Fix (and improve) x86's backend byte register usage. - Fix a bug in baseline register allocator: a fixed out register must prevent inputs from allocating it. Change-Id: I4883862e29b4e4b6470f1823cf7eab7e7863d8ad
|
624279f3c70f9904cbaf428078981b05d3b324c0 |
|
04-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for float-to-long in the optimizing compiler. - Add support for the float-to-long Dex instruction in the optimizing compiler. - Add a Dex PC field to art::HTypeConversion to allow the x86 and ARM code generators to produce runtime calls. - Instruct art::CodeGenerator::RecordPcInfo not to record PC information for HTypeConversion instructions. - Add S0 to the list of ARM FPU parameter registers. - Have art::x86_64::X86_64Assembler::cvttss2si work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for float to long HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I954214f0d537187883f83f7a83a1bb2dd8a21fd4
|
3f8f936aff35f29d86183d31c20597ea17e9789d |
|
02-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for float-to-int in the optimizing compiler. - Add support for the float-to-int Dex instruction in the optimizing compiler. - Factor type conversion related lines in compiler/optimizing/builder.cc. - Generate x86, x86-64 and ARM (but not ARM64) code for float to int HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I2382dfc04bf394ed75f675148cfcf98216d65bc6
|
32f5b4d2c8c9b52e9522941c159577b21752d0fa |
|
25-Nov-2014 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Vixl: Update the VIXL interface to VIXL 1.7 and enable VIXL debug. This patch updates the interface to VIXL 1.7 and enables the debug version of VIXL when ART is built in debug mode. Change-Id: I443fb941bec3cffefba7038f93bb972e6b7d8db5 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
6d0e483dd2e0b63e952de060738c10e2abd12ff7 |
|
27-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for long-to-float in the optimizing compiler. - Add support for the long-to-float Dex instruction in the optimizing compiler. - Have art::x86_64::X86_64Assembler::cvtsi2ss work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for long to float HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: Ic983cbeb1ae2051add40bc519a8f00a6196166c9
|
900f6eb15db5215deeea23e4e087b553b4f696f7 |
|
17-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix lint error. Change-Id: Ia0fa12f2208507b6bec0581edf4345025b877580
|
af07bc121121d7bd7e8329c55dfe24782207b561 |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Minor object store optimizations. - Avoid emitting write barrier when the value is null. - Do not do a typecheck on an arraystore when storing something that was loaded from the same array. Change-Id: I902492928692e4553b5af0fc99cce3c2186c442a
|
d582fa4ea62083a7598dded5b82dc2198b3daac7 |
|
06-Nov-2014 |
Ian Rogers <irogers@google.com> |
Instruction set features for ARM64, MIPS and X86. Also, refactor how feature strings are handled so they are additive or subtractive. Make MIPS have features for FPU 32-bit and MIPS v2. Use in the quick compiler rather than #ifdefs that wouldn't have worked in cross-compilation. Add SIMD features for x86/x86-64 proposed in: https://android-review.googlesource.com/#/c/112370/ Bug: 18056890 Change-Id: Ic88ff84a714926bd277beb74a430c5c7d5ed7666
|
f0e3937b87453234d0d7970b8712082062709b8d |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Do a parallel move in BoundsCheckSlowPath. The two locations of the index and length could overlap, so we need a parallel move. Also factorize the code for doing a parallel move based on two locations. Change-Id: Iee8b3459e2eed6704d45e9a564fb2cd050741ea4
|
de58ab2c03ff8112b07ab827c8fa38f670dfc656 |
|
05-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement try/catch/throw in optimizing. - We currently don't run optimizations in the presence of a try/catch. - We therefore implement Quick's mapping table. - Also fix a missing null check on array-length. Change-Id: I6917dfcb868e75c1cf6eff32b7cbb60b6cfbd68f
|
6a3c1fcb4ba42ad4d5d142c17a3712a6ddd3866f |
|
31-Oct-2014 |
Ian Rogers <irogers@google.com> |
Remove -Wno-unused-parameter and -Wno-sign-promo from base cflags. Fix associated errors about unused paramenters and implict sign conversions. For sign conversion this was largely in the area of enums, so add ostream operators for the effected enums and fix tools/generate-operator-out.py. Tidy arena allocation code and arena allocated data types, rather than fixing new and delete operators. Remove dead code. Change-Id: I5b433e722d2f75baacfacae4d32aef4a828bfe1b
|
19a19cffd197a28ae4c9c3e59eff6352fd392241 |
|
22-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for static fields in optimizing compiler. Change-Id: Id2f010589e2bd6faf42c05bb33abf6816ebe9fa9
|
3c03503d66df3b4440f851ae7d0c4fae5e7872df |
|
28-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Follow-up CL after hard float changes. Addressing comments from Zheng Xu. Change-Id: I8c599cdfab03373e82a1b90b711005c490bc6ca0
|
1ba0f596e9e4ddd778ab431237d11baa85594eba |
|
27-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support hard float on arm in optimizing compiler. Also bump oat version, needed after latest hard float switch. Change-Id: Idf5acfb36c07e74acff00edab998419a3c6b2965
|
102cbed1e52b7c5f09458b44903fe97bb3e14d5f |
|
15-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement register allocator for floating point registers. Also: - Fix misuses of emitting the rex prefix in the x86_64 assembler. - Fix movaps code generation in the x86_64 assembler. Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
|
92a73aef279be78e3c2b04db1713076183933436 |
|
16-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use assembler classes in code_generator.h. The arm64 backend uses its own assembler and does not share the same classes as the other backends. To avoid conflicts or unnecessary mappings, just don't use those classes in the shared part of the code generator. Change-Id: I9e5fa40c1021d2e83a4ef14c52cd1ccd03f2f73d
|
71175b7f19a4f6cf9cc264feafd820dbafa371fb |
|
09-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Cleanup baseline register allocator. - Use three arrays for blocking regsters instead of one and computing offsets in that array.] - Don't pass blocked_registers_ to methods, just use the field. Change-Id: Ib698564c31127c59b5a64c80f4262394b8394dc6
|
56b9ee6fe1d6880c5fca0e7feb28b25a1ded2e2f |
|
09-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Stop converting from Location to ManagedRegister. Now the source of truth is the Location object that knows which register (core, pair, fpu) it needs to refer to. Change-Id: I62401343d7479ecfb24b5ed161ec7829cda5a0b1
|
7fb49da8ec62e8a10ed9419ade9f32c6b1174687 |
|
06-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for floats and doubles. - Follows Quick conventions. - Currently only works with baseline register allocator. Change-Id: Ie4b8e298f4f5e1cd82364da83e4344d4fc3621a3
|
3c04974a90b0e03f4b509010bff49f0b2a3da57f |
|
24-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize suspend checks in optimizing compiler. - Remove the ones added during graph build (they were added for the baseline code generator). - Emit them at loop back edges after phi moves, so that the test can directly jump to the loop header. - Fix x86 and x86_64 suspend check by using cmpw instead of cmpl. Change-Id: I6fad5795a55705d86c9e1cb85bf5d63dadfafa2a
|
3bca0df855f0e575c6ee020ed016999fc8f14122 |
|
19-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support for saving and restoring live registers in a slow path. And use it in suspend check slow paths. Change-Id: I79caf28f334c145a36180c79a6e2fceae3990c31
|
3946844c34ad965515f677084b07d663d70ad1b8 |
|
02-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Runtime support for the new stack maps for the opt compiler. Now most of the methods supported by the compiler can be optimized, instead of using the baseline. Change-Id: I80ab36a34913fa4e7dd576c7bf55af63594dc1fa
|
4361beff5bc540c43ab7c072c99994adc4ed78f9 |
|
20-Aug-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix bug introduced in https://android-review.googlesource.com/102610. Also make oatdump work again. Change-Id: Iab96971645f40585bc04769d410f2273d3977f51
|
e3ea83811d47152c00abea24a9b420651a33b496 |
|
08-Aug-2014 |
Yevgeny Rouban <yevgeny.y.rouban@intel.com> |
ART source line debug info in OAT files OAT files have source line information enough for ART runtime needs like jump to/from interpreter and thread suspension. But this information is not enough for finer grained source level debugging and low-level profiling (VTune or perf). This patch adds to OAT files two additional sections: .debug_line - DWARF formatted Elf32 section with detailed source line information (mapping from native PC to Java source lines). In addition to the debugging symbols added using the dex2oat option --include-debug-symbols, the source line information is added to the section .debug_line. The source line info can be read by many Elf reading tools like objdump, readelf, dwarfdump, gdb, perf, VTune, ... gdb can use this debug line information in x86. In 64-bit mode the information can be used if the oat file is mapped in the lower address space (address has higher 32 bits zeroed). Relocation works. Testing: 1. art/test/run-test --host --gdb [--64] 001-HelloWorld 2. in gdb: break Main.java:19 3. in gdb: break Runtime.java:111 4. in gdb: run - stops at void java.lang.Runtime.<init>() 5. in gdb: backtrace - shows call stack down to main() 6. in gdb: continue - stops at void Main.main() (only in 32-bit mode) 7. in gdb: backtrace - shows call stack down to main() 8. objdump -W <oat-file> - addresses are from VMA range of .text section reported by objdump -h <file> 9. dwarfdump -ka <oat-file> - no errors expected Size of aosp-x86-eng boot.oat increased by 11% from 80.5Mb to 89.2Mb with two sections added .debug_line (7.2Mb) and .rel.debug (1.5Mb). Change-Id: Ib8828832686e49782a63d5529008ff4814ed9cda Signed-off-by: Yevgeny Rouban <yevgeny.y.rouban@intel.com>
|
73e80c3ae76fafdb53afe3a85306dcb491fb5b00 |
|
22-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Make unit test tell if a method is a leaf. The runtime is not initialized completely in gtests, so we cannot run code (such as explicit stack overflow checks) that look at tls values. Change-Id: I74a4449b01eb203f1b411dda700e9459878d0d55
|
f12feb8e0e857f2832545b3f28d31bad5a9d3903 |
|
17-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Stack overflow checks and NPE checks for optimizing. Change-Id: I59e97448bf29778769b79b51ee4ea43f43493d96
|
ab032bc1ff57831106fdac6a91a136293609401f |
|
15-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a braino in the stack layout. Also do some refactoring to have this code be just in CodeGenerator. Change-Id: I88de109889138af8d60027973c12a64bee813cb7
|
e50383288a75244255d3ecedcc79ffe9caf774cb |
|
04-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support fields in optimizing compiler. - Required support for temporaries, to be only used by baseline compiler. - Also fixed a few invalid assumptions around locations and instructions that don't need materialization. These instructions should not have an Out. Change-Id: Idc4a30dd95dd18015137300d36bec55fc024cf62
|
412f10cfed002ab617c78f2621d68446ca4dd8bd |
|
19-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support longs in the register allocator for x86_64. Change-Id: I7fb6dfb761bc5cf9e5705682032855a0a70ca867
|
e27f31a81636ad74bd3376ee39cf215941b85c0e |
|
12-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable the register allocator on ARM. - Also fixes a few bugs/wrong assumptions in code not hit by x86. - We need to differentiate between moves due to connecting siblings within a block, and moves due to control flow resolution. Change-Id: Idd05cf138a71c8f36f5531c473de613c0166fe38
|
86dbb9a12119273039ce272b41c809fa548b37b6 |
|
04-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Final CL to enable register allocation on x86. This CL implements: 1) Resolution after allocation: connecting the locations allocated to an interval within a block and between blocks. 2) Handling of fixed registers: some instructions require inputs/output to be at a specific location, and the allocator needs to deal with them in a special way. 3) ParallelMoveResolver::EmitNativeCode for x86. Change-Id: I0da6bd7eb66877987148b87c3be6a983b4e3f858
|
9cf35523764d829ae0470dae2d5dd99be469c841 |
|
09-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add x86_64 support to the optimizing compiler. Change-Id: I4462d9ae15be56c4a3dc1bd4d1c0c6548c1b94be
|
ffddfdf6fec0b9d98a692e27242eecb15af5ead2 |
|
03-Jun-2014 |
Tim Murray <timmurray@google.com> |
DO NOT MERGE Merge ART from AOSP to lmp-preview-dev. Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
|
a7062e05e6048c7f817d784a5b94e3122e25b1ec |
|
22-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a linear scan register allocator to the optimizing compiler. This is a "by-the-book" implementation. It currently only deals with allocating registers, with no hint optimizations. The changes remaining to make it functional are: - Allocate spill slots. - Resolution and placements of Move instructions. - Connect it to the code generator. Change-Id: Ie0b2f6ba1b98da85425be721ce4afecd6b4012a4
|
4e3d23aa1523718ea1fdf3a32516d2f9d81e84fe |
|
22-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Import Dart's parallel move resolver. And write a few tests while at it. A parallel move resolver will be needed for performing multiple moves that are conceptually parallel, for example moves at a block exit that branches to a block with phi nodes. Change-Id: Ib95b247b4fc3f2c2fcab3b8c8d032abbd6104cd7
|
804d09372cc3d80d537da1489da4a45e0e19aa5d |
|
02-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Build live-in, live-out and kill sets for each block. This information will be used when computing live ranges of instructions. Change-Id: I345ee833c1ccb4a8e725c7976453f6d58d350d74
|
a7aca370a7d62ca04a1e24423d90e8020d6f1a58 |
|
28-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Setup policies for register allocation. Change-Id: I857e77530fca3e2fb872fc142a916af1b48400dc
|
c32e770f21540e4e9eda6dc7f770e745d33f1b9f |
|
24-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a Transform to SSA phase to the optimizing compiler. Change-Id: Ia9700756a0396d797a00b529896487d52c989329
|
a747a392fb5f88d2ecc4c6021edf9f1f6615ba16 |
|
17-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Code cleanup in preparation for x64 backend. - Use InvokeDexCallingConventionVisitor for setting up HParameterValues - Use kVregSize instead of kX86WordSize when dealing with virtual registers. Change-Id: Ia520223010194c70a3ff0ed659077f55cec4e7d8
|
01bc96d007b67fdb7fe349232a83e4b354ce3d08 |
|
11-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Long support in optimizing compiler. - Add stack locations to the Location class. - Change logic of parameter passing/setup by setting the location of such instructions the ones for the calling convention. Change-Id: I4730ad58732813dcb9c238f44f55dfc0baa18799
|
b55f835d66a61e5da6fc1895ba5a0482868c9552 |
|
07-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Test control flow instruction with optimizing compiler. Add support for basic instructions to implement these tests. Change-Id: I3870bf9301599043b3511522bb49dc6364c9b4c0
|
707c809f661554713edfacf338365adca8dfd3a3 |
|
04-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Use target-specific word instead of runtime word. Change-Id: Ia11dc3cc520a1a5c7bd017013e5699af9570ce91
|
4a34a428c6a2588e0857ef6baf88f1b73ce65958 |
|
03-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support passing arguments to invoke-static* instructions. - Stop using the frame pointer for accessing locals. - Stop emulating a stack when doing code generation. Instead, rely on dex register model, where instructions only reference registers. Change-Id: Id51bd7d33ac430cb87a53c9f4b0c864eeb1006f9
|
8ccc3f5d06fd217cdaabd37e743adab2031d3720 |
|
19-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for invoke-static in optimizing compiler. Support is limited to calls without parameters and returning void. For simplicity, we currently follow the Quick ABI. Change-Id: I54805161141b7eac5959f1cae0dc138dd0b2e8a5
|
92cf83e001357329cbf41fa15a6e053fab6f4933 |
|
18-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Run Java tests with the optimizing compiler. Also fix a vector.reserve -> vector.resize braino, and build a GC map that dex2oat expects. Change-Id: I6acf2f90a4c32f90b79bf7709bf2e43931b98757
|
787c3076635cf117eb646c5a89a9014b2072fb44 |
|
17-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Plug new optimizing compiler in compilation pipeline. Also rename accessors to ART's conventions. Change-Id: I344807055b98aa4b27215704ec362191464acecc
|
bab4ed7057799a4fadc6283108ab56f389d117d4 |
|
11-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
More code generation for the optimizing compiler. - Add HReturn instruction - Generate code for locals/if/return - Setup infrastructure for register allocation. Currently emulate a stack. Change-Id: Ib28c2dba80f6c526177ed9a7b09c0689ac8122fb
|
d4dd255db1d110ceb5551f6d95ff31fb57420994 |
|
28-Feb-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add codegen support to the optimizing compiler. Change-Id: I9aae76908ff1d6e64fb71a6718fc1426b67a5c28
|