c393d63aa2b8f6984672fdd4de631bbeff14b6a2 |
|
15-Apr-2016 |
Alexandre Rames <alexandre.rames@linaro.org> |
Fix: correctly destruct VIXL labels. (cherry picked from commit c01a66465a398ad15da90ab2bdc35b7f4a609b17) Bug: 27505766 Change-Id: I077465e3d308f4331e7a861902e05865f9d99835
|
dee58d6bb6d567fcd0c4f39d8d690c3acaf0e432 |
|
07-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals"" This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) This CL fixed an issue with parsing quickened instructions. Bug: 27894376 Bug: 27998571 Bug: 27995065 Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
|
60328910cad396589474f8513391ba733d19390b |
|
04-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals" Bug: 27995065 This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f. Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
|
e3ff7b293be2a6791fe9d135d660c0cffe4bd73f |
|
02-Mar-2016 |
David Brazdil <dbrazdil@google.com> |
Refactor HGraphBuilder and SsaBuilder to remove HLocals This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) Bug: 27894376 Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
|
cac5a7e871f1f346b317894359ad06fa7bd67fba |
|
22-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Improve const-string code generation. For strings in the boot image, use either direct pointers or pc-relative addresses. For other strings, use PC-relative access to the dex cache arrays for AOT and direct address of the string's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -692KiB (-0.9%) - 64-bit boot.oat: -948KiB (-1.1%) - 32-bit dalvik cache total: -900KiB (-0.9%) - 64-bit dalvik cache total: -3672KiB (-1.5%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -380KiB (-0.5%) - 64-bit boot.oat: -928KiB (-1.0%) - 32-bit dalvik cache total: -468KiB (-0.4%) - 64-bit dalvik cache total: -1928KiB (-0.8%) (contains more files than the 32-bit dalvik cache) Bug: 26884697 Change-Id: Iec7266ce67e6fedc107be78fab2e742a8dab2696
|
2ae48182573da7087bffc2873730bc758ec29696 |
|
16-Mar-2016 |
Calin Juravle <calin@google.com> |
Clean up NullCheck generation and record stats about it. This removes redundant code from the generators and allows for easier stat recording. Change-Id: Iccd4368f9e9d87a6fecb863dee4e2145c97851c4
|
c7098ff991bb4e00a800d315d1c36f52a9cb0149 |
|
09-Feb-2016 |
David Srbecky <dsrbecky@google.com> |
Remove HNativeDebugInfo from start of basic blocks. We do not require full environment at the start of basic block. The dex pc contained in basic block is sufficient for line mapping. Change-Id: I5ba9e5f5acbc4a783ad544769f9a73bb33e2bafa
|
6e332529c33be4d7dae5dad3609a839f4c0d3bfc |
|
02-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove HTemporary Change-Id: I21b984224370a9ce7a4a13a9652503cfb03c5f03
|
a19616e3363276e7f2c471eb2839fb16f1d43f27 |
|
02-Feb-2016 |
Aart Bik <ajcbik@google.com> |
Implemented compare/signum intrinsics as HCompare (with all code generation for all) Rationale: At HIR level, many more optimizations are possible, while ultimately generated code can take advantage of full semantics. Change-Id: I6e2ee0311784e5e336847346f7f3c4faef4fd17e
|
7c0b44f180f1b8cf82c568091d250071d1130954 |
|
01-Feb-2016 |
Mark Mendell <mark.p.mendell@intel.com> |
Support CMOV for x86_64 Select If possible, generate CMOV to implement HSelect. Tricky cases are an FP condition (no single CC generated), FP inputs (no FP CMOV) and when the condition is a boolean or not emitted at the use site. In these cases, keep using the existing HSelect code. Added Load32BitValue for int and FP and used that to remove code duplication. Added minimal checker test for int/long CMOV generation. Change-Id: Id71e515f0afa5a30f53c5de3a5244de1ea429aae Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
c5d4754198aadb2ada2d3f5daacd10d79bc13f38 |
|
28-Jan-2016 |
Aart Bik <ajcbik@google.com> |
Implementation of integer intrinsics on x86_64 Rationale: Efficient implementations of common integer operations. Already tested in: 564-checker-bitcount 565-checker-rotate: 566-checker-signum 567-checker-compare 568-checker-onebit (extended to deal with run-time zero) Change-Id: Ib48c76eee751e7925056d7f26797e9a9b5ae60dd
|
95e7ffc28ea4d6deba356e636b16120ae49b62e2 |
|
22-Jan-2016 |
Roland Levillain <rpl@google.com> |
Improve documentation and assertions of read barrier instrumentation. For ARM, x86, x86-64 back ends. The case of the ARM64 back end is already handled in https://android-review.googlesource.com/#/c/197870/. Bug: 12687968 Change-Id: I6df1128cc100cbdb89020876e1a54de719508be3
|
e3f43ac79e50a4693ea4d46acf5cffca64910cee |
|
19-Jan-2016 |
Roland Levillain <rpl@google.com> |
Some read barrier clean-up in Optimizing. These changes make the read barrier compiler instrumentation code more uniform among the ARM, ARM64, x86 and x86-64 back ends. Bug: 12687968 Change-Id: I6b1c0cf2bc22ed6cd6b14754136bef4a2a036ea5
|
58282f4510961317b8d5a364a6f740a78926716f |
|
14-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove Baseline compiler We don't need Baseline any more and it hasn't been maintained for a while anyway. Let's remove it. Change-Id: I442ed26855527be2df3c79935403a25b1ee55df6
|
42249c3602c3d0243396ee3627ffb5906aa77c1e |
|
08-Jan-2016 |
Aart Bik <ajcbik@google.com> |
Reduce code size by sharing slow paths. Rationale: Sharing identical slow path code reduces code size. Background: Currently, slow paths with the same dex-pc, same physical register spilling code, and identical stack maps are shared (making this only useful for deopt slow paths). The newly introduced mechanism is sufficiently general to allow future improvements by e.g. allowing different dex-pc (by passing this to runtime) or even the kind of slow paths (by passing runtime addresses to the slowpath). Change-Id: I819615c47b4fd98440a241f681f93e4fc22d12e0
|
8a1c728e40813d30a85a1f27afaf16a3f105d32a |
|
29-Jun-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86_64: Replace x86_64 xchg instruction use Replacing 'xchg' to exchange two registers with a three instruction move sequence using the 'TMP' register r10 seems to be a big win. This is because xchg is a serializing instruction, even when used on registers. Change-Id: I1c0f7687630936e7f3d2efc4b30ad11233bd484c Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
152408f8c2188a7ed950cad04883b2f67dc74e84 |
|
31-Dec-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86: templatize GenerateTestAndBranch and friends Allow the use of NearLabel as well as Label. This will be used by the HSelect patch. Replace a couple of Label(s) with NearLabel(s) as well. Change-Id: I8e674c89e691bcdbccf4a5cdc07ad13b29ec21dd Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
5f7b58ea1adfc0639dd605b65f59198d3763f801 |
|
23-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Rewrite HInstruction::Is/As<type>(). Make Is<type>() and As<type>() non-virtual for concrete instruction types, relying on GetKind(), and mark GetKind() as PURE to improve optimization opportunities. This reduces the number of relocations in libart-compiler.so's .rel.dyn section by ~4K, or ~44%, and in .data.rel.ro by ~18K, or ~65%. The file is 96KiB smaller for Nexus 5, including 8KiB reduction of the .text section. Unfortunately, the g++/clang++ __attribute__((pure)) is not strong enough to avoid duplicated virtual calls and we would need the C++ [[pure]] attribute proposed in n3744 instead. To work around this deficiency, we introduce an extra non-virtual indirection for GetKind(), so that the compiler can optimize common expressions such as instruction->IsAdd() || instruction->IsSub() or instruction->IsAdd() && instruction->AsAdd()->... which contain two virtual calls to GetKind() after inlining. Change-Id: I83787de0671a5cb9f5b0a5f4a536cef239d5b401
|
17077d888a6752a2e5f8161eee1b2c3285783d12 |
|
16-Dec-2015 |
Mark P Mendell <mark.p.mendell@intel.com> |
Revert "Revert "X86: Use locked add rather than mfence"" This reverts commit 0da3b9117706760e8722029f407da6d0297cc943. Fix a compilation failure that slipped in somehow. Change-Id: Ide8681cdc921febb296ea47aa282cc195f154049
|
0da3b9117706760e8722029f407da6d0297cc943 |
|
16-Dec-2015 |
Aart Bik <ajcbik@google.com> |
Revert "X86: Use locked add rather than mfence" This reverts commit 7b3e4f99b25c31048a33a08688557b133ad345ab. Reason: build error on sdk (linux) in git_mirror-aosp-master-with-vendor , please fix first art/compiler/optimizing/code_generator_x86_64.cc:4032:7: error: use of undeclared identifier 'codegen_' codegen_->MemoryFence(); Change-Id: I91f8542cfd944b7425d1981c35872dcdcb901e18
|
7b3e4f99b25c31048a33a08688557b133ad345ab |
|
19-Nov-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86: Use locked add rather than mfence Java semantics for memory ordering can be satisfied using lock addl $0,0(SP) rather than mfence. The locked add synchronizes the memory caches, but doesn't affect device memory. Timing on a micro benchmark with a mfence or lock add $0,0(sp) in a loop with 600000000 iterations: time ./mfence real 0m5.411s user 0m5.408s sys 0m0.000s time ./locked_add real 0m3.552s user 0m3.550s sys 0m0.000s Implement this as an instruction-set-feature lock_add. This is off by default (uses mfence), and enabled for atom & silvermont variants. Generation of mfence can be forced by a parameter to MemoryFence. Change-Id: I5cb4fded61f4cbbd7b7db42a1b6902e43e458911 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
1e7f8db01a929ac816ca122868edc067c3c6cd17 |
|
15-Dec-2015 |
Roland Levillain <rpl@google.com> |
x86-64 Baker's read barrier fast path implementation. Introduce an x86-64 fast path implementation in Optimizing for Baker's read barriers (for both heap reference loads and GC root loads). The marking phase of the read barrier is performed by a slow path, invoking the runtime entry point artReadBarrierMark. Other read barrier algorithms continue to use the original slow path based implementation, which has been renamed as GenerateReadBarrierSlow/GenerateReadBarrierForRootSlow. Bug: 12687968 Change-Id: I9329293ddca7f9bcb512132bde6675aa202b98b2
|
a4f1220c1518074db18ca1044e9201492975750b |
|
06-Aug-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Optimizing: Add direct calls to math intrinsics Support the double forms of: cos, sin, acos, asin, atan, atan2, cbrt, cosh, exp, expm1, hypot, log, log10, nextAfter, sinh, tan, tanh Add these entries to the vector addressed off the thread pointer. Call the libc routines directly, which means that we have to implement the native ABI, not the ART one. For x86_64, that includes saving XMM12-15 as the native ABI considers them caller-save, while the ART ABI considers them callee-save. We save them by marking them as used by the call to the math function. For x86, this is not an issue, as all the XMM registers are caller-save. Other architectures will call Java as before until they are ready to implement the new intrinsics. Bump the OAT version since we are incompatible with old boot.oat files. Change-Id: Ic6332c3555c09393a17d1ad4daf62932488722fb Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
0debae7bc89eb05f7a2bf7dccd223318fad7c88d |
|
12-Nov-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Refactor GenerateTestAndBranch Each code generator implements a method for generating condition evaluation and branching to arbitrary labels. This patch refactors it for better clarity but also to generate fewer jumps when the true branch is the fallthrough successor. This is preliminary work for implementing HSelect. Change-Id: Iaa545a5ecbacb761c5aa241fa69140cf6eb5952f
|
0d5a281c671444bfa75d63caf1427a8c0e6e1177 |
|
13-Nov-2015 |
Roland Levillain <rpl@google.com> |
x86/x86-64 read barrier support for concurrent GC in Optimizing. This first implementation uses slow paths to instrument heap reference loads and GC root loads for the concurrent copying collector, respectively calling the artReadBarrierSlow and artReadBarrierForRootSlow (new) runtime entry points. Notes: - This implementation does not instrument HInvokeVirtual nor HInvokeInterface instructions (for class reference loads), as the corresponding read barriers are not stricly required with the current concurrent copying collector. - Intrinsics which may eventually call (on slow path) are disabled when read barriers are enabled, as the current slow path infrastructure does not support this case. - When read barriers are enabled, the code generated for a HArraySet instruction always go into the array set slow path for object arrays (delegating the operation to the runtime), as we are lacking a mechanism to keep a temporary register live accross a runtime call (needed for the instrumentation of type checking code, which requires two successive read barriers). Bug: 12687968 Change-Id: I14cd6107233c326389120336f93955b28ffbb329
|
0f7dca4ca0be8d2f8776794d35edf8b51b5bc997 |
|
02-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing/X86: PC-relative dex cache array addressing. Add PC-relative dex cache array addressing for X86 and use it for better invoke-static/-direct dispatch. Also delay the initialization to the PC-relative base until needed. Change-Id: Ib8634d5edce4920cd70172fd13211809cf6948d1
|
ea5af68d6dda832bdfb5978a0c5d6f86a3f67e80 |
|
22-Oct-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86-64: Split long/double constant array/field set A long constant needs to be in a register to store to memory. By allowing stores of constants that are outside of the range of int32_t, we reduce register usage. Also support sets of float/double constants by using integer stores. Rename RegisterOrInt32LongConstant to RegisterOrInt32Constant as it now handles any type of constant. Change-Id: I025d9ef889a5a433e45aa03b376bae40f14197d2 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
dc151b2346bb8a4fdeed0c06e54c2fca21d59b5d |
|
15-Oct-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Determine invoke-static/-direct dispatch early. Determine the dispatch type of invoke-static/-direct in a special pass right after the type inference. This allows the inliner to pass the "needs dex cache" check and inline more. It also allows the code generator to avoid requesting a register location for the ArtMethod* for kDexCachePcRelative and direct methods. The supported dispatch check handles also situations that the CompilerDriver currently doesn't allow. The cleanup of the CompilerDriver and required changes to Quick will come in a separate change. Change-Id: I3f8e903a119949e95871d8ab0a995f4731a13a07
|
9c86b485bc6169eadf846dd5f7cdf0958fe1eb23 |
|
18-Sep-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86_64 jump tables for PackedSwitch Implement PackedSwitch using a jump table of offsets to blocks. Bug: 24092914 Bug: 21119474 Change-Id: I83430086c03ef728d30d79b4022607e9245ef98f Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
ee3cf0731d0ef0787bc2947c8e3ca432b513956b |
|
06-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Intrinsify System.arraycopy. Currently on x64, will do the other architectures in different changes. Change-Id: I15fbbadb450dd21787809759a8b14b21b1e42624
|
e460d1df1f789c7c8bb97024a8efbd713ac175e9 |
|
29-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Support unresolved fields in optimizing" The CL also changes the calling convetion for 64bit static field set to use kArg2 instead of kArg1. This allows optimizing to keep the asumptions: - arm pairs are always of form (even_reg, odd_reg) - ecx_edx is not used as a register on x86. This reverts commit e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1. Change-Id: I93159917565824084abc96775f31be1a4249f2f3
|
225b6464a58ebe11c156144653f11a1c6607f4eb |
|
28-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in code generators. And completely remove the deprecated GrowableArray. Replace GrowableArray with ArenaVector in code generators and related classes and tag arena allocations. Label arrays use direct allocations from ArenaAllocator because Label is non-copyable and non-movable and as such cannot be really held in a container. The GrowableArray never actually constructed them, instead relying on the zero-initialized storage from the arena allocator to be correct. We now actually construct the labels. Also avoid StackMapStream::ComputeDexRegisterMapSize() being passed null references, even though unused. Change-Id: I26a46fdd406b23a3969300a67739d55528df8bf4
|
85b62f23fc6dfffe2ddd3ddfa74611666c9ff41d |
|
09-Sep-2015 |
Andreas Gampe <agampe@google.com> |
ART: Refactor intrinsics slow-paths Refactor slow paths so that there is a default implementation for common cases (only arm64 with vixl is special). Write a generic intrinsic slow-path that can be reused for the specific architectures. Move helper functions into CodeGenerator so that they are accessible. Change-Id: Ibd788dce432601c6a9f7e6f13eab31f28dcb8550
|
e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1 |
|
17-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Support unresolved fields in optimizing" breaks debuggable tests. This reverts commit 23a8e35481face09183a24b9d11e505597c75ebb. Change-Id: I8e60b5c8f48525975f25d19e5e8066c1c94bd2e5
|
23a8e35481face09183a24b9d11e505597c75ebb |
|
08-Sep-2015 |
Calin Juravle <calin@google.com> |
Support unresolved fields in optimizing Change-Id: I9941fa5fcb6ef0a7a253c7a0b479a44a0210aad4
|
175dc732c80e6f2afd83209348124df349290ba8 |
|
25-Aug-2015 |
Calin Juravle <calin@google.com> |
Support unresolved methods in Optimizing Change-Id: If2da02b50d2fa668cd58f134a005f1752e7746b1
|
fa6b93c4b69e6d7ddfa2a4ed0aff01b0608c5a3a |
|
15-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in HGraph. Replace GrowableArray with ArenaVector in HGraph and related classes HEnvironment, HLoopInformation, HInvoke and HPhi, and tag allocations with new arena allocation types. Change-Id: I3d79897af405b9a1a5b98bfc372e70fe0b3bc40d
|
bfb5ba90cd6425ce49c2125a87e3b12222cc2601 |
|
01-Sep-2015 |
Andreas Gampe <agampe@google.com> |
Revert "Revert "Do a second check for testing intrinsic types."" This reverts commit a14b9fef395b94fa9a32147862c198fe7c22e3d7. When an intrinsic with invoke-type virtual is recognized, replace the instruction with a new HInvokeStaticOrDirect. Minimal update for dex-cache rework. Fix includes. Change-Id: I1c8e735a2fa7cda4419f76ca0717125ef236d332
|
ecc4366670e12b4812ef1653f7c8d52234ca1b1f |
|
13-Aug-2015 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Add OptimizingCompilerStats to the CodeGenerator class. Just refactoring, not yet used, but will be used by the incoming patch series and future CodeGen specific stats. Change-Id: I7d20489907b82678120518a77bdab9c4cc58f937 Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
|
581550137ee3a068a14224870e71aeee924a0646 |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Optimizing: Better invoke-static/-direct dispatch."" Fixed kCallArtMethod to use correct callee location for kRecursive. This combination is used when compiling with debuggable flag set. This reverts commit b2c431e80e92eb6437788cc544cee6c88c3156df. Change-Id: Idee0f2a794199ebdf24892c60f8a5dcf057db01c
|
b2c431e80e92eb6437788cc544cee6c88c3156df |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Optimizing: Better invoke-static/-direct dispatch." Reverting due to failing ndebug tests. This reverts commit 9b688a095afbae21112df5d495487ac5231b12d0. Change-Id: Ie4f69da6609df3b7c8443412b6cf7f5c43c2c5d9
|
9b688a095afbae21112df5d495487ac5231b12d0 |
|
06-May-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Better invoke-static/-direct dispatch. Add framework for different types of loading ArtMethod* and code pointer retrieval. Implement invoke-static and invoke-direct calls the same way as Quick. Document the dispatch kinds in HInvokeStaticOrDirect's new enumerations MethodLoadKind and CodePtrLocation. PC-relative loads from dex cache arrays are used only for x86-64 and arm64. The implementation for other architectures will be done in separate CLs. Change-Id: I468ca4d422dbd14748e1ba6b45289f0d31734d94
|
cfa410b0ea561318f74a76c5323f0f6cd8eaaa50 |
|
25-May-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] More x86_64 code improvements Use the constant area some more, use 32-bit immediates in movq instructions when possible, and other small tweaks. Remove the commented out code for Math.Abs(float/double) as it would fail for baseline compiler due to the output being the same as the input. Change-Id: Ifa39f1865b94cec2e1c0a99af3066a645e9d3618 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
8158f28b6689314213eb4dbbe14166073be71f7e |
|
07-Aug-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Ensure coherency of call kinds for LocationSummary. The coherency is enforced with checks added in the `InvokeRuntime` helper, that we now also use on x86 and x86_64. Change-Id: I8cb92b042f25dc3c5fd390e9c61a45b477d081f4
|
c470193cfc522fc818eb2eaab896aef9caf0c75a |
|
10-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Fuse long and FP compare & condition on x86/x86-64 in Optimizing. This is a preliminary implementation of fusing long/float/double compares with conditions to avoid materializing the result from the compare and condition. The information from a HCompare is transferred to the HCondition if it is legal. There must be only a single use of the HCompare, the HCompare and HCondition must be in the same block, the HCondition must not need materialization. Added GetOppositeCondition() to HCondition to return the flipped condition. Bug: 21120453 Change-Id: I1f1db206e6dc336270cd71070ed3232dedc754d6 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
fc6a86ab2b70781e72b807c1798b83829ca7f931 |
|
26-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Implement try/catch blocks in Builder"" This patch enables the GraphBuilder to generate blocks and edges which represent the exceptional control flow when try/catch blocks are present in the code. Actual compilation is still delegated to Quick and Baseline ignores the additional code. To represent the relationship between try and catch blocks, Builder splits the edges which enter/exit a try block and links the newly created blocks to the corresponding exception handlers. This layout will later enable the SsaBuilder to correctly infer the dominators of the catch blocks and to produce the appropriate reverse post ordering. It will not, however, allow for building the complete SSA form of the catch blocks and consequently optimizing such blocks. To this end, a new TryBoundary control-flow instruction is introduced. Codegen treats it the same as a Goto but it allows for additional successors (the handlers). This reverts commit 3e18738bd338e9f8363b26bc895f38c0ec682824. Change-Id: I4f5ea961848a0b83d8db3673763861633e9bfcfb
|
3e18738bd338e9f8363b26bc895f38c0ec682824 |
|
26-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Implement try/catch blocks in Builder" Causes OutOfMemory issues, need to investigate. This reverts commit 0b5c7d1994b76090afcc825e737f2b8c546da2f8. Change-Id: I263e6cc4df5f9a56ad2ce44e18932ca51d7e349f
|
0b5c7d1994b76090afcc825e737f2b8c546da2f8 |
|
11-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement try/catch blocks in Builder This patch enables the GraphBuilder to generate blocks and edges which represent the exceptional control flow when try/catch blocks are present in the code. Actual compilation is still delegated to Quick and Baseline ignores the additional code. To represent the relationship between try and catch blocks, Builder splits the edges which enter/exit a try block and links the newly created blocks to the corresponding exception handlers. This layout will later enable the SsaBuilder to correctly infer the dominators of the catch blocks and to produce the appropriate reverse post ordering. It will not, however, allow for building the complete SSA form of the catch blocks and consequently optimizing such blocks. To this end, a new TryBoundary control-flow instruction is introduced. Codegen treats it the same as a Goto but it allows for additional successors (the handlers). Change-Id: I415b985596d5bebb7b1bb358a46e08b7b04bb53a
|
eb7b7399dbdb5e471b8ae00a567bf4f19edd3907 |
|
19-Jun-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Opt compiler: Add disassembly to the '.cfg' output. This is automatically added to the '.cfg' output when using the usual `--dump-cfg` option. Change-Id: I864bfc3a8299c042e72e451cc7730ad8271e4deb
|
ef20f71e16f035a39a329c8524d7e59ca6a11f04 |
|
09-Jun-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Add boilerplate code for architecture-specific HInstructions. Change-Id: I2723cd96e5f03012c840863dd38d7b2168117db8
|
69aa60163989c33a008115205d39732a76ecc1dc |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Pass current method to HNewInstance and HNewArray."" Problem exposed by this change was fixed in: https://android-review.googlesource.com/#/c/154031/ This reverts commit 7b0e353b49ac3f464c662f20e20e240f0231afff. Change-Id: I680c13dc9db9ba223ab11c7af255222860b4e6d2
|
7b0e353b49ac3f464c662f20e20e240f0231afff |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Pass current method to HNewInstance and HNewArray." 082-inline-execute fails on x86. This reverts commit e21aa42e1341d34250742abafdd83311ad9fa737. Change-Id: Ib3fd25faee2e0128001e40d3d51a74f959bc4449
|
94015b939060f5041d408d48717f22443e55b6ad |
|
04-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Use HCurrentMethod in HInvokeStaticOrDirect."" Fix was to special case baseline for x86, which does not have enough registers to allocate the current method. This reverts commit c345f141f11faad177aa9635a78088d00cf66086. Change-Id: I5997aa52f8d4df373ae5ff4d4150dac0c44c4c10
|
e21aa42e1341d34250742abafdd83311ad9fa737 |
|
08-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Pass current method to HNewInstance and HNewArray. Also remove unsed CodeGenerator::LoadCurrentMethod. Change-Id: I4b8d3f2a30b8e2c76b6b329a72555483c993cb73
|
c345f141f11faad177aa9635a78088d00cf66086 |
|
04-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Use HCurrentMethod in HInvokeStaticOrDirect." Fails on baseline/x86. This reverts commit 38207af82afb6f99c687f64b15601ed20d82220a. Change-Id: Ib71018367eb7c6046965494a7e996c22af3de403
|
38207af82afb6f99c687f64b15601ed20d82220a |
|
01-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use HCurrentMethod in HInvokeStaticOrDirect. Change-Id: I0d15244b6b44c8b10079398c55da5071a3e3af66
|
fd88f16100cceafbfde1b4f095f17e89444d6fa8 |
|
03-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Factorize code for common LocationSummary of HInvoke. This is one step forward, we could factorize more, but I wanted to get this out of the way first. Change-Id: I6ae411a737eebaecb64974f47af507ce0cfbae85
|
3d21bdf8894e780d349c481e5c9e29fe1556051c |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 (cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33) Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3 ART: Fix casts for 64-bit pointers on 32-bit compiler. Bug: 19264997 Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457 Fix JDWP tests after ArtMethod change Fixes Throwable::GetStackDepth for exception event detection after internal stack trace representation change. Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of proxy method. Bug: 19264997 Change-Id: I363e293796848c3ec491c963813f62d868da44d2 Fix accidental IMT and root marking regression Was always using the conflict trampoline. Also included fix for regression in GC time caused by extra roots. Most of the regression was IMT. Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to detached thread. EvaluateAndApplyChanges: From ~2500 -> ~1980 GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots Bug: 19264997 Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0 Fix bogus image test assert Previously we were comparing the size of the non moving space to size of the image file. Now we properly compare the size of the image space against the size of the image file. Bug: 19264997 Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a [MIPS64] Fix art_quick_invoke_stub argument offsets. ArtMethod reference's size got bigger, so we need to move other args and leave enough space for ArtMethod* and 'this' pointer. This fixes mips64 boot. Bug: 19264997 Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
|
e401d146407d61eeb99f8d6176b2ac13c4df1e33 |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
|
07276db28d654594e0e86e9e467cad393f752e6e |
|
18-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't do a null test in MarkGCCard if the value cannot be null. Change-Id: I45687f6d3505178e2fc3689eac9cb6ab1b2c1e29
|
92e83bf8c0b2df8c977ffbc527989631d94b1819 |
|
07-May-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Tune some x86_64 moves Generate Moves of constant FP values by loading from the constant table. Use 'movl' to load a 64 bit register for positive 32-bit values, saving a byte in the generated code by taking advantage of the implicit zero extension. Change a couple of xorq(reg, reg) to xorl to (potentially) save a byte of code per xor. Change-Id: I5b2a807f0d3b29294fd4e7b8ef6d654491fa0b01 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
2d27c8e338af7262dbd4aaa66127bb8fa1758b86 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Refactor InvokeDexCallingConventionVisitor in Optimizing. Change-Id: I7ede0f59d5109644887bf5d39201d4e1bf043f34
|
848f70a3d73833fc1bf3032a9ff6812e429661d9 |
|
15-Jan-2014 |
Jeff Hao <jeffhao@google.com> |
Replace String CharArray with internal uint16_t array. Summary of high level changes: - Adds compiler inliner support to identify string init methods - Adds compiler support (quick & optimizing) with new invoke code path that calls method off the thread pointer - Adds thread entrypoints for all string init methods - Adds map to verifier to log when receiver of string init has been copied to other registers. used by compiler and interpreter Change-Id: I797b992a8feb566f9ad73060011ab6f51eb7ce01
|
ad4450e5c3ffaa9566216cc6fafbf5c11186c467 |
|
17-Apr-2015 |
Zheng Xu <zheng.xu@arm.com> |
Opt compiler: Implement parallel move resolver without using swap. The algorithm of ParallelMoveResolverNoSwap() is almost the same with ParallelMoveResolverWithSwap(), except the way we resolve the circular dependency. NoSwap() uses additional scratch register to resolve the circular dependency. For example, (0->1) (1->2) (2->0) will be performed as (2->scratch) (1->2) (0->1) (scratch->0). On architectures without swap register support, NoSwap() can reduce the number of moves from 3x(N-1) to (N+1) when there is circular dependency with N moves. And also, NoSwap() algorithm does not depend on architecture register layout information, which means it can support register pairs on arm32 and X/W, D/S registers on arm64 without additional modification. Change-Id: Idf56bd5469bb78c0e339e43ab16387428a082318
|
e14590bdfed24df30e6b7545fc819ba03ff8bba1 |
|
15-Apr-2015 |
Guillaume Sanchez <guillaumesa@google.com> |
Revert "[optimizing] Improve x86 parallel moves/swaps" This reverts commit a5c19ce8d200d68a528f2ce0ebff989106c4a933. This commit introduces a performance regression on CaffeineLogic of 30%. Change-Id: I917e206e249d44e1748537bc1b2d31054ea4959d
|
a5c19ce8d200d68a528f2ce0ebff989106c4a933 |
|
01-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Improve x86 parallel moves/swaps Add a new constructor to ScratchRegisterScope that will supply a register if there is a free one, but not spill to force one. Use this to generated alternate code that doesn't use a temporary, as the spill/restore of a register generates extra instructions that aren't necessary on x86. Here is the benefit for a 32 bit memory-to-memory exchange with no free registers: < 50 push eax < 53 push ebx < 8B44244C mov eax, [esp + 76] < 8B5C246C mov ebx, [esp + 108] < 8944246C mov [esp + 108], eax < 895C244C mov [esp + 76], ebx < 5B pop ebx < 58 pop eax --- > FF742444 push [esp + 68] > FF742468 push [esp + 104] > 8F44244C pop [esp + 72] > 8F442468 pop [esp + 100] Avoid using xchg instruction, as it is slow on smaller processors. Change-Id: Id29ee3abd998577baaee552d55d23e60ae0c7871 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
39dcf55a56da746e04f477f89e7b00ba1de03880 |
|
10-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Address x86_64 RIP patch comments Nicolas had some comments after the patch https://android-review.googlesource.com/#/c/144100 had merged. Fix the problems that he found. Change-Id: I40e8a4273997860db7511dc8f1986281b72bead2 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
b19930c5cba3cf662dce5ee057fcc9829b4cbb9c |
|
09-Apr-2015 |
Guillaume Sanchez <guillaumesa@google.com> |
Follow up of "div/rem on x86 and x86_64", to tidy up the code a little. Change-Id: Ibf39cbc8ac1d773599d70be2cb1e941674b60f1d
|
f55c3e0825cdfc4c5a27730031177d1a0198ec5a |
|
27-Mar-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Add RIP support for x86_64 Support a constant area addressed using RIP on x86_64. Use it for FP operations to avoid loading constants into a CPU register and moving to a XMM register. Change-Id: I58421759ef2a8475538876c20e696ec787015a72 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
0f88e87085b7cf6544dadff3f555773966a6853e |
|
30-Mar-2015 |
Guillaume Sanchez <guillaumesa@google.com> |
Speedup div/rem by constants on x86 and x86_64 This is done using the algorithms in Hacker's Delight chapter 10. Change-Id: I7bacefe10067569769ed31a1f7834f796fb41119
|
d43b3ac88cd46b8815890188c9c2b9a3f1564648 |
|
01-Apr-2015 |
Mingyao Yang <mingyao@google.com> |
Revert "Revert "Deoptimization-based bce."" This reverts commit 0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430. Change-Id: I1ca10d15bbb49897a0cf541ab160431ec180a006
|
fb8d279bc011b31d0765dc7ca59afea324fd0d0c |
|
01-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Implement x86/x86_64 math intrinsics Implement floor/ceil/round/RoundFloat on x86 and x86_64. Implement RoundDouble on x86_64. Add support for roundss and roundsd on both architectures. Support them in the disassembler as well. Add the instruction set features for x86, as the 'round' instruction is only supported if SSE4.1 is supported. Fix the tests to handle the addition of passing the instruction set features to x86 and x86_64. Add assembler tests for roundsd and roundss to x86_64 assembler tests. Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
d75948ac93a4a317feaf136cae78823071234ba5 |
|
27-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Intrinsify String.compareTo. Change-Id: Ia540df98755ac493fe61bd63f0bd94f6d97fbb57
|
0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430 |
|
24-Mar-2015 |
Andreas Gampe <agampe@google.com> |
Revert "Deoptimization-based bce." This breaks compiling the core image: Error after BCE: art::SSAChecker: Instruction 219 in block 1 does not dominate use 221 in block 1. This reverts commit e295e6ec5beaea31be5d7d3c996cd8cfa2053129. Change-Id: Ieeb48797d451836ed506ccb940872f1443942e4e
|
e295e6ec5beaea31be5d7d3c996cd8cfa2053129 |
|
07-Mar-2015 |
Mingyao Yang <mingyao@google.com> |
Deoptimization-based bce. A mechanism is introduced that a runtime method can be called from code compiled with optimizing compiler to deoptimize into interpreter. This can be used to establish invariants in the managed code If the invariant does not hold at runtime, we will deoptimize and continue execution in the interpreter. This allows to optimize the managed code as if the invariant was proven during compile time. However, the exception will be thrown according to the semantics demanded by the spec. The invariant and optimization included in this patch are based on the length of an array. Given a set of array accesses with constant indices {c1, ..., cn}, we can optimize away all bounds checks iff all 0 <= min(ci) and max(ci) < array-length. The first can be proven statically. The second can be established with a deoptimization-based invariant. This replaces n bounds checks with one invariant check (plus slow-path code). Change-Id: I8c6e34b56c85d25b91074832d13dba1db0a81569
|
dc23d8318db08cb42e20f1d16dbc416798951a8b |
|
16-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Avoid generating jmp +0. When a block branches to a non-following block, but blocks in-between do branch to it, we can avoid doing the branch. Change-Id: I9b343f662a4efc718cd4b58168f93162a24e1219
|
1cf95287364948689f6a1a320567acd7728e94a3 |
|
12-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Small optimization for recursive calls: avoid dex cache. Change-Id: I044757a2f06e535cdc1480c4fc8182b89635baf6
|
878d58cbaf6b17a9e3dcab790754527f3ebc69e5 |
|
16-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Arm64 optimizing compiler intrinsics Implement most intrinsics for the optimizing compiler for Arm64. Change-Id: Idb459be09f0524cb9aeab7a5c7fccb1c6b65a707
|
d97dc40d186aec46bfd318b6a2026a98241d7e9c |
|
22-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support callee save floating point registers on x64. - Share the computation of core_spill_mask and fpu_spill_mask between backends. - Remove explicit stack overflow check support: we need to adjust them and since they are not tested, they will easily bitrot. Change-Id: I0b619b8de4e1bdb169ea1ae7c6ede8df0d65837a
|
988939683c26c0b1c8808fc206add6337319509a |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable core callee-save on x64. Will work on other architectures and FP support in other CLs. Change-Id: I8cef0343eedc7202d206f5217fdf0349035f0e4d
|
24f2dfae084b2382c053f5d688fd6bb26cb8a328 |
|
15-Jan-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing compiler] Implement inline x86 FP '%' Replace the calls to fmod/fmodf by inline code as is done in the Quick compiler. Remove the quick fmod/fmodf runtime entries, as they are no longer in use. 64 bit code generator Move() routine needed to be enhanced to handle constants, as Location::Any() allows them to be generated. Change-Id: I6b6a42f6faeed4b0b3c940453e487daf5b25d184 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
cd6dffedf1bd8e6dfb3fb0c933551f9a90f7de3f |
|
08-Jan-2015 |
Calin Juravle <calin@google.com> |
Add implicit null checks for the optimizing compiler - for backends: arm, arm64, x86, x86_64 - fixed parameter passing for CodeGenerator - 003-omnibus-opcodes test verifies that NullPointerExceptions work as expected Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
|
71fb52fee246b7d511f520febbd73dc7a9bbca79 |
|
30-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Optimizing compiler intrinsics Add intrinsics infrastructure to the optimizing compiler. Add almost all intrinsics supported by Quick to the x86-64 backend. Further intrinsics require more assembler support. Change-Id: I48de9b44c82886bb298d16e74e12a9506b8e8807
|
f85a9ca9859ad843dc03d3a2b600afbaf2e9bbdd |
|
13-Jan-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing compiler] Compute live spill size The current stack frame calculation assumes that each live register to be saved/restored has the word size of the machine. This fails for X86, where a double in an XMM register takes up 8 bytes. Change the calculation to keep track of the number of core registers and number of fp registers to handle this distinction. This is slightly pessimal, as the registers may not be active at the same time, but the only way to handle this would be to allocate both classes of registers simultaneously, or remember all the active intervals, matching them up and compute the size of each safepoint interval. Change-Id: If7860aa319b625c214775347728cdf49a56946eb Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
840e5461a85f8908f51e7f6cd562a9129ff0e7ce |
|
07-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement double and float support for arm in register allocator. The basic approach is: - An instruction that needs two registers gets two intervals. - When allocating the low part, we also allocate the high part. - When splitting a low (or high) interval, we also split the high (or low) equivalent. - Allocation follows the (S/D register) requirement that low registers are always even and the high equivalent is low + 1. Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
|
52c489645b6e9ae33623f1ec24143cde5444906e |
|
16-Dec-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add support for volatile - for backends: arm, x86, x86_64 - added necessary instructions to assemblies - clean up code gen for field set/get - fixed InstructionDataEquals for some instructions - fixed comments in compiler_enums * 003-opcode test verifies basic volatile functionality Change-Id: I144393efa312dfb2c332cb84056b00edffee338a
|
9aec02fc5df5518c16f1e5a9b6cb198a192db973 |
|
19-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add shifts Added SHL, SHR, USHR for arm, x86, x86_64. Change-Id: I971f594e270179457e6958acf1401ff7630df07e
|
86a8d7afc7f00ff0f5ea7b8aaf4d50514250a4e6 |
|
19-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Consistently use k{InstructionSet}WordSize. These constants were defined prior to k{InstructionSet}PointerSize. So use them consistently in optimizing as a first step. We can discuss whether we should remove them in a second step. Change-Id: If129de1a3bb8b65f8d9c816a8ad466815fb202e6
|
bacfec30ee9f2f6fdfd190f11b105b609938efca |
|
14-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add REM_INT, REM_LONG - for arm, x86, x86_64 - minor cleanup/fix in div tests Change-Id: I240874010206a5a9b3aaffbc81a885b94c248f93
|
f0e3937b87453234d0d7970b8712082062709b8d |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Do a parallel move in BoundsCheckSlowPath. The two locations of the index and length could overlap, so we need a parallel move. Also factorize the code for doing a parallel move based on two locations. Change-Id: Iee8b3459e2eed6704d45e9a564fb2cd050741ea4
|
9574c4b5f5ef039d694ac12c97e25ca02eca83c0 |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement and/or/xor in optimizing. Change-Id: I7cf6da1fd334a7177a5580931b8f174dd40b7cec
|
de58ab2c03ff8112b07ab827c8fa38f670dfc656 |
|
05-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement try/catch/throw in optimizing. - We currently don't run optimizations in the presence of a try/catch. - We therefore implement Quick's mapping table. - Also fix a missing null check on array-length. Change-Id: I6917dfcb868e75c1cf6eff32b7cbb60b6cfbd68f
|
424f676379f2f872acd1478672022f19f3240fc1 |
|
03-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement CONST_CLASS in optimizing compiler. Change-Id: Ia8c8dfbef87cb2f7893bfb6e178466154eec9efd
|
19a19cffd197a28ae4c9c3e59eff6352fd392241 |
|
22-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for static fields in optimizing compiler. Change-Id: Id2f010589e2bd6faf42c05bb33abf6816ebe9fa9
|
102cbed1e52b7c5f09458b44903fe97bb3e14d5f |
|
15-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement register allocator for floating point registers. Also: - Fix misuses of emitting the rex prefix in the x86_64 assembler. - Fix movaps code generation in the x86_64 assembler. Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
|
92a73aef279be78e3c2b04db1713076183933436 |
|
16-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use assembler classes in code_generator.h. The arm64 backend uses its own assembler and does not share the same classes as the other backends. To avoid conflicts or unnecessary mappings, just don't use those classes in the shared part of the code generator. Change-Id: I9e5fa40c1021d2e83a4ef14c52cd1ccd03f2f73d
|
71175b7f19a4f6cf9cc264feafd820dbafa371fb |
|
09-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Cleanup baseline register allocator. - Use three arrays for blocking regsters instead of one and computing offsets in that array.] - Don't pass blocked_registers_ to methods, just use the field. Change-Id: Ib698564c31127c59b5a64c80f4262394b8394dc6
|
360231a056e796c36ffe62348507e904dc9efb9b |
|
08-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix code generation of materialized conditions. Move the logic for knowing if a condition needs to be materialized in an optimization pass (so that the information does not change as a side effect of another optimization). Also clean-up arm and x86_64 codegen: - arm: ldr and str are for power-users when a constant is in play. We should use LoadFromOffset and StoreToOffset. - x86_64: fix misuses of movq instead of movl. Change-Id: I01a03b91803624be2281a344a13ad5efbf4f3ef3
|
56b9ee6fe1d6880c5fca0e7feb28b25a1ded2e2f |
|
09-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Stop converting from Location to ManagedRegister. Now the source of truth is the Location object that knows which register (core, pair, fpu) it needs to refer to. Change-Id: I62401343d7479ecfb24b5ed161ec7829cda5a0b1
|
7fb49da8ec62e8a10ed9419ade9f32c6b1174687 |
|
06-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for floats and doubles. - Follows Quick conventions. - Currently only works with baseline register allocator. Change-Id: Ie4b8e298f4f5e1cd82364da83e4344d4fc3621a3
|
3c04974a90b0e03f4b509010bff49f0b2a3da57f |
|
24-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize suspend checks in optimizing compiler. - Remove the ones added during graph build (they were added for the baseline code generator). - Emit them at loop back edges after phi moves, so that the test can directly jump to the loop header. - Fix x86 and x86_64 suspend check by using cmpw instead of cmpl. Change-Id: I6fad5795a55705d86c9e1cb85bf5d63dadfafa2a
|
3bca0df855f0e575c6ee020ed016999fc8f14122 |
|
19-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support for saving and restoring live registers in a slow path. And use it in suspend check slow paths. Change-Id: I79caf28f334c145a36180c79a6e2fceae3990c31
|
e982f0b8e809cece6f460fa2d8df25873aa69de4 |
|
13-Aug-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement invoke virtual in optimizing compiler. Also refactor 004 tests to make them work with both Quick and Optimizing. Change-Id: I87e275cb0ae0258fc3bb32b612140000b1d2adf8
|
3c7bb98698f77af10372cf31824d3bb115d9bf0f |
|
23-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement array get and array put in optimizing. Also fix a couple of assembler/disassembler issues. Change-Id: I705c8572988c1a9c4df3172b304678529636d5f6
|
96f89a290eb67d7bf4b1636798fa28df14309cc7 |
|
11-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add assembly operations with constants in optimizing compiler. Change-Id: I5bcc35ab50d4457186effef5592a75d7f4e5b65f
|
ab032bc1ff57831106fdac6a91a136293609401f |
|
15-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a braino in the stack layout. Also do some refactoring to have this code be just in CodeGenerator. Change-Id: I88de109889138af8d60027973c12a64bee813cb7
|
e50383288a75244255d3ecedcc79ffe9caf774cb |
|
04-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support fields in optimizing compiler. - Required support for temporaries, to be only used by baseline compiler. - Also fixed a few invalid assumptions around locations and instructions that don't need materialization. These instructions should not have an Out. Change-Id: Idc4a30dd95dd18015137300d36bec55fc024cf62
|
412f10cfed002ab617c78f2621d68446ca4dd8bd |
|
19-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support longs in the register allocator for x86_64. Change-Id: I7fb6dfb761bc5cf9e5705682032855a0a70ca867
|
ecb2f9ba57b08ceac4204ddd6a0a88a0524f8741 |
|
13-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable the register allocator on x86_64. Also fix an x86_64 assembler bug for movl. Change-Id: I8d17c68cd35ddd1d8df159f2d6173a013a7c3347
|
9cf35523764d829ae0470dae2d5dd99be469c841 |
|
09-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add x86_64 support to the optimizing compiler. Change-Id: I4462d9ae15be56c4a3dc1bd4d1c0c6548c1b94be
|