c15a2f4f45661a7f5f542e406282c146ea1a968d |
|
21-Apr-2017 |
Andreas Gampe <agampe@google.com> |
ART: Add object-readbarrier-inl.h Move some read-barrier code into a new header. This prunes the include tree for the concurrent-copying collector. Clean up other related includes. Test: mmma art Change-Id: I40ce4e74f2e5d4c692529ffb4df933230b6fd73e
|
c6ea7d00ad069a2736f603daa3d8eaa9a1f8ea11 |
|
02-Feb-2017 |
Andreas Gampe <agampe@google.com> |
ART: Clean up art_method.h Clean up the header. Fix up other headers including the -inl file, in an effort to prune the include graph. Fix broken transitive includes by making includes explicit. Introduce new -inl files for method handles and reference visiting. Test: source build/envsetup.sh && lunch aosp_angler-userdebug && mmma art Test: source build/envsetup.sh && lunch aosp_mips64-userdebug && mmma art Change-Id: I8f60f1160c2a702fdf3598149dae38f6fa6bc851
|
cbcedbf9382bc773713cd3552ed96f417bf1daeb |
|
13-Mar-2017 |
Mathieu Chartier <mathieuc@google.com> |
Add method info to oat files The method info data is stored separately from the code info to reduce oat size by improving deduplication of stack maps. To reduce code size, this moves the invoke info and inline info method indices to this table. Oat size for a large app (arm64): 77746816 -> 74023552 (-4.8%) Average oat size reduction for golem (arm64): 2% Repurposed unused SrcMapElem deduping to be for MethodInfo. TODO: Delete SrcMapElem in a follow up CL. Bug: 36124906 Test: clean-oat-host && test-art-host-run-test Change-Id: I2241362e728389030b959f42161ce817cf6e2009
|
0c95c12a102cb8c1514410be6e264f9730d847a7 |
|
26-Feb-2017 |
Andreas Gampe <agampe@google.com> |
ART: Fix underflow in codegen Check for count == 0 first before accessing non-existent stack map. Test: m ART_TEST_JIT=true test-art-host Test: m ART_TEST_JIT=true test-art-host-run-test-913-heaps Change-Id: Id4cad8e791d731147860b8a9a0d90cc893cc6972
|
d776ff08e07494327716f0d2ea1a774b2ebfbca9 |
|
17-Jan-2017 |
Mathieu Chartier <mathieuc@google.com> |
Add invoke infos to stack maps Invoke info records the invoke type and dex method index for invokes that may reach artQuickResolutionTrampoline. Having this information recorded allows the runtime to avoid reading the dex code and pulling in extra pages. Code size increase for a large app: 93886360 -> 95811480 (2.05% increase) 1/2 of the code size increase is from making less stack maps deduped. I suspect there is less deduping because of the invoke info method index. Merged disabled until we measure the RAM savings. Test: test-art-host, N6P boots Bug: 34109702 Change-Id: I6c5e4a60675a1d7c76dee0561a12909e4ab6d5d9
|
fa4333dcb481e564f54726b4e6f8153612df835e |
|
14-Feb-2017 |
Andreas Gampe <agampe@google.com> |
ART: Add operator == and != with nullptr to Handle Get it in line with ObjPtr and prettify our code. Test: m Change-Id: I1322e2a9bc7a85d7f2441034a19bf4d807b81a0e
|
b048cb74b742b03eb6dd5f1d6dd49e559f730b36 |
|
23-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Add per array size allocation entrypoints. - Update architectures that have fast paths for array allocation to use it. - Will add more fast paths in follow-up CLs. Test: test-art-target test-art-host. Change-Id: I138cccd16464a85de22a8ed31c915f876e78fb04
|
a2f526f889be06f96ea59624c9dfb1223b3839f3 |
|
19-Jan-2017 |
Mathieu Chartier <mathieuc@google.com> |
Compressed native PC for stack maps Compress native PC based on instruction alignment. This reduces the size of stack maps, boot.oat is 0.4% smaller for arm64. Test: test-art-host, test-art-target, N6P booting Change-Id: I2b70eecabda88b06fa80a85688fd992070d54278
|
5d37c152f21a0807459c6f53bc25e2d84f56d259 |
|
12-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Put inlined ArtMethod pointer in stack maps. Currently done for JIT. Can be extended for AOT and inlined boot image methods. Also refactor the lookup of a inlined method at runtime to not rely on the dex cache, but look at the class loader tables. bug: 30933338 test: test-art-host, test-art-target Change-Id: I58bd4d763b82ab8ca3023742835ac388671d1794
|
4155998a2f5c7a252a6611e3926943e931ea280a |
|
06-Jan-2017 |
Vladimir Marko <vmarko@google.com> |
Make runtime call on main for HLoadClass/kDexCacheViaMethod. Remove dependency of the compiled code on types dex cache array in preparation for changing to a hash-based array. Test: m test-art-host Test: m test-art-target on Nexus 9 Bug: 30627598 Change-Id: I3c426ed762c12eb9eb4bb61ea9a23a0659abf0a2
|
ac141397dc29189ad2b2df41f8d4312246beec60 |
|
13-Jan-2017 |
Orion Hodson <oth@google.com> |
Revert "Revert "ART: Compiler support for invoke-polymorphic."" This reverts commit 0fb5af1c8287b1ec85c55c306a1c43820c38a337. This takes us back to the original change and attempts to fix the issues encountered: - Adds transition record push/pop around artInvokePolymorphic. - Changes X86/X64 relocations for MacSDK. - Implements MIPS entrypoint for art_quick_invoke_polymorphic. - Corrects size of returned reference in art_quick_invoke_polymorphic on ARM. Bug: 30550796,33191393 Test: art/test/run-test 953 Test: m test-art-run-test Change-Id: Ib6b93e00b37b9d4ab743a3470ab3d77fe857cda8
|
0fb5af1c8287b1ec85c55c306a1c43820c38a337 |
|
11-Jan-2017 |
Orion Hodson <oth@google.com> |
Revert "ART: Compiler support for invoke-polymorphic." This reverts commit 02e3092f8d98f339588e48691db77f227b48ac1e. Reasons for revert: - Breaks MIPS/MIPS64 build. - Fails under GCStress test on x64. - Different x64 build configuration doesn't like relocation. Change-Id: I512555b38165d05f8a07e8aed528f00302061001
|
02e3092f8d98f339588e48691db77f227b48ac1e |
|
01-Dec-2016 |
Orion Hodson <oth@google.com> |
ART: Compiler support for invoke-polymorphic. Adds basic support to invoke method handles in compiled code. Enables method verification for methods containing invoke-polymorphic. Adds k45cc/k45rc output to Instruction::DumpString() which was found to be missing when enabling verification. Include stack traces in test 957-methodhandle-transforms for failures so they can be easily identified. Bug: 30550796,33191393 Test: art/test/run-test 953 Test: m test-art-run-test Change-Id: Ic9a96ea24906087597d96ad8159a5bc349d06950
|
f0acfe7a812a332122011832074142718c278dae |
|
09-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Keep resolved String in HLoadString. For the following reasons: - Avoids needing to do a lookup again in CodeGenerator::EmitJitRoots. - Fixes races where we the string was GC'ed before CodeGenerator::EmitJitRoots. - Makes it possible to do GVN on the same string but defined in different dex files. Test: test-art-host, test-art-target Change-Id: If2b5d3079f7555427b1b96ab04546b3373fcf921
|
22384aeab988df7fa5ccdc48a668589c5f602c39 |
|
12-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Add kJitTableAddress for HLoadClass."" This reverts commit d2d5262c8370309e1f2a009f00aafc24f1cf00a0. Change-Id: I6149d5c7d5df0b0fc5cb646a802a2eea8d01ac08
|
d2d5262c8370309e1f2a009f00aafc24f1cf00a0 |
|
12-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Add kJitTableAddress for HLoadClass." One test failure after merge. This reverts commit 5b12f7973636bfea29da3956a9baa7a6bbe2b666. Change-Id: I120c49e53274471fc1c82a10d52e99c83f5f85cc
|
5b12f7973636bfea29da3956a9baa7a6bbe2b666 |
|
09-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Add kJitTableAddress for HLoadClass. This new kind loads classes from the root table associated with JIT compiled code. Also remove kDexCacheAddress, which is replaced by kJitTableAddress. test: ART_TEST_JIT=true test-art-host-jit test-art-target-jit Change-Id: Ia23029688d1a60c178bf2ffa7463927c5d5de4d0
|
063fc772b5b8aed7d769cd7cccb6ddc7619326ee |
|
02-Aug-2016 |
Mingyao Yang <mingyao@google.com> |
Class Hierarchy Analysis (CHA) The class linker now tracks whether a method has a single implementation and if so, the JIT compiler will try to devirtualize a virtual call for the method into a direct call. If the single-implementation assumption is violated due to additional class linking, compiled code that makes the assumption is invalidated. Deoptimization is triggered for compiled code live on stack. Instead of patching return pc's on stack, a CHA guard is added which checks a hidden should_deoptimize flag for deoptimization. This approach limits the number of deoptimization points. This CL does not devirtualize abstract/interface method invocation. Slides on CHA: https://docs.google.com/a/google.com/presentation/d/1Ax6cabP1vM44aLOaJU3B26n5fTE9w5YU-1CRevIDsBc/edit?usp=sharing Change-Id: I18bf716a601b6413b46312e925a6ad9e4008efa4 Test: ART_TEST_JIT=true m test-art-host/target-run-test test-art-host-gtest
|
132d8363bf8cb043d910836672192ec8c36649b6 |
|
16-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Revert "Revert "JIT root tables."""" Test: 626-set-resolved-string, test-art-host, test-art-target Test: run-libcore-tests.sh Test: phone boots and runs This reverts commit 3395fbc20bcd20948bec8958db91b304c17cacd8. Change-Id: I104b73d093e3eb6a271d564cfdb9ab09c1c8cf24
|
3395fbc20bcd20948bec8958db91b304c17cacd8 |
|
14-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Revert "JIT root tables.""" libcore failures: dalvikvm32 F 11-14 03:04:06 14870 14870 jit_code_cache.cc:310] Check failed: new_string != nullptr This reverts commit 75afcdd3503a8a8518e5b23d21b6e73306ce39ce. Change-Id: I5a6b6b48aa79a763d1ff1ba4d85d63811254787d
|
75afcdd3503a8a8518e5b23d21b6e73306ce39ce |
|
10-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "JIT root tables."" Also contains Revert "Support kJitTableAddress in x86/arm/arm64." This reverts commit 4acd03638fcdb4e5d1666f8eec7eb3bf6d6be035. This reverts commit 997d1217830c0a18b70faeabd53c04700a87d7d9. Test: ART_USE_READ_BARRIER=true/false test-art-host test-art-target Change-Id: I77cb1e9bf8f1b4c58b72d3cf5ca31ced2aaa1ea3
|
4acd03638fcdb4e5d1666f8eec7eb3bf6d6be035 |
|
09-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "JIT root tables." May be the offender for jit-gcstress failure of 902. This reverts commit ac3ebc3150760425ed00abd56da48f9a6e0666bc. Change-Id: I9ea6c9236fd1729fed7d1868dd8a111172932308
|
ac3ebc3150760425ed00abd56da48f9a6e0666bc |
|
05-Oct-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
JIT root tables. Implement root tables for the JIT. Each JIT compiled method gets a table allocated before the stack maps. The table gets visited through Runtime::SweepSystemWeaks. Implement String roots for x86_64 as an example. Test: test-art-host test-art-target Change-Id: Id3d5bc67479e08b52dd4b253e970201203a0f0d2
|
2c45bc9137c29f886e69923535aff31a74d90829 |
|
25-Oct-2016 |
Vladimir Marko <vmarko@google.com> |
Remove H[Reverse]PostOrderIterator and HInsertionOrderIterator. Use range-based loops instead, introducing helper functions ReverseRange() for iteration in reverse order in containers. When the contents of the underlying container change inside the loop, use an index-based loop that better exposes the container data modifications, compared to the old iterator interface that's hiding it which may lead to subtle bugs. Test: m test-art-host Change-Id: I2a4e6c508b854c37a697fc4b1e8423a8c92c5ea0
|
5e4e11e171f90d9a3ea178fc8e72aac909de55d5 |
|
22-Sep-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Clean-up sharpening and compiler driver. Remove dependency on compiler driver for sharpening and dex2dex (the methods called on the compiler driver were doing unnecessary work), and remove the now unused methods in compiler driver. Also remove test that is now invalid, as sharpening always succeeds. test: m test-art-host m test-art-target Change-Id: I54e91c6839bd5b0b86182f2f43ba5d2c112ef908
|
fe8854609898b5a148d2c4094aa9970af1a4ec59 |
|
22-Sep-2016 |
Scott Wakeling <scott.wakeling@linaro.org> |
Revert "Revert "ARM: VIXL32: Add an initial code generator that passes codegen_tests."" This VIXL32-based code generator is not enabled in the optimizing compiler by default. Changes in codegen_test.cc test it in parallel with the existing ARM backend. This patch provides a base for further work, the new backend will not be enabled in the optimizing compiler until parity is proven with the current ARM backend and assembler. Test: gtest-codegen_test on host and target This reverts commit 7863a2152865a12ad9593d8caad32698264153c1. Change-Id: Ia09627bac22e78732ca982d207dc0b00bda435bb
|
7863a2152865a12ad9593d8caad32698264153c1 |
|
21-Sep-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "ARM: VIXL32: Add an initial code generator that passes codegen_tests." Failing with: art/compiler/optimizing/code_generator_arm_vixl.cc:396:47: error: too few arguments to function call, expected 3, have 2 ValidateInvokeRuntime(instruction, slow_path); This reverts commit b138dfbd76f9d8b64fb9dbaf1a7c25e2549b2a8c. Change-Id: Idccfe076f5905ea92ecbe3afbc7c8c64ecda94be
|
804b03ffb9b9dc6cc3153e004c2cd38667508b13 |
|
14-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Change remaining slow path throw entrypoints to save everything. Change DivZeroCheck, BoundsCheck and explicit NullCheck slow path entrypoints to conform to kSaveEverything. On Nexus 9, AOSP ToT, the boot.oat size reduction is prebuilt multi-part boot image: - 32-bit boot.oat: -12KiB (-0.04%) - 64-bit boot.oat: -24KiB (-0.06%) on-device built single boot image: - 32-bit boot.oat: -8KiB (-0.03%) - 64-bit boot.oat: -16KiB (-0.04%) Test: Run ART test suite including gcstress on host and Nexus 9. Test: Manually disable implicit null checks and test as above. Change-Id: If82a8082ea9ae571c5d03b5e545e67fcefafb163
|
91a6516103b8bf8bb75c3a2840cbdec7521e74a7 |
|
19-Sep-2016 |
Alexandre Rames <alexandre.rames@linaro.org> |
Remove the `CanTriggerGC` side-effects on a few instructions. The side-effect was specified for these instructions as they call runtime. We now have a list of entrypoints that we know cannot trigger GC. We can avoid requiring the side-effect for those. Test: Run ART test suite on Nexus 5X and host. Change-Id: I0e0e6a4d701ce6c75aff486cb0d1bc7fe2e8dda4
|
b138dfbd76f9d8b64fb9dbaf1a7c25e2549b2a8c |
|
26-Jul-2016 |
Scott Wakeling <scott.wakeling@linaro.org> |
ARM: VIXL32: Add an initial code generator that passes codegen_tests. This VIXL32-based code generator is not enabled in the optimizing compiler by default. Changes in codegen_test.cc test it in parallel with the existing ARM backend. This patch provides a base for further work, the new backend will not be enabled in the optimizing compiler until parity is proven with the current ARM backend and assembler. Test: gtest-codegen_test on host and target Change-Id: Id556a975b2645bf1d98ab2984650e8435b2312c2
|
3b7537bfc5a6b7ccb18b3970d8edf14b72464af7 |
|
13-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Use implicit null checks inside try blocks."" Fix implicit checks in try blocks to emit stack maps. Fix arm64 null expection from signal entrypoint to call the runtime handler instead or simply jumping there. On Nexus 9, AOSP ToT, the boot.oat size reduction is prebuilt multi-part boot image: - 32-bit boot.oat: -448KiB (-1.3%) - 64-bit boot.oat: -528KiB (-1.2%) on-device built single boot image: - 32-bit boot.oat: -448KiB (-1.4%) - 64-bit boot.oat: -528KiB (-1.3%) Note that the oat files no longer contain dex files which have been moved to vdex, so the percentages are not directly comparable with the those reported in the original commit. Test: Run ART test suite including gc-stress on host and Nexus 9. Bug: 30212852 Bug: 31468464 This reverts commit 0719b5b9b458cb3eb9f0823f0dacdfe1a71214dd. Change-Id: If8a9da8c11adf2aad203e93b6684ce16ed776285
|
0719b5b9b458cb3eb9f0823f0dacdfe1a71214dd |
|
13-Sep-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Use implicit null checks inside try blocks." Fails gcstress tests. This reverts commit 7aa7560683626c7893011271c241b3265ded1dc3. Change-Id: I4f5c89048b9ffddbafa02f3001e329ff87058ca2
|
7aa7560683626c7893011271c241b3265ded1dc3 |
|
07-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Use implicit null checks inside try blocks. Make implicit null check entrypoint save all registers, use platform-specific approach to still pass the fault address. Allow implicit null checks in try blocks. On Nexus 9, AOSP ToT, the boot.oat size reduction is prebuilt multi-part boot image: - 32-bit boot.oat: -452KiB (-0.7%) - 64-bit boot.oat: -482KiB (-0.7%) on-device built single boot image: - 32-bit boot.oat: -444KiB (-0.7%) - 64-bit boot.oat: -488KiB (-0.7%) Test: Run ART test suite on host and Nexus 9. Test: Build aosp_mips64-eng. Change-Id: I279f3ab57e2e2f338131c5cac45c51b673bdca19
|
70e97462116a47ef2e582ea29a037847debcc029 |
|
09-Aug-2016 |
Vladimir Marko <vmarko@google.com> |
Avoid excessive spill slots for slow paths. Reducing the frame size makes stack maps smaller as we need fewer bits for stack masks and some dex register locations may use short location kind rather than long. On Nexus 9, AOSP ToT, the boot.oat size reduction is prebuilt multi-part boot image: - 32-bit boot.oat: -416KiB (-0.6%) - 64-bit boot.oat: -635KiB (-0.9%) prebuilt multi-part boot image with read barrier: - 32-bit boot.oat: -483KiB (-0.7%) - 64-bit boot.oat: -703KiB (-0.9%) on-device built single boot image: - 32-bit boot.oat: -380KiB (-0.6%) - 64-bit boot.oat: -632KiB (-0.9%) on-device built single boot image with read barrier: - 32-bit boot.oat: -448KiB (-0.6%) - 64-bit boot.oat: -692KiB (-0.9%) The other benefit is that at runtime, threads may need fewer pages for their stacks, reducing overall memory usage. We defer the calculation of the maximum spill size from the main register allocator (linear scan or graph coloring) to the RegisterAllocationResolver and do it based on the live registers at slow path safepoints. The old notion of an artificial slow path safepoint interval is removed as it is no longer needed. Test: Run ART test suite on host and Nexus 9. Bug: 30212852 Change-Id: I40b3d114e278e2c5807982904fa49bf6642c6275
|
57eb0f58419e0e6773f69cf6e0c78e5fed0464cd |
|
30-Jul-2016 |
Alexey Frunze <Alexey.Frunze@imgtec.com> |
MIPS32: Fill branch delay slots Test: booted MIPS32 in QEMU Test: test-art-host-gtest Test: test-art-target-gtest Test: test-art-target-run-test-optimizing on CI20 Change-Id: I727e80753395ab99fff004cb5d2e0a06409150d7
|
16d9f949698faed28435af7aa9c9ebacbfd5d1a8 |
|
25-Aug-2016 |
Roland Levillain <rpl@google.com> |
Re-enable the ArraySet fast path with Baker read barriers. Benchmarks (ARM64) score variations on Nexus 5X with CPU cores clamped at 960000 Hz (aosp_bullhead-userdebug build): - Ritzperf - average (lower is better): -0.95% (virtually unchanged) - CaffeineMark - average (higher is better): +2.50% (slightly better) - DeltaBlue (lower is better): -0.55% (virtually unchanged) - Richards - average (lower is better): +0.67% (virtually unchanged) - SciMark2 - average (higher is better): -0.10% (virtually unchanged) Details about Ritzperf benchmarks with meaningful variations (lower is better): - GenericCalcActions.MemAllocTest: -5.05% (better) Details about CaffeineMark benchmarks with meaningful variations (higher is better): - Method: +16.88% (better) Details about Richards benchmarks with meaningful variations (lower is better): - deutsch_acc_interface: +9.86% (worse) Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 105933472 bytes -> 106027680 bytes (+0.09%) - total ARM framework Oat files size change: 89157936 bytes -> 89239856 bytes (+0.09%) Test: ART host and target (ARM, ARM64) tests. Bug: 29516974 Bug: 29506760 Bug: 12687968 Change-Id: Ib9e9709712295e17804b8888ac10e3d518ff2e70
|
0b671c0408e98824e1f92b1ee951b210c090fe7a |
|
19-Aug-2016 |
Roland Levillain <rpl@google.com> |
Add support for Baker read barriers in SystemArrayCopy intrinsics. Benchmarks (ARM64) score variations on Nexus 5X with CPU cores clamped at 960000 Hz (aosp_bullhead-userdebug build): - Ritzperf - average (lower is better): -3.03% (slightly better) - CaffeineMark - average (higher is better): +1.26% (slightly better) - DeltaBlue (lower is better): -10.50% (better) - Richards - average (lower is better): -3.36% (slightly better) - SciMark2 - average (higher is better): +0.26% (virtually unchanged) Details about Ritzperf benchmarks with meaningful variations (lower is better): - FormulaEvaluationActions.EvaluateAndApplyChanges: -13.26% (better) - FormulaEvaluationActions.EvaluateCascadingSums: -10.94% (better) - FormulaEvaluationActions.EvaluateComplexFormulas: -15.50% (better) - FormulaEvaluationActions.EvaluateFibonacci: -10.41% (better) - FormulaEvaluationActions.EvaluateLargeSums: +6.02% (worse) Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 107047632 bytes -> 107154128 bytes (+0.10%) - total ARM framework Oat files size change: 90932028 bytes -> 91009852 bytes (+0.09%) Test: ART host and target (ARM, ARM64) tests + Nexus 5X boot. Bug: 29516905 Bug: 29506760 Bug: 12687968 Change-Id: I85431368d09965687a0301ae2eb3c991f276ce5d
|
952dbb19cd094b8bfb01dbb33e0878db429e499a |
|
28-Jul-2016 |
Vladimir Marko <vmarko@google.com> |
Change suspend entrypoint to save all registers. We avoid the need to save/restore registers in slow paths and get significant code size savings. On Nexus 9, AOSP: - 32-bit boot.oat: -1.4MiB (-1.9%) - 64-bit boot.oat: -2.0MiB (-2.3%) - other 32-bit oat files in dalvik-cache: -200KiB (-1.7%) - other 64-bit oat files in dalvik-cache: -2.3MiB (-2.1%) Test: Run ART test suite on host and Nexus 9 with gc stress. Bug: 30212852 Change-Id: I7015afc1e7d30341618c9200a3dc9ae277afd134
|
542451cc546779f5c67840e105c51205a1b0a8fd |
|
26-Jul-2016 |
Andreas Gampe <agampe@google.com> |
ART: Convert pointer size to enum Move away from size_t to dedicated enum (class). Bug: 30373134 Bug: 30419309 Test: m test-art-host Change-Id: Id453c330f1065012e7d4f9fc24ac477cc9bb9269
|
806f0122e923581f559043e82cf958bab5defc87 |
|
09-Mar-2016 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Add support for CallKind::kCallOnMainAndSlowPath Some of the intrinsics call on both the main and slowpath. This patch adds support for such a CallKind and marks the intrinsics accordingly. This will be exercised by a later patch that refactors all the runtime calls to use InvokeRuntime(). Please note that without this patch, the calls to ValidateInvokeRuntime() exercised by the following patches would fail. Change-Id: I450571b8b47280a004b714996189ba6db13fb57d
|
dec8f63fdf50815f24efe1c03af64208da15f339 |
|
22-Jul-2016 |
Roland Levillain <rpl@google.com> |
Do not emit stack maps for runtime calls to ReadBarrierMarkRegX. * Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 115584120 bytes -> 109124728 bytes (-5.59%) - total ARM framework Oat files size change: 97387728 bytes -> 92517584 (-5.00%) Test: ART host and target (ARM, ARM64) tests. Bug: 29506760 Bug: 12687968 Change-Id: I979d9fb2b4e09f4c0c7bf33af2cd91750a67f989
|
68bd9b9b165ffca1a49b80bb437ce9f87b738264 |
|
15-Jul-2016 |
Alexandre Rames <alexandre.rames@linaro.org> |
ARM64: Improve code generated to spill/restore for slow paths. Aligning the accesses allows generating better code. Before: add x16, sp, #0x44 (68) stp x0, x1, [x16, #-16] After: stp x0, x1, [sp, #56] Change-Id: I3e20ad3fa59d00aee4b4d14ea9d59c7cd546509e
|
54ff482710910929900f8348a19c5b875e519237 |
|
07-Jul-2016 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Rename kCall to kCallOnMainOnly This patch renames kCall to kCallOnMainOnly in preparation for the next patch in this series which will be adding kCallOnMainAndSlowPath. Note: With this patch there will be places where we use kCallOnMainOnly even though we call on the slow path too. The next patch in this series will fix that. Test: ART host tests. Change-Id: Iabfdb0901990d163be5d780f3bdd2fab6fa17b32
|
e90049140fdfb89080e5cc9b000b0c9be8c18bcd |
|
16-Jun-2016 |
Vladimir Marko <vmarko@google.com> |
Create a typedef for HInstruction::GetInputs() return type. And some other cleanup after https://android-review.googlesource.com/230742 Test: No new tests. ART test suite passed (tested on host). Change-Id: I4743bf17544d0234c6ccb46dd0c1b9aae5c93e17
|
87f3fcbd0db352157fc59148e94647ef21b73bce |
|
28-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Replace String.charAt() with HIR. Replace String.charAt() with HArrayLength, HBoundsCheck and HArrayGet. This allows GVN on the HArrayLength and BCE on the HBoundsCheck as well as using the infrastructure for HArrayGet, i.e. better handling of constant indexes than the old intrinsic and using the HArm64IntermediateAddress. Bug: 28330359 Change-Id: I32bf1da7eeafe82537a60416abf6ac412baa80dc
|
372f10e5b0b34e2bb6e2b79aeba6c441e14afd1f |
|
17-May-2016 |
Vladimir Marko <vmarko@google.com> |
Refactor handling of input records. Introduce HInstruction::GetInputRecords(), a new virtual function that returns an ArrayRef<> to all input records. Implement all other functions dealing with input records as wrappers around GetInputRecords(). Rewrite functions that previously used multiple virtual calls to deal with input records, especially in loops, to prefetch the ArrayRef<> only once for each instruction. Besides avoiding all the extra calls, this also allows the compiler (clang++) to perform additional optimizations. This speeds up the Nexus 5 boot image compilation by ~0.5s (4% of "Compile Dex File", 2% of dex2oat time) on AOSP ToT. Change-Id: Id8ebe0fb9405e38d918972a11bd724146e4ca578
|
288c7a8664e516d7486ab85267050e676e84cc39 |
|
16-May-2016 |
Serguei Katkov <serguei.i.katkov@intel.com> |
Revert "Revert "ART: Reference.getReferent intrinsic for x86 and x86_64"" This reverts commit 0997d24e67d78f2146ebae2888eda0d7d254789a. ART_HEAP_POISONING=true mode is fixed. Change-Id: I83f6d5c101ea6a86802753f81b3e4348a263fb21 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
0997d24e67d78f2146ebae2888eda0d7d254789a |
|
13-May-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "ART: Reference.getReferent intrinsic for x86 and x86_64" Fails heap poisoning configuration. This reverts commit afdc97ebcb4e58afb7cf54d846d30314e6499d83. Change-Id: I50e53756a2b85059b89cfb8950f8c9e2b032743c
|
afdc97ebcb4e58afb7cf54d846d30314e6499d83 |
|
05-May-2016 |
Serguei Katkov <serguei.i.katkov@intel.com> |
ART: Reference.getReferent intrinsic for x86 and x86_64 Change-Id: I7a7ac9244847dd80d9fa4e4b5ebc5bf451c628ff Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
dce016eab87302f02b0bd903dd2cd86ae512df2d |
|
28-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Intrinsify String.length() and String.isEmpty() as HIR. Use HArrayLength for String.length() in anticipation of changing the String.charAt() to HBoundsCheck+HArrayGet to allow the existing BCE to seamlessly work for strings. Use HArrayLength+HEqual for String.isEmpty(). We previously relied on inlining but we now want to apply the new intrinsics even when we do not inline, i.e. when compiling debuggable (as is currently the case for boot image) or when we hit inlining limits, i.e. depth, size, or the number of accumulated dex registers. Bug: 28330359 Change-Id: Iab9d2f6d2967bdd930a72eb461f27efe8f37c103
|
c393d63aa2b8f6984672fdd4de631bbeff14b6a2 |
|
15-Apr-2016 |
Alexandre Rames <alexandre.rames@linaro.org> |
Fix: correctly destruct VIXL labels. (cherry picked from commit c01a66465a398ad15da90ab2bdc35b7f4a609b17) Bug: 27505766 Change-Id: I077465e3d308f4331e7a861902e05865f9d99835
|
c01a66465a398ad15da90ab2bdc35b7f4a609b17 |
|
15-Apr-2016 |
Alexandre Rames <alexandre.rames@linaro.org> |
Fix: correctly destruct VIXL labels. Bug: 27505766 Change-Id: I077465e3d308f4331e7a861902e05865f9d99835
|
d58b837ae41c6d8ce010c362e8f85bd938715900 |
|
12-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Allocate code generators on the arena. Change-Id: If8cf0ee43711f6e13171443e3c057ff370ccfbaa
|
dee58d6bb6d567fcd0c4f39d8d690c3acaf0e432 |
|
07-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals"" This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) This CL fixed an issue with parsing quickened instructions. Bug: 27894376 Bug: 27998571 Bug: 27995065 Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
|
60328910cad396589474f8513391ba733d19390b |
|
04-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals" Bug: 27995065 This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f. Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
|
9d07e3d128ccfa0ef7670feadd424a825e447d1d |
|
31-Mar-2016 |
Vladimir Marko <vmarko@google.com> |
Clean up OatQuickMethodHeader after Quick removal. This reduces the size of the pre-header by 8 bytes, reducing oat file size and mmapped .text section size. The memory needed to store a CompiledMethod by dex2oat is also reduced, for 32-bit dex2oat by 8B and for 64-bit dex2oat by 16B. The aosp_flounder-userdebug 32-bit and 64-bit boot.oat are each about 1.1MiB smaller. Disable the broken StubTest.IMT, b/27991555 . Change-Id: I05fe45c28c8ffb7a0fa8b1117b969786748b1039
|
e3ff7b293be2a6791fe9d135d660c0cffe4bd73f |
|
02-Mar-2016 |
David Brazdil <dbrazdil@google.com> |
Refactor HGraphBuilder and SsaBuilder to remove HLocals This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) Bug: 27894376 Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
|
86ea7eeabe30c98bbe1651a51d03cb89776724e7 |
|
16-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
Build dominator tree before generating HInstructions Second CL in the series of merging HGraphBuilder and SsaBuilder. This patch refactors the builders so that dominator tree can be built before any HInstructions are generated. This puts the SsaBuilder removal of HLoadLocals/HStoreLocals straight after HGraphBuilder's HInstruction generation phase. Next CL will therefore be able to merge them. This patch also adds util classes for iterating bytecode and switch tables which allowed to simplify the code. Bug: 27894376 Change-Id: Ic425d298b2e6e7980481ed697230b1a0b7904526
|
09ed09866da6d8c7448ef297c148bfa577a247c2 |
|
12-Feb-2016 |
David Srbecky <dsrbecky@google.com> |
Pack stack map entries on bit level to save space. Use only the minimum number of bits required to store stack map data. For example, if native_pc needs 5 bits and dex_pc needs 3 bits, they will share the first byte of the stack map entry. The header is changed to store bit offsets of the fields rather than byte sizes. Offsets also make it easier to access later fields without calculating sum of all previous sizes. All of the header fields are byte sized or encoded as ULEB128 instead of the previous fixed size encoding. This shrinks it by about half. It saves 3.6 MB from non-debuggable boot.oat (AOSP). It saves 3.1 MB from debuggable boot.oat (AOSP). It saves 2.8 MB (of 99.4 MB) from /system/framework/arm/ (GOOG). It saves 1.0 MB (of 27.8 MB) from /system/framework/oat/arm/ (GOOG). Field loads from stackmaps seem to get around 10% faster. (based on the time it takes to load all stackmap entries from boot.oat) Bug: 27640410 Change-Id: I8bf0996b4eb24300c1b0dfc6e9d99fe85d04a1b7
|
bf9611f821697b14bf9e170f503c3f47613b046b |
|
26-Mar-2016 |
Andreas Gampe <agampe@google.com> |
ART: Clean up verifier Clean up verifier post-Quick. Change-Id: I0b05e10dd06edd228fe2068c8afffc4b7d7fdffa
|
f6a35de9eeefb20f6446f1b4815b4dcb0161d09c |
|
21-Mar-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Fix register allocator validation memory usage. Also attribute ArenaBitVector allocations to appropriate passes. This was used to track down the source of the excessive memory alloactions. Bug: 27690481 Change-Id: Ib895984cb7c04e24cbc7abbd8322079bab8ab100
|
d28f4a00933a4a3b8d5e9db73b8532924d0f989d |
|
14-Mar-2016 |
David Srbecky <dsrbecky@google.com> |
Generate native debug stackmaps before calls as well. The debugger looks up PC of the call instruction, so the runtime's stackmap is not sufficient since it is at PC after the instruction. Change-Id: I0dd06c0b52e8079ea5d064ea10beb12c93584092
|
2ae48182573da7087bffc2873730bc758ec29696 |
|
16-Mar-2016 |
Calin Juravle <calin@google.com> |
Clean up NullCheck generation and record stats about it. This removes redundant code from the generators and allows for easier stat recording. Change-Id: Iccd4368f9e9d87a6fecb863dee4e2145c97851c4
|
1693a1f9c83a0bf5a29fa18ddc2d87e04e049233 |
|
15-Mar-2016 |
Roland Levillain <rpl@google.com> |
Make art::HCompare side effect free. All our back ends implement all comparisons without making a runtime call, so we can mark art::HCompare as a side effect free instruction unconditionally. Change-Id: I9a9e7c09156c642edb6af1fe84408f887e762f2e
|
d1c4045fb4d4703642f3f79985727b9a12cf5c49 |
|
03-Mar-2016 |
Aart Bik <ajcbik@google.com> |
Avoid generating dead code on frame enter/exit. This includes stack operations and, on x86, call/pop to read PC. bug=26997690 Rationale: (1) If method is fully intrinsified, and makes no calls in slow path or uses special input, no need to require current method. (2) Invoke instructions with HasPcRelativeDexCache() generate code that reads the PC (call/pop) on x86. However, if the invoke is an intrinsic that is later replaced with actual code, this PC reading code may be dead. Example X86 (before/after): 0x0000108c: 83EC0C sub esp, 12 0x0000108f: 890424 mov [esp], eax <-- not needed 0x00001092: E800000000 call +0 (0x00001097) 0x00001097: 58 pop eax <-- dead code to read PC 0x00001098: F30FB8C1 popcnt eax, ecx 0x0000109c: F30FB8DA popcnt ebx, edx 0x000010a0: 03D8 add ebx, eax 0x000010a2: 89D8 mov eax, ebx 0x000010a4: 83C40C add esp, 12 <-- not needed 0x000010a7: C3 ret 0x0000103c: F30FB8C1 popcnt eax, ecx 0x00001040: F30FB8DA popcnt ebx, edx 0x00001044: 03D8 add ebx, eax 0x00001046: 89D8 mov eax, ebx 0x00001048: C3 ret Example ARM64 (before/after): 0x0000103c: f81e0fe0 str x0, [sp, #-32]! 0x00001040: f9000ffe str lr, [sp, #24] 0x00001044: dac01020 clz x0, x1 0x00001048: f9400ffe ldr lr, [sp, #24] 0x0000104c: 910083ff add sp, sp, #0x20 (32) 0x00001050: d65f03c0 ret 0x0000103c: dac01020 clz x0, x1 0x00001040: d65f03c0 ret Change-Id: I8377db80c9a901a08fff4624927cf4a6e585da0c
|
9cd6d378bd573cdc14d049d32bdd22a97fa4d84a |
|
09-Feb-2016 |
David Srbecky <dsrbecky@google.com> |
Associate slow paths with the instruction that they belong to. Almost all slow paths already know the instruction they belong to, this CL just moves the knowledge to the base class as well. This is needed to be be able to get the corresponding dex pc for slow path, which allows us generate better native line numbers, which in turn fixes some native debugging stepping issues. Change-Id: I568dbe78a7cea6a43a4a71a014b3ad135782c270
|
c7098ff991bb4e00a800d315d1c36f52a9cb0149 |
|
09-Feb-2016 |
David Srbecky <dsrbecky@google.com> |
Remove HNativeDebugInfo from start of basic blocks. We do not require full environment at the start of basic block. The dex pc contained in basic block is sufficient for line mapping. Change-Id: I5ba9e5f5acbc4a783ad544769f9a73bb33e2bafa
|
6e332529c33be4d7dae5dad3609a839f4c0d3bfc |
|
02-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove HTemporary Change-Id: I21b984224370a9ce7a4a13a9652503cfb03c5f03
|
b331febbab8e916680faba722cc84b66b84218a3 |
|
05-Feb-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Implement on-stack replacement for arm/arm64/x86/x86_64."" This reverts commit bd89a5c556324062b7d841843b039392e84cfaf4. Change-Id: I08d190431520baa7fcec8fbdb444519f25ac8d44
|
bd89a5c556324062b7d841843b039392e84cfaf4 |
|
05-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Implement on-stack replacement for arm/arm64/x86/x86_64." DCHECK whether loop headers are covered fails. This reverts commit 891bc286963892ed96134ca1adb7822737af9710. Change-Id: I0f9a90630b014b16d20ba1dfba31ce63e6648021
|
891bc286963892ed96134ca1adb7822737af9710 |
|
29-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement on-stack replacement for arm/arm64/x86/x86_64. High-level overview: - osr_method_threshold is used to know when to compile a method in osr mode (-> treat all loops as irreducible). - branch instructions in the compiler query whether they can jump to an osr method. - An osr entry point is found through the stack maps: if a stack map is duplicated in the CodeInfo, it is an osr entry point. Change-Id: Ifb39338cd281e2c7eccce67f4e18d46428be71e4
|
58282f4510961317b8d5a364a6f740a78926716f |
|
14-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove Baseline compiler We don't need Baseline any more and it hasn't been maintained for a while anyway. Let's remove it. Change-Id: I442ed26855527be2df3c79935403a25b1ee55df6
|
780aeced2a8ef918901d8f450864de934f79c555 |
|
13-Jan-2016 |
Alexandre Rames <alexandre.rames@linaro.org> |
Update `ValidateInvokeRuntime()` and HDivZeroCheck. Change-Id: I35beab2777a8c83bd508d56966afa1ceff9ee24f
|
b7070a2db8b0b7eca14f01f932be305be64ded57 |
|
08-Jan-2016 |
David Srbecky <dsrbecky@google.com> |
Generate Nops to ensure that debug stack maps have distinct PC. Change-Id: I5740ec958a20d236634b66df0e675382ed5c16fc
|
f71b3ade9c99ce2fec2f5049ce9c5968721e1b81 |
|
08-Dec-2015 |
David Srbecky <dsrbecky@google.com> |
Get source mapping table from stack maps. Stack maps contain pc to dex mapping. Reuse them instead of maintaining separate map. Change-Id: Iaaec9a6bd2603eace1dfc8f4344087883d88cce3
|
c53c0797a78a89d637e4230503cc1feb27e855a8 |
|
19-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Clean up the special input in HInvokeStaticOrDirect. Change-Id: I4042aefbdac1a8c236d00e2e7145349a64f6486b
|
0d5a281c671444bfa75d63caf1427a8c0e6e1177 |
|
13-Nov-2015 |
Roland Levillain <rpl@google.com> |
x86/x86-64 read barrier support for concurrent GC in Optimizing. This first implementation uses slow paths to instrument heap reference loads and GC root loads for the concurrent copying collector, respectively calling the artReadBarrierSlow and artReadBarrierForRootSlow (new) runtime entry points. Notes: - This implementation does not instrument HInvokeVirtual nor HInvokeInterface instructions (for class reference loads), as the corresponding read barriers are not stricly required with the current concurrent copying collector. - Intrinsics which may eventually call (on slow path) are disabled when read barriers are enabled, as the current slow path infrastructure does not support this case. - When read barriers are enabled, the code generated for a HArraySet instruction always go into the array set slow path for object arrays (delegating the operation to the runtime), as we are lacking a mechanism to keep a temporary register live accross a runtime call (needed for the instrumentation of type checking code, which requires two successive read barriers). Bug: 12687968 Change-Id: I14cd6107233c326389120336f93955b28ffbb329
|
0f7dca4ca0be8d2f8776794d35edf8b51b5bc997 |
|
02-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing/X86: PC-relative dex cache array addressing. Add PC-relative dex cache array addressing for X86 and use it for better invoke-static/-direct dispatch. Also delay the initialization to the PC-relative base until needed. Change-Id: Ib8634d5edce4920cd70172fd13211809cf6948d1
|
d28b969c273ab777ca9b147b87fcef671b4f695f |
|
04-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Code cleanup to avoid CompilerDriver abstractions in JIT. Avoids allocating a CompiledMethod. Change-Id: I35b4aa0d7c74daba68e827a01e71c300fce3b3bf
|
dc151b2346bb8a4fdeed0c06e54c2fca21d59b5d |
|
15-Oct-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Determine invoke-static/-direct dispatch early. Determine the dispatch type of invoke-static/-direct in a special pass right after the type inference. This allows the inliner to pass the "needs dex cache" check and inline more. It also allows the code generator to avoid requesting a register location for the ArtMethod* for kDexCachePcRelative and direct methods. The supported dispatch check handles also situations that the CompilerDriver currently doesn't allow. The cleanup of the CompilerDriver and required changes to Quick will come in a separate change. Change-Id: I3f8e903a119949e95871d8ab0a995f4731a13a07
|
f652cecb984c104d44a0223c3c98400ef8ed8ce2 |
|
25-Aug-2015 |
Goran Jakovljevic <Goran.Jakovljevic@imgtec.com> |
MIPS: Initial version of optimizing compiler for MIPS32 Change-Id: I370388e8d5de52c7001552b513877ef5833aa621
|
5bd05a5c9492189ec28edaf6396d6a39ddf03367 |
|
13-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement System.arraycopy intrinsic for arm. Change-Id: I58ae1af5103e281fe59fbe022b718d6d8f293a5e
|
ec7802a102d49ab5c17495118d4fe0bcc7287beb |
|
01-Oct-2015 |
Vladimir Marko <vmarko@google.com> |
Add DCHECKs to ArenaVector and ScopedArenaVector. Implement dchecked_vector<> template that DCHECK()s element access and insert()/emplace()/erase() positions. Change the ArenaVector<> and ScopedArenaVector<> aliases to use the new template instead of std::vector<>. Remove DCHECK()s that have now become unnecessary from the Optimizing compiler. Change-Id: Ib8506bd30d223f68f52bd4476c76d9991acacadc
|
580b609cd6cfef46108156457df42254d11e72a7 |
|
06-Oct-2015 |
Calin Juravle <calin@google.com> |
Fix location summary for LoadClass Don't request a register for the current method if we're gonna call the runtime. Change-Id: I9760d15108bd95efb2a34e6eacd84b60841781d7
|
98893e146b0ff0e1fd1d7c29252f1d1e75a163f2 |
|
02-Oct-2015 |
Calin Juravle <calin@google.com> |
Add support for unresolved classes in optimizing. Change-Id: I0e299a81e560eb9cb0737ec46125dffc99333b54
|
e460d1df1f789c7c8bb97024a8efbd713ac175e9 |
|
29-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Support unresolved fields in optimizing" The CL also changes the calling convetion for 64bit static field set to use kArg2 instead of kArg1. This allows optimizing to keep the asumptions: - arm pairs are always of form (even_reg, odd_reg) - ecx_edx is not used as a register on x86. This reverts commit e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1. Change-Id: I93159917565824084abc96775f31be1a4249f2f3
|
225b6464a58ebe11c156144653f11a1c6607f4eb |
|
28-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in code generators. And completely remove the deprecated GrowableArray. Replace GrowableArray with ArenaVector in code generators and related classes and tag arena allocations. Label arrays use direct allocations from ArenaAllocator because Label is non-copyable and non-movable and as such cannot be really held in a container. The GrowableArray never actually constructed them, instead relying on the zero-initialized storage from the arena allocator to be correct. We now actually construct the labels. Also avoid StackMapStream::ComputeDexRegisterMapSize() being passed null references, even though unused. Change-Id: I26a46fdd406b23a3969300a67739d55528df8bf4
|
e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1 |
|
17-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Support unresolved fields in optimizing" breaks debuggable tests. This reverts commit 23a8e35481face09183a24b9d11e505597c75ebb. Change-Id: I8e60b5c8f48525975f25d19e5e8066c1c94bd2e5
|
23a8e35481face09183a24b9d11e505597c75ebb |
|
08-Sep-2015 |
Calin Juravle <calin@google.com> |
Support unresolved fields in optimizing Change-Id: I9941fa5fcb6ef0a7a253c7a0b479a44a0210aad4
|
175dc732c80e6f2afd83209348124df349290ba8 |
|
25-Aug-2015 |
Calin Juravle <calin@google.com> |
Support unresolved methods in Optimizing Change-Id: If2da02b50d2fa668cd58f134a005f1752e7746b1
|
fa6b93c4b69e6d7ddfa2a4ed0aff01b0608c5a3a |
|
15-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in HGraph. Replace GrowableArray with ArenaVector in HGraph and related classes HEnvironment, HLoopInformation, HInvoke and HPhi, and tag allocations with new arena allocation types. Change-Id: I3d79897af405b9a1a5b98bfc372e70fe0b3bc40d
|
77a48ae01bbc5b05ca009cf09e2fcb53e4c8ff23 |
|
15-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Register allocation and runtime support for try/catch"" The original CL triggered b/24084144 which has been fixed by Ib72e12a018437c404e82f7ad414554c66a4c6f8c. This reverts commit 659562aaf133c41b8d90ec9216c07646f0f14362. Change-Id: Id8980436172457d0fcb276349c4405f7c4110a55
|
659562aaf133c41b8d90ec9216c07646f0f14362 |
|
14-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Register allocation and runtime support for try/catch" Breaks libcore test org.apache.harmony.security.tests.java.security.KeyStorePrivateKeyEntryTest#testGetCertificateChain. Need to investigate. This reverts commit b022fa1300e6d78639b3b910af0cf85c43df44bb. Change-Id: Ib24d3a80064d963d273e557a93469c95f37b1f6f
|
b022fa1300e6d78639b3b910af0cf85c43df44bb |
|
20-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Register allocation and runtime support for try/catch This patch completes a series of CLs that add support for try/catch in the Optimizing compiler. With it, Optimizing can compile all methods containing try/catch, provided they don't contain catch loops. Future work will focus on improving performance of the generated code. SsaLivenessAnalysis was updated to propagate liveness information of instructions live at catch blocks, and to keep location information on instructions which may be caught by catch phis. RegisterAllocator was extended to spill values used after catch, and to allocate spill slots for catch phis. Catch phis generated for the same vreg share a spill slot as the raw value must be the same. Location builders and slow paths were updated to reflect the fact that throwing an exception may not lead to escaping the method. Instruction code generators are forbidden from using of implicit null checks in try blocks as live registers need to be saved before handing over to the runtime. CodeGenerator emits a stack map for each catch block, storing locations of catch phis. CodeInfo and StackMapStream recognize this new type of stack map and store them separate from other stack maps to avoid dex_pc conflicts. After having found the target catch block to deliver an exception to, QuickExceptionHandler looks up the dex register maps at the throwing instruction and the catch block and copies the values over to their respective locations. The runtime-support approach was selected because it allows for the best performance in the normal control-flow path, since no propagation of catch phi values is necessary until the exception is thrown. In addition, it also greatly simplifies the register allocation phase. ConstantHoisting was removed from LICMTest because it instantiated (now abstract) HConstant and was bogus anyway (constants are always in the entry block). Change-Id: Ie31038ad8e3ee0c13a5bbbbaf5f0b3e532310e4e
|
6058455d486219994921b63a2d774dc9908415a2 |
|
03-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag basic block allocations with their source. Replace GrowableArray with ArenaVector in HBasicBlock and, to track the source of allocations, assign one new and two Quick's arena allocation types to these vectors. Rename kArenaAllocSuccessor to kArenaAllocSuccessors. Bug: 23736311 Change-Id: Ib52e51698890675bde61f007fe6039338cf1a025
|
05792b98980741111b4d0a24d68cff2a8e070a3a |
|
03-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Move DexCache arrays to native. This CL has a companion CL in libcore/ https://android-review.googlesource.com/162985 Change-Id: Icbc9e20ad1b565e603195b12714762bb446515fa
|
145acc5361deb769eed998f057bc23abaef6e116 |
|
03-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Optimizing: Tag basic block allocations with their source." Reverting so that we can have more discussion about the STL API. This reverts commit 91e11c0c840193c6822e66846020b6647de243d5. Change-Id: I187fe52f2c16b6e7c5c9d49c42921eb6c7063dba
|
91e11c0c840193c6822e66846020b6647de243d5 |
|
02-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag basic block allocations with their source. Replace GrowableArray with ArenaVector in HBasicBlock and, to track the source of allocations, assign one new and two Quick's arena allocation types to these vectors. Rename kArenaAllocSuccessor to kArenaAllocSuccessors. Bug: 23736311 Change-Id: I984aef6e615ae2380a532f5c6726af21015f43f5
|
2a7c1ef95c850abae915b3a59fbafa87e6833967 |
|
22-Jul-2015 |
Yevgeny Rouban <yevgeny.y.rouban@intel.com> |
Add more dwarf debug line info for Optimized methods. Optimizing compiler generates minimum debug line info that is built using the dex_pc information about suspend points. This is not enough for performance and debugging needs. This CL generates additional debug line information for instructions which have known dex_pc and it ensures that whole call sites are mapped (as opposed to suspend points which map only one instruction past the function call). Bug: 23157336 Change-Id: I9f2b1c2038e3560847c175b8121cf9496b8b58fa Signed-off-by: Yevgeny Rouban <yevgeny.y.rouban@intel.com>
|
f9f6441c665b5ff9004d3ed55014f46d416fb1bb |
|
02-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag Arena allocations with their source. This adds the ability to track where we allocate memory when the kArenaAllocatorCountAllocations flag is turned on. Also move some allocations from native heap to the Arena and remove some unnecessary utilities. Bug: 23736311 Change-Id: I1aaef3fd405d1de444fe9e618b1ce7ecef07ade3
|
ecc4366670e12b4812ef1653f7c8d52234ca1b1f |
|
13-Aug-2015 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Add OptimizingCompilerStats to the CodeGenerator class. Just refactoring, not yet used, but will be used by the incoming patch series and future CodeGen specific stats. Change-Id: I7d20489907b82678120518a77bdab9c4cc58f937 Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
|
4ab02352db4051d590b793f34d166a0b5c633c4a |
|
12-Aug-2015 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Use CodeGenerator::RecordPcInfo instead of SlowPathCode::RecordPcInfo. Part of a clean-up and refactoring series. SlowPathCode::RecordPcInfo is currently just a wrapper around CodGenerator::RecordPcInfo. Change-Id: Iffabef4ef37c365051130bf98a6aa6dc0a0fb254 Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
|
d9cb68e3212d31d61445fb7e8446f68991720009 |
|
25-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Add (Fpu)RegHigh stack map location kinds When running Optimized code on 64-bit, high value of vreg pair may be stored in the high 32 bits of a CPU register. This is not reflected in stack maps which would encode both the low and high vreg as kInRegister with the same register number, making it indistinguishable from two non-wide vregs with the same value in the lower 32 bits. Deoptimization deals with this by running the verifier and thus obtaining vreg pair information, but this would be too slow for try/ catch. This patch therefore adds two new stack map location kinds: kInRegisterHigh and kInFpuRegisterHigh to differentiate between the two cases. Note that this also applies to floating-point registers on x86. Change-Id: I15092323e56a661673e77bee1f0fca4261374732
|
581550137ee3a068a14224870e71aeee924a0646 |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Optimizing: Better invoke-static/-direct dispatch."" Fixed kCallArtMethod to use correct callee location for kRecursive. This combination is used when compiling with debuggable flag set. This reverts commit b2c431e80e92eb6437788cc544cee6c88c3156df. Change-Id: Idee0f2a794199ebdf24892c60f8a5dcf057db01c
|
b2c431e80e92eb6437788cc544cee6c88c3156df |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Optimizing: Better invoke-static/-direct dispatch." Reverting due to failing ndebug tests. This reverts commit 9b688a095afbae21112df5d495487ac5231b12d0. Change-Id: Ie4f69da6609df3b7c8443412b6cf7f5c43c2c5d9
|
9b688a095afbae21112df5d495487ac5231b12d0 |
|
06-May-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Better invoke-static/-direct dispatch. Add framework for different types of loading ArtMethod* and code pointer retrieval. Implement invoke-static and invoke-direct calls the same way as Quick. Document the dispatch kinds in HInvokeStaticOrDirect's new enumerations MethodLoadKind and CodePtrLocation. PC-relative loads from dex cache arrays are used only for x86-64 and arm64. The implementation for other architectures will be done in separate CLs. Change-Id: I468ca4d422dbd14748e1ba6b45289f0d31734d94
|
50fa993d67f8a20322c27c1a77e7efcf826531fc |
|
11-Aug-2015 |
Alex Light <allight@google.com> |
Svelter libart-compiler Added new environment variable ART_{TARGET,HOST}_CODEGEN_ARCHS which may be set to 'all', 'svelte' or a space separated list of architectures. When compiled with ART_{TARGET,HOST}_CODEGEN_ARCHS='all' (the default value) dex2oat will be able to generate output for all supported architectures. When compiled with ART_TARGET_CODEGEN_ARCHS='svelte' only the architectures of the TARGET will be included. When ART_HOST_CODEGEN_ARCHS='svelte' all architectures the target includes and the host architectures will be included on the host dex2oat. If a list of architectures is given only those will be included. Change-Id: I87f4ad0131ab1b37544d8799e947ce4733b6daec
|
df3f8227badd0276177774a72f1bcb181688d954 |
|
13-Aug-2015 |
Roland Levillain <rpl@google.com> |
Adjust art::HTypeConversion's side effects for MIPS64. Also improve debugging information in art::CodeGenerator::ValidateInvokeRuntime. Change-Id: Icfcd1a5cfa5e5449a316251dc20547de6badecb5
|
78e3ef6bc5f8aa149f2f8bf0c78ce854c2f910fa |
|
12-Aug-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Add a GVN dependency 'GC' for garbage collection. This will be used by incoming architecture specific optimizations. The dependencies must be conservative. When an HInstruction is created we may not be sure whether it can trigger GC. In that case the 'ChangesGC' dependency must be set. We control at code-generation time that HInstructions that can call have the 'ChangesGC' dependency set. Change-Id: Iea6a7f430009f37a9599b0a0039207049906e45d
|
a1935c4fa255b5c20f5e9b2abce6be2d0f7cb0a8 |
|
26-Jun-2015 |
Roland Levillain <rpl@google.com> |
MIPS: Initial version of optimizing compiler for MIPS64R6. (cherry picked from commit 4dda3376b71209fae07f5c3c8ac3eb4b54207aa8) (amended for mnc-dev) Bug: 21555893 Change-Id: I874dc356eee6ab061a32f8f3df5f8ac3a4ab7dcf Signed-off-by: Alexey Frunze <Alexey.Frunze@imgtec.com> Signed-off-by: Douglas Leung <douglas.leung@imgtec.com>
|
fc6a86ab2b70781e72b807c1798b83829ca7f931 |
|
26-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Implement try/catch blocks in Builder"" This patch enables the GraphBuilder to generate blocks and edges which represent the exceptional control flow when try/catch blocks are present in the code. Actual compilation is still delegated to Quick and Baseline ignores the additional code. To represent the relationship between try and catch blocks, Builder splits the edges which enter/exit a try block and links the newly created blocks to the corresponding exception handlers. This layout will later enable the SsaBuilder to correctly infer the dominators of the catch blocks and to produce the appropriate reverse post ordering. It will not, however, allow for building the complete SSA form of the catch blocks and consequently optimizing such blocks. To this end, a new TryBoundary control-flow instruction is introduced. Codegen treats it the same as a Goto but it allows for additional successors (the handlers). This reverts commit 3e18738bd338e9f8363b26bc895f38c0ec682824. Change-Id: I4f5ea961848a0b83d8db3673763861633e9bfcfb
|
3e18738bd338e9f8363b26bc895f38c0ec682824 |
|
26-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Implement try/catch blocks in Builder" Causes OutOfMemory issues, need to investigate. This reverts commit 0b5c7d1994b76090afcc825e737f2b8c546da2f8. Change-Id: I263e6cc4df5f9a56ad2ce44e18932ca51d7e349f
|
0b5c7d1994b76090afcc825e737f2b8c546da2f8 |
|
11-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement try/catch blocks in Builder This patch enables the GraphBuilder to generate blocks and edges which represent the exceptional control flow when try/catch blocks are present in the code. Actual compilation is still delegated to Quick and Baseline ignores the additional code. To represent the relationship between try and catch blocks, Builder splits the edges which enter/exit a try block and links the newly created blocks to the corresponding exception handlers. This layout will later enable the SsaBuilder to correctly infer the dominators of the catch blocks and to produce the appropriate reverse post ordering. It will not, however, allow for building the complete SSA form of the catch blocks and consequently optimizing such blocks. To this end, a new TryBoundary control-flow instruction is introduced. Codegen treats it the same as a Goto but it allows for additional successors (the handlers). Change-Id: I415b985596d5bebb7b1bb358a46e08b7b04bb53a
|
eb7b7399dbdb5e471b8ae00a567bf4f19edd3907 |
|
19-Jun-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Opt compiler: Add disassembly to the '.cfg' output. This is automatically added to the '.cfg' output when using the usual `--dump-cfg` option. Change-Id: I864bfc3a8299c042e72e451cc7730ad8271e4deb
|
4dda3376b71209fae07f5c3c8ac3eb4b54207aa8 |
|
02-Jun-2015 |
Alexey Frunze <Alexey.Frunze@imgtec.com> |
MIPS: Initial version of optimizing compiler for MIPS64R6. Bug: 21555893 Change-Id: I874dc356eee6ab061a32f8f3df5f8ac3a4ab7dcf Signed-off-by: Alexey Frunze <Alexey.Frunze@imgtec.com> Signed-off-by: Douglas Leung <douglas.leung@imgtec.com>
|
dd3c7d2d6124ceb346b4ed9aa7115f75fc6d3f9f |
|
18-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Remove old DCHECK that trips Baseline Codegen verified that the entry block always falls through to the next block. While this is the case with Optimizing, it doesn't hold for Baseline but it doesn't need to since codegen handles it fine. Bug:21913514 Change-Id: I751ef227e6cf103af3e7fc35fca4b01c663385a1 (cherry picked from commit 015c7e63604c038e866d7af3850c557403cddc8b)
|
015c7e63604c038e866d7af3850c557403cddc8b |
|
18-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Remove old DCHECK that trips Baseline Codegen verified that the entry block always falls through to the next block. While this is the case with Optimizing, it doesn't hold for Baseline but it doesn't need to since codegen handles it fine. Bug:21913514 Change-Id: I751ef227e6cf103af3e7fc35fca4b01c663385a1
|
bd8c725e465cc7f44062745a6f2b73248f5159ed |
|
12-Jun-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Remove PcInfo, use the StackMapStream instead. Change-Id: I474f3a89f6c7ee5c7accd21791b1c1e311104158
|
94015b939060f5041d408d48717f22443e55b6ad |
|
04-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Use HCurrentMethod in HInvokeStaticOrDirect."" Fix was to special case baseline for x86, which does not have enough registers to allocate the current method. This reverts commit c345f141f11faad177aa9635a78088d00cf66086. Change-Id: I5997aa52f8d4df373ae5ff4d4150dac0c44c4c10
|
c345f141f11faad177aa9635a78088d00cf66086 |
|
04-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Use HCurrentMethod in HInvokeStaticOrDirect." Fails on baseline/x86. This reverts commit 38207af82afb6f99c687f64b15601ed20d82220a. Change-Id: Ib71018367eb7c6046965494a7e996c22af3de403
|
38207af82afb6f99c687f64b15601ed20d82220a |
|
01-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use HCurrentMethod in HInvokeStaticOrDirect. Change-Id: I0d15244b6b44c8b10079398c55da5071a3e3af66
|
4e40c2691d42608f871b48b102155c80cf8b27e3 |
|
03-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix lint error. Change-Id: Ie485d52dc8c6670ab717f14081200572dab0357f
|
fd88f16100cceafbfde1b4f095f17e89444d6fa8 |
|
03-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Factorize code for common LocationSummary of HInvoke. This is one step forward, we could factorize more, but I wanted to get this out of the way first. Change-Id: I6ae411a737eebaecb64974f47af507ce0cfbae85
|
3d21bdf8894e780d349c481e5c9e29fe1556051c |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 (cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33) Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3 ART: Fix casts for 64-bit pointers on 32-bit compiler. Bug: 19264997 Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457 Fix JDWP tests after ArtMethod change Fixes Throwable::GetStackDepth for exception event detection after internal stack trace representation change. Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of proxy method. Bug: 19264997 Change-Id: I363e293796848c3ec491c963813f62d868da44d2 Fix accidental IMT and root marking regression Was always using the conflict trampoline. Also included fix for regression in GC time caused by extra roots. Most of the regression was IMT. Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to detached thread. EvaluateAndApplyChanges: From ~2500 -> ~1980 GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots Bug: 19264997 Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0 Fix bogus image test assert Previously we were comparing the size of the non moving space to size of the image file. Now we properly compare the size of the image space against the size of the image file. Bug: 19264997 Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a [MIPS64] Fix art_quick_invoke_stub argument offsets. ArtMethod reference's size got bigger, so we need to move other args and leave enough space for ArtMethod* and 'this' pointer. This fixes mips64 boot. Bug: 19264997 Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
|
e3b034a6f6f0d80d519ab08bdd18be4de2a4a2db |
|
31-May-2015 |
Mathieu Chartier <mathieuc@google.com> |
Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3
|
e401d146407d61eeb99f8d6176b2ac13c4df1e33 |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
|
b176d7c6c8c01a50317f837a78de5da57ee84fb2 |
|
20-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Also encode the InvokeType in an InlineInfo. This will be needed to recover the call stack. Change-Id: I2fe10785eb1167939c8cce1862b2d7f4066e16ec
|
b1d0f3f7e92fdcc92fe2d4c48cbb1262c005583f |
|
14-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support InlineInfo in StackMap. Change-Id: I9956091775cedc609fdae7dec1433fcb8858a477
|
0a23d74dc2751440822960eab218be4cb8843647 |
|
07-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a parent environment to HEnvironment. This code has no functionality change. It adds a placeholder for chaining inlined frames. Change-Id: I5ec57335af76ee406052345b947aad98a6a4423a
|
3e3d73349a2de81d14e2279f60ffbd9ab3f3ac28 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Have HInvoke instructions know their number of actual arguments. Add an art::HInvoke::GetNumberOfArguments routine so that art::HInvoke and its subclasses can return the number of actual arguments of the called method. Use it in code generators and intrinsics handlers. Consequently, no longer remove a clinit check as last input of a static invoke if it is still present during baseline code generation, but ensure that static invokes have no such check as last input in optimized compilations. Change-Id: Iaf9e07d1057a3b15b83d9638538c02b70211e476
|
4f46ac5179967dda5966f2dcecf2cf08977951ef |
|
23-Apr-2015 |
Calin Juravle <calin@google.com> |
Cleanup and improve stack map stream - transform AddStackMapEntry into BeginStackMapEntry/EndStackMapEntry. This allows for nicer code and less assumptions when searching for equal dex register maps. - store the components sizes and their start positions as fields to avoid re-computation. - store the current stack map entry as a field to avoid the copy semantic when updating its value in the stack maps array. - remove redundant methods and fix visibility for the remaining ones. Change-Id: Ica2d2969d7e15993bdbf8bc41d9df083cddafd24
|
641547a5f18ca2ea54469cceadcfef64f132e5e0 |
|
21-Apr-2015 |
Calin Juravle <calin@google.com> |
[optimizing] Fix a bug in moving the null check to the user. When taking the decision to move a null check to the user we did not verify if the next instruction checks the same object. Change-Id: I2f4533a4bb18aa4b0b6d5e419f37dcccd60354d2
|
88c13cddc3a4184908662b0f3de796565d348c76 |
|
14-Apr-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Correctly require register or FPU register. Also add a check that location summary are correctly typed with the HInstruction. Change-Id: I699762ff4e8f4e321c7db01ea005236ea1934af9
|
9021825d1e73998b99c81e89c73796f6f2845471 |
|
15-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Type MoveOperands. The ParallelMoveResolver implementation needs to know if a move is for 64bits or not, to handle swaps correctly. Bug found, and test case courtesy of Serguei I. Katkov. Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
|
c6b4dd8980350aaf250f0185f73e9c42ec17cd57 |
|
07-Apr-2015 |
David Srbecky <dsrbecky@google.com> |
Implement CFI for Optimizing. CFI is necessary for stack unwinding in gdb, lldb, and libunwind. Change-Id: I1a3480e3a4a99f48bf7e6e63c4e83a80cfee40a2
|
65b798ea10dd716c1bb3dda029f9bf255435af72 |
|
06-Apr-2015 |
Andreas Gampe <agampe@google.com> |
ART: Enable more Clang warnings Change-Id: Ie6aba02f4223b1de02530e1515c63505f37e184c
|
fb8d279bc011b31d0765dc7ca59afea324fd0d0c |
|
01-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Implement x86/x86_64 math intrinsics Implement floor/ceil/round/RoundFloat on x86 and x86_64. Implement RoundDouble on x86_64. Add support for roundss and roundsd on both architectures. Support them in the disassembler as well. Add the instruction set features for x86, as the 'round' instruction is only supported if SSE4.1 is supported. Fix the tests to handle the addition of passing the instruction set features to x86 and x86_64. Add assembler tests for roundsd and roundss to x86_64 assembler tests. Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
46e2a3915aa68c77426b71e95b9f3658250646b7 |
|
16-Mar-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Boolean simplifier The optimization recognizes the negation pattern generated by 'javac' and replaces it with a single condition. To this end, boolean values are now consistently assumed to be represented by an integer. This is a first optimization which deletes blocks from the HGraph and does so by replacing the corresponding entries with null. Hence, existing code can continue indexing the list of blocks with the block ID, but must check for null when iterating over the list. Change-Id: I7779da69cfa925c6521938ad0bcc11bc52335583
|
da4d79bc9a4aeb9da7c6259ce4c9c1c3bf545eb8 |
|
24-Mar-2015 |
Roland Levillain <rpl@google.com> |
Unify ART's various implementations of bit_cast. ART had several implementations of art::bit_cast: 1. one in runtime/base/casts.h, declared as: template <class Dest, class Source> inline Dest bit_cast(const Source& source); 2. another one in runtime/utils.h, declared as: template<typename U, typename V> static inline V bit_cast(U in); 3. and a third local version, in runtime/memory_region.h, similar to the previous one: template<typename Source, typename Destination> static Destination MemoryRegion::local_bit_cast(Source in); This CL removes versions 2. and 3. and changes their callers to use 1. instead. That version was chosen over the others as: - it was the oldest one in the code base; and - its syntax was closer to the standard C++ cast operators, as it supports the following use: bit_cast<Destination>(source) since `Source' can be deduced from `source'. Change-Id: I7334fd5d55bf0b8a0c52cb33cfbae6894ff83633
|
eeefa1276e83776f08704a3db4237423b0627e20 |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Update locations of registers after slow paths spilling. Change-Id: Id9aafcc13c1a085c17ce65d704c67b73f9de695d
|
fead4e4f397455aa31905b2982d4d861126ab89d |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
[optimizing] Don't record None locations in the stack maps. - moved environment recording from code generator to stack map stream - added creation/loading factory methods for the DexRegisterMap (hides internal details) - added new tests Change-Id: Ic8b6d044f0d8255c6759c19a41df332ef37876fe
|
a8ac9130b872c080299afacf5dcaab513d13ea87 |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Refactor code in preparation of correct stack maps in slow path. Move the logic of saving/restoring live registers in slow path in the SlowPathCode method. Also add a RecordPcInfo helper to SlowPathCode, that will act as the placeholder of saving correct stack maps. Change-Id: I25c2bc7a642ef854bbc8a3eb570e5c8c8d2d030c
|
a4d120c88e79eece333e66eec64c4e909d770e3e |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix build breakage. Change-Id: I86959eca5d8f5458ff75c78776b0af9db9c26800
|
915b9d0c13bb5091875d868fbfa551d7b65d7477 |
|
11-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Tweak liveness when instructions are used in environments. Instructions remain live when debuggable, but only instructions with object types remain live when non-debuggable. Enable StackVisitor::GetThisObject for optimizing. Change-Id: Id87b2cbf33a02450059acc9993995782e5f28987
|
a2d8ec6876325e89e5d82f5dbeca59f96ced3ec1 |
|
12-Mar-2015 |
Roland Levillain <rpl@google.com> |
Compress the Dex register maps built by the optimizing compiler. - Replace the current list-based (fixed-size) Dex register encoding in stack maps emitted by the optimizing compiler with another list-based variable-size Dex register encoding compressing short locations on 1 byte (3 bits for the location kind, 5 bits for the value); other (large) values remain encoded on 5 bytes. - In addition, use slot offsets instead of byte offsets to encode the location of Dex registers placed in stack slots at small offsets, as it enables more values to use the short (1-byte wide) encoding instead of the large (5-byte wide) one. - Rename art::DexRegisterMap::LocationKind as art::DexRegisterLocation::Kind, turn it into a strongly-typed enum based on a uint8_t, and extend it to support new kinds (kInStackLargeOffset and kConstantLargeValue). - Move art::DexRegisterEntry from compiler/optimizing/stack_map_stream.h to runtime/stack_map.h and rename it as art::DexRegisterLocation. - Adjust art::StackMapStream, art::CodeGenerator::RecordPcInfo, art::CheckReferenceMapVisitor::CheckOptimizedMethod, art::StackVisitor::GetVRegFromOptimizedCode, and art::StackVisitor::SetVRegFromOptimizedCode. - Implement unaligned memory accesses in art::MemoryRegion. - Use them to manipulate data in Dex register maps. - Adjust oatdump to support the new Dex register encoding. - Update compiler/optimizing/stack_map_test.cc. Change-Id: Icefaa2e2b36b3c80bb1b882fe7ea2f77ba85c505
|
5f8741860d465410bfed495dbb5f794590d338da |
|
04-Mar-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Use callee-save registers for x86 Add ESI, EDI, EBP to available registers for non-baseline mode. Ensure that they aren't used when byte addressible registers are needed. Change-Id: Ie7130d4084c2ae9cfcd1e47c26eb3e5dcac1ebd6 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
579885a26d761f5ba9550f2a1cd7f0f598c2e1e3 |
|
22-Feb-2015 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Opt Compiler: ARM64: Enable explicit memory barriers over acquire/release Implement remaining explicit memory barrier code paths and temporarily enable the use of explicit memory barriers for testing. This CL also enables the use of instruction set features in the ARM64 backend. kUseAcquireRelease has been replaced with PreferAcquireRelease(), which for now is statically set to false (prefer explicit memory barriers). Please note that we still prefer acquire-release for the ARM64 Optimizing Compiler, but we would like to exercise the explicit memory barrier code path too. Change-Id: I84e047ecd43b6fbefc5b82cf532e3f5c59076458 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
d6138ef1ea13d07ae555542f8898b30d89e9ac9a |
|
18-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Ensure the graph is correctly typed. We used to be forgiving because of HIntConstant(0) also being used for null. We now create a special HNullConstant for such uses. Also, we need to run the dead phi elimination twice during ssa building to ensure the correctness. Change-Id: If479efa3680d3358800aebb1cca692fa2d94f6e5
|
442b46a087c389a91a0b51547ac9205058432364 |
|
18-Feb-2015 |
Roland Levillain <rpl@google.com> |
Display optimizing compiler's CodeInfo objects in oatdump. A few elements are not displayed yet (stack mask, inline info) though. Change-Id: I5e51a801c580169abc5d1ef43ad581aadc110754
|
dc23d8318db08cb42e20f1d16dbc416798951a8b |
|
16-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Avoid generating jmp +0. When a block branches to a non-following block, but blocks in-between do branch to it, we can avoid doing the branch. Change-Id: I9b343f662a4efc718cd4b58168f93162a24e1219
|
c0572a451944f78397619dec34a38c36c11e9d2a |
|
06-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize leaf methods. Avoid suspend checks and stack changes when not needed. Change-Id: I0fdb31e8c631e99091b818874a558c9aa04b1628
|
829280cc90b7a84db42864589b4bafb4c94a79d9 |
|
28-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Finally implement Location::kNoOutputOverlap. The [i, i + 1) interval scheme we chose for representing lifetime positions is not optimal for doing this optimization. It however doesn't prevent recognizing a non-split interval during the TryAllocateFreeReg phase, and try to re-use its inputs' registers. Change-Id: I80a2823b0048d3310becfc5f5fb7b1230dfd8201
|
4c204bafbc8d596894f8cb8ec696f5be1c6f12d8 |
|
03-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use a different block order when not compiling baseline. Use the linearized order instead, as it puts blocks logically next to each other in a better way. Also, it does not contain dead blocks. Change-Id: Ie65b56041a093c8155e6c1e06351cb36a4053505
|
4dee636d21d9ce54386cdfbb824e5eb2a9c1af0d |
|
23-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support callee-save registers on ARM. Change-Id: I7c519b7a828c9891b1141a8e51e12d6a8bc84118
|
d97dc40d186aec46bfd318b6a2026a98241d7e9c |
|
22-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support callee save floating point registers on x64. - Share the computation of core_spill_mask and fpu_spill_mask between backends. - Remove explicit stack overflow check support: we need to adjust them and since they are not tested, they will easily bitrot. Change-Id: I0b619b8de4e1bdb169ea1ae7c6ede8df0d65837a
|
988939683c26c0b1c8808fc206add6337319509a |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable core callee-save on x64. Will work on other architectures and FP support in other CLs. Change-Id: I8cef0343eedc7202d206f5217fdf0349035f0e4d
|
0ada95d8de4b04b5f201b4b7e9c3c2fd2cc321ae |
|
04-Dec-2014 |
Jean Christophe Beyler <jean.christophe.beyler@intel.com> |
ART: Replace NULL to nullptr in the optimizing compiler Replace macro NULL to the nullptr variation for C++. Change-Id: Ib6e48dd4bb3c254343383011b67372622578ca76 Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com>
|
6c2dff8ff8e1440fa4d9e1b2ba2a44d036882801 |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Fully support pairs in the register allocator."" This reverts commit c399fdc442db82dfda66e6c25518872ab0f1d24f. Change-Id: I19f8215c4b98f2f0827e04bf7806c3ca439794e5
|
77520bca97ec44e3758510cebd0f20e3bb4584ea |
|
12-Jan-2015 |
Calin Juravle <calin@google.com> |
Record implicit null checks at the actual invoke time. ImplicitNullChecks are recorded only for instructions directly (see NB below) preceeded by NullChecks in the graph. This way we avoid recording redundant safepoints and minimize the code size increase. NB: ParallalelMoves might be inserted by the register allocator between the NullChecks and their uses. These modify the environment and the correct action would be to reverse their modification. This will be addressed in a follow-up CL. Change-Id: Ie50006e5a4bd22932dcf11348f5a655d253cd898
|
c399fdc442db82dfda66e6c25518872ab0f1d24f |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Fully support pairs in the register allocator." Libcore tests fail. This reverts commit 41aedbb684ccef76ff8373f39aba606ce4cb3194. Change-Id: I2572f120d4bbaeb7a4d4cbfd47ab00c9ea39ac6c
|
41aedbb684ccef76ff8373f39aba606ce4cb3194 |
|
14-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fully support pairs in the register allocator. Enabled on ARM for longs and doubles. Change-Id: Id8792d08bd7ca9fb049c5db8a40ae694bafc2d8b
|
cd6dffedf1bd8e6dfb3fb0c933551f9a90f7de3f |
|
08-Jan-2015 |
Calin Juravle <calin@google.com> |
Add implicit null checks for the optimizing compiler - for backends: arm, arm64, x86, x86_64 - fixed parameter passing for CodeGenerator - 003-omnibus-opcodes test verifies that NullPointerExceptions work as expected Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
|
42d1f5f006c8bdbcbf855c53036cd50f9c69753e |
|
16-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not use register pair in a parallel move. The ParallelMoveResolver does not work with pairs. Instead, decompose the pair into two individual moves. Change-Id: Ie9d3f0b078cef8dc20640c98b20bb20cc4971a7f
|
f85a9ca9859ad843dc03d3a2b600afbaf2e9bbdd |
|
13-Jan-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing compiler] Compute live spill size The current stack frame calculation assumes that each live register to be saved/restored has the word size of the machine. This fails for X86, where a double in an XMM register takes up 8 bytes. Change the calculation to keep track of the number of core registers and number of fp registers to handle this distinction. This is slightly pessimal, as the registers may not be active at the same time, but the only way to handle this would be to allocate both classes of registers simultaneously, or remember all the active intervals, matching them up and compute the size of each safepoint interval. Change-Id: If7860aa319b625c214775347728cdf49a56946eb Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
12df9ebf72255544b0147c81b1dca6644a29764e |
|
09-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Move code around in OptimizingCompiler::Compile to reduce stack space. Also fix an (intentional) memory leak, by allocating the CodeGenerator on the heap instead of the arena: they construct an Assembler object that requires destruction. BUG:18787334 Change-Id: I8cf0667cb70ce5b14d4ac334bd4487a562635f1b
|
840e5461a85f8908f51e7f6cd562a9129ff0e7ce |
|
07-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement double and float support for arm in register allocator. The basic approach is: - An instruction that needs two registers gets two intervals. - When allocating the low part, we also allocate the high part. - When splitting a low (or high) interval, we also split the high (or low) equivalent. - Allocation follows the (S/D register) requirement that low registers are always even and the high equivalent is low + 1. Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
|
3416601a9e9be81bb7494864287fd3602d18ef13 |
|
19-Dec-2014 |
Calin Juravle <calin@google.com> |
Look at instruction set features when generating volatiles code Change-Id: Ia882405719fdd60b63e4102af7e085f7cbe0bb2a
|
e21dc3db191df04c100620965bee4617b3b24397 |
|
09-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Swap-space in the compiler Introduce a swap-space and corresponding allocator to transparently switch native allocations to memory backed by a file. Bug: 18596910 (cherry picked from commit 62746d8d9c4400e4764f162b22bfb1a32be287a9) Change-Id: I131448f3907115054a592af73db86d2b9257ea33
|
5b4b898ed8725242ee6b7229b94467c3ea3054c8 |
|
18-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't block quick callee saved registers for optimizing." X64 has one libcore test failing, and codegen_test on arm is failing. This reverts commit 6004796d6c630696127df2494dcd4f30d1367a34. Change-Id: I20e00431fa18e11ce4c0cb6fffa91977fa8e9b4f
|
6004796d6c630696127df2494dcd4f30d1367a34 |
|
15-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't block quick callee saved registers for optimizing. This change builds on: https://android-review.googlesource.com/#/c/118983/ - Also fix x86_64 assembler bug triggered by this change. - Fix (and improve) x86's backend byte register usage. - Fix a bug in baseline register allocator: a fixed out register must prevent inputs from allocating it. Change-Id: I4883862e29b4e4b6470f1823cf7eab7e7863d8ad
|
e53798a7e3267305f696bf658e418c92e63e0834 |
|
01-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Inlining support in optimizing. Currently only inlines simple things that don't require an environment, such as: - Returning a constant. - Returning a parameter. - Returning an arithmetic operation. Change-Id: Ie844950cb44f69e104774a3cf7a8dea66bc85661
|
d2ec87d84057174d4884ee16f652cbcfd31362e9 |
|
08-Dec-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add REM_FLOAT and REM_DOUBLE - for arm, x86, x86_64 backends - reinstated fmod quick entry points for x86. This is a partial revert of bd3682eada753de52975ae2b4a712bd87dc139a6 which added inline assembly for floting point rem on x86. Note that Quick still uses the inline version. - fix rem tests for longs Change-Id: I73be19a9f2f2bcf3f718d9ca636e67bdd72b5440
|
624279f3c70f9904cbaf428078981b05d3b324c0 |
|
04-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for float-to-long in the optimizing compiler. - Add support for the float-to-long Dex instruction in the optimizing compiler. - Add a Dex PC field to art::HTypeConversion to allow the x86 and ARM code generators to produce runtime calls. - Instruct art::CodeGenerator::RecordPcInfo not to record PC information for HTypeConversion instructions. - Add S0 to the list of ARM FPU parameter registers. - Have art::x86_64::X86_64Assembler::cvttss2si work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for float to long HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I954214f0d537187883f83f7a83a1bb2dd8a21fd4
|
32f5b4d2c8c9b52e9522941c159577b21752d0fa |
|
25-Nov-2014 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Vixl: Update the VIXL interface to VIXL 1.7 and enable VIXL debug. This patch updates the interface to VIXL 1.7 and enables the debug version of VIXL when ART is built in debug mode. Change-Id: I443fb941bec3cffefba7038f93bb972e6b7d8db5 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
647b9ed41cdb7cf302fd356627a3ba372419b78c |
|
27-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for long-to-double in the optimizing compiler. - Add support for the long-to-double Dex instruction in the optimizing compiler. - Enable requests of temporary FPU (double) registers during code generation. - Fix art::x86::X86Assembler::LoadLongConstant and extend it to int64_t values. - Have art::x86_64::X86_64Assembler::cvtsi2sd work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for long to double HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: Ie73d9e5e25bd2e15f585c371e8fc2dcb83438ccd
|
87d03761f35ad6cbe0bffbf1ec739875a471da6d |
|
19-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix safepoint bug when computing live registers. Change-Id: I8f28dd287c0e04223c49dea6a323058c1b210913
|
f97f9fbfdf7f2e23c662f21081fadee6af37809d |
|
11-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] add HTemporary support for long and doubles Change-Id: I5247ecd71d0193050484b7632c804c9bfd20f924
|
f0e3937b87453234d0d7970b8712082062709b8d |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Do a parallel move in BoundsCheckSlowPath. The two locations of the index and length could overlap, so we need a parallel move. Also factorize the code for doing a parallel move based on two locations. Change-Id: Iee8b3459e2eed6704d45e9a564fb2cd050741ea4
|
52839d17c06175e19ca4a093fb878450d1c4310d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support invoke-interface in optimizing. Change-Id: Ic18d7c3d2810557231caf0571956e0c431f5d384
|
f43083d560565aea46c602adb86423daeefe589d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not update Out after it has a valid location. Slow paths use LocationSummary to know where to move things around, and they are executed at the end of the code generation. This fix is needed for https://android-review.googlesource.com/#/c/113345/. Change-Id: Id336c6409479b1de6dc839b736a7234d08a7774a
|
de58ab2c03ff8112b07ab827c8fa38f670dfc656 |
|
05-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement try/catch/throw in optimizing. - We currently don't run optimizations in the presence of a try/catch. - We therefore implement Quick's mapping table. - Also fix a missing null check on array-length. Change-Id: I6917dfcb868e75c1cf6eff32b7cbb60b6cfbd68f
|
19a19cffd197a28ae4c9c3e59eff6352fd392241 |
|
22-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for static fields in optimizing compiler. Change-Id: Id2f010589e2bd6faf42c05bb33abf6816ebe9fa9
|
3c03503d66df3b4440f851ae7d0c4fae5e7872df |
|
28-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Follow-up CL after hard float changes. Addressing comments from Zheng Xu. Change-Id: I8c599cdfab03373e82a1b90b711005c490bc6ca0
|
1ba0f596e9e4ddd778ab431237d11baa85594eba |
|
27-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support hard float on arm in optimizing compiler. Also bump oat version, needed after latest hard float switch. Change-Id: Idf5acfb36c07e74acff00edab998419a3c6b2965
|
5319defdf502fc4569316473846b83180ec08035 |
|
23-Oct-2014 |
Alexandre Rames <alexandre.rames@arm.com> |
ART: optimizing compiler: initial support for ARM64. The ARM64 port uses VIXL for code generation, to which it defers work like label binding and branch resolving, register type coherency checking, and immediate values handling. Change-Id: I0a44508c0c991f472a63e67b3469cdd878fe1a68 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com> Signed-off-by: Alexandre Rames <alexandre.rames@arm.com>
|
102cbed1e52b7c5f09458b44903fe97bb3e14d5f |
|
15-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement register allocator for floating point registers. Also: - Fix misuses of emitting the rex prefix in the x86_64 assembler. - Fix movaps code generation in the x86_64 assembler. Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
|
92a73aef279be78e3c2b04db1713076183933436 |
|
16-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use assembler classes in code_generator.h. The arm64 backend uses its own assembler and does not share the same classes as the other backends. To avoid conflicts or unnecessary mappings, just don't use those classes in the shared part of the code generator. Change-Id: I9e5fa40c1021d2e83a4ef14c52cd1ccd03f2f73d
|
71175b7f19a4f6cf9cc264feafd820dbafa371fb |
|
09-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Cleanup baseline register allocator. - Use three arrays for blocking regsters instead of one and computing offsets in that array.] - Don't pass blocked_registers_ to methods, just use the field. Change-Id: Ib698564c31127c59b5a64c80f4262394b8394dc6
|
56b9ee6fe1d6880c5fca0e7feb28b25a1ded2e2f |
|
09-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Stop converting from Location to ManagedRegister. Now the source of truth is the Location object that knows which register (core, pair, fpu) it needs to refer to. Change-Id: I62401343d7479ecfb24b5ed161ec7829cda5a0b1
|
7fb49da8ec62e8a10ed9419ade9f32c6b1174687 |
|
06-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for floats and doubles. - Follows Quick conventions. - Currently only works with baseline register allocator. Change-Id: Ie4b8e298f4f5e1cd82364da83e4344d4fc3621a3
|
3c04974a90b0e03f4b509010bff49f0b2a3da57f |
|
24-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize suspend checks in optimizing compiler. - Remove the ones added during graph build (they were added for the baseline code generator). - Emit them at loop back edges after phi moves, so that the test can directly jump to the loop header. - Fix x86 and x86_64 suspend check by using cmpw instead of cmpl. Change-Id: I6fad5795a55705d86c9e1cb85bf5d63dadfafa2a
|
3bca0df855f0e575c6ee020ed016999fc8f14122 |
|
19-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support for saving and restoring live registers in a slow path. And use it in suspend check slow paths. Change-Id: I79caf28f334c145a36180c79a6e2fceae3990c31
|
8a16d97fb8f031822b206e65f9109a071da40563 |
|
11-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix valgrind errors. For now just stack allocate the code generator. Will think about cleaning up the root problem later (CodeGenerator being an arena object). Change-Id: I161a6f61c5f27ea88851b446f3c1e12ee9c594d7
|
3946844c34ad965515f677084b07d663d70ad1b8 |
|
02-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Runtime support for the new stack maps for the opt compiler. Now most of the methods supported by the compiler can be optimized, instead of using the baseline. Change-Id: I80ab36a34913fa4e7dd576c7bf55af63594dc1fa
|
e3ea83811d47152c00abea24a9b420651a33b496 |
|
08-Aug-2014 |
Yevgeny Rouban <yevgeny.y.rouban@intel.com> |
ART source line debug info in OAT files OAT files have source line information enough for ART runtime needs like jump to/from interpreter and thread suspension. But this information is not enough for finer grained source level debugging and low-level profiling (VTune or perf). This patch adds to OAT files two additional sections: .debug_line - DWARF formatted Elf32 section with detailed source line information (mapping from native PC to Java source lines). In addition to the debugging symbols added using the dex2oat option --include-debug-symbols, the source line information is added to the section .debug_line. The source line info can be read by many Elf reading tools like objdump, readelf, dwarfdump, gdb, perf, VTune, ... gdb can use this debug line information in x86. In 64-bit mode the information can be used if the oat file is mapped in the lower address space (address has higher 32 bits zeroed). Relocation works. Testing: 1. art/test/run-test --host --gdb [--64] 001-HelloWorld 2. in gdb: break Main.java:19 3. in gdb: break Runtime.java:111 4. in gdb: run - stops at void java.lang.Runtime.<init>() 5. in gdb: backtrace - shows call stack down to main() 6. in gdb: continue - stops at void Main.main() (only in 32-bit mode) 7. in gdb: backtrace - shows call stack down to main() 8. objdump -W <oat-file> - addresses are from VMA range of .text section reported by objdump -h <file> 9. dwarfdump -ka <oat-file> - no errors expected Size of aosp-x86-eng boot.oat increased by 11% from 80.5Mb to 89.2Mb with two sections added .debug_line (7.2Mb) and .rel.debug (1.5Mb). Change-Id: Ib8828832686e49782a63d5529008ff4814ed9cda Signed-off-by: Yevgeny Rouban <yevgeny.y.rouban@intel.com>
|
73e80c3ae76fafdb53afe3a85306dcb491fb5b00 |
|
22-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Make unit test tell if a method is a leaf. The runtime is not initialized completely in gtests, so we cannot run code (such as explicit stack overflow checks) that look at tls values. Change-Id: I74a4449b01eb203f1b411dda700e9459878d0d55
|
f12feb8e0e857f2832545b3f28d31bad5a9d3903 |
|
17-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Stack overflow checks and NPE checks for optimizing. Change-Id: I59e97448bf29778769b79b51ee4ea43f43493d96
|
ab032bc1ff57831106fdac6a91a136293609401f |
|
15-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a braino in the stack layout. Also do some refactoring to have this code be just in CodeGenerator. Change-Id: I88de109889138af8d60027973c12a64bee813cb7
|
e50383288a75244255d3ecedcc79ffe9caf774cb |
|
04-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support fields in optimizing compiler. - Required support for temporaries, to be only used by baseline compiler. - Also fixed a few invalid assumptions around locations and instructions that don't need materialization. These instructions should not have an Out. Change-Id: Idc4a30dd95dd18015137300d36bec55fc024cf62
|
86dbb9a12119273039ce272b41c809fa548b37b6 |
|
04-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Final CL to enable register allocation on x86. This CL implements: 1) Resolution after allocation: connecting the locations allocated to an interval within a block and between blocks. 2) Handling of fixed registers: some instructions require inputs/output to be at a specific location, and the allocator needs to deal with them in a special way. 3) ParallelMoveResolver::EmitNativeCode for x86. Change-Id: I0da6bd7eb66877987148b87c3be6a983b4e3f858
|
9cf35523764d829ae0470dae2d5dd99be469c841 |
|
09-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add x86_64 support to the optimizing compiler. Change-Id: I4462d9ae15be56c4a3dc1bd4d1c0c6548c1b94be
|
f635e63318447ca04731b265a86a573c9ed1737c |
|
14-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a compilation tracing mechanism to the new compiler. Code mostly imported from: https://android-review.googlesource.com/#/c/81653/. Change-Id: I150fe942be0fb270e03fabb19032180f7a065d13
|
804d09372cc3d80d537da1489da4a45e0e19aa5d |
|
02-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Build live-in, live-out and kill sets for each block. This information will be used when computing live ranges of instructions. Change-Id: I345ee833c1ccb4a8e725c7976453f6d58d350d74
|
f529d776ca9f48b115714f6c79677755ecc37d24 |
|
02-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Make all registers available when allocating an output register. On ARM we currently only have two register pairs available, so we need to use one already used for an input. Change-Id: I5411862310009a41e50ddab3549d3a9e9052266a
|
a7aca370a7d62ca04a1e24423d90e8020d6f1a58 |
|
28-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Setup policies for register allocation. Change-Id: I857e77530fca3e2fb872fc142a916af1b48400dc
|
c32e770f21540e4e9eda6dc7f770e745d33f1b9f |
|
24-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a Transform to SSA phase to the optimizing compiler. Change-Id: Ia9700756a0396d797a00b529896487d52c989329
|
b55f835d66a61e5da6fc1895ba5a0482868c9552 |
|
07-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Test control flow instruction with optimizing compiler. Add support for basic instructions to implement these tests. Change-Id: I3870bf9301599043b3511522bb49dc6364c9b4c0
|
f583e5976e1de9aa206fb8de4f91000180685066 |
|
07-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for taking parameters in optimizing compiler. - Fix stack layout to mimic Quick's. - Implement some sub operations. Change-Id: I8cf75a4d29b662381a64f02c0bc61d859482fc4e
|
707c809f661554713edfacf338365adca8dfd3a3 |
|
04-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Use target-specific word instead of runtime word. Change-Id: Ia11dc3cc520a1a5c7bd017013e5699af9570ce91
|
4a34a428c6a2588e0857ef6baf88f1b73ce65958 |
|
03-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support passing arguments to invoke-static* instructions. - Stop using the frame pointer for accessing locals. - Stop emulating a stack when doing code generation. Instead, rely on dex register model, where instructions only reference registers. Change-Id: Id51bd7d33ac430cb87a53c9f4b0c864eeb1006f9
|
6a58cb16d803c9a7b3a75ccac8be19dd9d4e520d |
|
02-Apr-2014 |
Dmitry Petrochenko <dmitry.petrochenko@intel.com> |
art: Handle x86_64 architecture equal to x86 This patch forces FE/ME to treat x86_64 as x86 exactly. The x86_64 logic will be revised later when assembly will be ready. Change-Id: I4a92477a6eeaa9a11fd710d35c602d8d6f88cbb6 Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
|
8ccc3f5d06fd217cdaabd37e743adab2031d3720 |
|
19-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for invoke-static in optimizing compiler. Support is limited to calls without parameters and returning void. For simplicity, we currently follow the Quick ABI. Change-Id: I54805161141b7eac5959f1cae0dc138dd0b2e8a5
|
92cf83e001357329cbf41fa15a6e053fab6f4933 |
|
18-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Run Java tests with the optimizing compiler. Also fix a vector.reserve -> vector.resize braino, and build a GC map that dex2oat expects. Change-Id: I6acf2f90a4c32f90b79bf7709bf2e43931b98757
|
787c3076635cf117eb646c5a89a9014b2072fb44 |
|
17-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Plug new optimizing compiler in compilation pipeline. Also rename accessors to ART's conventions. Change-Id: I344807055b98aa4b27215704ec362191464acecc
|
bab4ed7057799a4fadc6283108ab56f389d117d4 |
|
11-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
More code generation for the optimizing compiler. - Add HReturn instruction - Generate code for locals/if/return - Setup infrastructure for register allocation. Currently emulate a stack. Change-Id: Ib28c2dba80f6c526177ed9a7b09c0689ac8122fb
|
d4dd255db1d110ceb5551f6d95ff31fb57420994 |
|
28-Feb-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add codegen support to the optimizing compiler. Change-Id: I9aae76908ff1d6e64fb71a6718fc1426b67a5c28
|