edddaa22dce7e0dee2a74be5462e5d43815b790c |
|
26-Jun-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix braino when handling branches fallthrough in arm backend. bug: 62210114 Test: 657-branches (cherry picked from commit 6fda42718a348cfb758d8714e223cab7e855765b) Change-Id: I0e4f0577bcb1f3960459fe5d35473af191fc6534 (cherry picked from commit 860626e1a3e1314b7d2e828fd61fb25eebc5f081)
|
79efadfdd861584f1c47654ade975eae6c43c360 |
|
08-May-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Add runtime reasons for deopt. Currently to help investigate. Also: 1) Log when deoptimization happens (which method and what reason) 2) Trace when deoptimization happens (to make it visible in systrace) bug:37655083 Test: test-art-host test-art-target (cherry picked from commit 4e92c3ce7ef354620a785553bbada554fca83a67) Change-Id: I992398a1038ab61ea0e5106af6b6ad0a3305312e
|
d9911eeca13f609c885e0f6a5ce81af9b6340bfa |
|
27-Mar-2017 |
Andreas Gampe <agampe@google.com> |
ART: Clean up field initialization Add explicit field initialization to default value where necessary. Also clean up interpreter intrinsics header. Test: m Change-Id: I7a850ac30dcccfb523a5569fb8400b9ac892c8e5
|
1e7bb5a3fd3591121e00c9ebf71e628e450af72b |
|
17-Mar-2017 |
Anton Kirilov <anton.kirilov@linaro.org> |
ARM: Improve the code generated for HInstanceOf Test: m test-art-target-run-test-009-instanceof Test: m test-art-target-run-test-422-instanceof Test: m test-art-target-run-test-494-checker-instanceof-tests Test: m test-art-target-run-test-500-instanceof Test: m test-art-target-run-test-530-instanceof-checkcast Test: m test-art-target-run-test-603-checker-instanceof Change-Id: Ia5e1421403605659d0f53bc794acb5e5b0af0c5e
|
217b2ce15674cb3cf2373110711d74aefb6c91e4 |
|
16-Mar-2017 |
Anton Kirilov <anton.kirilov@linaro.org> |
ARM: Reduce the number of branches generated for HCondition and HSelect Test: m test-art-target-run-test-570-checker-select Change-Id: I87d2e87eb2fd30355101df07eb3754b013cedf63
|
6f64420701c1e1af3bb51bce6a9b608b5bb79fb5 |
|
27-Feb-2017 |
Anton Kirilov <anton.kirilov@linaro.org> |
ARM: Avoid branches to branches Generally speaking, this optimization applies to all code generation visitors ending with a call to Bind(), which includes intrinsics with kNoCall CallKind. However, no changes are done for slow paths (which frequently end with a branch to an exit label that is bound at the end of a visitor). Test: m test-art-target Change-Id: Ie1a0c8c54ef76b01e7f0b23962c56c29ca8984a9
|
effd5bfa62fa3065a8386b192bf60d41c320f6e4 |
|
28-Feb-2017 |
Anton Kirilov <anton.kirilov@linaro.org> |
ARM: Generate UBFX for HAnd Test: m test-art-target-run-test-538-checker-embed-constants Change-Id: I8e6af76b99543331e8ffec01bd8df3f09890708e
|
2dd053d61c3971fa5b5e179e0a2b5368409c9ba3 |
|
08-Mar-2017 |
Artem Serov <artem.serov@linaro.org> |
ARM: VIXL32: Improve BoundsCheck for constant inputs. Test: mma test-art-host && mma test-art-target Change-Id: I05051c03dbd3684c674096def84020494d28364b
|
426b49c45d8088ff3114d3fbcec26db4e00c9324 |
|
08-Nov-2016 |
Donghui Bai <donghui.bai@linaro.org> |
ARM(64): Improve the code generated for HSelect Test: m test-art-target-run-test-566-checker-codegen-select Test: m test-art-target-run-test-570-checker-select Change-Id: If0140892303490701782df9a818e6d8346bf3d6c Signed-off-by: Anton Kirilov <anton.kirilov@linaro.org>
|
c52f3034b06c03632e937aff07d46c2bdcadfef5 |
|
02-Mar-2017 |
Richard Uhler <ruhler@google.com> |
Remove --include-patch-information option from dex2oat. Because we no longer support running patchoat on npic oat files, which means the included patch information is unused . Bug: 33192586 Test: m test-art-host Change-Id: I9e100c4e47dc24d91cd74226c84025e961d30f67
|
54f869ed3c7910e6eb7bade924d41570e9a4cb14 |
|
06-Mar-2017 |
Roland Levillain <rpl@google.com> |
Revert "Revert "Use the holder's gray bit in Baker read barrier slow paths (ARM, ARM64)."" This reverts commit 47b3ab2fd83aaa530b7d2c62bfc024209b8b6923. In compiler-generated code, when deciding whether to mark a heap reference or not in a read barrier, after checking whether the GC is currently marking, also check (in the slow path) whether the lock word of the reference's holder is gray, before actually marking the reference. This change is only for ARM and ARM64, as it does not benefit x86 nor x86-64. Change-Id: Ia26b07f0485e23589bfc0e65f83852f2795688c0 Test: Run ART tests in Baker read barrier configuration. Test: Boot a device in Baker read barrier configuration. Bug: 35780827 Bug: 29516974
|
ba650a4d5a0a82c6c88d6546b6111013c2ee8072 |
|
06-Mar-2017 |
Roland Levillain <rpl@google.com> |
Revert "Revert "Use the "GC is marking" information in compiler read barriers (ARM, ARM64)."" This reverts commit 35345a555bd7928582a7ffa6369b374b3ddc379d. In compiler-generated code, when deciding whether to mark a heap reference or not in a read barrier, check whether the GC is currently marking, instead of checking the gray bit in the reference's holder's lock word. This change is only for ARM and ARM64, as it does not benefit x86 nor x86-64. Change-Id: Id3d2758c600115b2f07d345442cfa87edfc2792c Test: Run ART tests in Baker read barrier configuration. Test: Boot a device in Baker read barrier configuration. Bug: 35780827 Bug: 29516974
|
dbdba3776f2a5ba6b0400883d0c6fac474ffe2a3 |
|
27-Feb-2017 |
Roland Levillain <rpl@google.com> |
Revert "Use the "GC is marking" information in compiler read barriers (ARM, ARM64)." This reverts commit 1372c9f40df1e47bf775f1466bbb96f472b6b9ed. This change (along with https://android-review.googlesource.com/#/c/342429/) creates null pointer dereferences. Bug: 35780827 Bug: 29516974 Change-Id: I2a9c4d0ad8d2ab870c2e0ddbff32152933c77abe (cherry picked from commit 35345a555bd7928582a7ffa6369b374b3ddc379d)
|
5e3e319ec1295dddd62a0caad18e688bd4ee18b9 |
|
27-Feb-2017 |
Roland Levillain <rpl@google.com> |
Revert "Use the holder's gray bit in Baker read barrier slow paths (ARM, ARM64)." This reverts commit 27b1f9cbfc1409418eee4b0e22f29f033e10b64d. This change (along with https://android-review.googlesource.com/#/c/342428/) creates null pointer dereferences. Bug: 35780827 Bug: 29516974 Change-Id: If731960a405f9b89528f3daaf235da57cabc5c11 (cherry picked from commit 47b3ab2fd83aaa530b7d2c62bfc024209b8b6923)
|
35345a555bd7928582a7ffa6369b374b3ddc379d |
|
27-Feb-2017 |
Roland Levillain <rpl@google.com> |
Revert "Use the "GC is marking" information in compiler read barriers (ARM, ARM64)." This reverts commit 1372c9f40df1e47bf775f1466bbb96f472b6b9ed. This change (along with https://android-review.googlesource.com/#/c/342429/) creates null pointer dereferences. Bug: 35780827 Bug: 29516974 Change-Id: I2a9c4d0ad8d2ab870c2e0ddbff32152933c77abe
|
47b3ab2fd83aaa530b7d2c62bfc024209b8b6923 |
|
27-Feb-2017 |
Roland Levillain <rpl@google.com> |
Revert "Use the holder's gray bit in Baker read barrier slow paths (ARM, ARM64)." This reverts commit 27b1f9cbfc1409418eee4b0e22f29f033e10b64d. This change (along with https://android-review.googlesource.com/#/c/342428/) creates null pointer dereferences. Bug: 35780827 Bug: 29516974 Change-Id: If731960a405f9b89528f3daaf235da57cabc5c11
|
27b1f9cbfc1409418eee4b0e22f29f033e10b64d |
|
17-Jan-2017 |
Roland Levillain <rpl@google.com> |
Use the holder's gray bit in Baker read barrier slow paths (ARM, ARM64). In compiler-generated code, when deciding whether to mark a heap reference or not in a read barrier, after checking whether the GC is currently marking, also check (in the slow path) whether the lock word of the reference's holder is gray, before actually marking the reference. This change is only for ARM and ARM64, as it does not benefit x86 nor x86-64. Test: Run ART tests in Baker read barrier configuration. Test: Boot a device in Baker read barrier configuration. Bug: 29516974 Change-Id: I60595a8f4987747faeaa359ad873e9758c1ded75
|
1372c9f40df1e47bf775f1466bbb96f472b6b9ed |
|
13-Jan-2017 |
Roland Levillain <rpl@google.com> |
Use the "GC is marking" information in compiler read barriers (ARM, ARM64). In compiler-generated code, when deciding whether to mark a heap reference or not in a read barrier, check whether the GC is currently marking, instead of checking the gray bit in the reference's holder's lock word. This change is only for ARM and ARM64, as it does not benefit x86 nor x86-64. Test: Run ART tests in Baker read barrier configuration. Test: Boot a device in Baker read barrier configuration. Bug: 29516974 Change-Id: Ia5d90286bb9f753f3bbcb3a6254eb166523a2ff5
|
74234daabb28a4b9c804bf8bf908e7334bd4d400 |
|
13-Jan-2017 |
Anton Kirilov <anton.kirilov@linaro.org> |
ARM: Merge data-processing instructions and shifts/(un)signed extensions This commit mirrors the work that has already been done for ARM64. Test: m test-art-target-run-test-551-checker-shifter-operand Change-Id: Iec8c1563b035f40f0e18dcffde28d91dc21922f8
|
d966ce7739bebbdce5481900a1b3220b31f3f3ad |
|
09-Feb-2017 |
Roland Levillain <rpl@google.com> |
Use entrypoint switching on x86 & x86-64 for GC root read barriers. For consistency reason (with the ARM and ARM64 implementations), check the read barrier marking entrypoint (`Thread::Current()->pReadBarrierMarkReg ## root.reg()`) instead of `Thread::Current()->GetIsGcMarking()` to decide whether to mark a GC root. This change should have no impact on the performance or the size of the generated code. Test: m test-art-host Bug: 32638713 Change-Id: Ifd71312992fdfd6067447cccb7d95860f3771b57
|
ea4c126a0165c5a4b997986e6e01c7f975642167 |
|
06-Feb-2017 |
Vladimir Marko <vmarko@google.com> |
Change type initialization entrypoints to kSaveEverything. Also avoid the unnecessary read barriers for boot image classes with kBssEntry or kJitTableAddress (the kBssEntry and JIT work missed the `read_barrier_option` flag), fix bit-rotten non-Baker read barriers on ARM and ARM64 and fix bit-rotten ARM64 relative patcher's IsAdrpPatch() used for erratum 843419 workaround. aosp_angler-userdebug with CC: before: arm boot*.oat: 35440420 arm64 boot*.oat: 43504952 after: arm boot*.oat: 35222292 (-218128, -0.62%) arm64 boot*.oat: 43389048 (-115904, -0.26%) aosp_angler-userdebug without CC: before: arm boot*.oat: 31927412 arm64 boot*.oat: 39340512 after: arm boot*.oat: 31708736 (-218676, -0.68%) arm64 boot*.oat: 39211768 (-128744, -0.33%) Test: m test-art-host (non-CC, Baker CC, table lookup CC) Test: m test-art-target on Nexus 6P (non-CC, Baker CC, table lookup CC) Test: Nexus 6P boots (non-CC, Baker CC, table lookup CC) Bug: 30627598 Change-Id: Ida5bbce414844de9e4273e40334165d4494230d4
|
83c8e27a292e6e002fb3b3def75cf6d8653378e8 |
|
31-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Code refactoring around sharpening HLoadClass. Even if the class is not accessible through the dex cache, we can access it by other means (eg boot class, jit table). So rewrite static field access instruction builder to not bail out if a class cannot be accessed through the dex cache. bug:34966607 test: test-art-host test-art-target Change-Id: I88e4e09951a002b480eb8f271726b56f981291bd
|
d09584456559f669f5999fb1ff32aa89ebf6ef4e |
|
30-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Align allocation entrypoints implementation between arm/arm64/x86/x64. x64: - Add art_quick_alloc_initialized_rosalloc x86: - Add art_quick_alloc_initialized_rosalloc - Add art_quick_alloc_initialized{_region}_tlab - Add art_quick_alloc_array_resolved{8,16,32,64}{_region}_tlab arm32: - Add art_quick_alloc_initialized_rosalloc - Add art_quick_alloc_initialized{_region}_tlab - Add art_quick_alloc_array_resolved{8,16,32,64}{_region}_tlab arm64: - Add art_quick_alloc_initialized_rosalloc - Add art_quick_alloc_initialized{_region}_tlab - Add art_quick_alloc_array_resolved{8,16,32,64}_tlab Test: test-art-target test-art-host bug: 30933338 Change-Id: I0dd8667a2921dd0b3403bea5d05304ba5d40627f
|
5e8d5f01b0fe87a6c649bd3a9f1534228b93423d |
|
18-Oct-2016 |
Roland Levillain <rpl@google.com> |
Fix some typos in ART. Test: m build-art-host Test: m cpplint-art Change-Id: Ifc6ce3d0d645c4a8dca72dd483fc03fc05077130
|
d8c052ac0aa3382c4807add33afa32580ffeecbb |
|
02-Nov-2016 |
TatWai Chong <tatwai.chong@linaro.org> |
ART: Reference.getReferent intrinsic for arm and arm64 Test: m test-art-host Test: m test-art-target Test: export ART_HEAP_POISONING=true; m test-art-host Test: export ART_HEAP_POISONING=true; m test-art-target Bug: 32535355 Change-Id: Ie63317689dd9e03a24e701c30411f8014970173a
|
e807ff725159dabab3a3028bbb76f83ebcfa40de |
|
23-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Allow multiple HArmDexCacheArrayBase. So that even graphs with irreducible loops can use it and avoid loading methods through KDexCacheViaMethod. Test: test-art-target Change-Id: I913eece1c134ebe9ea989064e477f694b8895d8f
|
a2f526f889be06f96ea59624c9dfb1223b3839f3 |
|
19-Jan-2017 |
Mathieu Chartier <mathieuc@google.com> |
Compressed native PC for stack maps Compress native PC based on instruction alignment. This reduces the size of stack maps, boot.oat is 0.4% smaller for arm64. Test: test-art-host, test-art-target, N6P booting Change-Id: I2b70eecabda88b06fa80a85688fd992070d54278
|
e761bccf9f0d884cc4d4ec104568cef968296492 |
|
19-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Load the array class in the compiler for allocations."" This reverts commit fee255039e30c1c3dfc70c426c3d176221c3cdf9. Change-Id: I02b45f9a659d872feeb35df40b42c1be9878413a
|
fee255039e30c1c3dfc70c426c3d176221c3cdf9 |
|
19-Jan-2017 |
Hiroshi Yamauchi <yamauchi@google.com> |
Revert "Load the array class in the compiler for allocations." libcore test fails. This reverts commit cc99df230feb46ba717252f002d0cc2da6828421. Change-Id: I5bac595acd2b240886062e8c1f11f9095ff6a9ed
|
cc99df230feb46ba717252f002d0cc2da6828421 |
|
18-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Load the array class in the compiler for allocations. Removing one other dependency for needing to pass the current method, and having dex_cache_resolved_types_ in ArtMethod. oat file increase: - x64: 0.25% - arm32: 0.30% - x86: 0.28% test: test-art-host, test-art-target Change-Id: Ibca4fa00d3e31954db2ccb1f65a584b8c67cb230
|
5247c08fb186a5a2ac02226827cf6b994f41a681 |
|
13-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Put the resolved class in HLoadClass. To avoid repeated lookups in sharpening/rtp/inlining. Test: test-art-host test-art-target Change-Id: I08d0da36a4bb061cdaa490ea2af3a3217a875bbe
|
1998cd02603197f2acdc0734397a6d48b2f59b80 |
|
13-Jan-2017 |
Vladimir Marko <vmarko@google.com> |
Implement HLoadClass/kBssEntry for boot image. Test: m test-art-host Test: m test-art-host with CC Test: m test-art-target on Nexus 9 Test: Nexus 9 boots. Test: Build aosp_mips64-eng Bug: 30627598 Change-Id: I168f24dedd5fb54a1e4215ecafb947ffb0dc3280
|
6bec91c7d4670905cd67440991ec76fd54d0f000 |
|
09-Jan-2017 |
Vladimir Marko <vmarko@google.com> |
Store resolved types for AOT code in .bss. Test: m test-art-host Test: m test-art-target on Nexus 9. Test: Nexus 9 boots. Test: Build aosp_mips64-eng. Bug: 30627598 Bug: 34193123 Change-Id: I8ec60a98eb488cb46ae3ea56341f5709dad4f623
|
4155998a2f5c7a252a6611e3926943e931ea280a |
|
06-Jan-2017 |
Vladimir Marko <vmarko@google.com> |
Make runtime call on main for HLoadClass/kDexCacheViaMethod. Remove dependency of the compiled code on types dex cache array in preparation for changing to a hash-based array. Test: m test-art-host Test: m test-art-target on Nexus 9 Bug: 30627598 Change-Id: I3c426ed762c12eb9eb4bb61ea9a23a0659abf0a2
|
48886c2ee655a16224870fee52dc8721a52babcf |
|
06-Jan-2017 |
Vladimir Marko <vmarko@google.com> |
Remove HLoadClass::LoadKind::kDexCachePcRelative. Test: m test-art-host Test: m test-art-target-run-test-552-checker-sharpening Bug: 30627598 Change-Id: Ic809b0f3a8ed0bd4dc7ab67aa64866f9cdff9bdb
|
ac141397dc29189ad2b2df41f8d4312246beec60 |
|
13-Jan-2017 |
Orion Hodson <oth@google.com> |
Revert "Revert "ART: Compiler support for invoke-polymorphic."" This reverts commit 0fb5af1c8287b1ec85c55c306a1c43820c38a337. This takes us back to the original change and attempts to fix the issues encountered: - Adds transition record push/pop around artInvokePolymorphic. - Changes X86/X64 relocations for MacSDK. - Implements MIPS entrypoint for art_quick_invoke_polymorphic. - Corrects size of returned reference in art_quick_invoke_polymorphic on ARM. Bug: 30550796,33191393 Test: art/test/run-test 953 Test: m test-art-run-test Change-Id: Ib6b93e00b37b9d4ab743a3470ab3d77fe857cda8
|
0d3998b5ff619364acf47bec0b541e7a49bd6fe7 |
|
12-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Make object allocation entrypoints only take a class."" This reverts commit f7aaacd97881c6924b8212c7f8fe4a4c8721ef53. Change-Id: I6756cd1e6110bb45231f62f5e388f16c044cb145
|
f7aaacd97881c6924b8212c7f8fe4a4c8721ef53 |
|
12-Jan-2017 |
Hiroshi Yamauchi <yamauchi@google.com> |
Revert "Make object allocation entrypoints only take a class." 960-default-smali64 is failing. This reverts commit 2b615ba29c4dfcf54aaf44955f2eac60f5080b2e. Change-Id: Iebb8ee5a917fa84c5f01660ce432798524d078ef
|
0fb5af1c8287b1ec85c55c306a1c43820c38a337 |
|
11-Jan-2017 |
Orion Hodson <oth@google.com> |
Revert "ART: Compiler support for invoke-polymorphic." This reverts commit 02e3092f8d98f339588e48691db77f227b48ac1e. Reasons for revert: - Breaks MIPS/MIPS64 build. - Fails under GCStress test on x64. - Different x64 build configuration doesn't like relocation. Change-Id: I512555b38165d05f8a07e8aed528f00302061001
|
02e3092f8d98f339588e48691db77f227b48ac1e |
|
01-Dec-2016 |
Orion Hodson <oth@google.com> |
ART: Compiler support for invoke-polymorphic. Adds basic support to invoke method handles in compiled code. Enables method verification for methods containing invoke-polymorphic. Adds k45cc/k45rc output to Instruction::DumpString() which was found to be missing when enabling verification. Include stack traces in test 957-methodhandle-transforms for failures so they can be easily identified. Bug: 30550796,33191393 Test: art/test/run-test 953 Test: m test-art-run-test Change-Id: Ic9a96ea24906087597d96ad8159a5bc349d06950
|
2b615ba29c4dfcf54aaf44955f2eac60f5080b2e |
|
06-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Make object allocation entrypoints only take a class. Change motivated by: - Dex cache compression: having the allocation fast path do a dex cache lookup will be too expensive. So instead, rely on the compiler having direct access to the class (either through BSS for AOT, or JIT tables for JIT). - Inlining: the entrypoints relied on the caller of the allocation to have the same dex cache as the outer method (stored at the bottom of the stack). This meant we could not inline methods from a different dex file that do allocations. By avoiding the dex cache lookup in the entrypoint, we can now remove this restriction. Code expansion on average for Docs/Gms/FB/Framework (go/lem numbers): - Around 0.8% on arm64 - Around 1% for x64, arm - Around 1.5% on x86 Test: test-art-host, test-art-target, ART_USE_READ_BARRIER=true/false Test: test-art-host, test-art-target, ART_DEFAULT_GC_TYPE=SS ART_USE_TLAB=true Change-Id: I41f3748bb4d251996aaf6a90fae4c50176f9295f
|
f0acfe7a812a332122011832074142718c278dae |
|
09-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Keep resolved String in HLoadString. For the following reasons: - Avoids needing to do a lookup again in CodeGenerator::EmitJitRoots. - Fixes races where we the string was GC'ed before CodeGenerator::EmitJitRoots. - Makes it possible to do GVN on the same string but defined in different dex files. Test: test-art-host, test-art-target Change-Id: If2b5d3079f7555427b1b96ab04546b3373fcf921
|
c1a42cf3873be202c8c0ca3c4e67500b470ab075 |
|
18-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Remove soon to be obsolete call kinds for direct calls. And remove CompilerDriver::GetCodeAndMethodForDirectCall in preparation of removing non-PIC prebuild and non-PIC on-device boot image compilation. Test: test-art-host test-art-target bug:33192586 Change-Id: Ic48e3e8b9d7605dd0e66f31d458a182198ba9578
|
0f0829ba15e4ed54472fb6ebac3a19b101d03db3 |
|
13-Dec-2016 |
Vladimir Marko <vmarko@google.com> |
Remove obsolete DeduplicateDexCacheAddressLiteral(). Test: Rely on TreeHugger Bug: 30627598 Change-Id: Ia3c7a1d528f62b730d7ac1cc7b67f21d9ff06c9e
|
22384aeab988df7fa5ccdc48a668589c5f602c39 |
|
12-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Add kJitTableAddress for HLoadClass."" This reverts commit d2d5262c8370309e1f2a009f00aafc24f1cf00a0. Change-Id: I6149d5c7d5df0b0fc5cb646a802a2eea8d01ac08
|
d2d5262c8370309e1f2a009f00aafc24f1cf00a0 |
|
12-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Add kJitTableAddress for HLoadClass." One test failure after merge. This reverts commit 5b12f7973636bfea29da3956a9baa7a6bbe2b666. Change-Id: I120c49e53274471fc1c82a10d52e99c83f5f85cc
|
5b12f7973636bfea29da3956a9baa7a6bbe2b666 |
|
09-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Add kJitTableAddress for HLoadClass. This new kind loads classes from the root table associated with JIT compiled code. Also remove kDexCacheAddress, which is replaced by kJitTableAddress. test: ART_TEST_JIT=true test-art-host-jit test-art-target-jit Change-Id: Ia23029688d1a60c178bf2ffa7463927c5d5de4d0
|
063fc772b5b8aed7d769cd7cccb6ddc7619326ee |
|
02-Aug-2016 |
Mingyao Yang <mingyao@google.com> |
Class Hierarchy Analysis (CHA) The class linker now tracks whether a method has a single implementation and if so, the JIT compiler will try to devirtualize a virtual call for the method into a direct call. If the single-implementation assumption is violated due to additional class linking, compiled code that makes the assumption is invalidated. Deoptimization is triggered for compiled code live on stack. Instead of patching return pc's on stack, a CHA guard is added which checks a hidden should_deoptimize flag for deoptimization. This approach limits the number of deoptimization points. This CL does not devirtualize abstract/interface method invocation. Slides on CHA: https://docs.google.com/a/google.com/presentation/d/1Ax6cabP1vM44aLOaJU3B26n5fTE9w5YU-1CRevIDsBc/edit?usp=sharing Change-Id: I18bf716a601b6413b46312e925a6ad9e4008efa4 Test: ART_TEST_JIT=true m test-art-host/target-run-test test-art-host-gtest
|
8a0128a5ca0784f6d2b4ca27907e8967a74bc4c5 |
|
28-Nov-2016 |
Andreas Gampe <agampe@google.com> |
ART: Add dex::StringIndex Add abstraction for uint32_t string index. Test: m test-art-host Change-Id: I917c2881702fe3df112c713f06980f2278ced7ed
|
a5b09a67034e57a6e10231dd4bd92f4cb50b824c |
|
18-Nov-2016 |
Andreas Gampe <agampe@google.com> |
ART: Add dex::TypeIndex Add abstraction for uint16_t type index. Test: m test-art-host Change-Id: I47708741c7c579cbbe59ab723c1e31c5fe71f83a
|
132d8363bf8cb043d910836672192ec8c36649b6 |
|
16-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Revert "Revert "JIT root tables."""" Test: 626-set-resolved-string, test-art-host, test-art-target Test: run-libcore-tests.sh Test: phone boots and runs This reverts commit 3395fbc20bcd20948bec8958db91b304c17cacd8. Change-Id: I104b73d093e3eb6a271d564cfdb9ab09c1c8cf24
|
6beced4c017826f7c449f12fac7fa42403657f2b |
|
16-Nov-2016 |
Mathieu Chartier <mathieuc@google.com> |
Change iftable to never be null Simplifies code generation by removing a null check. The null case is rare. Ritzperf code size: 13107624 -> 13095336 Also addressed comments from previous CL. Bug: 32577579 Test: test-art-host, run ritzperf both with CC Change-Id: I2b31e800867112869d7f0643e16c08826296979e
|
9fd8c60cdff7b28a89bb97fd90ae9d0f37cf8f8b |
|
14-Nov-2016 |
Mathieu Chartier <mathieuc@google.com> |
Pass object instead of class to instanceof entrypoint Reduces code size. Also avoid read barrier for kArrayCheck case. Bug: 32577579 Test: test-art-host, test-art-target CC Change-Id: Ia890f656fe166b2d39c522b63a8a6469404134ae
|
afbcdafde4d2c1de293c3ba1da22f579df200b3b |
|
14-Nov-2016 |
Mathieu Chartier <mathieuc@google.com> |
Clean up interface check cast Changed arm, arm64 to use less labels and removed forward branch in the success case. Cleaned up X86, X86_64 to remove the is_null label. Bug: 12687968 Bug: 32577579 Test: test-art-host, test-art-target CC Change-Id: Iba426dff548b2ef42198fad13efeb075f7c724a7
|
3395fbc20bcd20948bec8958db91b304c17cacd8 |
|
14-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Revert "JIT root tables.""" libcore failures: dalvikvm32 F 11-14 03:04:06 14870 14870 jit_code_cache.cc:310] Check failed: new_string != nullptr This reverts commit 75afcdd3503a8a8518e5b23d21b6e73306ce39ce. Change-Id: I5a6b6b48aa79a763d1ff1ba4d85d63811254787d
|
75afcdd3503a8a8518e5b23d21b6e73306ce39ce |
|
10-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "JIT root tables."" Also contains Revert "Support kJitTableAddress in x86/arm/arm64." This reverts commit 4acd03638fcdb4e5d1666f8eec7eb3bf6d6be035. This reverts commit 997d1217830c0a18b70faeabd53c04700a87d7d9. Test: ART_USE_READ_BARRIER=true/false test-art-host test-art-target Change-Id: I77cb1e9bf8f1b4c58b72d3cf5ca31ced2aaa1ea3
|
fe814e89965ddf9a8b603863bd28259f8dd7be35 |
|
09-Nov-2016 |
Mathieu Chartier <mathieuc@google.com> |
Use entrypoint switching to reduce code size of GcRoot read barrier Set the read barrier mark register entrypoints to null when the GC is not marking. The compiler uses this to avoid needing to load the is_gc_marking boolean. Code size results on ritzperf CC: arm32: 13439400 -> 13242792 (-1.5%) arm64: 16380544 -> 16208512 (-1.05%) Implemented for arm32 and arm64. TODO: Consider implementing on x86. Bug: 32638713 Bug: 29516974 Test: test-art-host + run ritzperf Change-Id: I527ca5dc4cd43950ba43b872d0ac81e1eb5791eb
|
3af00dc3918dfaacd51fb0ef604de51dd6dc9af4 |
|
10-Nov-2016 |
Mathieu Chartier <mathieuc@google.com> |
Use enum for read barrier options in compiler Enums are just phenomenal. Also fixed a double load error in x86 interface check cast fast path. Test: test-art-host Change-Id: Iea403ce579145b6a294073f3900ad6921c1a0d53
|
aa474eb597056d21c0b21d353b9b6aa460351d0f |
|
10-Nov-2016 |
Mathieu Chartier <mathieuc@google.com> |
Avoid read barriers for inlined check cast Avoiding read barriers improves speed and reduces code size. Doing this can never result in false positives, only false negatives. These false negatives are handled correcly by rechecking in the entrypoint. Ritzperf code size for CC: arm32: 13439400->13300136 (-1.04%) arm64: 16405120->16253568 (-0.92%) Perf: TODO Bug: 29516974 Bug: 12687968 Test: test-art-host, run ritzperf both with CC Change-Id: Ie024e0b1e8ee415781fb73e8029e87e8a5318f86
|
5c44c1bb131e609d9aba6f97f933567bf77cb8ed |
|
05-Nov-2016 |
Mathieu Chartier <mathieuc@google.com> |
Add interface check cast fast path to arm, arm64, x86 Bug: 12687968 Bug: 32577579 Test: test-art-host, test-art-target CC Change-Id: Ia57099d499fa704803cc5f0135f0f53fefe39826
|
4acd03638fcdb4e5d1666f8eec7eb3bf6d6be035 |
|
09-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "JIT root tables." May be the offender for jit-gcstress failure of 902. This reverts commit ac3ebc3150760425ed00abd56da48f9a6e0666bc. Change-Id: I9ea6c9236fd1729fed7d1868dd8a111172932308
|
07c919feccdf47f997842a131a802aa6b891e34a |
|
09-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Support kJitTableAddress in x86/arm/arm64." Revert this in order to revert https://android-review.googlesource.com/#/c/285781/ This reverts commit 997d1217830c0a18b70faeabd53c04700a87d7d9. Change-Id: I1888fba1c6f712cae4aec4ea4719b74a46da156c
|
997d1217830c0a18b70faeabd53c04700a87d7d9 |
|
09-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Support kJitTableAddress in x86/arm/arm64. test: test-art-host test-art-target, angler boots and runs. Change-Id: I3654ae2809d4d759db76ee1ada1c17f3a9c3b392
|
54d6a207341ad45cb5eceed71a344073ed6d4e31 |
|
09-Nov-2016 |
Vladimir Marko <vmarko@google.com> |
Fix 552-checker-sharpening for PIC test. And remove obsolete HLoadString::LoadKind::kDexCacheAddress. Test: m ART_TEST_PIC_TEST=true test-art-host Change-Id: I3e7a1a98c2c7eba5ea10954d7efcf743a807c300
|
fdaf0f45510374d3a122fdc85d68793e2431175e |
|
13-Oct-2016 |
Vladimir Marko <vmarko@google.com> |
Change string compression encoding. Encode the string compression flag as the least significant bit of the "count" field, with 0 meaning compressed and 1 meaning uncompressed. The main vdex file is a tiny bit larger (+28B for prebuilt boot images, +32 for on-device built images) and the oat file sizes change. Measured on Nexus 9, AOSP ToT, these changes are insignificant when string compression is disabled (-200B for the 32-bit boot*.oat for prebuilt boot image, -4KiB when built on the device attributable to rounding, -16B for 64-bit boot*.oat for prebuilt boot image, no change when built on device) but with string compression enabled we get significant differences: prebuilt multi-part boot image: - 32-bit boot*.oat: -28KiB - 64-bit boot*.oat: -24KiB on-device built single boot image: - 32-bit boot.oat: -32KiB - 64-bit boot.oat: -28KiB The boot image oat file overhead for string compression: prebuilt multi-part boot image: - 32-bit boot*.oat: before: ~80KiB after: ~52KiB - 64-bit boot*.oat: before: ~116KiB after: ~92KiB on-device built single boot image: - 32-bit boot.oat: before: 92KiB after: 60KiB - 64-bit boot.oat: before: 116KiB after: 92KiB The differences in the SplitStringBenchmark seem to be lost in the noise. Test: Run ART test suite on host and Nexus 9 with Optimizing. Test: Run ART test suite on host and Nexus 9 with interpreter. Test: All of the above with string compression enabled. Bug: 31040547 Change-Id: I7570c2b700f1a31004a2d3c18b1cc30046d35a74
|
b99f4d6463e7cb5654af3893ed7b3113665df658 |
|
08-Nov-2016 |
Mathieu Chartier <mathieuc@google.com> |
Change check cast entrypoint to check instance of Reduces code size since we do not need to reload class before calling slow path. TODO: Delete read barriers in the check cast code since the slow path will retry with the proper read barriers if the check fails. Bug: 12687968 Bug: 29516974 Test: test-art-host + test-art-target with CC Change-Id: Ia4eb9bbe3fe2d2016e44523cf0451210828d7b88
|
ac3ebc3150760425ed00abd56da48f9a6e0666bc |
|
05-Oct-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
JIT root tables. Implement root tables for the JIT. Each JIT compiled method gets a table allocated before the stack maps. The table gets visited through Runtime::SweepSystemWeaks. Implement String roots for x86_64 as an example. Test: test-art-host test-art-target Change-Id: Id3d5bc67479e08b52dd4b253e970201203a0f0d2
|
19c5419d21376dd69404736b998fbbb9da54af56 |
|
04-Nov-2016 |
Roland Levillain <rpl@google.com> |
Revert "Revert "Enable IntermediateAddress for primitive arrays with read barriers."" This reverts commit 4a3aa578eff94eb10450fae1772deb7cb8ddc6a6. The failing assertion (see b/30762467): 08-09 11:32:46.767 1654 1656 F dex2oatd: art/compiler/optimizing/register_allocation_resolver.cc:325] Check failed: interval->GetDefinedBy()->IsActualObject() IntermediateAddress@InstanceFieldGet that motivated the initial revert has been removed by a previous CL (commit 70e97462116a47ef2e582ea29a037847debcc029, https://android-review.googlesource.com/#/c/254920/). Test: ART host and target (ARM, ARM64) tests with `ART_USE_READ_BARRIER=true`. Bug: 26601270 Bug: 12687968 Change-Id: I09cae0c6c38ca403924153e9f0eb0cc3ff4540e7
|
12b58b23de974232e991c650405f929f8b0dcc9f |
|
01-Nov-2016 |
Hiroshi Yamauchi <yamauchi@google.com> |
Clean up the runtime read barrier and fix fake address dependency. - Rename GetReadBarrierPointer to GetReadBarrierState. - Change its return type to uint32_t. - Fix the runtime fake address dependency for arm/arm64 using inline asm. - Drop ReadBarrier::black_ptr_ and some brooks code. Bug: 12687968 Test: test-art with CC, Ritz EAAC, libartd boot on N9. Change-Id: I595970db825db5be2e98ee1fcbd7696d5501af55
|
00468f3b4b4741be407169a4f21054ebdcccb2b1 |
|
27-Oct-2016 |
Roland Levillain <rpl@google.com> |
Remove default argument values in GenerateGcRootFieldLoad. These values were never or rarely used. Test: mmma art (with and without `ART_USE_READ_BARRIER=true`) Bug: 12687968 Bug: 29516974 Change-Id: I5d15140ce501bf50d7a87871b1e492cee54913db
|
24a4d11cdc5975215af079dc3d658b79e9b0717e |
|
26-Oct-2016 |
Roland Levillain <rpl@google.com> |
Use CLREX in ARM/ARM64 CAS intrinsic Baker read barrier slow paths. Follow clang's implementation, which uses CLREX in compare-and-exchange operations on the failure path, i.e. when the value read by the LDREX (ARM) or LDXR (ARM64) instruction is not the expected value, in order to release the monitor. The previous implementation was perfectly correct, but this one may improve performance on some micro-architectures. This change only affects the art::arm::ReadBarrierMarkAndUpdateFieldSlowPathARM and art::arm64::ReadBarrierMarkAndUpdateFieldSlowPathARM64 slow paths. Test: make test-art-target-run-test-004-UnsafeTest Bug: 29516905 Bug: 12687968 Change-Id: I99edd1ae6489dcec4a0089bfef52736114c6cd48
|
a1aa3b1f40e496d6f8b3b305a4f956ddf2e425fc |
|
26-Oct-2016 |
Roland Levillain <rpl@google.com> |
Add support for Baker read barriers in UnsafeCASObject intrinsics. Prior to doing the compare-and-swap operation, ensure the expected reference stored in the holding object's field is in the to-space by loading it, emitting a read barrier and updating that field with a strong compare-and-set operation with relaxed memory synchronization ordering (if needed). Test: ART host and target tests and Nexus 5X boot test with Baker read barriers. Bug: 29516905 Bug: 12687968 Change-Id: I480f6a9b59547f11d0a04777406b9bfeb905bfd2
|
94ce9c2f41ea198f5fdcfc09c48b9984c95a9c61 |
|
30-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Change pResolveString entrypoint to kSaveEverything. Test: Run ART test suite including gcstress on host and Nexus 9. Test: Run ART test suite including gcstress with baker CC on host and Nexus 9. Bug: 20323084 Change-Id: I63c21a7d3be8ff7a5765b5003c85b5317635efe6
|
58a4c6198a71973ea589edebe0b3f17c72d55e29 |
|
18-Oct-2016 |
Mathieu Chartier <mathieuc@google.com> |
Delete unused blocked_register_pairs_ in code generators Legacy code for compatibility with quick? Test: test-art-host CC Change-Id: I9de261daea67dfd9bd3df89826ba9d10f135e29e
|
a7812ae7939b199392f874b24a52454bbd0c13f2 |
|
17-Oct-2016 |
Scott Wakeling <scott.wakeling@linaro.org> |
ARM: VIXL32: Pass initial ART tests with new code generator. - Implement enough codegen to pass ~70 art/tests. - When ART_USE_VIXL_ARM_BACKEND is defined: - Blacklist known-to-fail target tests - interpret-only everything except the tests themselves - Set a flag to use the VIXL based ARM backend Test: export ART_USE_VIXL_ARM_BACKEND=true && mma test-art-target && mma test-art-host Change-Id: Ic8bc095e8449f10f97fa0511284790f36c20e276
|
96eeb4e2bb21afe8783d62e06b91fd1aef682dbb |
|
12-Oct-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Update HInstruction::NeedsCurrentMethod. HLoadString and HLoadClass when sharpened may not need it anymore. Instead just rely on the HCurrentMethod being the SSA dependency of those instructions. Also save storing the current method in the stack if the graph actually doesn't need it. test: m test-art-host test-art-target Change-Id: I235d8275230637cbbd38fc0d2f9b822f6d2a9c1e
|
aad75c6d5bfab2dc8e30fc99fafe8cd2dc8b74d8 |
|
03-Oct-2016 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Store resolved Strings for AOT code in .bss."" Fixed oat_test to keep dex files alive. Fixed mips build. Rewritten the .bss GC root visiting and added write barrier to the artResolveStringFromCode(). Test: build aosp_mips-eng Test: m ART_DEFAULT_GC_TYPE=SS test-art-target-host-gtest-oat_test Test: Run ART test suite on host and Nexus 9. Bug: 20323084 Bug: 30627598 This reverts commit 5f926055cb88089d8ca27243f35a9dfd89d981f0. Change-Id: I07fa2278d82b8eb64964c9a4b66cb93726ccda6b
|
0576575d075e97a227010b4adf74ad5c8a920bde |
|
10-Sep-2016 |
jessicahandojo <jessicahandojo@google.com> |
String Compression for ARM and ARM64 Changes on intrinsics and Code Generation on ARM and ARM64 for string compression feature. Currently the feature is off. The size of boot.oat and boot.art for ARM before and after the changes (feature OFF) are still. When the feature ON, boot.oat increased by 0.60% and boot.art decreased by 9.38%. Meanwhile for ARM64, size of boot.oat and boot.art before and after changes (feature OFF) are still. When the feature ON, boot.oat increased by 0.48% and boot.art decreased by 6.58%. Turn feature on: runtime/mirror/string.h (kUseStringCompression = true) runtime/asm_support.h (STRING_COMPRESSION_FEATURE 1) Test: m -j31 test-art-target All tests passed both when the mirror::kUseStringCompression is ON and OFF. Bug: 31040547 Change-Id: I24e86b99391df33ba27df747779b648c5a820649
|
5f926055cb88089d8ca27243f35a9dfd89d981f0 |
|
30-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Revert "Store resolved Strings for AOT code in .bss." There are some issues with oat_test64 on host and aosp_mips-eng. Also reverts "compiler_driver: Fix build." Bug: 20323084 Bug: 30627598 This reverts commit 63dccbbefef3014c99c22748d18befcc7bcb3b41. This reverts commit 04a44135ace10123f059373691594ae0f270a8a4. Change-Id: I568ba3e58cf103987fdd63c8a21521010a9f27c4
|
63dccbbefef3014c99c22748d18befcc7bcb3b41 |
|
21-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Store resolved Strings for AOT code in .bss. And do some related refactorings. Bug: 20323084 Bug: 30627598 Test: Run ART test suite including gcstress on host and Nexus 9. Test: Run ART test suite including gcstress with baker CC on host and Nexus 9. Test: Build aosp_mips64-eng. Change-Id: I1b12c1570fee8e5da490b47f231050142afcbd1e
|
da079bba8403733cac9bb7415b038ffd77e62403 |
|
26-Sep-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Cleanup String.<init> handling. Move everything to one place (currently well_known_classes.cc, but no strong preference) and define a macro to easily handle the list of affected methods. test: m test-art-host test: m test-art-target Change-Id: Ib8372d130d5458516a1f1ae31014afc76037fc34
|
5e4e11e171f90d9a3ea178fc8e72aac909de55d5 |
|
22-Sep-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Clean-up sharpening and compiler driver. Remove dependency on compiler driver for sharpening and dex2dex (the methods called on the compiler driver were doing unnecessary work), and remove the now unused methods in compiler driver. Also remove test that is now invalid, as sharpening always succeeds. test: m test-art-host m test-art-target Change-Id: I54e91c6839bd5b0b86182f2f43ba5d2c112ef908
|
804b03ffb9b9dc6cc3153e004c2cd38667508b13 |
|
14-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Change remaining slow path throw entrypoints to save everything. Change DivZeroCheck, BoundsCheck and explicit NullCheck slow path entrypoints to conform to kSaveEverything. On Nexus 9, AOSP ToT, the boot.oat size reduction is prebuilt multi-part boot image: - 32-bit boot.oat: -12KiB (-0.04%) - 64-bit boot.oat: -24KiB (-0.06%) on-device built single boot image: - 32-bit boot.oat: -8KiB (-0.03%) - 64-bit boot.oat: -16KiB (-0.04%) Test: Run ART test suite including gcstress on host and Nexus 9. Test: Manually disable implicit null checks and test as above. Change-Id: If82a8082ea9ae571c5d03b5e545e67fcefafb163
|
d300d8fa3cf696c459eaf05ffd374c11eb3e9d78 |
|
15-Jul-2016 |
Artem Serov <artem.serov@linaro.org> |
ARM: Use vstm/vldm for live floating point registers save/restore in SlowPathCode. Test: m test-art-target; m test-art-host Change-Id: Id22271c572bb698728444bef90d5c7487ab84b1a
|
f4d6aee7786176df65b093690686617725f08378 |
|
11-Jul-2016 |
Artem Serov <artem.serov@linaro.org> |
ARM: Use stm/ldm for live registers save/restore in SlowPathCode. In case when there is more than 4 register to save/restore in the SlowPathCode stm/ldm can save some code size. Test: m test-art-target; m test-art-host Change-Id: I2d5b44bab58b67207105302cd7d8ee3300b9040a
|
91a6516103b8bf8bb75c3a2840cbdec7521e74a7 |
|
19-Sep-2016 |
Alexandre Rames <alexandre.rames@linaro.org> |
Remove the `CanTriggerGC` side-effects on a few instructions. The side-effect was specified for these instructions as they call runtime. We now have a list of entrypoints that we know cannot trigger GC. We can avoid requiring the side-effect for those. Test: Run ART test suite on Nexus 5X and host. Change-Id: I0e0e6a4d701ce6c75aff486cb0d1bc7fe2e8dda4
|
3b7537bfc5a6b7ccb18b3970d8edf14b72464af7 |
|
13-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Use implicit null checks inside try blocks."" Fix implicit checks in try blocks to emit stack maps. Fix arm64 null expection from signal entrypoint to call the runtime handler instead or simply jumping there. On Nexus 9, AOSP ToT, the boot.oat size reduction is prebuilt multi-part boot image: - 32-bit boot.oat: -448KiB (-1.3%) - 64-bit boot.oat: -528KiB (-1.2%) on-device built single boot image: - 32-bit boot.oat: -448KiB (-1.4%) - 64-bit boot.oat: -528KiB (-1.3%) Note that the oat files no longer contain dex files which have been moved to vdex, so the percentages are not directly comparable with the those reported in the original commit. Test: Run ART test suite including gc-stress on host and Nexus 9. Bug: 30212852 Bug: 31468464 This reverts commit 0719b5b9b458cb3eb9f0823f0dacdfe1a71214dd. Change-Id: If8a9da8c11adf2aad203e93b6684ce16ed776285
|
0719b5b9b458cb3eb9f0823f0dacdfe1a71214dd |
|
13-Sep-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Use implicit null checks inside try blocks." Fails gcstress tests. This reverts commit 7aa7560683626c7893011271c241b3265ded1dc3. Change-Id: I4f5c89048b9ffddbafa02f3001e329ff87058ca2
|
a60a7053cd9a25c89dedc810b8a539cad3d56b36 |
|
12-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Remove custom CheckCast slow path caller saves for Baker CC. For Baker CC, CheckCast has both a read-barrier marking slow path and a pCheckCast slow path. When the latter is known to leave the method, i.e. known to throw outside a try-block, we do not need to save live registers for retrieval for the exception delivery and since the read-barrier marking does not need to save any registers either we were setting the custom slow path caller saves to empty to avoid reserving unnecessary spill space. Hovewer, this also leads to marking live references in caller-save registers in the register mask and while the read-barrier marking entrypoint doesn't care, it causes a stack walk for the pCheckCast to try and retrieve an unsaved register. For the time being, revert to the default caller saves. This is a partial revert of https://android-review.googlesource.com/254920 Test: Run ART test suite on host and Nexus 9. Bug: 29231980 Bug: 30212852 Change-Id: I4e22125f3d8903c97506aa2e6e66bea8e8e6baef
|
7aa7560683626c7893011271c241b3265ded1dc3 |
|
07-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Use implicit null checks inside try blocks. Make implicit null check entrypoint save all registers, use platform-specific approach to still pass the fault address. Allow implicit null checks in try blocks. On Nexus 9, AOSP ToT, the boot.oat size reduction is prebuilt multi-part boot image: - 32-bit boot.oat: -452KiB (-0.7%) - 64-bit boot.oat: -482KiB (-0.7%) on-device built single boot image: - 32-bit boot.oat: -444KiB (-0.7%) - 64-bit boot.oat: -488KiB (-0.7%) Test: Run ART test suite on host and Nexus 9. Test: Build aosp_mips64-eng. Change-Id: I279f3ab57e2e2f338131c5cac45c51b673bdca19
|
e0576d15f3ad0e9316f96838af01f7cc7acf6c3c |
|
09-Sep-2016 |
Mathieu Chartier <mathieuc@google.com> |
Re-enable boot image direct string loads for read barriers Boot.oat code size with CC baker: ARM32: 70775656 -> 69817028 (-1.35%) ARM64: 80819424 -> 79417072 (-1.74%) X86 unmeasured. X86_64 unmeasured. Performance unmeasured, should be faster. Bug: 29516974 Test: test-art-host CC baker, N6P booting CC baker Change-Id: I219faaca9ed17af81d2815fb5e124120f307af83
|
d8ec6dba03e61734a398e9cd6e017dd90eec6101 |
|
31-Aug-2016 |
Christina Wadsworth <cwadsworth@google.com> |
ART: Generate path to entrypoints in VisitLoadString for arm ARM32 boot.oat with CC: 72534816 -> 71888864 (-0.9%) Move code that branches to entrypoints from GenerateSlowPaths to VisitLoadString. Since we're doing this every time, we shouldn't have it at the end with all of the slow paths. Test: N6P booting with CC, test-art-target32 on shamu Change-Id: I9c3307629015c9f6460506519339d4f275abe5a9
|
31b12e32073f458950e96d0d1b44e48508cf67e4 |
|
03-Sep-2016 |
Mathieu Chartier <mathieuc@google.com> |
Avoid read barrier for image HLoadClass Concurrent copying baker: X86_64 core-optimizing-pic.oat: 28583112 -> 27906824 (2.4% smaller) Around 0.4% of 2.4% is from re-enabling kBootImageLinkTimeAddress, kBootImageLinkTimePcRelative, and kBootImageAddress. N6P boot.oat 32: 73042140 -> 71891956 (1.57% smaller) N6P boot.oat 64: 83831608 -> 82531456 (1.55% smaller) EAAC: 1252 -> 1245 (32 samples) Bug: 29516974 Test: test-art-host CC baker, N6P booting Change-Id: I9a196cf0157058836981c43c93872e9f0c4919aa
|
9d6e1f892768b0a0cf273ca455070f462a766a06 |
|
05-Sep-2016 |
Roland Levillain <rpl@google.com> |
Do type checks in ArraySet without read barriers. This approach is valid in the case of Baker and non-Baker read barriers. Benchmarks (ARM64) score variations on Nexus 5X with CPU cores clamped at 960000 Hz (aosp_bullhead-userdebug build, medians of 10 runs for each suite): - Ritzperf - average (lower is better): -0.44% (virtually unchanged) - CaffeineMark - average (higher is better): -0.20% (virtually unchanged) - DeltaBlue (lower is better): -4.08% (slightly better) - Richards - average (lower is better): -0.57% (virtually unchanged) - SciMark2 - average (higher is better): -0.52% (virtually unchanged) Details about Ritzperf benchmarks with meaningful variations (lower is better): - GenericCalcActions.MemAllocTest: +3.02% (slightly worse) Details about Richards benchmarks with meaningful variations (lower is better): - gibbons -5.01% (better) Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 83127840 bytes -> 83082656 bytes (-45184 bytes, -0.05%) - total ARM framework Oat files size change: 72571872 bytes -> 72522796 bytes (-49076 bytes, -0.07%) Test: ART_USE_READ_BARRIER=true ART_HEAP_POISONING=true m test-art-host Test: ART_USE_READ_BARRIER=true ART_HEAP_POISONING=true m test-art-target Bug: 29516974 Bug: 12687968 Change-Id: I8fe130156ace87dd2e4a15d9f8b4111287e735b3
|
239d6eaff0cbb5c4c0139f7053a012758799f186 |
|
05-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Change deoptimize entrypoint to save everything. And implement FPU register retrieval from stack on x86. On Nexus 9, AOSP ToT, the boot.oat size reduction is prebuilt multi-part boot image: - 32-bit boot.oat: -20KiB (-0.03%) - 64-bit boot.oat: -45KiB (-0.06%) on-device built single boot image: - 32-bit boot.oat: -24KiB (-0.04%) - 64-bit boot.oat: -36KiB (-0.05%) Test: Run ART test suite on host and Nexus 9. Bug: 30212852 Change-Id: I5d98e2a24363136d73dfec6100ab02f8eb101911
|
70e97462116a47ef2e582ea29a037847debcc029 |
|
09-Aug-2016 |
Vladimir Marko <vmarko@google.com> |
Avoid excessive spill slots for slow paths. Reducing the frame size makes stack maps smaller as we need fewer bits for stack masks and some dex register locations may use short location kind rather than long. On Nexus 9, AOSP ToT, the boot.oat size reduction is prebuilt multi-part boot image: - 32-bit boot.oat: -416KiB (-0.6%) - 64-bit boot.oat: -635KiB (-0.9%) prebuilt multi-part boot image with read barrier: - 32-bit boot.oat: -483KiB (-0.7%) - 64-bit boot.oat: -703KiB (-0.9%) on-device built single boot image: - 32-bit boot.oat: -380KiB (-0.6%) - 64-bit boot.oat: -632KiB (-0.9%) on-device built single boot image with read barrier: - 32-bit boot.oat: -448KiB (-0.6%) - 64-bit boot.oat: -692KiB (-0.9%) The other benefit is that at runtime, threads may need fewer pages for their stacks, reducing overall memory usage. We defer the calculation of the maximum spill size from the main register allocator (linear scan or graph coloring) to the RegisterAllocationResolver and do it based on the live registers at slow path safepoints. The old notion of an artificial slow path safepoint interval is removed as it is no longer needed. Test: Run ART test suite on host and Nexus 9. Bug: 30212852 Change-Id: I40b3d114e278e2c5807982904fa49bf6642c6275
|
da8ffec70e9019fe1208ac38444a7048958fc206 |
|
09-Mar-2016 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Add entrypoint type information. For some of the runtime calls we do not need to generate stack maps. For example, the Optimizing compiler implements HRem Floating Point by calling libm's fmod(). Since this is a leaf method that does not suspend the execution, we do not need to treat the fmod() invoke as a possible suspend point and thus we do not need to create a stack map for the particular PC. For now conservatively only tag the maths runtime entrypoints with this information. Test: m test-art-target Change-Id: Iab73dcf8047d2edaa7a570113ee792e46ccbc464
|
68c981fad87720fae9c799b240141ce3c12cd5bf |
|
26-Aug-2016 |
Vladimir Marko <vmarko@google.com> |
ARM/MIPS: Avoid dead dex cache arrays base for intrinsics. Test: Run ART test suite on host and Nexus 6. Change-Id: Ie2ad70f1e3f125eae5dad53a6384d405e0311505
|
4bb30ac111d2d9d57a504597520454e05cdee3ed |
|
22-Jun-2016 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
ARM: Make runtime invokes use InvokeRuntime(). This patch refactors all of the ARM Optimizing compiler runtime invokes to use InvokeRuntime(). It also fixes some misuses of RecordPcInfo(). Change-Id: I722bc2ba95e42ff69ca12c3edc09326e0de2881f
|
16d9f949698faed28435af7aa9c9ebacbfd5d1a8 |
|
25-Aug-2016 |
Roland Levillain <rpl@google.com> |
Re-enable the ArraySet fast path with Baker read barriers. Benchmarks (ARM64) score variations on Nexus 5X with CPU cores clamped at 960000 Hz (aosp_bullhead-userdebug build): - Ritzperf - average (lower is better): -0.95% (virtually unchanged) - CaffeineMark - average (higher is better): +2.50% (slightly better) - DeltaBlue (lower is better): -0.55% (virtually unchanged) - Richards - average (lower is better): +0.67% (virtually unchanged) - SciMark2 - average (higher is better): -0.10% (virtually unchanged) Details about Ritzperf benchmarks with meaningful variations (lower is better): - GenericCalcActions.MemAllocTest: -5.05% (better) Details about CaffeineMark benchmarks with meaningful variations (higher is better): - Method: +16.88% (better) Details about Richards benchmarks with meaningful variations (lower is better): - deutsch_acc_interface: +9.86% (worse) Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 105933472 bytes -> 106027680 bytes (+0.09%) - total ARM framework Oat files size change: 89157936 bytes -> 89239856 bytes (+0.09%) Test: ART host and target (ARM, ARM64) tests. Bug: 29516974 Bug: 29506760 Bug: 12687968 Change-Id: Ib9e9709712295e17804b8888ac10e3d518ff2e70
|
8d49fd7b1087fba274a844cbf180349c528cf912 |
|
25-Aug-2016 |
Vladimir Marko <vmarko@google.com> |
ArraySet without type check does not need read barrier. Test: Run ART test suite with ART_USE_READ_BARRIER=true on host and Nexus 9. Bug: 12687968 Change-Id: Ie04a34b2149f4fc6fe995f3e43e76986a3f6330f
|
953437bd51059801d92079295f728d0260efca31 |
|
24-Aug-2016 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "x86/x86-64: Avoid temporary for read barrier field load."" Fixed the fault handler recognizing the TEST instruction and fault address within the lock word. Added tests to 439-npe. Bug: 29966877 Bug: 12687968 Test: Tested with ART_USE_READ_BARRIER=true on host. Test: Tested with ART_USE_READ_BARRIER=true ART_HEAP_POISONING=true on host. This reverts commit ccf15bca330f9a23337b1a4b5850f7fcc6c1bf15. Change-Id: I8990def5f719c9205bf6e5fdba32027fa82bec50
|
ccf15bca330f9a23337b1a4b5850f7fcc6c1bf15 |
|
23-Aug-2016 |
Vladimir Marko <vmarko@google.com> |
Revert "x86/x86-64: Avoid temporary for read barrier field load." Fault handler does not recognize the instruction F6 /0 ib TEST r/m8, imm8 so we get crashes instead of NPEs. Bug: 29966877 Bug: 12687968 This reverts commit ccf06d8f19a37432de4a3b768747090adfbd18ec. Change-Id: Ib7db3b59f44c0d3ed5e24a20b6c6ee596a89d709
|
ccf06d8f19a37432de4a3b768747090adfbd18ec |
|
12-Aug-2016 |
Vladimir Marko <vmarko@google.com> |
x86/x86-64: Avoid temporary for read barrier field load. Add TEST instructions for memory and immediate. Use the byte version to avoid a temporary in read barrier field load. Test: Tested with ART_USE_READ_BARRIER=true on host. Test: Tested with ART_USE_READ_BARRIER=true ART_HEAP_POISONING=true on host. Bug: 29966877 Bug: 12687968 Change-Id: Ia415d3c2e1ae1ff6dff11d72bbb7d96d5deed6ee
|
0b671c0408e98824e1f92b1ee951b210c090fe7a |
|
19-Aug-2016 |
Roland Levillain <rpl@google.com> |
Add support for Baker read barriers in SystemArrayCopy intrinsics. Benchmarks (ARM64) score variations on Nexus 5X with CPU cores clamped at 960000 Hz (aosp_bullhead-userdebug build): - Ritzperf - average (lower is better): -3.03% (slightly better) - CaffeineMark - average (higher is better): +1.26% (slightly better) - DeltaBlue (lower is better): -10.50% (better) - Richards - average (lower is better): -3.36% (slightly better) - SciMark2 - average (higher is better): +0.26% (virtually unchanged) Details about Ritzperf benchmarks with meaningful variations (lower is better): - FormulaEvaluationActions.EvaluateAndApplyChanges: -13.26% (better) - FormulaEvaluationActions.EvaluateCascadingSums: -10.94% (better) - FormulaEvaluationActions.EvaluateComplexFormulas: -15.50% (better) - FormulaEvaluationActions.EvaluateFibonacci: -10.41% (better) - FormulaEvaluationActions.EvaluateLargeSums: +6.02% (worse) Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 107047632 bytes -> 107154128 bytes (+0.10%) - total ARM framework Oat files size change: 90932028 bytes -> 91009852 bytes (+0.09%) Test: ART host and target (ARM, ARM64) tests + Nexus 5X boot. Bug: 29516905 Bug: 29506760 Bug: 12687968 Change-Id: I85431368d09965687a0301ae2eb3c991f276ce5d
|
bf44e0e5281de91f2e38a9378b94ef8c50ad9b23 |
|
18-Aug-2016 |
Christina Wadsworth <cwadsworth@google.com> |
ART: Implement a fixed size string dex cache Previously, the string dex cache was dex_file->NumStringIds() size, and @ruhler found that only ~1% of that cache was ever getting filled. Since many of these string dex caches were previously 100,000+ indices in length, we're wasting a few hundred KB per app by storing null pointers. The intent of this project was to reduce the space the string dex cache is using, while not regressing on time that much. This is the first of a few CLs, which implements the new fixed size array and disables the compiled code so it always goes slow path. In four other CLs, I implemented a "medium path" that regresses from the previous "fast path" only a bit in assembly in the entrypoints. @vmarko will introduce new compiled code in the future so that we ultimately won't be regressing on time at all. Overall, space savings have been confirmed as on the order of 100 KB per application. A 4-5% slow down in art-opt on Golem, and no noticeable slow down in the interpreter. The opt slow down should be diminished once the new compiled code is introduced. Test: m test-art-host Bug: 20323084 Change-Id: Ic654a1fb9c1ae127dde59290bf36a23edb55ca8e
|
4a3aa578eff94eb10450fae1772deb7cb8ddc6a6 |
|
15-Aug-2016 |
Roland Levillain <rpl@google.com> |
Revert "Enable IntermediateAddress for primitive arrays with read barriers." This CL breaks the angler-userdebug build with `ART_USE_READ_BARRIER=true`. Test: Build angler-userdebug with `ART_USE_READ_BARRIER=true`. Bug: 30762467 Bug: 26601270 Bug: 12687968 This reverts commit 12ecf0800d465acdaa3deccd383ff8ed3428a183. Change-Id: Ia2069ac9436d2336311dd8d0f183c02e587586ae
|
7cbd27fe778f2c348136540d52b5473e28f5769d |
|
12-Aug-2016 |
Roland Levillain <rpl@google.com> |
Adjust spacing before NOLINT comments in ART. Note that neither clang-tidy nor cpplint.py complain about these style "issues", precisely because of the NOLINT comments. Test: WITH_TIDY=1 WITH_TIDY_CHECKS='-*,misc-macro-parentheses' mmma art Change-Id: Id692fd394ffbd4fe208cbbe4407b4d5e208462bb
|
12ecf0800d465acdaa3deccd383ff8ed3428a183 |
|
08-Aug-2016 |
Roland Levillain <rpl@google.com> |
Enable IntermediateAddress for primitive arrays with read barriers. Test: ART host and target (ARM, ARM64) tests. Bug: 26601270 Bug: 12687968 Change-Id: I6736ba7b1809bece1bf3cd82c69e4f42a0d3c4a7
|
59751a7375196c530fbd048e72750aa94ab90431 |
|
05-Aug-2016 |
Vladimir Marko <vmarko@google.com> |
ARM: Embed constants in add/sub-long. Test: 538-checker-embed-constants Test: Run ART test suite on Nexus 5. Change-Id: Ib9639748c74d5c56dc354a6830987b613b922654
|
952dbb19cd094b8bfb01dbb33e0878db429e499a |
|
28-Jul-2016 |
Vladimir Marko <vmarko@google.com> |
Change suspend entrypoint to save all registers. We avoid the need to save/restore registers in slow paths and get significant code size savings. On Nexus 9, AOSP: - 32-bit boot.oat: -1.4MiB (-1.9%) - 64-bit boot.oat: -2.0MiB (-2.3%) - other 32-bit oat files in dalvik-cache: -200KiB (-1.7%) - other 64-bit oat files in dalvik-cache: -2.3MiB (-2.1%) Test: Run ART test suite on host and Nexus 9 with gc stress. Bug: 30212852 Change-Id: I7015afc1e7d30341618c9200a3dc9ae277afd134
|
37dd80d701fc5f55ed5a88ce2a495bf6eeb4a321 |
|
01-Aug-2016 |
Vladimir Marko <vmarko@google.com> |
ARM: Embed 0.0 in VCMP. Test: Run ART test suite on Nexus 5. Change-Id: I5cbbd98c4d64a4d9213e27adcae929ead5099a39
|
542451cc546779f5c67840e105c51205a1b0a8fd |
|
26-Jul-2016 |
Andreas Gampe <agampe@google.com> |
ART: Convert pointer size to enum Move away from size_t to dedicated enum (class). Bug: 30373134 Bug: 30419309 Test: m test-art-host Change-Id: Id453c330f1065012e7d4f9fc24ac477cc9bb9269
|
dec8f63fdf50815f24efe1c03af64208da15f339 |
|
22-Jul-2016 |
Roland Levillain <rpl@google.com> |
Do not emit stack maps for runtime calls to ReadBarrierMarkRegX. * Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 115584120 bytes -> 109124728 bytes (-5.59%) - total ARM framework Oat files size change: 97387728 bytes -> 92517584 (-5.00%) Test: ART host and target (ARM, ARM64) tests. Bug: 29506760 Bug: 12687968 Change-Id: I979d9fb2b4e09f4c0c7bf33af2cd91750a67f989
|
4359e61927866c254bc2d701e3ea4c48de10b79c |
|
20-Jul-2016 |
Roland Levillain <rpl@google.com> |
Move caller-saves saving/restoring to ReadBarrierMarkRegX. Instead of saving/restoring live caller-save registers before/after the call to read barrier mark entry points ReadBarrierMarkRegX, have these entry points save/restore all the caller-save registers themselves (except register rX, which contains the return value). Also refactor the assembly code of these entry points using macros. * Boot image code size variation on Nexus 5X (aosp_bullhead-userdebug build): - total ARM64 framework Oat files size change: 119196792 bytes -> 115575920 bytes (-3.04%) - total ARM framework Oat files size change: 100435212 bytes -> 97621188 bytes (-2.80%) * Benchmarks (ARM64) score variations on Nexus 5X (aosp_bullhead-userdebug build): - RitzPerf (lower is better) - average score difference: -2.71% - CaffeineMark (higher is better) - no real difference for most tests (absolute variation lower than 1%) - better score on the "Method" benchmark: score variation 41253 -> 44891 (+8.82%) Test: ART host and target (ARM, ARM64) tests. Bug: 29506760 Bug: 12687968 Change-Id: I881bf73139a3f1c2bee9ffc6fc8c00f9a392afa6
|
328429ff48d06e2cad4ebdd3568ab06de916a10a |
|
06-Jul-2016 |
Artem Serov <artem.serov@linaro.org> |
ARM: Port instr simplification of array accesses. After changing the addressing mode for array accesses (in https://android-review.googlesource.com/248406) the 'add' instruction that calculates the base address for the array can be shared across accesses to the same array. Before https://android-review.googlesource.com/248406: add IP, r[Array], r[Index0], LSL #2 ldr r0, [IP, #12] add IP, r[Array], r[Index1], LSL #2 ldr r0, [IP, #12] Before this CL: add IP. r[Array], #12 ldr r0, [IP, r[Index0], LSL #2] add IP. r[Array], #12 ldr r0, [IP, r[Index1], LSL #2] After this CL: add IP. r[Array], #12 ldr r0, [IP, r[Index0], LSL #2] ldr r0, [IP, r[Index1], LSL #2] Link to the original optimization: https://android-review.googlesource.com/#/c/127310/ Test: Run ART test suite on Nexus 6. Change-Id: Iee26f9a0a7ca46abb90e3f60d19d22dc8dee4d8f
|
6740997e6934bbca27d5830a32352d82aabbd38b |
|
20-Jul-2016 |
Andreas Gampe <agampe@google.com> |
ART: Change return types of field access entrypoints Ensure that return types guarantee full-width data as the compiled code and mterp expect by using size_t and ssize_t. This fixes Clang no longer sign-/zero-extending small return types. Bug: 30232671 Test: m ART_TEST_RUN_TEST_NDEBUG=true ART_TEST_INTERPRETER=true test-art-host-run-test Change-Id: Ic505befc6c94e2dccbc8abf2b13d4c2d662e68d1
|
6c91679ad9802764c8bc45508fedbb9d59fab377 |
|
11-Jul-2016 |
Artem Serov <artem.serov@linaro.org> |
ARM: Change mem address mode for array accesses. Switch from: add IP, r[Array], r[Index], LSL #2 ldr r0, [IP, #12] To: add IP. r[Array], #12 ldr r0, [IP, r[Index], LSL #2] These is a base for the future TryExtractArrayAccessAddress optimization port to arm. Test: aosp_shamu-userdebug boots and passes "m test-art-target". Change-Id: I6ab01ba3271a8f79599ddd91a6b63cd1b37d2d67
|
465ecc86ff65ca546629630c9469deb6d2d8137e |
|
19-Jul-2016 |
Matthew Gharrity <gharrma@google.com> |
Revert "Revert "Refactor GetIMTIndex"" Originally reverted in order to revert https://android-review.googlesource.com/#/c/244190/ but can now be merged again. This reverts commit d4ceecc85a5aab2ec23ea1bd010692ba8c8aaa0c. Test: m test-art-host Change-Id: Id9205f2b77a378fc0f06088e78c66e81a49f712d
|
b3cd84a2fbd4875c605cfa5a4a362864b570f1e6 |
|
13-Jul-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a bug in ClassTableGet code generation for IMTs. Introduced by: https://android-review.googlesource.com/#/c/244980/ test:566-polymorphic-inling for fixing x86 crash. Also fixes a performance regression. bug:29188168 (cherry picked from commit ff484b95b25a5181a6a8a191cbd11da501c97651) Change-Id: Iae5a63cb24017222c3fefda695a0a39673719f51
|
ff484b95b25a5181a6a8a191cbd11da501c97651 |
|
13-Jul-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a bug in ClassTableGet code generation for IMTs. Introduced by: https://android-review.googlesource.com/#/c/244980/ test:566-polymorphic-inling for fixing x86 crash. Also fixes a performance regression. bug:29188168 Change-Id: Id90cb819c88e7ba3db1cb3c50c517a112ab7d784
|
02b75806a80f8b75c3d6ba2ff97c995117630f36 |
|
13-Jul-2016 |
Roland Levillain <rpl@google.com> |
Introduce more compact ReadBarrierMark slow-paths. Replace entry point ReadBarrierMark with 32 ReadBarrierMarkRegX entry points, using register number X as input and output (instead of the standard runtime calling convention) to save two moves in Baker's read barrier mark slow-path code. Test: ART host and target (ARM, ARM64) tests. Bug: 29506760 Bug: 12687968 Change-Id: I73cfb82831cf040b8b018e984163c865cc44ed87
|
194bcfea4a29db2c529de333c6a00b32608dd4e5 |
|
11-Jul-2016 |
Vladimir Marko <vmarko@google.com> |
ARM: Shorter fast-path for read barrier field load. Reduces the aosp_hammerhead-userdebug boot.oat by 2.2MiB, i.e. ~2.2%, in the ART_USE_READ_BARRIER=true configuration. Test: Tested with ART_USE_READ_BARRIER=true on Nexus 5. Bug: 29966877 Bug: 12687968 Change-Id: I4454150003e12a1aa7f0cf451627dc1ee9a495ae
|
54ff482710910929900f8348a19c5b875e519237 |
|
07-Jul-2016 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Rename kCall to kCallOnMainOnly This patch renames kCall to kCallOnMainOnly in preparation for the next patch in this series which will be adding kCallOnMainAndSlowPath. Note: With this patch there will be places where we use kCallOnMainOnly even though we call on the slow path too. The next patch in this series will fix that. Test: ART host tests. Change-Id: Iabfdb0901990d163be5d780f3bdd2fab6fa17b32
|
df2d4f22d5e89692c90b443da82fe2930518418b |
|
30-Jun-2016 |
Artem Udovichenko <artem.u@samsung.com> |
Revert "Revert "Optimize IMT"" This reverts commit 88f288e3564d79d87c0cd8bb831ec5a791ba4861. Test: Includes smali tests to exercise cts failures that led to revert. These tests check that objects that don't implement any interfaces are handled properly when interface methods are invoked on them. Bug: 29188168 (for initial CL) Bug: 29778499 (reason for revert) Change-Id: I49605d53692cbec1e2622e23ff2893fc51ed4115
|
8c5d310b8ac3bdce41cdc680fcad791c321eaec2 |
|
07-Jul-2016 |
Vladimir Marko <vmarko@google.com> |
ARM: Remove unnecessary VMOV from float/double-to-int. Test: Run standard ART test suite on Nexus 5. Change-Id: I780fd0cca68f89401d2a114e1022bed498d02979
|
a62cb9bb6cb2278cb41ab0664191623e178c6a4f |
|
30-Jun-2016 |
Artem Udovichenko <artem.u@samsung.com> |
Revert "Revert "Optimize IMT"" This reverts commit 88f288e3564d79d87c0cd8bb831ec5a791ba4861. Change-Id: I49605d53692cbec1e2622e23ff2893fc51ed4115
|
88f288e3564d79d87c0cd8bb831ec5a791ba4861 |
|
29-Jun-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Optimize IMT" Bug: 29188168 (for initial CL) Bug: 29778499 (reason for revert) This reverts commit badee9820fcf5dca5f8c46c3215ae1779ee7736e. Change-Id: I32b8463122c3521e233c34ca95c96a5078e88848
|
d4ceecc85a5aab2ec23ea1bd010692ba8c8aaa0c |
|
29-Jun-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Refactor GetIMTIndex" I need to revert this to get https://android-review.googlesource.com/#/c/244190/ to cleanly revert. Matthew, do you mind rewriting it? This reverts commit 50706437d8216e41f0fea1e413cda7891324d397. Change-Id: I5c1435f5dffb46dbb5b613b22adb88c7770304f2
|
fd43db68d204caaa0e411ca79a37af15d1c001af |
|
29-Jun-2016 |
Jeff Hao <jeffhao@google.com> |
Revert "Optimize IMT" This reverts commit 0790af1391b316c5c12b4e135be357008c060696. Bug: 29188168 (for initial CL) Bug: 29778499 (reason for revert) Change-Id: I2c3e4ec2cebdd40faec67ddb721b7acdc8e90061
|
3d31242300c3525e5c85665d9e34acf87002db63 |
|
23-Jun-2016 |
Roland Levillain <rpl@google.com> |
Re-enable most intrinsics with read barriers. Also extend sun.misc.Unsafe test coverage to exercise sun.misc.Unsafe.{get,put}{Int,Long,Object}Volatile. Bug: 26205973 Bug: 29516905 Change-Id: I4d8da7cee5c8a310c8825c1631f71e5cb2b80b30 Test: Covered by ART's run-tests.
|
bfea33585e229973f7887afbf51fe45c2ba41e91 |
|
23-Jun-2016 |
Roland Levillain <rpl@google.com> |
Fix ARM & ARM64 UnsafeGetObject intrinsics with read barriers. The implementation was incorrectly interpreting the 'offset' input as an index in a (4-byte) object reference array, whereas it is a (1-byte) offset to an object reference field within the 'base' (object) input. Bug: 29516905 Change-Id: I4da5be0193217965f25e5d141c242592dea6ffe8 Test: Covered by test/004-UnsafeTest.
|
0790af1391b316c5c12b4e135be357008c060696 |
|
13-May-2016 |
Nelli Kim <nelli.kim@samsung.com> |
Optimize IMT * Remove IMT for classes which do not implement interfaces * Remove IMT for array classes * Share same IMT Saved memory (measured on hammerhead): boot.art: Total number of classes: 3854 Number of affected classes: 1637 Saved memory: 409kB Chrome (excluding classes in boot.art): Total number of classes: 2409 Number of affected classes: 1259 Saved memory: 314kB Google Maps (excluding classes in boot.art): Total number of classes: 6988 Number of affected classes: 2574 Saved memory: 643kB Performance regression on benchmarks/InvokeInterface.java benchmark (measured timeCall10Interface) 1st launch: 9.6% 2nd launch: 6.8% Bug: 29188168 (cherry picked from commit badee9820fcf5dca5f8c46c3215ae1779ee7736e) Change-Id: If8db765e3333cb78eb9ef0d66c2fc78a5f17f497
|
87f3fcbd0db352157fc59148e94647ef21b73bce |
|
28-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Replace String.charAt() with HIR. Replace String.charAt() with HArrayLength, HBoundsCheck and HArrayGet. This allows GVN on the HArrayLength and BCE on the HBoundsCheck as well as using the infrastructure for HArrayGet, i.e. better handling of constant indexes than the old intrinsic and using the HArm64IntermediateAddress. Bug: 28330359 Change-Id: I32bf1da7eeafe82537a60416abf6ac412baa80dc
|
dbb7f5bef10138ade0fb202da1d61f562b2df649 |
|
30-Mar-2016 |
Vladimir Marko <vmarko@google.com> |
Improve HLoadClass code generation. For classes in the boot image, use either direct pointers or PC-relative addresses. For other classes, use PC-relative access to the dex cache arrays for AOT and direct address of the type's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -252KiB (-0.3%) - 64-bit boot.oat: -412KiB (-0.4%) - 32-bit dalvik cache total: -392KiB (-0.4%) - 64-bit dalvik-cache total: -2312KiB (-1.0%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -124KiB (-0.2%) - 64-bit boot.oat: -420KiB (-0.5%) - 32-bit dalvik cache total: -136KiB (-0.1%) - 64-bit dalvik-cache total: -1136KiB (-0.5%) (contains more files than the 32-bit dalvik cache) Bug: 27950288 Change-Id: I4da991a4b7e53c63c92558b97923d18092acf139
|
50706437d8216e41f0fea1e413cda7891324d397 |
|
14-Jun-2016 |
Matthew Gharrity <gharrma@google.com> |
Refactor GetIMTIndex This allows us to more easily maintain and experiment with interface method table indexing and hashing. Change-Id: I719920fae7490dcedcda7c1c36db225c2b8b16df
|
badee9820fcf5dca5f8c46c3215ae1779ee7736e |
|
13-May-2016 |
Nelli Kim <nelli.kim@samsung.com> |
Optimize IMT * Remove IMT for classes which do not implement interfaces * Remove IMT for array classes * Share same IMT Saved memory (measured on hammerhead): boot.art: Total number of classes: 3854 Number of affected classes: 1637 Saved memory: 409kB Chrome (excluding classes in boot.art): Total number of classes: 2409 Number of affected classes: 1259 Saved memory: 314kB Google Maps (excluding classes in boot.art): Total number of classes: 6988 Number of affected classes: 2574 Saved memory: 643kB Performance regression on benchmarks/InvokeInterface.java benchmark (measured timeCall10Interface) 1st launch: 9.6% 2nd launch: 6.8% Change-Id: If07e45390014a6ee8f3c1c4ca095b43046f0871f
|
372f10e5b0b34e2bb6e2b79aeba6c441e14afd1f |
|
17-May-2016 |
Vladimir Marko <vmarko@google.com> |
Refactor handling of input records. Introduce HInstruction::GetInputRecords(), a new virtual function that returns an ArrayRef<> to all input records. Implement all other functions dealing with input records as wrappers around GetInputRecords(). Rewrite functions that previously used multiple virtual calls to deal with input records, especially in loops, to prefetch the ArrayRef<> only once for each instruction. Besides avoiding all the extra calls, this also allows the compiler (clang++) to perform additional optimizations. This speeds up the Nexus 5 boot image compilation by ~0.5s (4% of "Compile Dex File", 2% of dex2oat time) on AOSP ToT. Change-Id: Id8ebe0fb9405e38d918972a11bd724146e4ca578
|
fba39972d99701c80bf3beb7451aca508d67593c |
|
11-May-2016 |
Chih-Hung Hsieh <chh@google.com> |
Fix misc-macro-parentheses warnings. * Add parentheses to fix warnings. * Use NOLINT to suppress wrong clang-tidy warnings. Bug: 28705665 Change-Id: Icc8bc9b59583dee0ea17ab83e0ff0383b8599c3e
|
dce016eab87302f02b0bd903dd2cd86ae512df2d |
|
28-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Intrinsify String.length() and String.isEmpty() as HIR. Use HArrayLength for String.length() in anticipation of changing the String.charAt() to HBoundsCheck+HArrayGet to allow the existing BCE to seamlessly work for strings. Use HArrayLength+HEqual for String.isEmpty(). We previously relied on inlining but we now want to apply the new intrinsics even when we do not inline, i.e. when compiling debuggable (as is currently the case for boot image) or when we hit inlining limits, i.e. depth, size, or the number of accumulated dex registers. Bug: 28330359 Change-Id: Iab9d2f6d2967bdd930a72eb461f27efe8f37c103
|
ffc87076dda9878cb2cc098149bae441d38b9268 |
|
20-Apr-2016 |
Calin Juravle <calin@google.com> |
Split profile recording from jit compilation We still use ProfileInfo objects to record profile information. That gives us the flexibility to add the inline caches in the future and the convenience of the already implemented GC. If UseJIT is false and SaveProfilingInfo true, we will only record the ProfileInfo and never launch compilation tasks. Bug: 27916886 (cherry picked from commit e5de54cfab5f14ba0b8ff25d8d60901c7021943f) Change-Id: I68afc181d71447895fb12346c1806e99bcab1de2
|
e5de54cfab5f14ba0b8ff25d8d60901c7021943f |
|
20-Apr-2016 |
Calin Juravle <calin@google.com> |
Split profile recording from jit compilation We still use ProfileInfo objects to record profile information. That gives us the flexibility to add the inline caches in the future and the convenience of the already implemented GC. If UseJIT is false and SaveProfilingInfo true, we will only record the ProfileInfo and never launch compilation tasks. Bug: 27916886 Change-Id: I6e4768dc5d58f2f85f947b276b4244aa11ce3fca
|
d1ee80948144526b985afb44a0574248cf7da58a |
|
13-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Move Assemblers to the Arena. And clean up some APIs to return std::unique_ptr<> instead of raw pointers that don't communicate ownership. (cherry picked from commit 93205e395f777c1dd81d3f164cf9a4aec4bde45f) Bug: 27505766 Change-Id: I3017302307a0253d661240750298802fb0d9585e
|
93205e395f777c1dd81d3f164cf9a4aec4bde45f |
|
13-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Move Assemblers to the Arena. And clean up some APIs to return std::unique_ptr<> instead of raw pointers that don't communicate ownership. Change-Id: I3017302307a0253d661240750298802fb0d9585e
|
dee58d6bb6d567fcd0c4f39d8d690c3acaf0e432 |
|
07-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals"" This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) This CL fixed an issue with parsing quickened instructions. Bug: 27894376 Bug: 27998571 Bug: 27995065 Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
|
40ecb12f8eeb97b810e11f895278abbf7988ed4d |
|
06-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Fix codegens for MethodLoadKind::kDexCacheViaMethod. Use the original method index instead of the target method index because the target method may point to a different dex file. No regression test: this currently happens only if the codegen uses the kDexCacheViaMethod as a fallback for another load kind and we aim to avoid that fallback, so it would be difficult to write a reliable regression test. We could try and exploit current fallbacks for irreducible loops on x86 and arm but those fallbacks will eventually disappear anyway. Bug: 28036230 Change-Id: I4cc9e046480d3d60a7fb521f0ca6a98914625cdc
|
60328910cad396589474f8513391ba733d19390b |
|
04-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals" Bug: 27995065 This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f. Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
|
e3ff7b293be2a6791fe9d135d660c0cffe4bd73f |
|
02-Mar-2016 |
David Brazdil <dbrazdil@google.com> |
Refactor HGraphBuilder and SsaBuilder to remove HLocals This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) Bug: 27894376 Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
|
cac5a7e871f1f346b317894359ad06fa7bd67fba |
|
22-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Improve const-string code generation. For strings in the boot image, use either direct pointers or pc-relative addresses. For other strings, use PC-relative access to the dex cache arrays for AOT and direct address of the string's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -692KiB (-0.9%) - 64-bit boot.oat: -948KiB (-1.1%) - 32-bit dalvik cache total: -900KiB (-0.9%) - 64-bit dalvik cache total: -3672KiB (-1.5%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -380KiB (-0.5%) - 64-bit boot.oat: -928KiB (-1.0%) - 32-bit dalvik cache total: -468KiB (-0.4%) - 64-bit dalvik cache total: -1928KiB (-0.8%) (contains more files than the 32-bit dalvik cache) Bug: 26884697 Change-Id: Iec7266ce67e6fedc107be78fab2e742a8dab2696
|
5b5b9319ff970979ed47d41a41283e4faeffb602 |
|
22-Mar-2016 |
Roland Levillain <rpl@google.com> |
Fix and improve shift and rotate operations. - Define maximum int and long shift & rotate distances as int32_t constants, as shift & rotate distances are 32-bit integer values. - Consider the (long, long) inputs case as invalid for static evaluation of shift & rotate rotations. - Add more checks in shift & rotate operations constructors as well as in art::GraphChecker. Change-Id: I754b326c3a341c9cc567d1720b327dad6fcbf9d6
|
1a65388f1d86bb232c2e44fecb44cebe13105d2e |
|
18-Mar-2016 |
Roland Levillain <rpl@google.com> |
Clean up art::HConstant predicates. - Make the difference between arithmetic zero and zero-bit pattern non ambiguous. - Introduce Boolean predicates in art::HIntConstant for when they are used as Booleans. - Introduce aritmetic positive and negative zero predicates for floating-point constants. Bug: 27639313 Change-Id: Ia04ecc6f6aa7450136028c5362ed429760c883bd
|
22c4922c6b31e154a6814c4abe9015d9ba156911 |
|
18-Mar-2016 |
Roland Levillain <rpl@google.com> |
Ensure art::HRor support boolean, byte, short and char inputs. Also extend tests covering the IntegerRotateLeft, LongRotateLeft, IntegerRotateRight and LongRotateRight intrinsics and their translation into an art::HRor instruction. Bug: 27682579 Change-Id: I89f6ea6a7315659a172482bf09875cfb7e7422a1
|
d28f4a00933a4a3b8d5e9db73b8532924d0f989d |
|
14-Mar-2016 |
David Srbecky <dsrbecky@google.com> |
Generate native debug stackmaps before calls as well. The debugger looks up PC of the call instruction, so the runtime's stackmap is not sufficient since it is at PC after the instruction. Change-Id: I0dd06c0b52e8079ea5d064ea10beb12c93584092
|
a5c4a4060edd03eda017abebc85f24cffb083ba7 |
|
15-Mar-2016 |
Roland Levillain <rpl@google.com> |
Make art::HCompare support boolean, byte, short and char inputs. Also extend tests covering the IntegerSignum, LongSignum, IntegerCompare and LongCompare intrinsics and their translation into an art::HCompare instruction. Bug: 27629913 Change-Id: I0afc75ee6e82602b01ec348bbb36a08e8abb8bb8
|
2ae48182573da7087bffc2873730bc758ec29696 |
|
16-Mar-2016 |
Calin Juravle <calin@google.com> |
Clean up NullCheck generation and record stats about it. This removes redundant code from the generators and allows for easier stat recording. Change-Id: Iccd4368f9e9d87a6fecb863dee4e2145c97851c4
|
e5671618d19489ad0781ec0d204c7765317170cf |
|
16-Mar-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Accept boolean as an input of HDivZeroCheck. All our arithmetic operations accept it. bug:27624718 Change-Id: I1f6bb95dc77ecb3fb2fcabb35a93b31c524bfa0a
|
7fc6350f6f1ab04b52b9cd7542e0790528296cbe |
|
09-Feb-2016 |
Artem Serov <artem.serov@linaro.org> |
Integrate BitwiseNegated into shared framework. Share implementation between arm and arm64. Change-Id: I0dd12e772cb23b4c181fd0b1e2a447470b1d8702
|
a1de9188a05afdecca8cd04ecc4fefbac8b9880f |
|
25-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Reduce memory usage of HInstructions. Pack narrow fields and flags into a single 32-bit field. Change-Id: Ib2f7abf987caee0339018d21f0d498f8db63542d
|
4a0dad67867f389e01a5a6c0fe381d210f687c0d |
|
25-Jan-2016 |
Artem Udovichenko <artem.u@samsung.com> |
Revert "Revert "ARM/ARM64: Extend support of instruction combining."" This reverts commit 6b5afdd144d2bb3bf994240797834b5666b2cf98. Change-Id: Ic27a10f02e21109503edd64e6d73d1bb0c6a8ac6
|
9cd6d378bd573cdc14d049d32bdd22a97fa4d84a |
|
09-Feb-2016 |
David Srbecky <dsrbecky@google.com> |
Associate slow paths with the instruction that they belong to. Almost all slow paths already know the instruction they belong to, this CL just moves the knowledge to the base class as well. This is needed to be be able to get the corresponding dex pc for slow path, which allows us generate better native line numbers, which in turn fixes some native debugging stepping issues. Change-Id: I568dbe78a7cea6a43a4a71a014b3ad135782c270
|
c7098ff991bb4e00a800d315d1c36f52a9cb0149 |
|
09-Feb-2016 |
David Srbecky <dsrbecky@google.com> |
Remove HNativeDebugInfo from start of basic blocks. We do not require full environment at the start of basic block. The dex pc contained in basic block is sufficient for line mapping. Change-Id: I5ba9e5f5acbc4a783ad544769f9a73bb33e2bafa
|
b52bbde2870e5ab5d126612961dcb3da8e5236ee |
|
12-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Simplify consecutive type conversions. Merge two consecutive type conversions to one if the result of such merged conversion is guaranteed to be the same and remove all implicit conversions, not just conversions to the same type. Improve codegens to handle conversions from long to integral types smaller than int. This will make it easier to simplify `(byte) (x & 0xffL)` to `(byte) x` where the conversion from long to byte is done by two dex instructions, long-to-int and in int-to-byte. Bug: 23965701 Change-Id: I833f193556671136ad2cd3f5b31cdfbc2d99c19d
|
6e332529c33be4d7dae5dad3609a839f4c0d3bfc |
|
02-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove HTemporary Change-Id: I21b984224370a9ce7a4a13a9652503cfb03c5f03
|
a19616e3363276e7f2c471eb2839fb16f1d43f27 |
|
02-Feb-2016 |
Aart Bik <ajcbik@google.com> |
Implemented compare/signum intrinsics as HCompare (with all code generation for all) Rationale: At HIR level, many more optimizations are possible, while ultimately generated code can take advantage of full semantics. Change-Id: I6e2ee0311784e5e336847346f7f3c4faef4fd17e
|
a42363f79832a6e14f348514664dc6dc3edf9da2 |
|
17-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement first kind of polymorphic inlining. Add HClassTableGet to fetch an ArtMethod from the vtable or imt, and compare it to the only method the profiling saw. Change-Id: I76afd3689178f10e3be048aa3ac9a97c6f63295d
|
74eb1b264691c4eb399d0858015a7fc13c476ac6 |
|
14-Dec-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement HSelect This patch adds a new HIR instruction to Optimizing. HSelect returns one of two inputs based on the outcome of a condition. This is only initial implementation which: - defines the new instruction, - repurposes BooleanSimplifier to emit it, - extends InstructionSimplifier to statically resolve it, - updates existing code and tests accordingly. Code generators currently emit fallback if/then/else code and will be updated in follow-up CLs to use platform-specific conditional moves when possible. Change-Id: Ib61b17146487ebe6b55350c2b589f0b971dcaaee
|
b3e773eea39a156b3eacf915ba84e3af1a5c14fa |
|
26-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Implement support for instruction inlining Optimizing HIR contains 'non-materialized' instructions which are emitted at their use sites rather than their defining sites. This was not properly handled by the liveness analysis which did not adjust the use positions of the inputs of such instructions. Despite the analysis being incorrect, the current use cases never produce incorrect code. This patch generalizes the concept of inlined instructions and updates liveness analysis to set the compute use positions correctly. Change-Id: Id703c154b20ab861241ae5c715a150385d3ff621
|
95e7ffc28ea4d6deba356e636b16120ae49b62e2 |
|
22-Jan-2016 |
Roland Levillain <rpl@google.com> |
Improve documentation and assertions of read barrier instrumentation. For ARM, x86, x86-64 back ends. The case of the ARM64 back end is already handled in https://android-review.googlesource.com/#/c/197870/. Bug: 12687968 Change-Id: I6df1128cc100cbdb89020876e1a54de719508be3
|
6b5afdd144d2bb3bf994240797834b5666b2cf98 |
|
22-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "ARM/ARM64: Extend support of instruction combining." The test fails its checker parts. This reverts commit debeb98aaa8950caf1a19df490f2ac9bf563075b. Change-Id: I49929e15950c7814da6c411ecd2b640d12de80df
|
debeb98aaa8950caf1a19df490f2ac9bf563075b |
|
11-Dec-2015 |
Ilmir Usmanov <i.usmanov@samsung.com> |
ARM/ARM64: Extend support of instruction combining. Combine multiply instructions in the following way: ARM64: MUL/NEG -> MNEG ARM32 (32-bit integers only): MUL/ADD -> MLA MUL/SUB -> MLS Change-Id: If20f2d8fb060145ab6fbceeb5a8f1a3d02e0ecdb
|
e3f43ac79e50a4693ea4d46acf5cffca64910cee |
|
19-Jan-2016 |
Roland Levillain <rpl@google.com> |
Some read barrier clean-up in Optimizing. These changes make the read barrier compiler instrumentation code more uniform among the ARM, ARM64, x86 and x86-64 back ends. Bug: 12687968 Change-Id: I6b1c0cf2bc22ed6cd6b14754136bef4a2a036ea5
|
58282f4510961317b8d5a364a6f740a78926716f |
|
14-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove Baseline compiler We don't need Baseline any more and it hasn't been maintained for a while anyway. Let's remove it. Change-Id: I442ed26855527be2df3c79935403a25b1ee55df6
|
d6e069b16a7d4964e546daf3d340ea11756ab090 |
|
18-Jan-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Improve floating point comparisons on arm and arm64. Avoid the extra check for unordered inputs by using the appropriate arm/arm64 condition. Change-Id: Ib5e775a90428db7a2cf377ad9fd6a3192d670617
|
6de1938e562b0d06e462512dd806166e754035ea |
|
08-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove incorrect HFakeString optimization Simplification of HFakeString assumes that it cannot be used until String.<init> is called which is not true and causes different behaviour between the compiler and the interpreter. This patch removes the optimization together with the HFakeString instruction. Instead, HNewInstance is generated and an empty String allocated until it is replaced with the result of the StringFactory call. This is consistent with the behaviour of the interpreter but is too conservative. A follow-up CL will attempt to optimize out the initial allocation when possible. Bug: 26457745 Bug: 26486014 Change-Id: I7139e37ed00a880715bfc234896a930fde670c44
|
15bd22849ee6a1ffb3fb3630f686c2870bdf1bbc |
|
05-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement irreducible loop support in optimizing. So we don't fallback to the interpreter in the presence of irreducible loops. Implications: - A loop pre-header does not necessarily dominate a loop header. - Non-constant redundant phis will be kept in loop headers, to satisfy our linear scan register allocation algorithm. - while-graph optimizations, such as gvn, licm, lse, and dce need to know when they are dealing with irreducible loops. Change-Id: I2cea8934ce0b40162d215353497c7f77d6c9137e
|
42249c3602c3d0243396ee3627ffb5906aa77c1e |
|
08-Jan-2016 |
Aart Bik <ajcbik@google.com> |
Reduce code size by sharing slow paths. Rationale: Sharing identical slow path code reduces code size. Background: Currently, slow paths with the same dex-pc, same physical register spilling code, and identical stack maps are shared (making this only useful for deopt slow paths). The newly introduced mechanism is sufficiently general to allow future improvements by e.g. allowing different dex-pc (by passing this to runtime) or even the kind of slow paths (by passing runtime addresses to the slowpath). Change-Id: I819615c47b4fd98440a241f681f93e4fc22d12e0
|
b7070a2db8b0b7eca14f01f932be305be64ded57 |
|
08-Jan-2016 |
David Srbecky <dsrbecky@google.com> |
Generate Nops to ensure that debug stack maps have distinct PC. Change-Id: I5740ec958a20d236634b66df0e675382ed5c16fc
|
68f6289fbc1b14ed814722c023b3f343c1e59a79 |
|
04-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use std::abs on INT_MIN/LONG_MIN, it's undefined. bug:25494265 Change-Id: I560a3a589b92440020285f9adfdf7c9efb06217c
|
80e6709722d6c27aed399c50a11a98e0ab13a97e |
|
08-Jan-2016 |
Roland Levillain <rpl@google.com> |
Small implicit null checks refactoring in the ARM codegen. Change-Id: I7dccb02cf7ac2f7d8fd1676b03e0b394701fbe3f
|
1407ee7df9d063e4d94c488c7beb46cb2da0677e |
|
08-Jan-2016 |
Roland Levillain <rpl@google.com> |
Add a missing implicit null check in the ARM codegen. The code generated for object ArraySet on ARM used to miss an implicit null check for the array when the assigned value is `null`. This has not been an actual issue so far, as ArraySet instructions have never been using implicit null checks. Note: This CL comes without a regression test, as the code path in question is not used (yet). Change-Id: If3bc85e32802595e635513dfb83ccfcfd8f00d3d
|
c928591f5b2c544751bb3fb26dc614d3c2e67bef |
|
18-Dec-2015 |
Roland Levillain <rpl@google.com> |
ARM Baker's read barrier fast path implementation. Introduce an ARM fast path implementation in Optimizing for Baker's read barriers (for both heap reference loads and GC root loads). The marking phase of the read barrier is performed by a slow path, invoking the runtime entry point artReadBarrierMark. Other read barrier algorithms continue to use the original slow path based implementation, which has been renamed as GenerateReadBarrierSlow/GenerateReadBarrierForRootSlow. Bug: 12687968 Change-Id: Ie7ee85b1b4c0564148270cebdd3cbd4c3da51b3a
|
0cf4493166ff28518c8eafa2d0463f6e817cce75 |
|
09-Dec-2015 |
David Srbecky <dsrbecky@google.com> |
Generate more stack maps during native debugging. Generate extra stack map at the start of each java statement. The stack maps are later translated to DWARF which allows LLDB to set breakpoints and view local variables. Change-Id: If00ab875513308e4a1399d1e12e0fe8934a6f0c3
|
5f7b58ea1adfc0639dd605b65f59198d3763f801 |
|
23-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Rewrite HInstruction::Is/As<type>(). Make Is<type>() and As<type>() non-virtual for concrete instruction types, relying on GetKind(), and mark GetKind() as PURE to improve optimization opportunities. This reduces the number of relocations in libart-compiler.so's .rel.dyn section by ~4K, or ~44%, and in .data.rel.ro by ~18K, or ~65%. The file is 96KiB smaller for Nexus 5, including 8KiB reduction of the .text section. Unfortunately, the g++/clang++ __attribute__((pure)) is not strong enough to avoid duplicated virtual calls and we would need the C++ [[pure]] attribute proposed in n3744 instead. To work around this deficiency, we introduce an extra non-virtual indirection for GetKind(), so that the compiler can optimize common expressions such as instruction->IsAdd() || instruction->IsSub() or instruction->IsAdd() && instruction->AsAdd()->... which contain two virtual calls to GetKind() after inlining. Change-Id: I83787de0671a5cb9f5b0a5f4a536cef239d5b401
|
ac6ac10a0801fa6eb95e0ab0c72b2ed562210b34 |
|
17-Dec-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing/ARM: Fix CmpConstant(). CMN updates flags based on addition of its operands. Do not confuse the "N" suffix with bitwise inversion performed by MVN. Also add more special cases analogous to AddConstant() and use CmpConstant() more in code generator. Change-Id: I0d4571770a3f0fdf162e97d4bde56814098e7246
|
f3e0ee27f46aa6434b900ab33f12cd3157578234 |
|
17-Dec-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "ART: Reduce the instructions generated by packed switch."" This reverts commit b4c137630fd2226ad07dfd178ab15725374220f1. The underlying issue was fixed by https://android-review.googlesource.com/188271 . Bug: 26121945 Change-Id: I58b08eb1a9f0a5c861f8cda93522af64bcf63920
|
b4c137630fd2226ad07dfd178ab15725374220f1 |
|
16-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "ART: Reduce the instructions generated by packed switch." This reverts commit 59f054d98f519a3efa992b1c688eb97bdd8bbf55. bug:26121945 Change-Id: I8a5ad7ef1f1de8d44787c27528fa3f7f5c2e9cd3
|
351dddf4025f07477161209e374741f089d97cb4 |
|
11-Dec-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Clean up after HRor. Change-Id: I96bd7fa2e8bdccb87a3380d063dad0dd57fed9d7
|
40a04bf64e5837fa48aceaffe970c9984c94084a |
|
11-Dec-2015 |
Scott Wakeling <scott.wakeling@linaro.org> |
Replace rotate patterns and invokes with HRor IR. Replace constant and register version bitfield rotate patterns, and rotateRight/Left intrinsic invokes, with new HRor IR. Where k is constant and r is a register, with the UShr and Shl on either side of a |, +, or ^, the following patterns are replaced: x >>> #k OP x << #(reg_size - k) x >>> #k OP x << #-k x >>> r OP x << (#reg_size - r) x >>> (#reg_size - r) OP x << r x >>> r OP x << -r x >>> -r OP x << r Implemented for ARM/ARM64 & X86/X86_64. Tests changed to not be inlined to prevent optimization from folding them out. Additional tests added for constant rotate amounts. Change-Id: I5847d104c0a0348e5792be6c5072ce5090ca2c34
|
917d01680714b2295f109f8fea0aa06764a30b70 |
|
24-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't generate a slow path for strings in the dex cache. Change-Id: I1d258f1a89bf0ec7c7ddd134be9215d480f0b09a
|
59f054d98f519a3efa992b1c688eb97bdd8bbf55 |
|
07-Dec-2015 |
Zheng Xu <zheng.xu@linaro.org> |
ART: Reduce the instructions generated by packed switch. Implement Vladimir Marko's suggestion. The new compare/jump series reduce the number of instructions from (2*n+1) to (1.5*n+3). Generate normal compare/jump series when numEntries <= 3. Generate optimal compare/jump series when numEntries <= threshold. Generate jump tables otherwise. Change-Id: I425547b6787057c7fa84e71f17c145b63b208633
|
e523423a053af5cb55837f07ceae9ff2fd581712 |
|
02-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Don't use the compiler driver for method resolution."" This reverts commit c88ef3a10c474045a3476a02ae75d07ddd3230b7. Change-Id: I0ed88a48b313a8d28bc39fae40631123aadb13ef
|
c88ef3a10c474045a3476a02ae75d07ddd3230b7 |
|
01-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't use the compiler driver for method resolution." Fails 425 in debuggable mode. This reverts commit 4db0bf9c4db6a09716c3388b7d2f88d534470339. Change-Id: I346df8f75674564fc4fb241c60f23e250fc7f0a7
|
4db0bf9c4db6a09716c3388b7d2f88d534470339 |
|
23-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use the compiler driver for method resolution. The compiler driver makes assumptions that don't hold for the optimizing compiler, and will for example always go to slow path for an invoke-super when there's no verified method. Also fix GenerateInvokeVirtual in the presence of intrinsics. Next change will address some of the TODOs in sharpening.cc. Change-Id: I2b0e543ee9b9bebcadb2d26de29e850c59ad58b9
|
b4536b7de576b20c74c612406c5d3132998075ef |
|
24-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing/ARM: Implement kDexCachePcRelative dispatch. Change-Id: I0fe2da50a30a3f62bec8ea01688dd1fec84b1831
|
42e372e5a34d0fef88007bc5f40dd0fc7c03b58b |
|
24-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize HLoadClass when we know the class is in the cache. Change-Id: Iaa74591eed0f2eabc9ba9f9988681d9582faa320
|
888d067a67640e7d9fc349b0451dfe845acad562 |
|
23-Nov-2015 |
Roland Levillain <rpl@google.com> |
Revamp art::CheckEntrypointTypes uses. Change-Id: I6e13e594539e766ed94524ac3282cec292ba91da
|
4f6b0b551ee549af12fce75c8379f5137fe4cfad |
|
23-Nov-2015 |
Roland Levillain <rpl@google.com> |
Clean up read barrier related comments in Optimizing. Bug: 12687968 Change-Id: Idf2e371e01e10d9d32c95b150735e2c96244232e
|
729645a937eb9f04a311b3c22471dcf3ebe9bcec |
|
19-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Explicitly add HLoadClass/HClinitCheck for HNewInstance. bug:25735083 bug:25173758 Change-Id: Ie81cfa4fa9c47cc025edb291cdedd7af209a03db
|
f9d741e32c6f1629ce70eefc68d3363fa1cfd696 |
|
20-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing/ARM: Improve long shifts by 1. Implement long Shl(x,1) as LSLS+ADC, Shr(x,1) as ASR+RRX and UShr(x,1) as LSR+RRX. Remove the simplification substituting Shl(x,1) with ADD(x,x) as it interferes with some other optimizations instead of helping them. And since it didn't help 64-bit architectures anyway, codegen is the correct place for it. This is now implemented for ARM and x86, so only mips32 can be improved. Change-Id: Idd14f23292198b2260189e1497ca5411b21743b3
|
c53c0797a78a89d637e4230503cc1feb27e855a8 |
|
19-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Clean up the special input in HInvokeStaticOrDirect. Change-Id: I4042aefbdac1a8c236d00e2e7145349a64f6486b
|
3b359c71f2fb784589be113206932e76807787bb |
|
17-Nov-2015 |
Roland Levillain <rpl@google.com> |
ARM read barrier support for concurrent GC in Optimizing. This first implementation uses slow paths to instrument heap reference loads and GC root loads for the concurrent copying collector, respectively calling the artReadBarrierSlow and artReadBarrierForRootSlow runtime entry points. Notes: - This implementation does not instrument HInvokeVirtual nor HInvokeInterface instructions (for class reference loads), as the corresponding read barriers are not stricly required with the current concurrent copying collector. - Intrinsics which may eventually call (on slow path) are disabled when read barriers are enabled, as the current slow path infrastructure does not support this case. - When read barriers are enabled, the code generated for a HArraySet instruction always go into the array set slow path for object arrays (delegating the operation to the runtime), as we are lacking a mechanism to keep a temporary register live accross a runtime call (needed for the instrumentation of type checking code, which requires two successive read barriers). Bug: 12687968 Change-Id: I92e8db414d029f952c07f3d3a98069e46dfdbc2a
|
0debae7bc89eb05f7a2bf7dccd223318fad7c88d |
|
12-Nov-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Refactor GenerateTestAndBranch Each code generator implements a method for generating condition evaluation and branching to arbitrary labels. This patch refactors it for better clarity but also to generate fewer jumps when the true branch is the fallthrough successor. This is preliminary work for implementing HSelect. Change-Id: Iaa545a5ecbacb761c5aa241fa69140cf6eb5952f
|
13c86fdd2238ef158594182b31040533e1c92965 |
|
11-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Clean up constant location handling. Locations builder should use ConstantLocation() when the code generator relies on a location to be constant. Code generator should interrogate locations, not inputs, about being const. Change-Id: Ic35bb84aa9f83e0977b151a0430aca6c88f19cf0
|
33ad10e72438f01d11ec57695fe68194007535d2 |
|
10-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing/ARM: Improve shifts of long values by a constant. Change-Id: Id66ef8cdb9e64306f2be547370b90cc100a3e086
|
b8b97695d178337736b61609220613b92f344d45 |
|
22-May-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Fix conditional jump over jmp (X86/X86-64/ARM32) Optimize the code generation for 'if' statements to jump to the 'false' block if the next block to be generated is the 'true' block. Add an X86-64 test for this case. Note that ARM64 & MIPS64 have not been updated. Change-Id: Iebb1352feb9d3bd0142d8b0621a2e3069a708ea7 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
7cffc3b0004d32faffc552c0a59286f369b21504 |
|
20-Oct-2015 |
Andreas Gampe <agampe@google.com> |
ART: Arm32 packed-switch jump tables Add jump table support to the thumb2 assembler. Jump tables are a collection of labels for the case targets, and an anchor label denoting the position of the jump. Use the jump table support to implement packed-switch support for arm32. Add tests for BindTrackedLabel and JumpTable to the thumb2 assembler test. Bug: 24092914 Change-Id: I5c84f193dfebf9e07f48678efc8bd151bb1410dd
|
dc151b2346bb8a4fdeed0c06e54c2fca21d59b5d |
|
15-Oct-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Determine invoke-static/-direct dispatch early. Determine the dispatch type of invoke-static/-direct in a special pass right after the type inference. This allows the inliner to pass the "needs dex cache" check and inline more. It also allows the code generator to avoid requesting a register location for the ArtMethod* for kDexCachePcRelative and direct methods. The supported dispatch check handles also situations that the CompilerDriver currently doesn't allow. The cleanup of the CompilerDriver and required changes to Quick will come in a separate change. Change-Id: I3f8e903a119949e95871d8ab0a995f4731a13a07
|
bb245d199a5240b4c520263fd2c8c10dba79eadc |
|
19-Oct-2015 |
Aart Bik <ajcbik@google.com> |
Generalize codegen and simplification of deopt. Rationale: the de-opt instruction is very similar to an if, so the existing assumption that it always has a conditional "under the hood" is very unsafe, since optimizations may have replaced conditionals with actual values; this CL generalizes handling of deopt. Change-Id: I1c6cb71fdad2af869fa4714b38417dceed676459
|
4b8f1ecd3aa5a29ec1463ff88fee9db365f257dc |
|
26-Aug-2015 |
Roland Levillain <rpl@google.com> |
Use ATTRIBUTE_UNUSED more. Use it in lieu of UNUSED(), which had some incorrect uses. Change-Id: If247dce58b72056f6eea84968e7196f0b5bef4da
|
e9f37600e98ba21308ad4f70d9d68cf6c057bdbe |
|
09-Oct-2015 |
Aart Bik <ajcbik@google.com> |
Added support for unsigned comparisons Rationale: even though not directly supported in input graph, having the ability to express unsigned comparisons in HIR is useful for all sorts of optimizations. Change-Id: I4543c96a8c1895c3d33aaf85685afbf80fe27d72
|
d2b4ca2d02c86b1ce1826fd2b35ce6c9c58c1ff1 |
|
14-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Improve Thumb2 bitwise operations. Allow embedding constants in AND, ORR, EOR. Add ORN to assembler, use BIC and ORN for AND and ORR when needed. Change-Id: I24d69ecc7ce6992b9c5eb7a313ff47a942de9661
|
5bd05a5c9492189ec28edaf6396d6a39ddf03367 |
|
13-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement System.arraycopy intrinsic for arm. Change-Id: I58ae1af5103e281fe59fbe022b718d6d8f293a5e
|
ec7802a102d49ab5c17495118d4fe0bcc7287beb |
|
01-Oct-2015 |
Vladimir Marko <vmarko@google.com> |
Add DCHECKs to ArenaVector and ScopedArenaVector. Implement dchecked_vector<> template that DCHECK()s element access and insert()/emplace()/erase() positions. Change the ArenaVector<> and ScopedArenaVector<> aliases to use the new template instead of std::vector<>. Remove DCHECK()s that have now become unnecessary from the Optimizing compiler. Change-Id: Ib8506bd30d223f68f52bd4476c76d9991acacadc
|
580b609cd6cfef46108156457df42254d11e72a7 |
|
06-Oct-2015 |
Calin Juravle <calin@google.com> |
Fix location summary for LoadClass Don't request a register for the current method if we're gonna call the runtime. Change-Id: I9760d15108bd95efb2a34e6eacd84b60841781d7
|
98893e146b0ff0e1fd1d7c29252f1d1e75a163f2 |
|
02-Oct-2015 |
Calin Juravle <calin@google.com> |
Add support for unresolved classes in optimizing. Change-Id: I0e299a81e560eb9cb0737ec46125dffc99333b54
|
ecf680d5e1fe6fcdd57962334a7c7865720503cc |
|
05-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Block callee save fp registers in debuggable. This is a simple but conservative implementation. We could extend it by using the registers but still saving them before a call and at method entry. bug: 21057237 Change-Id: Ia2e9e0e2efae0b01625e0f4165d0535c4bf9ba62
|
75d5b9bbd48edbe221d00dc85d25093977c6fa41 |
|
05-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't use floating point callee saves in debuggable." bug:24602865 bug:24605078 This reverts commit 88a95ba893fcda974d492917dd77a9b11693dbf2. Change-Id: Iba97eeab5c2ba725f66cc138f740dac337344828
|
e460d1df1f789c7c8bb97024a8efbd713ac175e9 |
|
29-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Support unresolved fields in optimizing" The CL also changes the calling convetion for 64bit static field set to use kArg2 instead of kArg1. This allows optimizing to keep the asumptions: - arm pairs are always of form (even_reg, odd_reg) - ecx_edx is not used as a register on x86. This reverts commit e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1. Change-Id: I93159917565824084abc96775f31be1a4249f2f3
|
88a95ba893fcda974d492917dd77a9b11693dbf2 |
|
30-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use floating point callee saves in debuggable. The runtime stubs don't save them, so GetVReg and SetVReg won't work on them. Not having callee saves will increase code size and reduce performance of fp-heavy methods. But we need to do it for propper debugging. Change-Id: I40354c29718af49b6b3adf61d724d3bb93680107
|
e0395dd58454e27fc47c0ca273913929fb658e6c |
|
25-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize ArraySet for x86/x64/arm/arm64. Change-Id: I5bc8c6adf7f82f3b211f0c21067f5bb54dd0c040
|
5233f93ee336b3581ccdb993ff6342c52fec34b0 |
|
29-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag even more arena allocations. Tag previously "Misc" arena allocations with more specific allocation types. Move some native heap allocations to the arena in BCE. Bug: 23736311 Change-Id: If8ef15a8b614dc3314bdfb35caa23862c9d4d25c
|
225b6464a58ebe11c156144653f11a1c6607f4eb |
|
28-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in code generators. And completely remove the deprecated GrowableArray. Replace GrowableArray with ArenaVector in code generators and related classes and tag arena allocations. Label arrays use direct allocations from ArenaAllocator because Label is non-copyable and non-movable and as such cannot be really held in a container. The GrowableArray never actually constructed them, instead relying on the zero-initialized storage from the arena allocator to be correct. We now actually construct the labels. Also avoid StackMapStream::ComputeDexRegisterMapSize() being passed null references, even though unused. Change-Id: I26a46fdd406b23a3969300a67739d55528df8bf4
|
abfcf18fa2fe723bd683edcb685ed5058d9c7cf3 |
|
21-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Further refinements to checkcast/instanceof. - Use setcc when possible. - Do an exact check in the Object[] case before checking the component type. Change-Id: Ic11c60643af9b41fe4ef2beb59dfe7769bef388f
|
fe57faa2e0349418dda38e77ef1c0ac29db75f4d |
|
18-Sep-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Add basic PackedSwitch support Add HPackedSwitch, and generate it from the builder. Code generators convert this to a series of compare/branch tests. Better implementation in the code generators as a real jump table will follow as separate CLs. Change-Id: If14736fa4d62809b6ae95280148c55682e856911 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
85c7bab43d11180d552179c506c2ffdf34dd749c |
|
18-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Optimize code generation of check-cast and instance-of."" This reverts commit 7537437c6a2f89249a48e30effcc27d4e7c5a04f. Change-Id: If759cb08646e47b62829bebc3c5b1e2f2969cf84
|
85b62f23fc6dfffe2ddd3ddfa74611666c9ff41d |
|
09-Sep-2015 |
Andreas Gampe <agampe@google.com> |
ART: Refactor intrinsics slow-paths Refactor slow paths so that there is a default implementation for common cases (only arm64 with vixl is special). Write a generic intrinsic slow-path that can be reused for the specific architectures. Move helper functions into CodeGenerator so that they are accessible. Change-Id: Ibd788dce432601c6a9f7e6f13eab31f28dcb8550
|
7537437c6a2f89249a48e30effcc27d4e7c5a04f |
|
17-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Optimize code generation of check-cast and instance-of." Failures with libcore tests. This reverts commit 64acf303eaa2f32c0b1d8cfcbf044a822c5eec08. Change-Id: Ie6f323fcf5d86bae5c334c1352bb21f1bad60a88
|
e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1 |
|
17-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Support unresolved fields in optimizing" breaks debuggable tests. This reverts commit 23a8e35481face09183a24b9d11e505597c75ebb. Change-Id: I8e60b5c8f48525975f25d19e5e8066c1c94bd2e5
|
64acf303eaa2f32c0b1d8cfcbf044a822c5eec08 |
|
14-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize code generation of check-cast and instance-of. On x86/x64/arm/arm64. Improve code size of selected apks from 0.3% to 1%, and performance of DeltaBlue by 20%. Change-Id: Ib5799f7a53443cd880a121dd7f21932ae9f5c7aa
|
23a8e35481face09183a24b9d11e505597c75ebb |
|
08-Sep-2015 |
Calin Juravle <calin@google.com> |
Support unresolved fields in optimizing Change-Id: I9941fa5fcb6ef0a7a253c7a0b479a44a0210aad4
|
175dc732c80e6f2afd83209348124df349290ba8 |
|
25-Aug-2015 |
Calin Juravle <calin@google.com> |
Support unresolved methods in Optimizing Change-Id: If2da02b50d2fa668cd58f134a005f1752e7746b1
|
fa6b93c4b69e6d7ddfa2a4ed0aff01b0608c5a3a |
|
15-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in HGraph. Replace GrowableArray with ArenaVector in HGraph and related classes HEnvironment, HLoopInformation, HInvoke and HPhi, and tag allocations with new arena allocation types. Change-Id: I3d79897af405b9a1a5b98bfc372e70fe0b3bc40d
|
77a48ae01bbc5b05ca009cf09e2fcb53e4c8ff23 |
|
15-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Register allocation and runtime support for try/catch"" The original CL triggered b/24084144 which has been fixed by Ib72e12a018437c404e82f7ad414554c66a4c6f8c. This reverts commit 659562aaf133c41b8d90ec9216c07646f0f14362. Change-Id: Id8980436172457d0fcb276349c4405f7c4110a55
|
659562aaf133c41b8d90ec9216c07646f0f14362 |
|
14-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Register allocation and runtime support for try/catch" Breaks libcore test org.apache.harmony.security.tests.java.security.KeyStorePrivateKeyEntryTest#testGetCertificateChain. Need to investigate. This reverts commit b022fa1300e6d78639b3b910af0cf85c43df44bb. Change-Id: Ib24d3a80064d963d273e557a93469c95f37b1f6f
|
b022fa1300e6d78639b3b910af0cf85c43df44bb |
|
20-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Register allocation and runtime support for try/catch This patch completes a series of CLs that add support for try/catch in the Optimizing compiler. With it, Optimizing can compile all methods containing try/catch, provided they don't contain catch loops. Future work will focus on improving performance of the generated code. SsaLivenessAnalysis was updated to propagate liveness information of instructions live at catch blocks, and to keep location information on instructions which may be caught by catch phis. RegisterAllocator was extended to spill values used after catch, and to allocate spill slots for catch phis. Catch phis generated for the same vreg share a spill slot as the raw value must be the same. Location builders and slow paths were updated to reflect the fact that throwing an exception may not lead to escaping the method. Instruction code generators are forbidden from using of implicit null checks in try blocks as live registers need to be saved before handing over to the runtime. CodeGenerator emits a stack map for each catch block, storing locations of catch phis. CodeInfo and StackMapStream recognize this new type of stack map and store them separate from other stack maps to avoid dex_pc conflicts. After having found the target catch block to deliver an exception to, QuickExceptionHandler looks up the dex register maps at the throwing instruction and the catch block and copies the values over to their respective locations. The runtime-support approach was selected because it allows for the best performance in the normal control-flow path, since no propagation of catch phi values is necessary until the exception is thrown. In addition, it also greatly simplifies the register allocation phase. ConstantHoisting was removed from LICMTest because it instantiated (now abstract) HConstant and was bogus anyway (constants are always in the entry block). Change-Id: Ie31038ad8e3ee0c13a5bbbbaf5f0b3e532310e4e
|
501fd635a557645ab05f893c56e1f358e21bab82 |
|
11-Sep-2015 |
Andreas Gampe <agampe@google.com> |
ART: Fix Quick-style LR vs PC core spill mask bug It's always been a bug that Quick marked PC as spilled instead of LR. The root cause was a mutation of the spill mask at frame exit, when LR is being restored into PC to return. A local should have been used to keep the actual spill mask safe and sound. This has only worked because nobody ever uses LR, even after long jumps for exception dispatch. However, single-frame deoptimization needs this to work, and I'd rather fix this than being forced to have machine-specific fixups. Also fix in optimizing, and bump the oat version. Change-Id: Ib032a533408bf464097fc96dcbfc5b6a68bf59a1
|
bfb5ba90cd6425ce49c2125a87e3b12222cc2601 |
|
01-Sep-2015 |
Andreas Gampe <agampe@google.com> |
Revert "Revert "Do a second check for testing intrinsic types."" This reverts commit a14b9fef395b94fa9a32147862c198fe7c22e3d7. When an intrinsic with invoke-type virtual is recognized, replace the instruction with a new HInvokeStaticOrDirect. Minimal update for dex-cache rework. Fix includes. Change-Id: I1c8e735a2fa7cda4419f76ca0717125ef236d332
|
05792b98980741111b4d0a24d68cff2a8e070a3a |
|
03-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Move DexCache arrays to native. This CL has a companion CL in libcore/ https://android-review.googlesource.com/162985 Change-Id: Icbc9e20ad1b565e603195b12714762bb446515fa
|
73cf0fb75de2a449ce4fe329b5f1fb42eef1372f |
|
30-Jul-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Add 16-bit Thumb2 ROR, NEGS and CMP for high registers. Also clean up the usage of set_cc flag. Define a SetCc enumeration that specifies whether to set or keep condition codes or whether we don't care and a 16-bit instruction should be selected if one exists. This reduces the size of Nexus 5 boot.oat by 44KiB (when compiled with Optimizing which is not the default yet). Change-Id: I047072dc197ea678bf2019c01bcb28943fa9b604
|
5a6cc49ed4f36dd11d6ec1590857b884ad8da6ab |
|
13-Aug-2015 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
SlowPath: Remove the use of Locations in the SlowPath constructors. The main motivation is that using locations in the SlowPath constructors ties us to creating the SlowPaths after register allocation, since before the locations are invalid. A later patch of the series will be moving the SlowPath creation to the LocationsBuilder visitors. This will enable us to add more checking as well as consider sharing multiple SlowPaths of the same type. Change-Id: I7e96dcc2b5586d15153c942373e9281ecfe013f0 Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
|
ecc4366670e12b4812ef1653f7c8d52234ca1b1f |
|
13-Aug-2015 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Add OptimizingCompilerStats to the CodeGenerator class. Just refactoring, not yet used, but will be used by the incoming patch series and future CodeGen specific stats. Change-Id: I7d20489907b82678120518a77bdab9c4cc58f937 Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
|
581550137ee3a068a14224870e71aeee924a0646 |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Optimizing: Better invoke-static/-direct dispatch."" Fixed kCallArtMethod to use correct callee location for kRecursive. This combination is used when compiling with debuggable flag set. This reverts commit b2c431e80e92eb6437788cc544cee6c88c3156df. Change-Id: Idee0f2a794199ebdf24892c60f8a5dcf057db01c
|
b2c431e80e92eb6437788cc544cee6c88c3156df |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Optimizing: Better invoke-static/-direct dispatch." Reverting due to failing ndebug tests. This reverts commit 9b688a095afbae21112df5d495487ac5231b12d0. Change-Id: Ie4f69da6609df3b7c8443412b6cf7f5c43c2c5d9
|
9b688a095afbae21112df5d495487ac5231b12d0 |
|
06-May-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Better invoke-static/-direct dispatch. Add framework for different types of loading ArtMethod* and code pointer retrieval. Implement invoke-static and invoke-direct calls the same way as Quick. Document the dispatch kinds in HInvokeStaticOrDirect's new enumerations MethodLoadKind and CodePtrLocation. PC-relative loads from dex cache arrays are used only for x86-64 and arm64. The implementation for other architectures will be done in separate CLs. Change-Id: I468ca4d422dbd14748e1ba6b45289f0d31734d94
|
78e3ef6bc5f8aa149f2f8bf0c78ce854c2f910fa |
|
12-Aug-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Add a GVN dependency 'GC' for garbage collection. This will be used by incoming architecture specific optimizations. The dependencies must be conservative. When an HInstruction is created we may not be sure whether it can trigger GC. In that case the 'ChangesGC' dependency must be set. We control at code-generation time that HInstructions that can call have the 'ChangesGC' dependency set. Change-Id: Iea6a7f430009f37a9599b0a0039207049906e45d
|
8c0676ce786f33b8f9c8eedf1ace48988c750932 |
|
03-Aug-2015 |
Serguei Katkov <serguei.i.katkov@intel.com> |
ART-Optimizing: Fix the type of HDivZeroCheck HDivZeroCheck is created during the building CFG and at this moment its type is not known completely. So it sets the type to int or long. However, later SSA builder can insert the type conversion and type of input of HDivZeroCheck can become byte or short while the type of HDivZeroCheck remains the same. In reality the type of HDivZeroCheck should be always equal to its input parameter. To fix this inconsistency we return the type of HDivZeroCheck as its input type. Code generators are updated accordingly. Change-Id: I6a5aedc8d479cfc6328704e7ddf252bca830076b Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
8158f28b6689314213eb4dbbe14166073be71f7e |
|
07-Aug-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Ensure coherency of call kinds for LocationSummary. The coherency is enforced with checks added in the `InvokeRuntime` helper, that we now also use on x86 and x86_64. Change-Id: I8cb92b042f25dc3c5fd390e9c61a45b477d081f4
|
cb1c0557033065f2436ee79e7fa6c19d87064801 |
|
04-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Move exception clearing into own instruction Runtime delivers exceptions only to catch blocks which begin with a MOVE_EXCEPTION instruction (in DEX). In that case, the catch block is expected to clear the thread-local exception storage after having read the exception reference. This patch changes Optimizing to represent MOVE_EXCEPTION with two instructions - HLoadException and HClearException - instead of one. If the exception reference is not used, HLoadException can be safely removed, saving a memory load without breaking the runtime behaviour. Change-Id: Idad8a714467bf9d9d5fccefbc43c0bd8ae13ddba
|
2e7cd752452d02499a2f5fbd604c5427aa372f00 |
|
10-Jul-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
[optimizing] Don't rely on the verifier for String.<init>. Continue work on cutting the dependency on the verifier. Change-Id: I0f95b1eb2e10fd8f6bf54817f1202bdf6dfdb0fe
|
4fa13f65ece3b68fe3d8722d679ebab8656bbf99 |
|
06-Jul-2015 |
Roland Levillain <rpl@google.com> |
Fuse long and FP compare & condition on ARM in Optimizing. Also: - Stylistic changes in corresponding parts on the x86 and x86-64 code generators. - Update and improve the documentation of art::arm::Condition. Bug: 21120453 Change-Id: If144772046e7d21362c3c2086246cb7d011d49ce
|
4d02711ea578dbb789abb30cbaf12f9926e13d81 |
|
01-Jul-2015 |
Roland Levillain <rpl@google.com> |
Implement heap poisoning in ART's Optimizing compiler. - Instrument ARM, ARM64, x86 and x86-64 code generators. - Note: To turn heap poisoning on in Optimizing, set the environment variable `ART_HEAP_POISONING' to "true" before compiling ART. Bug: 12687968 Change-Id: Ib3120b38cf805a8a50207a314b9ccc90c8d93740
|
2bcb43111edf7bf99fe409ff3e9c76a285e54c25 |
|
01-Jul-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use CompareAndBranchIf(Non)Zero when applicable. Now that we relocate branches, we can try make more use of cbz/cbnz. Change-Id: I93ca64107f34eb3c43f2e7102ea90453113dad7a
|
fc6a86ab2b70781e72b807c1798b83829ca7f931 |
|
26-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Implement try/catch blocks in Builder"" This patch enables the GraphBuilder to generate blocks and edges which represent the exceptional control flow when try/catch blocks are present in the code. Actual compilation is still delegated to Quick and Baseline ignores the additional code. To represent the relationship between try and catch blocks, Builder splits the edges which enter/exit a try block and links the newly created blocks to the corresponding exception handlers. This layout will later enable the SsaBuilder to correctly infer the dominators of the catch blocks and to produce the appropriate reverse post ordering. It will not, however, allow for building the complete SSA form of the catch blocks and consequently optimizing such blocks. To this end, a new TryBoundary control-flow instruction is introduced. Codegen treats it the same as a Goto but it allows for additional successors (the handlers). This reverts commit 3e18738bd338e9f8363b26bc895f38c0ec682824. Change-Id: I4f5ea961848a0b83d8db3673763861633e9bfcfb
|
3e18738bd338e9f8363b26bc895f38c0ec682824 |
|
26-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Implement try/catch blocks in Builder" Causes OutOfMemory issues, need to investigate. This reverts commit 0b5c7d1994b76090afcc825e737f2b8c546da2f8. Change-Id: I263e6cc4df5f9a56ad2ce44e18932ca51d7e349f
|
0b5c7d1994b76090afcc825e737f2b8c546da2f8 |
|
11-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement try/catch blocks in Builder This patch enables the GraphBuilder to generate blocks and edges which represent the exceptional control flow when try/catch blocks are present in the code. Actual compilation is still delegated to Quick and Baseline ignores the additional code. To represent the relationship between try and catch blocks, Builder splits the edges which enter/exit a try block and links the newly created blocks to the corresponding exception handlers. This layout will later enable the SsaBuilder to correctly infer the dominators of the catch blocks and to produce the appropriate reverse post ordering. It will not, however, allow for building the complete SSA form of the catch blocks and consequently optimizing such blocks. To this end, a new TryBoundary control-flow instruction is introduced. Codegen treats it the same as a Goto but it allows for additional successors (the handlers). Change-Id: I415b985596d5bebb7b1bb358a46e08b7b04bb53a
|
ad3359e77357cc5ce29ce529ab2ed9d0d8401da4 |
|
23-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not overwrite an input register in shift operations. 'second_reg' is an input register that can survive the instruction. Instead use the output register as a temporary result. bug:21667432 (cherry picked from commit a4f3581da73b83484a30ab499c4f8ad43b378dab) Change-Id: Ic1f399964911b8a9fc57352130c92b2a0a1b8e0d
|
a4f3581da73b83484a30ab499c4f8ad43b378dab |
|
23-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not overwrite an input register in shift operations. 'second_reg' is an input register that can survive the instruction. Instead use the output register as a temporary result. bug:21667432 Change-Id: I1a4577b0333c3fb184645023d5eae30555bbf65c
|
eb7b7399dbdb5e471b8ae00a567bf4f19edd3907 |
|
19-Jun-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Opt compiler: Add disassembly to the '.cfg' output. This is automatically added to the '.cfg' output when using the usual `--dump-cfg` option. Change-Id: I864bfc3a8299c042e72e451cc7730ad8271e4deb
|
9931f319cf86c56c2855d800339a3410697633a6 |
|
19-Jun-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Opt compiler: Add a description to slow paths. Change-Id: I22160d90de3fe0ab3e6a2acc440bda8daa00e0f0
|
33d6903e570daf8f3cf7c1f6ebd9a6dd22c7c23c |
|
18-Jun-2015 |
Roland Levillain <rpl@google.com> |
Replace some run-time assertions with compile-time ones in ART. Change-Id: I16c3fad45c4b98b94b7c83d071374096e81d407a
|
cf93a5cd9c978f59113d42f9f642fab5e2cc8877 |
|
16-Jun-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "ART: Implement literal pool for arm, fix branch fixup."" This reverts commit fbeb4aede0ddc5b1e6a5a3a40cc6266fe8518c98. Adjust block label positions. Bad catch block labels were the reason for the revert. Change-Id: Ia6950d639d46b9da6b07f3ade63ab46d03d63310
|
fbeb4aede0ddc5b1e6a5a3a40cc6266fe8518c98 |
|
16-Jun-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "ART: Implement literal pool for arm, fix branch fixup." This reverts commit f38caa68cce551fb153dff37d01db518e58ed00f. Change-Id: Id88b82cc949d288cfcdb3c401b96f884b777fc40 Reason: broke the tests.
|
f38caa68cce551fb153dff37d01db518e58ed00f |
|
29-May-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Implement literal pool for arm, fix branch fixup. Change-Id: Iecc91418bb4ee1c957f42fefb737d0ee2ba960e7
|
69aa60163989c33a008115205d39732a76ecc1dc |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Pass current method to HNewInstance and HNewArray."" Problem exposed by this change was fixed in: https://android-review.googlesource.com/#/c/154031/ This reverts commit 7b0e353b49ac3f464c662f20e20e240f0231afff. Change-Id: I680c13dc9db9ba223ab11c7af255222860b4e6d2
|
ae71a0539451a8350bdd9d46c76ddab7b763f209 |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a crash in optimizing compiler with the current method. Crash was due to overwriting the location of the current method in the slow path of an intrinsic. Change-Id: I6ca58ef5b3cea19925e60b9500aef543bc5f71ef
|
7b0e353b49ac3f464c662f20e20e240f0231afff |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Pass current method to HNewInstance and HNewArray." 082-inline-execute fails on x86. This reverts commit e21aa42e1341d34250742abafdd83311ad9fa737. Change-Id: Ib3fd25faee2e0128001e40d3d51a74f959bc4449
|
94015b939060f5041d408d48717f22443e55b6ad |
|
04-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Use HCurrentMethod in HInvokeStaticOrDirect."" Fix was to special case baseline for x86, which does not have enough registers to allocate the current method. This reverts commit c345f141f11faad177aa9635a78088d00cf66086. Change-Id: I5997aa52f8d4df373ae5ff4d4150dac0c44c4c10
|
e21aa42e1341d34250742abafdd83311ad9fa737 |
|
08-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Pass current method to HNewInstance and HNewArray. Also remove unsed CodeGenerator::LoadCurrentMethod. Change-Id: I4b8d3f2a30b8e2c76b6b329a72555483c993cb73
|
c345f141f11faad177aa9635a78088d00cf66086 |
|
04-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Use HCurrentMethod in HInvokeStaticOrDirect." Fails on baseline/x86. This reverts commit 38207af82afb6f99c687f64b15601ed20d82220a. Change-Id: Ib71018367eb7c6046965494a7e996c22af3de403
|
38207af82afb6f99c687f64b15601ed20d82220a |
|
01-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use HCurrentMethod in HInvokeStaticOrDirect. Change-Id: I0d15244b6b44c8b10079398c55da5071a3e3af66
|
682393c4b1995c209e2cf71780b0fb9023150213 |
|
14-Apr-2015 |
Roland Levillain <rpl@google.com> |
Improve the performance of long-to-double conversions on ARM. Use a VMLA instruction instead of VADD & VMUL instructions in long-to-double conversions on ARM. This change reduces code size and improves execution times (but does not alter precision). It trades one temporary FPU register for two temporary core registers. Change-Id: I1dc35bef6c12be8f305e5b46da98c2421686b60d
|
0d1652e1e3768b30e4d80f31d59db580312581d8 |
|
03-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix compilation errors with gcc. Change-Id: If88d4f639658db2d6d71f5abcad563211138fc4a
|
fd88f16100cceafbfde1b4f095f17e89444d6fa8 |
|
03-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Factorize code for common LocationSummary of HInvoke. This is one step forward, we could factorize more, but I wanted to get this out of the way first. Change-Id: I6ae411a737eebaecb64974f47af507ce0cfbae85
|
1d8199d8215a6ee7b1904edc47372d83fcdee5a3 |
|
02-Jun-2015 |
Kenny Root <kroot@google.com> |
Tidy up spelling Change-Id: I65fba9d8310ff3759322cec3345235e6472f4cfb
|
5b3ee56cd63ee9e3c70c0412d044b81ab9c94513 |
|
14-Apr-2015 |
Roland Levillain <rpl@google.com> |
Delegate long-to-float type conversions to the runtime on ARM. On ARM, translate long-to-float type conversions (from both Quick and Optimizing) as calls to the runtime routine art_l2f, instead of generating ad hoc code, so as to improve the precision of the conversions. Bug: 20413424 Change-Id: I8c414ee1c6f4ff1f32ee78f75734cfd3cf579f71
|
3d21bdf8894e780d349c481e5c9e29fe1556051c |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 (cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33) Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3 ART: Fix casts for 64-bit pointers on 32-bit compiler. Bug: 19264997 Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457 Fix JDWP tests after ArtMethod change Fixes Throwable::GetStackDepth for exception event detection after internal stack trace representation change. Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of proxy method. Bug: 19264997 Change-Id: I363e293796848c3ec491c963813f62d868da44d2 Fix accidental IMT and root marking regression Was always using the conflict trampoline. Also included fix for regression in GC time caused by extra roots. Most of the regression was IMT. Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to detached thread. EvaluateAndApplyChanges: From ~2500 -> ~1980 GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots Bug: 19264997 Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0 Fix bogus image test assert Previously we were comparing the size of the non moving space to size of the image file. Now we properly compare the size of the image space against the size of the image file. Bug: 19264997 Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a [MIPS64] Fix art_quick_invoke_stub argument offsets. ArtMethod reference's size got bigger, so we need to move other args and leave enough space for ArtMethod* and 'this' pointer. This fixes mips64 boot. Bug: 19264997 Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
|
62a46b2b4ac066a740fb22e58a246c18501fa909 |
|
01-Jun-2015 |
Roland Levillain <rpl@google.com> |
Use down_cast instead of reinterpret_cast in Optimizing codegens. Change-Id: Ifa23023ffaca631a4f6b5745dd7492c39521a26f
|
e401d146407d61eeb99f8d6176b2ac13c4df1e33 |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
|
fbdaa30a448029d75422c76f29087a4e39630f4a |
|
29-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use the new HCurrentMethod in HLoadString. Change-Id: I23d27e5e10736d127519eb3238ff8f25df3843a2
|
76b1e1799a713a19218de26b171b0aef48a59e98 |
|
27-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a HCurrentMethod node. This enables register allocation for the current method, so that users of it don't always load it from the stack. Currently only used by HLoadClass. Will make follow-up CLs for the other users. Change-Id: If73324d85643102faba47fabbbd2755eb258c59c
|
0d37cd0a895cedb1653cf9897d9f9058855e2aee |
|
27-May-2015 |
Roland Levillain <rpl@google.com> |
Rename VisitCondition's argument in code generators. This argument is a condition instruction, not a comparison. Change-Id: I026f799d2161df58b0c8a84600eb8fffd6f7b998
|
41b175aba41c9365a1c53b8a1afbd17129c87c14 |
|
19-May-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Clean up arm64 kNumberOfXRegisters usage. Avoid undefined behavior for arm64 stemming from 1u << 32 in loops with upper bound kNumberOfXRegisters. Create iterators for enumerating bits in an integer either from high to low or from low to high and use them for <arch>Context::FillCalleeSaves() on all architectures. Refactor runtime/utils.{h,cc} by moving all bit-fiddling functions to runtime/base/bit_utils.{h,cc} (together with the new bit iterators) and all time-related functions to runtime/base/time_utils.{h,cc}. Improve test coverage and fix some corner cases for the bit-fiddling functions. Bug: 13925192 (cherry picked from commit 80afd02024d20e60b197d3adfbb43cc303cf29e0) Change-Id: I905257a21de90b5860ebe1e39563758f721eab82
|
80afd02024d20e60b197d3adfbb43cc303cf29e0 |
|
19-May-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Clean up arm64 kNumberOfXRegisters usage. Avoid undefined behavior for arm64 stemming from 1u << 32 in loops with upper bound kNumberOfXRegisters. Create iterators for enumerating bits in an integer either from high to low or from low to high and use them for <arch>Context::FillCalleeSaves() on all architectures. Refactor runtime/utils.{h,cc} by moving all bit-fiddling functions to runtime/base/bit_utils.{h,cc} (together with the new bit iterators) and all time-related functions to runtime/base/time_utils.{h,cc}. Improve test coverage and fix some corner cases for the bit-fiddling functions. Bug: 13925192 Change-Id: I704884dab15b41ecf7a1c47d397ab1c3fc7ee0f7
|
d56376cce54e7df976780ecbd03228f60d276433 |
|
21-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Introduce a NearLabel in thumb2."" This reverts commit 1f277e3cef6c33cd35e91123978491d83338d2ad. - Fix CompareAndBranch to not use cbz/cbnz with high registers. - Add a test for CompareAndBranch with the *inc file, as the other assembler test infrastructure does not handle labels. Change-Id: If552bf1112b96caa3b9bb6c73c4b40bb90a33db7
|
1f277e3cef6c33cd35e91123978491d83338d2ad |
|
21-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Introduce a NearLabel in thumb2." Fails some benchmarks and libcore tests. This reverts commit db0bbab279534974dca507946c66cff2d05dc9f9. Change-Id: I5d1afef5ede87e65d61f49529027c5c2f35b17fb
|
db0bbab279534974dca507946c66cff2d05dc9f9 |
|
20-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Introduce a NearLabel in thumb2. This tells the assembler that the user knows the encoding can be in 16bits. Change-Id: Idf36c38beb1e07a69862c972484aeb08326a0499
|
d126ba19a2a3352fedbe43ed628ab60ccd401424 |
|
20-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
[Optimizing] Thumb2 assembler: use 16bits branches when we can. We cannot relocate branches, but we can at least encode branches on 16bits when the target is known. Change-Id: Icb6116ed974fc97e03622ac80d914c2c06f4cba2
|
07276db28d654594e0e86e9e467cad393f752e6e |
|
18-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't do a null test in MarkGCCard if the value cannot be null. Change-Id: I45687f6d3505178e2fc3689eac9cb6ab1b2c1e29
|
c66671076b12a0ee8b9d1ae782732cc91beacb73 |
|
15-May-2015 |
Zheng Xu <zheng.xu@arm.com> |
Opt compiler: Speedup div/rem by constants on arm32 and arm64. This patch also includes: 1. Add java test for div/rem negative constants. 2. Fix a thumb2 encoding issue where the last operand is "reg, shift #amount" in some instructions. 3. Support a simple filter in arm32 assembler test to filter out unsupported cases, such as "smull r0, r0, r1, r2". 4. Add smull arm32 assembler test. 5. Add smull/umull thumb2 test. 6. Add test for the thumb2 encoding issue which is fixed in this patch. Change-Id: I1601bc9c38f70f11909f2816fe3ec105a158951e
|
c74652867cd9293e86232324e5e057cd73c48e74 |
|
13-May-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Refactor GraphVisualizer attribute printing This patch unifies the way GraphVisualizer prints instruction attributes in preparation of changes to the Checker syntax. Change-Id: I44e91e36c660985ddfe039a9f410fedc48b496ec
|
db216f4d49ea1561a74261c29f1264952232728a |
|
05-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Relax the only one back-edge restriction. The rule is in the way for better register allocation, as it creates an artificial join point between multiple paths. Change-Id: Ia4392890f95bcea56d143138f28ddce6c572ad58
|
2d27c8e338af7262dbd4aaa66127bb8fa1758b86 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Refactor InvokeDexCallingConventionVisitor in Optimizing. Change-Id: I7ede0f59d5109644887bf5d39201d4e1bf043f34
|
3e3d73349a2de81d14e2279f60ffbd9ab3f3ac28 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Have HInvoke instructions know their number of actual arguments. Add an art::HInvoke::GetNumberOfArguments routine so that art::HInvoke and its subclasses can return the number of actual arguments of the called method. Use it in code generators and intrinsics handlers. Consequently, no longer remove a clinit check as last input of a static invoke if it is still present during baseline code generation, but ensure that static invokes have no such check as last input in optimized compilations. Change-Id: Iaf9e07d1057a3b15b83d9638538c02b70211e476
|
848f70a3d73833fc1bf3032a9ff6812e429661d9 |
|
15-Jan-2014 |
Jeff Hao <jeffhao@google.com> |
Replace String CharArray with internal uint16_t array. Summary of high level changes: - Adds compiler inliner support to identify string init methods - Adds compiler support (quick & optimizing) with new invoke code path that calls method off the thread pointer - Adds thread entrypoints for all string init methods - Adds map to verifier to log when receiver of string init has been copied to other registers. used by compiler and interpreter Change-Id: I797b992a8feb566f9ad73060011ab6f51eb7ce01
|
0379f82393237798616d485ad99952e73e480e12 |
|
25-Apr-2015 |
Roland Levillain <rpl@google.com> |
Fix DCHECKs about clinit checks in Optimizing's code generators. These assertions are not true for the baseline compiler. As a temporary workaround, remove a clinit check as last input of a static invoke if it is still present at the stage of code generation. Change-Id: I5655f4a0873e2e7ee7790b6a341c18b4b7b52af1
|
4c0eb42259d790fddcd9978b66328dbb3ab65615 |
|
24-Apr-2015 |
Roland Levillain <rpl@google.com> |
Ensure inlined static calls perform clinit checks in Optimizing. Calls to static methods have implicit class initialization (clinit) checks of the method's declaring class in Optimizing. However, when such a static call is inlined, the implicit clinit check vanishes, possibly leading to an incorrect behavior. To ensure that inlining static methods does not change the behavior of a program, add explicit class initialization checks (art::HClinitCheck) as well as load class instructions (art::HLoadClass) as last input of static calls (art::HInvokeStaticOrDirect) in Optimizing' control flow graphs, when the declaring class is reachable and not known to be already initialized. Then when considering the inlining of a static method call, proceed only if the method has no implicit clinit check requirement. The added explicit clinit checks are already removed by the art::PrepareForRegisterAllocation visitor. This CL also extends this visitor to turn explicit clinit checks from static invokes into implicit ones after the inlining step, by removing the added art::HLoadClass nodes mentioned hereinbefore. Change-Id: I9ba452b8bd09ae1fdd9a3797ef556e3e7e19c651
|
5ea536aa4a6414db01beaf6f8bd8cb9adc5cfc92 |
|
20-Apr-2015 |
Vladimir Marko <vmarko@google.com> |
Remove ArtMethod* parameter from dex cache entry points. Load the ArtMethod* using an optimized stack walk instead. This reduces the size of the generated code. Three of the entry points are called only from a slow-path and the fourth (InitializeTypeAndVerifyAccess) is rare and already slow enough that the one or two extra loads (depending on whether we already have the ArtMethod* in a register) are insignificant. And as we're starting to use PC-relative addressing of the dex cache arrays (already done by Quick for the boot image), having the ArtMethod* in a register becomes less likely anyway. Change-Id: Ib19b9d204e355e13bf386662a8b158178bf8ad28
|
af88835231c2508509eb19aa2d21b92879351962 |
|
20-Apr-2015 |
Guillaume "Vermeille" Sanchez <guillaumesa@google.com> |
Remove unnecessary null checks in CheckCast and InstanceOf Change-Id: I6fd81cabd8673be360f369e6318df0de8b18b634
|
27df758e2e7baebb6e3f393f9732fd0d064420c8 |
|
17-Apr-2015 |
Calin Juravle <calin@google.com> |
[optimizing] Add memory barriers in constructors when needed If a class has final fields we must add a memory barrier before returning from constructor. This makes sure the fields are visible to other threads. Bug: 19851497 Change-Id: If8c485092fc512efb9636cd568cb0543fb27688e
|
acc0b8e3c7bad818edc9b777b89e97003b1eb4eb |
|
20-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix lint error. Change-Id: Id956c0e8c864a14c05d291f6b890df4877652306
|
88c13cddc3a4184908662b0f3de796565d348c76 |
|
14-Apr-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Correctly require register or FPU register. Also add a check that location summary are correctly typed with the HInstruction. Change-Id: I699762ff4e8f4e321c7db01ea005236ea1934af9
|
13b4718ecd52a674b25eac106e654d8e89872750 |
|
15-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Remove DCHECKs for boolean type Since bool and int are interchangeable types, checking whether an input is kPrimBoolean can fail when replaced with 0/1 constant or a phi. This patch removes the problematic DCHECKs, adds a best-effort verification into SSAChecker but leaves the phi case empty until a suitable analysis is implemented. Change-Id: I31e8daf27dd33d2fd74049b82bed1cb7c240c8c6
|
9021825d1e73998b99c81e89c73796f6f2845471 |
|
15-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Type MoveOperands. The ParallelMoveResolver implementation needs to know if a move is for 64bits or not, to handle swaps correctly. Bug found, and test case courtesy of Serguei I. Katkov. Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
|
66d126ea06ce3f507d86ca5f0d1f752170ac9be1 |
|
03-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement HBooleanNot instruction Optimizations simplifying operations on boolean values (boolean simplifier, instruction simplifier) can benefit from having a special HInstruction for negating booleans in order to perform more transforms and produce faster machine code. This patch implements HBooleanNot as 'x xor 1', assuming that booleans are 1-bit integers and allowing for a single-instruction negation on all supported platforms. Change-Id: I33a2649c1821255b18a86ca68ed16416063c739f
|
9d8606de5e274c00242ee73ffb693bc34589f184 |
|
12-Apr-2015 |
David Srbecky <dsrbecky@google.com> |
Whitespace cleanup in DWARFReg helper functions. Change-Id: Iedc05969b05be6d93e40467ff23287faaae08fb3
|
c34dc9362b9ec624b3bdd97d36b6b2098814cd73 |
|
12-Apr-2015 |
David Srbecky <dsrbecky@google.com> |
Move 'ret' instruction generation inside GenerateFrameExit. Change-Id: I0c594d9a2356a006a5ce8dfd41d307cf7c3704ba
|
c6b4dd8980350aaf250f0185f73e9c42ec17cd57 |
|
07-Apr-2015 |
David Srbecky <dsrbecky@google.com> |
Implement CFI for Optimizing. CFI is necessary for stack unwinding in gdb, lldb, and libunwind. Change-Id: I1a3480e3a4a99f48bf7e6e63c4e83a80cfee40a2
|
65b798ea10dd716c1bb3dda029f9bf255435af72 |
|
06-Apr-2015 |
Andreas Gampe <agampe@google.com> |
ART: Enable more Clang warnings Change-Id: Ie6aba02f4223b1de02530e1515c63505f37e184c
|
d43b3ac88cd46b8815890188c9c2b9a3f1564648 |
|
01-Apr-2015 |
Mingyao Yang <mingyao@google.com> |
Revert "Revert "Deoptimization-based bce."" This reverts commit 0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430. Change-Id: I1ca10d15bbb49897a0cf541ab160431ec180a006
|
b51cdb32acd8b056752375e5f01d243033ec360c |
|
30-Mar-2015 |
Andreas Gampe <agampe@google.com> |
ART: Arm32 optimizing compiler backend should honor sdiv We still support architectures that do not have sdiv. Issue: https://code.google.com/p/android/issues/detail?id=162257 Change-Id: I6d43620b7599f70a630668791a796a1703b62912
|
d75948ac93a4a317feaf136cae78823071234ba5 |
|
27-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Intrinsify String.compareTo. Change-Id: Ia540df98755ac493fe61bd63f0bd94f6d97fbb57
|
b2bd1c5f9171f35fa5b71ada42d1a9e11189428d |
|
25-Mar-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Formatting and comments in BooleanSimplifier Change-Id: I9a5aa3f2aa8b0a29d7b0f1e5e247397cf8e9e379
|
fd18f5ac060365286616cce773f8702d6246e4ca |
|
11-Mar-2015 |
Guillaume "Vermeille" Sanchez <guillaumesa@google.com> |
Inline long shift code Change-Id: I2848636f892e276507d04f4313987b9f4c80686b
|
46e2a3915aa68c77426b71e95b9f3658250646b7 |
|
16-Mar-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Boolean simplifier The optimization recognizes the negation pattern generated by 'javac' and replaces it with a single condition. To this end, boolean values are now consistently assumed to be represented by an integer. This is a first optimization which deletes blocks from the HGraph and does so by replacing the corresponding entries with null. Hence, existing code can continue indexing the list of blocks with the block ID, but must check for null when iterating over the list. Change-Id: I7779da69cfa925c6521938ad0bcc11bc52335583
|
0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430 |
|
24-Mar-2015 |
Andreas Gampe <agampe@google.com> |
Revert "Deoptimization-based bce." This breaks compiling the core image: Error after BCE: art::SSAChecker: Instruction 219 in block 1 does not dominate use 221 in block 1. This reverts commit e295e6ec5beaea31be5d7d3c996cd8cfa2053129. Change-Id: Ieeb48797d451836ed506ccb940872f1443942e4e
|
e295e6ec5beaea31be5d7d3c996cd8cfa2053129 |
|
07-Mar-2015 |
Mingyao Yang <mingyao@google.com> |
Deoptimization-based bce. A mechanism is introduced that a runtime method can be called from code compiled with optimizing compiler to deoptimize into interpreter. This can be used to establish invariants in the managed code If the invariant does not hold at runtime, we will deoptimize and continue execution in the interpreter. This allows to optimize the managed code as if the invariant was proven during compile time. However, the exception will be thrown according to the semantics demanded by the spec. The invariant and optimization included in this patch are based on the length of an array. Given a set of array accesses with constant indices {c1, ..., cn}, we can optimize away all bounds checks iff all 0 <= min(ci) and max(ci) < array-length. The first can be proven statically. The second can be established with a deoptimization-based invariant. This replaces n bounds checks with one invariant check (plus slow-path code). Change-Id: I8c6e34b56c85d25b91074832d13dba1db0a81569
|
f3b4aebd0f5ce6c82bfd6284919a5c5e91955124 |
|
17-Mar-2015 |
Calin Juravle <calin@google.com> |
Revert "Inline long shift code" This reverts commit 09895ebf2d98783e65930a820e9288703bb1a50b. Change-Id: I7544022d896ef4353bc2cdf4b036403ed20c956d
|
09895ebf2d98783e65930a820e9288703bb1a50b |
|
11-Mar-2015 |
Guillaume "Vermeille" Sanchez <guillaumesa@google.com> |
Inline long shift code Change-Id: I96887c295eb9a23dad4c9cc05d0a0e3ba17f674d
|
68e15009173f92fe717546a621b56413d5e9fba1 |
|
17-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
PREOPT compiles using dex2oatd so don't emit debug instructions. Change-Id: I8d2ab8d956ad0ce313928918c658d49f490ad081
|
eeefa1276e83776f08704a3db4237423b0627e20 |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Update locations of registers after slow paths spilling. Change-Id: Id9aafcc13c1a085c17ce65d704c67b73f9de695d
|
a8ac9130b872c080299afacf5dcaab513d13ea87 |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Refactor code in preparation of correct stack maps in slow path. Move the logic of saving/restoring live registers in slow path in the SlowPathCode method. Also add a RecordPcInfo helper to SlowPathCode, that will act as the placeholder of saving correct stack maps. Change-Id: I25c2bc7a642ef854bbc8a3eb570e5c8c8d2d030c
|
dc5ac731f6369b53b42f1cee3404f3b3384cec34 |
|
25-Feb-2015 |
Mingyao Yang <mingyao@google.com> |
Opt compiler: enhance gvn for commutative ops. Change-Id: I415b50d58b30cab4ec38077be22373eb9598ec40
|
d8ef2e991a1a65f47a26a1eb8c6b34c92b775d6b |
|
24-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
not-int can also take non-int (byte and short) instructions. So we should use the result-type instead if the input type for knowning what instruction to use. Bug: 19454010 Change-Id: I88782ad27ae8c8e1b7868afede5057d26f14685a
|
b1498f67b444c897fa8f1530777ef118e05aa631 |
|
16-Feb-2015 |
Calin Juravle <calin@google.com> |
Improve type propagation with if-contexts This works by adding a new instruction (HBoundType) after each `if (a instanceof ClassA) {}` to bound the type that `a` can take in the True- dominated blocks. Change-Id: Iae6a150b353486d4509b0d9b092164675732b90c
|
d6138ef1ea13d07ae555542f8898b30d89e9ac9a |
|
18-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Ensure the graph is correctly typed. We used to be forgiving because of HIntConstant(0) also being used for null. We now create a special HNullConstant for such uses. Also, we need to run the dead phi elimination twice during ssa building to ensure the correctness. Change-Id: If479efa3680d3358800aebb1cca692fa2d94f6e5
|
ffe8a577a4c644a2c5387f1e8efe92fb0efac43f |
|
11-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize double/float immediate loading on arm. Also reserve a D register for temp. Change-Id: I6584d9005b0f5685c3afcd8e9153b4c87b56aa8e
|
f7a0c4e421b5edaad5b7a15bfff687da28d0b287 |
|
10-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Improve ParallelMoveResolver to work with pairs. Change-Id: Ie2a540ffdb78f7f15d69c16a08ca2d3e794f65b9
|
2bcf9bf784a0021630d8fe63d7230d46d6891780 |
|
29-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Arm intrinsics for Optimizing compiler Add arm32 intrinsics to the optimizing compiler. Change-Id: If4aeedbf560862074d8ee08ca4484b666d6b9bf0
|
c0572a451944f78397619dec34a38c36c11e9d2a |
|
06-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize leaf methods. Avoid suspend checks and stack changes when not needed. Change-Id: I0fdb31e8c631e99091b818874a558c9aa04b1628
|
829280cc90b7a84db42864589b4bafb4c94a79d9 |
|
28-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Finally implement Location::kNoOutputOverlap. The [i, i + 1) interval scheme we chose for representing lifetime positions is not optimal for doing this optimization. It however doesn't prevent recognizing a non-split interval during the TryAllocateFreeReg phase, and try to re-use its inputs' registers. Change-Id: I80a2823b0048d3310becfc5f5fb7b1230dfd8201
|
cb1b00aedd94785e7599f18065a0b97b314e64f6 |
|
28-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use the non access check entrypoint when possible. Change-Id: I0b53d63141395e26816d5d2ce3fa6a297bb39b54
|
1cf95287364948689f6a1a320567acd7728e94a3 |
|
12-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Small optimization for recursive calls: avoid dex cache. Change-Id: I044757a2f06e535cdc1480c4fc8182b89635baf6
|
a0bb2bd5b6a049ad806c223f00672d1f0210db67 |
|
26-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix codegen_test. Native and ART do not have the same calling convention for ART, so we need to adjust blocked and allocated registers. Change-Id: I606b2620c0e5a54bd60d6100a137c06616ad40b4
|
4dee636d21d9ce54386cdfbb824e5eb2a9c1af0d |
|
23-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support callee-save registers on ARM. Change-Id: I7c519b7a828c9891b1141a8e51e12d6a8bc84118
|
d97dc40d186aec46bfd318b6a2026a98241d7e9c |
|
22-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support callee save floating point registers on x64. - Share the computation of core_spill_mask and fpu_spill_mask between backends. - Remove explicit stack overflow check support: we need to adjust them and since they are not tested, they will easily bitrot. Change-Id: I0b619b8de4e1bdb169ea1ae7c6ede8df0d65837a
|
988939683c26c0b1c8808fc206add6337319509a |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable core callee-save on x64. Will work on other architectures and FP support in other CLs. Change-Id: I8cef0343eedc7202d206f5217fdf0349035f0e4d
|
fa93b504b324784dd9a96e28e6e8f3f1b1ac456a |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not use HNot for creating !bool. HNot folds to ~, not !. Change-Id: I681f968449a2ade7110b2f316146ad16ba5da74c
|
6c2dff8ff8e1440fa4d9e1b2ba2a44d036882801 |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Fully support pairs in the register allocator."" This reverts commit c399fdc442db82dfda66e6c25518872ab0f1d24f. Change-Id: I19f8215c4b98f2f0827e04bf7806c3ca439794e5
|
77520bca97ec44e3758510cebd0f20e3bb4584ea |
|
12-Jan-2015 |
Calin Juravle <calin@google.com> |
Record implicit null checks at the actual invoke time. ImplicitNullChecks are recorded only for instructions directly (see NB below) preceeded by NullChecks in the graph. This way we avoid recording redundant safepoints and minimize the code size increase. NB: ParallalelMoves might be inserted by the register allocator between the NullChecks and their uses. These modify the environment and the correct action would be to reverse their modification. This will be addressed in a follow-up CL. Change-Id: Ie50006e5a4bd22932dcf11348f5a655d253cd898
|
c399fdc442db82dfda66e6c25518872ab0f1d24f |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Fully support pairs in the register allocator." Libcore tests fail. This reverts commit 41aedbb684ccef76ff8373f39aba606ce4cb3194. Change-Id: I2572f120d4bbaeb7a4d4cbfd47ab00c9ea39ac6c
|
41aedbb684ccef76ff8373f39aba606ce4cb3194 |
|
14-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fully support pairs in the register allocator. Enabled on ARM for longs and doubles. Change-Id: Id8792d08bd7ca9fb049c5db8a40ae694bafc2d8b
|
93edf73a5fecd526920fbd870068fa592376ac8a |
|
20-Jan-2015 |
Calin Juravle <calin@google.com> |
Use CompilerOptions for implicit stack overflow checks Change-Id: I52744382a7e3d2c6c11a43e027d87bf43ec4e62b
|
3747b48f7b09a9bc836397ceaacb9de0940db6fd |
|
19-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Address review comments. Comments were from: https://android-review.googlesource.com/#/c/121992. Change-Id: I8c59b30a356d606f12c50d0c8db916295a5c9e13
|
a8eef82f394f31272610d7ed80328ee465fa1a0f |
|
16-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not use STMP, it conflicts with the calling convention. Hard-float calling convention uses S14 and D7 for argument passing, so we cannot use them. Change-Id: I77a2d8c875677640204baebc24355051aa4175fd
|
cd6dffedf1bd8e6dfb3fb0c933551f9a90f7de3f |
|
08-Jan-2015 |
Calin Juravle <calin@google.com> |
Add implicit null checks for the optimizing compiler - for backends: arm, arm64, x86, x86_64 - fixed parameter passing for CodeGenerator - 003-omnibus-opcodes test verifies that NullPointerExceptions work as expected Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
|
42d1f5f006c8bdbcbf855c53036cd50f9c69753e |
|
16-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not use register pair in a parallel move. The ParallelMoveResolver does not work with pairs. Instead, decompose the pair into two individual moves. Change-Id: Ie9d3f0b078cef8dc20640c98b20bb20cc4971a7f
|
71fb52fee246b7d511f520febbd73dc7a9bbca79 |
|
30-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Optimizing compiler intrinsics Add intrinsics infrastructure to the optimizing compiler. Add almost all intrinsics supported by Quick to the x86-64 backend. Further intrinsics require more assembler support. Change-Id: I48de9b44c82886bb298d16e74e12a9506b8e8807
|
53f1262773516a247e7bfad50de3cd94a4dcf4df |
|
13-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement ParallelMoveResolver::Swap for doubles on arm. Currently reserve a global register DTMP for these operations. Change-Id: Ie88b4696af51834492fd062082335bc2e1137be2
|
af2c65c38449dfeb21b572887110c5c9a0008ca1 |
|
14-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Remove whitespace. Change-Id: I82f51cff87765a3aeeb861d2ae64978f2e762c73
|
69c15d340e7e76821bbc5d4494d4cef383774dee |
|
13-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Skip r1 on arm if first parameter is a long. Change-Id: I16d927ee0a0b55031ade4c92c0095fd74e18ed5b
|
425f239c291d435f519a1cf4bdd9ccc9a2c0c070 |
|
08-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix handling of long argument spanning register/memory. Comment in arm_lir.h says: * If a 64-bit argument would span the register/memory argument * boundary, it will instead be fully passed in the frame. This change implements such logic for all platforms. We still need to pass the low part in register as well because I haven't ported the jni compilers (x86 and mips) to it. Once the jni compilers are updated, we can remove the register assignment. Note that this greatly simplifies optimizing's register allocator by not having to understand a long spanning register and memory. Change-Id: I59706ca5d47269fc46e5489ac99bd6576e87e7f3
|
bdcedd301a0a417ca538b7bf7e684c60cb1dbda3 |
|
09-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't overwrite a register input. `addr` is a register input, which can survive the current instruction, therefore we can't overwrite it. Change-Id: I6eaa60e5f91c2b7b9b31673457d2a0d63474e587
|
840e5461a85f8908f51e7f6cd562a9129ff0e7ce |
|
07-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement double and float support for arm in register allocator. The basic approach is: - An instruction that needs two registers gets two intervals. - When allocating the low part, we also allocate the high part. - When splitting a low (or high) interval, we also split the high (or low) equivalent. - Allocation follows the (S/D register) requirement that low registers are always even and the high equivalent is low + 1. Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
|
3416601a9e9be81bb7494864287fd3602d18ef13 |
|
19-Dec-2014 |
Calin Juravle <calin@google.com> |
Look at instruction set features when generating volatiles code Change-Id: Ia882405719fdd60b63e4102af7e085f7cbe0bb2a
|
1cc7dbabd03e0a6c09d68161417a21bd6f9df371 |
|
18-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Reorder entrypoint argument order Shuffle the ArtMethod* referrer backwards for easier removal. Clean up ARM & MIPS assembly code. Change some macros to make future changes easier. Change-Id: Ie2862b68bd6e519438e83eecd9e1611df51d7945
|
52c489645b6e9ae33623f1ec24143cde5444906e |
|
16-Dec-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add support for volatile - for backends: arm, x86, x86_64 - added necessary instructions to assemblies - clean up code gen for field set/get - fixed InstructionDataEquals for some instructions - fixed comments in compiler_enums * 003-opcode test verifies basic volatile functionality Change-Id: I144393efa312dfb2c332cb84056b00edffee338a
|
5b4b898ed8725242ee6b7229b94467c3ea3054c8 |
|
18-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't block quick callee saved registers for optimizing." X64 has one libcore test failing, and codegen_test on arm is failing. This reverts commit 6004796d6c630696127df2494dcd4f30d1367a34. Change-Id: I20e00431fa18e11ce4c0cb6fffa91977fa8e9b4f
|
6004796d6c630696127df2494dcd4f30d1367a34 |
|
15-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't block quick callee saved registers for optimizing. This change builds on: https://android-review.googlesource.com/#/c/118983/ - Also fix x86_64 assembler bug triggered by this change. - Fix (and improve) x86's backend byte register usage. - Fix a bug in baseline register allocator: a fixed out register must prevent inputs from allocating it. Change-Id: I4883862e29b4e4b6470f1823cf7eab7e7863d8ad
|
4e44c829e282b3979a73bfcba92510e64fbec209 |
|
17-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Small optimization for recursive calls: avoid dex cache." Fails on target. This reverts commit 390f59f9bec64fd81b05e796dfaeb03ab6d4cc81. Change-Id: Ic3865b8897068ba20df0fbc2bcf561faf6c290c1
|
390f59f9bec64fd81b05e796dfaeb03ab6d4cc81 |
|
12-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Small optimization for recursive calls: avoid dex cache. Change-Id: Ic4054b6c38f0a2a530ba6ef747647f86cee0b1b8
|
e53798a7e3267305f696bf658e418c92e63e0834 |
|
01-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Inlining support in optimizing. Currently only inlines simple things that don't require an environment, such as: - Returning a constant. - Returning a parameter. - Returning an arithmetic operation. Change-Id: Ie844950cb44f69e104774a3cf7a8dea66bc85661
|
d2ec87d84057174d4884ee16f652cbcfd31362e9 |
|
08-Dec-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add REM_FLOAT and REM_DOUBLE - for arm, x86, x86_64 backends - reinstated fmod quick entry points for x86. This is a partial revert of bd3682eada753de52975ae2b4a712bd87dc139a6 which added inline assembly for floting point rem on x86. Note that Quick still uses the inline version. - fix rem tests for longs Change-Id: I73be19a9f2f2bcf3f718d9ca636e67bdd72b5440
|
4c0b61f506644bb6b647be05d02c5fb45b9ceb48 |
|
05-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for double-to-int & double-to-long in optimizing. - Add support for the double-to-int and double-to-long Dex instructions in the optimizing compiler. - Add S1 to the list of ARM FPU parameter registers so that a double value can be passed as parameter during a call to the runtime through D0. - Have art::x86_64::X86_64Assembler::cvttsd2si work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for double to int and double to long HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: Ic93b9ec6630c26e940f7966a3346ad3fd5a2ab3a
|
8964e2b689d80fe546604ac8c724078645095cf1 |
|
04-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for float-to-double & double-to-float in optimizing. Change-Id: I41b0fee5a28c83757697c8d000b7e224cf5a4534
|
624279f3c70f9904cbaf428078981b05d3b324c0 |
|
04-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for float-to-long in the optimizing compiler. - Add support for the float-to-long Dex instruction in the optimizing compiler. - Add a Dex PC field to art::HTypeConversion to allow the x86 and ARM code generators to produce runtime calls. - Instruct art::CodeGenerator::RecordPcInfo not to record PC information for HTypeConversion instructions. - Add S0 to the list of ARM FPU parameter registers. - Have art::x86_64::X86_64Assembler::cvttss2si work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for float to long HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I954214f0d537187883f83f7a83a1bb2dd8a21fd4
|
3f8f936aff35f29d86183d31c20597ea17e9789d |
|
02-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for float-to-int in the optimizing compiler. - Add support for the float-to-int Dex instruction in the optimizing compiler. - Factor type conversion related lines in compiler/optimizing/builder.cc. - Generate x86, x86-64 and ARM (but not ARM64) code for float to int HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I2382dfc04bf394ed75f675148cfcf98216d65bc6
|
01fcc9ee556f98d0163cc9b524e989760826926f |
|
01-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Remove type conversion nodes converting to the same type. When optimizing, we ensure these conversions do not reach the code generators. When not optimizing, we cannot get such situations. Change-Id: I717247c957667675dc261183019c88efa3a38452
|
3bcc8ea079d867f26622defd0611d134a3b4ae49 |
|
28-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use CanHoldArm in the code generator. CanHoldArm was ARM32 specific. Instead use a virtual Assembler::ShifterOperandCanHold that both thumb2 and arm32 implement. Change-Id: I33794a93caf02ee5d78d32a8471d9fd6fe4f0a00
|
6d0e483dd2e0b63e952de060738c10e2abd12ff7 |
|
27-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for long-to-float in the optimizing compiler. - Add support for the long-to-float Dex instruction in the optimizing compiler. - Have art::x86_64::X86_64Assembler::cvtsi2ss work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for long to float HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: Ic983cbeb1ae2051add40bc519a8f00a6196166c9
|
199f336af1fc8212646fda67675df0361ece33d6 |
|
27-Nov-2014 |
Roland Levillain <rpl@google.com> |
Wrap long lines in the optimizing compiler. Change-Id: I5dee0c65e6652de574ae952b1f1dfc7355859e45
|
32b2a52aa3d6dc25c18422514c7f88757f87d33c |
|
27-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix Move64 by using ParallelMoves. Destination and source might overlap in a Move64, so we have to use a parallel move resolver. Change-Id: Ica6c72d91ab8e2e2ee4661b211ac1ee8f054b9ef
|
271ab9c916980209fbc6b26e5545d76e58471569 |
|
27-Nov-2014 |
Roland Levillain <rpl@google.com> |
Ensure opt. compiler doesn't get core & FP registers mixed up. Replace Location::As<T>() with two method methods (Location::AsRegister<T>() and Location::AsFpuRegister<T>()) checking the kind of the location (register). Change-Id: I22b4abee1a124b684becd2dc1caf33652b911070
|
ddb7df25af45d7cd19ed1138e537973735cc78a5 |
|
25-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE} Adds: - float comparison for arm, x86, x86_64 backends. - ucomis{s,d} assembly to x86 and x86_64. - vmstat assebmly for thumb2 - new assembly tests Change-Id: Ie3e19d0c08b3b875cd0a4be4ee4e9c8a4a076290
|
647b9ed41cdb7cf302fd356627a3ba372419b78c |
|
27-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for long-to-double in the optimizing compiler. - Add support for the long-to-double Dex instruction in the optimizing compiler. - Enable requests of temporary FPU (double) registers during code generation. - Fix art::x86::X86Assembler::LoadLongConstant and extend it to int64_t values. - Have art::x86_64::X86_64Assembler::cvtsi2sd work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for long to double HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: Ie73d9e5e25bd2e15f585c371e8fc2dcb83438ccd
|
91debbc3da3e3376416e4394155d9f9e355255cb |
|
26-Nov-2014 |
Calin Juravle <calin@google.com> |
Revert "[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE}" Fails on arm due to missing vmrs op after vcmp. I revert this instead of pushing the fix because I don't understand yet why it compiles with run-test but not with dex2oat. This reverts commit fd861249f31ab360c12dd1ffb131d50f02b0bfc6. Change-Id: Idc2d30f6a0f39ddd3596aa18a532ae90f8aaf62f
|
fd861249f31ab360c12dd1ffb131d50f02b0bfc6 |
|
25-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE} - adds float comparison for arm, x86, x86_64 backends. - adds ucomis{s,d} assembly to x86 and x86_64. Change-Id: I232d2b6e9ecf373beb5cc63698dd97a658ff9c83
|
799f506b8d48bcceef5e6cf50f3f5eb6bcea05e1 |
|
26-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE}" Fails on x86_64 and target. This reverts commit cea28ec4b9e94ec942899acf1dbf20f8999b36b4. Change-Id: I30c1d188c7ecfe765f137a307022ede84f15482c
|
cea28ec4b9e94ec942899acf1dbf20f8999b36b4 |
|
25-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE} - adds float comparison for arm, x86, x86_64 backends. - adds ucomis{s,d} assembly to x86 and x86_64. Change-Id: Ie91e04bfb402025073054f3803a3a569e4705caa
|
eace45873190a27302b3644c32ec82854b59d299 |
|
25-Nov-2014 |
Mathieu Chartier <mathieuc@google.com> |
Move dexCacheStrings from ArtMethod to Class Adds one load for const strings which are not direct. Saves >= 60KB of memory avg per app. Image size: -350KB. Bug: 17643507 Change-Id: I2d1a3253d9de09682be9bc6b420a29513d592cc8 (cherry picked from commit f521f423b66e952f746885dd9f6cf8ef2788955d)
|
9aec02fc5df5518c16f1e5a9b6cb198a192db973 |
|
19-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add shifts Added SHL, SHR, USHR for arm, x86, x86_64. Change-Id: I971f594e270179457e6958acf1401ff7630df07e
|
86a8d7afc7f00ff0f5ea7b8aaf4d50514250a4e6 |
|
19-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Consistently use k{InstructionSet}WordSize. These constants were defined prior to k{InstructionSet}PointerSize. So use them consistently in optimizing as a first step. We can discuss whether we should remove them in a second step. Change-Id: If129de1a3bb8b65f8d9c816a8ad466815fb202e6
|
2d7210188805292e463be4bcf7a133b654d7e0ea |
|
10-Nov-2014 |
Mathieu Chartier <mathieuc@google.com> |
Change 64 bit ArtMethod fields to be pointer sized Changed the 64 bit entrypoint and gc map fields in ArtMethod to be pointer sized. This saves a large amount of memory on 32 bit systems. Reduces ArtMethod size by 16 bytes on 32 bit. Total number of ArtMethod on low memory mako: 169957 Image size: 49203 methods -> 787248 image size reduction. Zygote space size: 1070 methods -> 17120 size reduction. App methods: ~120k -> 2 MB savings. Savings per app on low memory mako: 125K+ per app (less active apps -> more image methods per app). Savings depend on how often the shared methods are on dirty pages vs shared. TODO in another CL, delete gc map field from ArtMethod since we should be able to get it from the Oat method header. Bug: 17643507 Change-Id: Ie9508f05907a9f693882d4d32a564460bf273ee8 (cherry picked from commit e832e64a7e82d7f72aedbd7d798fb929d458ee8f)
|
67555f7e9a05a9d436e034f67ae683bbf02d072d |
|
18-Nov-2014 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Add support for more IRs on arm64. Change-Id: I4b6425135d1af74912a206411288081d2516f8bf
|
e832e64a7e82d7f72aedbd7d798fb929d458ee8f |
|
10-Nov-2014 |
Mathieu Chartier <mathieuc@google.com> |
Change 64 bit ArtMethod fields to be pointer sized Changed the 64 bit entrypoint and gc map fields in ArtMethod to be pointer sized. This saves a large amount of memory on 32 bit systems. Reduces ArtMethod size by 16 bytes on 32 bit. Total number of ArtMethod on low memory mako: 169957 Image size: 49203 methods -> 787248 image size reduction. Zygote space size: 1070 methods -> 17120 size reduction. App methods: ~120k -> 2 MB savings. Savings per app on low memory mako: 125K+ per app (less active apps -> more image methods per app). Savings depend on how often the shared methods are on dirty pages vs shared. TODO in another CL, delete gc map field from ArtMethod since we should be able to get it from the Oat method header. Bug: 17643507 Change-Id: Ie9508f05907a9f693882d4d32a564460bf273ee8
|
cff137481eda0eb8dbdf9d2a303ae2bdac2c7322 |
|
17-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for int-to-float & int-to-double in optimizing. - Add support for the int-to-float and int-to-double Dex instructions in the optimizing compiler. - Generate x86, x86-64 and ARM (but not ARM64) code for byte to float, short to float, int to float, char to float, byte to double, short to double, int to double and char to double HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I963f9d0184a5d3721af2d8f593f133d5af7aa6a3
|
bacfec30ee9f2f6fdfd190f11b105b609938efca |
|
14-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add REM_INT, REM_LONG - for arm, x86, x86_64 - minor cleanup/fix in div tests Change-Id: I240874010206a5a9b3aaffbc81a885b94c248f93
|
01a8d7135c59b4a664d1e0c0e4d8db343d4118ef |
|
14-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for int-to-short in the optimizing compiler. - Add support for the int-to-short Dex instruction in the optimizing compiler. - Generate x86, x86-64 and ARM (but not ARM64) code for byte to short, int to short and char to short HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: If1829549708d9c3473efaa641f7f0bcfa6080ae9
|
af07bc121121d7bd7e8329c55dfe24782207b561 |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Minor object store optimizations. - Avoid emitting write barrier when the value is null. - Do not do a typecheck on an arraystore when storing something that was loaded from the same array. Change-Id: I902492928692e4553b5af0fc99cce3c2186c442a
|
981e45424f52735b1c61ae0eac7e299ed313f8db |
|
14-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for int-to-char in the optimizing compiler. - Add support for the int-to-char Dex instruction in the optimizing compiler. - Implement the ARM and Thumb-2 UBFX instructions and add tests for them. - Generate x86, x86-64 and ARM (but not ARM64) code for byte to char, short to char, int to char (and char to char!) HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I5cd4c6d86f0f6a966c059715b98db35cc8f9de76
|
51d3fc40637fc73d4156ad617cd451b844cbb75e |
|
13-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for int-to-byte in the optimizing compiler. - Add support for the int-to-byte Dex instruction in the optimizing compiler. - Implement the ARM and Thumb-2 SBFX instructions. - Generate x86, x86-64 and ARM (but not ARM64) code for char to byte, short to byte and int to byte HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: Ic8b8911b90d4b5281fad15bcee96bc3ee85dc577
|
a21f598fd4dfdb95dc8597d3156120cc20d94c02 |
|
13-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Fix Move for instruction with constant output Change-Id: I15d89292dc62f8dd8643530f95ace2e8be034411
|
d6fb6cfb6f2d0d9595f55e8cc18d2753be5d9a13 |
|
11-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add DIV_LONG - for backends: arm, x86, x86_64 - added cqo, idivq, testq assembly for x64_64 - small cleanups Change-Id: I762ef37880749038ed25d6014370be9a61795200
|
f97f9fbfdf7f2e23c662f21081fadee6af37809d |
|
11-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] add HTemporary support for long and doubles Change-Id: I5247ecd71d0193050484b7632c804c9bfd20f924
|
f0e3937b87453234d0d7970b8712082062709b8d |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Do a parallel move in BoundsCheckSlowPath. The two locations of the index and length could overlap, so we need a parallel move. Also factorize the code for doing a parallel move based on two locations. Change-Id: Iee8b3459e2eed6704d45e9a564fb2cd050741ea4
|
9574c4b5f5ef039d694ac12c97e25ca02eca83c0 |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement and/or/xor in optimizing. Change-Id: I7cf6da1fd334a7177a5580931b8f174dd40b7cec
|
b7baf5c58d0e864f8c3f889357c51288aed42e61 |
|
11-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement monitorenter/monitorexit. Pretty simple as they just invoke the runtime. Change-Id: I5fcb2c783deac27e55e28d8b3da3e68ea4b77363
|
57a88d4ac205874dc85d22f9f6a9ca3c4c373eeb |
|
10-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement checkcast for optimizing. - Ended up not using HTypeCheck because of how instanceof and checkcast end up having different logic for code generation. - Fix a x86_64 assembler bug triggered by now enabling more methods to be compiled. Difficult to test today without b/18117217. Change-Id: I3022e7ae03befb1d10bea9637ad21fadc430abe0
|
946e143941d456a4ec666f7f54719c65c5aa3f5d |
|
11-Nov-2014 |
Roland Levillain <rpl@google.com> |
Revert "Revert "Add support for long-to-int in the optimizing compiler."" This reverts commit 3adfd1b4fb20ac2b0217b5d2737bfe30ad90257a. Change-Id: Iacf0c6492d49267e24f1b727dbf6379b21fd02db
|
3adfd1b4fb20ac2b0217b5d2737bfe30ad90257a |
|
11-Nov-2014 |
Roland Levillain <rpl@google.com> |
Revert "Add support for long-to-int in the optimizing compiler." This reverts commit 647b96f29cb81832e698f863884fdba06674c9de. Change-Id: I552f23585463c676acbd547521b4d3ee5c0342eb
|
647b96f29cb81832e698f863884fdba06674c9de |
|
11-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for long-to-int in the optimizing compiler. - Add support for the long-to-int Dex instruction in the optimizing compiler. - Generate x86, x86-64 and ARM (but not ARM64) code for long-to-int HTypeConversion nodes. - Add related tests to test/422-type-conversion. - Also fix comments in test/415-optimizing-arith-neg and in test/416-optimizing-arith-not. Change-Id: I3084af30f2a495d178362ae1154dc7ceb7bf3a58
|
666c732cfa211abf44ed90120a87bf8c18138e55 |
|
10-Nov-2014 |
Roland Levillain <rpl@google.com> |
Support Java conversions from char to long in opt. compiler. These char to long conversions generate int-to-long Dex instructions. Change-Id: I6a8e71b57870cf5e8d5bc638fabce0fc7593f0b2
|
52839d17c06175e19ca4a093fb878450d1c4310d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support invoke-interface in optimizing. Change-Id: Ic18d7c3d2810557231caf0571956e0c431f5d384
|
6f5c41f9e409bc4da53b5d7c385202255e391e72 |
|
06-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement instanceof in optimizing. - Only fast-path for now: null or same class. - Use pQuickInstanceofNonTrivial for slow path. Change-Id: Ic5196b94bef792f081f3cb4d15157058e1381e6b
|
f43083d560565aea46c602adb86423daeefe589d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not update Out after it has a valid location. Slow paths use LocationSummary to know where to move things around, and they are executed at the end of the code generation. This fix is needed for https://android-review.googlesource.com/#/c/113345/. Change-Id: Id336c6409479b1de6dc839b736a7234d08a7774a
|
52e832b1278449e62d9eb502d54d5ff18f8606ed |
|
06-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support floats and doubles in fields. Change-Id: I19832106633405403f0461b3fe13b268abe39db3
|
de58ab2c03ff8112b07ab827c8fa38f670dfc656 |
|
05-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement try/catch/throw in optimizing. - We currently don't run optimizations in the presence of a try/catch. - We therefore implement Quick's mapping table. - Also fix a missing null check on array-length. Change-Id: I6917dfcb868e75c1cf6eff32b7cbb60b6cfbd68f
|
3dbcb38a8b2237b0da290ae35dc0caab3cb47b3d |
|
28-Oct-2014 |
Roland Levillain <rpl@google.com> |
Support float & double negation in the optimizing compiler. - Add support for the neg-float and neg-double Dex instructions in the optimizing compiler. - Generate x86, x86-64 and ARM (but not ARM64) code for float and double HNeg nodes. - Add related tests to test/415-optimizing-arith-neg. Change-Id: I29739a86e13dbe6f64e191641d01637c867cba6c
|
d0d4852847432368b090c184d6639e573538dccf |
|
04-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add div-int and exception handling. - for backends: arm, x86, x86_64 - fixed a register allocator bug: the request for a fixed register for the first input was ignored if the output was kSameAsFirstInput - added divide by zero exception - more tests - shuffle around some code in the builder to reduce the number of lines of code for a single function. Change-Id: Id3a515e02bfbc66cd9d16cb9746f7551bdab3d42
|
44b819e8d2ac48f1f66915ec1fab36ad247eb4d9 |
|
06-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use R4 for suspend check. Change-Id: I57ccfaa527e2676f21339860b28955468a87adfe
|
dff1f2812ecdaea89978c5351f0c70cdabbc0821 |
|
05-Nov-2014 |
Roland Levillain <rpl@google.com> |
Support int-to-long conversions in the optimizing compiler. - Add support for the int-to-float Dex instruction in the optimizing compiler. - Add a HTypeConversion node type for control-flow graphs. - Generate x86, x86-64 and ARM (but not ARM64) code for int-to-float HTypeConversion nodes. - Add a 64-bit "Move doubleword to quadword with sign-extension" (MOVSXD) instruction to the x86-64 assembler. - Add related tests to test/422-type-conversion. Change-Id: Ieb8ec5380f9c411857119c79aa8d0728fd10f780
|
277ccbd200ea43590dfc06a93ae184a765327ad0 |
|
04-Nov-2014 |
Andreas Gampe <agampe@google.com> |
ART: More warnings Enable -Wno-conversion-null, -Wredundant-decls and -Wshadow in general, and -Wunused-but-set-parameter for GCC builds. Change-Id: I81bbdd762213444673c65d85edae594a523836e5
|
424f676379f2f872acd1478672022f19f3240fc1 |
|
03-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement CONST_CLASS in optimizing compiler. Change-Id: Ia8c8dfbef87cb2f7893bfb6e178466154eec9efd
|
6a3c1fcb4ba42ad4d5d142c17a3712a6ddd3866f |
|
31-Oct-2014 |
Ian Rogers <irogers@google.com> |
Remove -Wno-unused-parameter and -Wno-sign-promo from base cflags. Fix associated errors about unused paramenters and implict sign conversions. For sign conversion this was largely in the area of enums, so add ostream operators for the effected enums and fix tools/generate-operator-out.py. Tidy arena allocation code and arena allocated data types, rather than fixing new and delete operators. Remove dead code. Change-Id: I5b433e722d2f75baacfacae4d32aef4a828bfe1b
|
b5f62b3dc5ac2731ba8ad53cdf3d9bdb14fbf86b |
|
30-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support for CONST_STRING in optimizing compiler. Change-Id: Iab8517bdadd1d15ffbe570010f093660be7c51aa
|
0a6c459f713ff61769a02204cd736167e062bf4c |
|
30-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix for long parameter passed both in stack and register. Fix for long parameter passed both in stack and register on 32bits architectures. The move to hard float ABI makes it so that the register index does not necessarily match the stack index anymore. Change-Id: I26b483f68ac86d336b4a37d94c38b04917668659
|
19a19cffd197a28ae4c9c3e59eff6352fd392241 |
|
22-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for static fields in optimizing compiler. Change-Id: Id2f010589e2bd6faf42c05bb33abf6816ebe9fa9
|
7c4954d429626a6ceafbf05be41bf5f840894e44 |
|
28-Oct-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add division for floats and doubles backends: x86, x86_64, arm. Also: - ordered instructions based on their name. - add missing kNoOutputOverlap to add/sub/mul. Change-Id: Ie47cde3b15ac74e7a1660c67a2eed1d7871f0ad0
|
3c03503d66df3b4440f851ae7d0c4fae5e7872df |
|
28-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Follow-up CL after hard float changes. Addressing comments from Zheng Xu. Change-Id: I8c599cdfab03373e82a1b90b711005c490bc6ca0
|
705664321a5cc1418255172f92d7d7195cf60a7b |
|
24-Oct-2014 |
Roland Levillain <rpl@google.com> |
Add long bitwise not instruction in the optimizing compiler. - Add support for the not-long (long integer one's complement negation) instruction in the optimizing compiler. - Add a 64-bit NOT instruction (notq) to the x86-64 assembler. - Generate ARM, x86 and x86-64 code for long HNot nodes. - Gather not-related tests in test/416-optimizing-arith-not. Change-Id: I2d5b75e9875664d6032d04f8401b2bbb84506948
|
1ba0f596e9e4ddd778ab431237d11baa85594eba |
|
27-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support hard float on arm in optimizing compiler. Also bump oat version, needed after latest hard float switch. Change-Id: Idf5acfb36c07e74acff00edab998419a3c6b2965
|
2e07b4f0a84a7968b4690c2b1be2e2f75cc6fa8e |
|
23-Oct-2014 |
Roland Levillain <rpl@google.com> |
Revert "Revert "Implement long negate instruction in the optimizing compiler."" This reverts commit 30ca3d847fe72cfa33e1b2473100ea2d8bea4517. Change-Id: I188ca8d460d55d3a9966bcf31e0588575afa77d2
|
30ca3d847fe72cfa33e1b2473100ea2d8bea4517 |
|
23-Oct-2014 |
Roland Levillain <rpl@google.com> |
Revert "Implement long negate instruction in the optimizing compiler." This reverts commit 66ce173a40eff4392e9949ede169ccf3108be2db.
|
66ce173a40eff4392e9949ede169ccf3108be2db |
|
23-Oct-2014 |
Roland Levillain <rpl@google.com> |
Implement long negate instruction in the optimizing compiler. - Add support for the neg-long (long integer two's complement negate) instruction in the optimizing compiler. - Add a 64-bit NEG instruction (negq) to the x86-64 assembler. - Generate ARM, x86 and x86-64 code for integer HNeg nodes. - Put neg-related tests into test/415-optimizing-arith-neg. Change-Id: I1fbe9611e134408a6b8745d1df20ab6ffa5e50f2
|
1135168a1a9e2a6493657be8c5e91d67e5f224a7 |
|
23-Oct-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add float/double subtraction - for arm, x86, x86_64 - add tests - a bit of clean up Change-Id: I3761b0d908aca3e3c5d60da481fafb423ff7c9b9
|
1cc5f251df558b0e22cea5000626365eb644c727 |
|
22-Oct-2014 |
Roland Levillain <rpl@google.com> |
Implement int bit-wise not operation in the optimizing compiler. - Add support for the not-int (integer one's complement negate) instruction in the optimizing compiler. - Extend the HNot control-flow graph node type and make it inherit from HUnaryOperation. - Generate ARM, x86 and x86-64 code for integer HNeg nodes. - Exercise these additions in the codegen_test gtest, as there is not direct way to assess the support of not-int from a Java source. Indeed, compiling a Java expression such as `~a' using javac and then dx generates an xor-int/lit8 Dex instruction instead of the expected not-int Dex instruction. This is probably because the Java bytecode has an `ixor' instruction, but there's not instruction directly corresponding to a bit-wise not operation. Change-Id: I223aed75c4dac5785e04d99da0d22e8d699aee2b
|
b5bfa96ff20e86316961327dec5c859239dab6a0 |
|
21-Oct-2014 |
Calin Juravle <calin@google.com> |
Add multiplication for floats/doubles in optimizing compiler Change-Id: I61de8ce1d9e37e30db62e776979b3f22dc643894
|
a3d05a40de076aabf12ea284c67c99ff28b43dbf |
|
20-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement array creation related DEX instructions. Implement new-array, filled-new-array, and fill-array-data. Change-Id: I405560d66777a57d881e384265322617ac5d3ce3
|
b762d2ebf9dc604561d9915c96b377235c94960c |
|
22-Oct-2014 |
Roland Levillain <rpl@google.com> |
Various fixes related to integer negate operations. - Emit an RSB instruction for HNeg nodes in the ARM code generator instead of RSBS, as we do not need to update the condition code flags in this case. - Simply punt when trying to statically evaluate a long unary operation, instead of aborting. - Move a test case to the right place. Change-Id: I35eb8dea58ed35258d4d8df77181159c3ab07b6f
|
102cbed1e52b7c5f09458b44903fe97bb3e14d5f |
|
15-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement register allocator for floating point registers. Also: - Fix misuses of emitting the rex prefix in the x86_64 assembler. - Fix movaps code generation in the x86_64 assembler. Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
|
88cb1755e1d6acaed0f66ce65d7a2a4465053342 |
|
20-Oct-2014 |
Roland Levillain <rpl@google.com> |
Implement int negate instruction in the optimizing compiler. - Add support for the neg-int (integer two's complement negate) instruction in the optimizing compiler. - Add a HNeg node type for control-flow graphs and an intermediate HUnaryOperation base class. - Generate ARM, x86 and x86-64 code for integer HNeg nodes. Change-Id: I72fd3e1e5311a75c38a8cb665a9211a20325a42e
|
8e3964b766652a0478e8e0e303e8556c997675f1 |
|
17-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Remove the notion of dies at entry. - Instead, explicitly say that the output does not overlap. - Inputs that must be in a fixed register do die at entry, as we know they have a location that others can not take. - There is also no need to differentiate between an input move and a connecting sibling move - those can be put in the same parallel move instruction. Change-Id: I1b2b2827906601f822b59fb9d6a21d48e43bae27
|
34bacdf7eb46c0ffbf24ba7aa14a904bc9176fb2 |
|
07-Oct-2014 |
Calin Juravle <calin@google.com> |
Add multiplication for integral types This also fixes an issue where we could allocate a pair register even if one of its parts was already blocked. Change-Id: I4869175933409add2a56f1ccfb369c3d3dd3cb01
|
92a73aef279be78e3c2b04db1713076183933436 |
|
16-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use assembler classes in code_generator.h. The arm64 backend uses its own assembler and does not share the same classes as the other backends. To avoid conflicts or unnecessary mappings, just don't use those classes in the shared part of the code generator. Change-Id: I9e5fa40c1021d2e83a4ef14c52cd1ccd03f2f73d
|
3a3fd0f8d3981691aa2331077a8fae5feee08dd1 |
|
10-Oct-2014 |
Roland Levillain <rpl@google.com> |
Turn constant conditional jumps into unconditional jumps. If a condition (input of an art::HIf instruction) is constant (an art::HConstant object), evaluate it at compile time and generate an unconditional branch instruction if it is true (in lieu of a conditional jump). Change-Id: I262e43ffe66d5c25dbbfa98092a41c8b3c4c75d6
|
71175b7f19a4f6cf9cc264feafd820dbafa371fb |
|
09-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Cleanup baseline register allocator. - Use three arrays for blocking regsters instead of one and computing offsets in that array.] - Don't pass blocked_registers_ to methods, just use the field. Change-Id: Ib698564c31127c59b5a64c80f4262394b8394dc6
|
fc787ecd91127b2c8458afd94e5148e2ae51a1f5 |
|
10-Oct-2014 |
Ian Rogers <irogers@google.com> |
Enable -Wimplicit-fallthrough. Falling through switch cases on a clang build must now annotate the fallthrough with the FALLTHROUGH_INTENDED macro. Bug: 17731372 Change-Id: I836451cd5f96b01d1ababdbf9eef677fe8fa8324
|
476df557fed5f0b3f32f8d11a654674bb403a8f8 |
|
09-Oct-2014 |
Roland Levillain <rpl@google.com> |
Use Is*() helpers to shorten code in the optimizing compiler. Change-Id: I79f31833bc9a0aa2918381aa3fb0b05d45f75689
|
360231a056e796c36ffe62348507e904dc9efb9b |
|
08-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix code generation of materialized conditions. Move the logic for knowing if a condition needs to be materialized in an optimization pass (so that the information does not change as a side effect of another optimization). Also clean-up arm and x86_64 codegen: - arm: ldr and str are for power-users when a constant is in play. We should use LoadFromOffset and StoreToOffset. - x86_64: fix misuses of movq instead of movl. Change-Id: I01a03b91803624be2281a344a13ad5efbf4f3ef3
|
56b9ee6fe1d6880c5fca0e7feb28b25a1ded2e2f |
|
09-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Stop converting from Location to ManagedRegister. Now the source of truth is the Location object that knows which register (core, pair, fpu) it needs to refer to. Change-Id: I62401343d7479ecfb24b5ed161ec7829cda5a0b1
|
7e70b002c4552347ed1af8c002a0e13f08864f20 |
|
08-Oct-2014 |
Ian Rogers <irogers@google.com> |
Header file clean up. Remove runtime.h from object.h. Move TypeStaticIf to its own header file to avoid bringing utils.h into allocator.h. Move Array::DataOffset into -inl.h as it now has a utils.h dependency. Fix include issues arising from this. Change-Id: I4605b1aa4ff5f8dc15706a0132e15df03c7c8ba0
|
01ef345767ea609417fc511e42007705c9667546 |
|
01-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add trivial register hints to the register allocator. - Add hints for phis, same as first input, and expected registers. - Make the if instruction accept non-condition instructions. Change-Id: I34fa68393f0d0c19c68128f017b7a05be556fbe5
|
7fb49da8ec62e8a10ed9419ade9f32c6b1174687 |
|
06-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for floats and doubles. - Follows Quick conventions. - Currently only works with baseline register allocator. Change-Id: Ie4b8e298f4f5e1cd82364da83e4344d4fc3621a3
|
26a25ef62a13f409f941aa39825a51b4d6f0f047 |
|
30-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a prepare for register allocation pass. - Currently the pass just changes the uses of checks to the actual values. - Also optimize array access, now that inputs can be constants. - And fix another bug in the register allocator reveiled by this change. Change-Id: I43be0dbde9330ee5c8f9d678de11361292d8bd98
|
9ae0daa60c568f98ef0020e52366856ff314615f |
|
30-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for inputs dying at entry of instructions. - Start using it in places where it makes sense. - Also improve suspend check on arm to use subs directly. Change-Id: I09ac0589f5ccb9b850ee757c76dcbcf35ee8cd01
|
5799fc0754da7ff2b50b472e05c65cd4ba32dda2 |
|
25-Sep-2014 |
Roland Levillain <rpl@google.com> |
Optimizing compiler: remove unnecessary `explicit' keywords. Change-Id: I5927fd92d53308c81e14edbd6e7d1c943bfa085b
|
3c04974a90b0e03f4b509010bff49f0b2a3da57f |
|
24-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize suspend checks in optimizing compiler. - Remove the ones added during graph build (they were added for the baseline code generator). - Emit them at loop back edges after phi moves, so that the test can directly jump to the loop header. - Fix x86 and x86_64 suspend check by using cmpw instead of cmpl. Change-Id: I6fad5795a55705d86c9e1cb85bf5d63dadfafa2a
|
3bca0df855f0e575c6ee020ed016999fc8f14122 |
|
19-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support for saving and restoring live registers in a slow path. And use it in suspend check slow paths. Change-Id: I79caf28f334c145a36180c79a6e2fceae3990c31
|
18efde5017369e005f1e8bcd3bbfb04e85053640 |
|
22-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix code generation with materialized conditions. Change-Id: I8630af3c13fc1950d3fa718d7488407b00898796
|
e982f0b8e809cece6f460fa2d8df25873aa69de4 |
|
13-Aug-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement invoke virtual in optimizing compiler. Also refactor 004 tests to make them work with both Quick and Optimizing. Change-Id: I87e275cb0ae0258fc3bb32b612140000b1d2adf8
|
fbc695f9b8e2084697e19c1355ab925f99f0d235 |
|
15-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Implement suspend checks in new compiler."" This reverts commit 7e3652c45c30c1f2f840e6088e24e2db716eaea7. Change-Id: Ib489440c34e41cba9e9e297054f9274f6e81a2d8
|
7e3652c45c30c1f2f840e6088e24e2db716eaea7 |
|
15-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Implement suspend checks in new compiler." This reverts commit 6fbce029fba3ed5da6c36017754ed408e6bcb632. Change-Id: Ia915c27873b021e658a10212e559095dfc91284e
|
6fbce029fba3ed5da6c36017754ed408e6bcb632 |
|
10-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement suspend checks in new compiler. For simplicity, they are currently placed on all (dex-level) back edges, and at method entry. Change-Id: I6e833e244d559dd788c69727e22fe40aff5b3435
|
3946844c34ad965515f677084b07d663d70ad1b8 |
|
02-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Runtime support for the new stack maps for the opt compiler. Now most of the methods supported by the compiler can be optimized, instead of using the baseline. Change-Id: I80ab36a34913fa4e7dd576c7bf55af63594dc1fa
|
b038ba66a166fb264ca121632f447712e0973b5b |
|
14-Aug-2014 |
Dave Allison <dallison@google.com> |
Revert "Revert "Reduce stack usage for overflow checks"" Fixes stack protection issue. Fixes mac build issue. This reverts commit 83b1940e6482b9d8feba5c492507735686650ea5. Change-Id: I7ba17252882b23a740bcda2ea94aacf398255406
|
4cf00ba324f5f6884059796a6ba41937f32e1844 |
|
14-Aug-2014 |
Dave Allison <dallison@google.com> |
Revert "Reduce stack usage for overflow checks" This reverts commit 63c051a540e6dfc806f656b88ac3a63e99395429. Change-Id: I282a048994fcd130fe73842b16c21680053c592f
|
03c9785a8a6d712775cf406c4371d0227c44148f |
|
14-Aug-2014 |
Dave Allison <dallison@google.com> |
Revert "Revert "Reduce stack usage for overflow checks"" Fixes stack protection issue. Fixes mac build issue. This reverts commit 83b1940e6482b9d8feba5c492507735686650ea5. Change-Id: I7ba17252882b23a740bcda2ea94aacf398255406
|
83b1940e6482b9d8feba5c492507735686650ea5 |
|
14-Aug-2014 |
Dave Allison <dallison@google.com> |
Revert "Reduce stack usage for overflow checks" This reverts commit 63c051a540e6dfc806f656b88ac3a63e99395429. Change-Id: I282a048994fcd130fe73842b16c21680053c592f
|
63c051a540e6dfc806f656b88ac3a63e99395429 |
|
26-Jul-2014 |
Dave Allison <dallison@google.com> |
Reduce stack usage for overflow checks This reduces the stack space reserved for overflow checks to 12K, split into an 8K gap and a 4K protected region. GC needs over 8K when running in a stack overflow situation. Also prevents signal runaway by detecting a signal inside code that resulted from a signal handler invokation. And adds a max signal count to the SignalTest to prevent it running forever. Also reduces the number of iterations for the InterfaceTest as this was taking (almost) forever with the --trace option on run-test. Bug: 15435566 Change-Id: Id4fd46f22d52d42a9eb431ca07948673e8fda694 Conflicts: compiler/optimizing/code_generator_x86_64.cc runtime/arch/x86/fault_handler_x86.cc runtime/arch/x86_64/quick_entrypoints_x86_64.S
|
648d7112609dd19c38131b3e71c37bcbbd19d11e |
|
26-Jul-2014 |
Dave Allison <dallison@google.com> |
Reduce stack usage for overflow checks This reduces the stack space reserved for overflow checks to 12K, split into an 8K gap and a 4K protected region. GC needs over 8K when running in a stack overflow situation. Also prevents signal runaway by detecting a signal inside code that resulted from a signal handler invokation. And adds a max signal count to the SignalTest to prevent it running forever. Also reduces the number of iterations for the InterfaceTest as this was taking (almost) forever with the --trace option on run-test. Bug: 15435566 Change-Id: Id4fd46f22d52d42a9eb431ca07948673e8fda694
|
3c7bb98698f77af10372cf31824d3bb115d9bf0f |
|
23-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement array get and array put in optimizing. Also fix a couple of assembler/disassembler issues. Change-Id: I705c8572988c1a9c4df3172b304678529636d5f6
|
f12feb8e0e857f2832545b3f28d31bad5a9d3903 |
|
17-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Stack overflow checks and NPE checks for optimizing. Change-Id: I59e97448bf29778769b79b51ee4ea43f43493d96
|
1a43dd78d054dbad8d7af9ba4829ea2f1cb70b53 |
|
17-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add write barriers to optimizing compiler. Change-Id: I43a40954757f51d49782e70bc28f7c314d6dbe17
|
96f89a290eb67d7bf4b1636798fa28df14309cc7 |
|
11-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add assembly operations with constants in optimizing compiler. Change-Id: I5bcc35ab50d4457186effef5592a75d7f4e5b65f
|
8d486731559ba0c5e12c27b4a507181333702b7e |
|
16-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Use the thumb2 assembler for the optimizing compiler. Change-Id: I2b058f4433504dc3299c06f5cb0b5ab12f34aa82
|
ab032bc1ff57831106fdac6a91a136293609401f |
|
15-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a braino in the stack layout. Also do some refactoring to have this code be just in CodeGenerator. Change-Id: I88de109889138af8d60027973c12a64bee813cb7
|
e50383288a75244255d3ecedcc79ffe9caf774cb |
|
04-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support fields in optimizing compiler. - Required support for temporaries, to be only used by baseline compiler. - Also fixed a few invalid assumptions around locations and instructions that don't need materialization. These instructions should not have an Out. Change-Id: Idc4a30dd95dd18015137300d36bec55fc024cf62
|
412f10cfed002ab617c78f2621d68446ca4dd8bd |
|
19-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support longs in the register allocator for x86_64. Change-Id: I7fb6dfb761bc5cf9e5705682032855a0a70ca867
|
f61b5377068f22c0be7b2f6e62961e620408beb2 |
|
25-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Re-enable tests with the optimizing compiler. Tests run ok on my host/target. I reverted the move to using thumb2, because tests were crashing. But I could not reproduce file limits issues. Make SignalTest as crashing for optimizing. We need to implement stack overflow checks. Change-Id: Ieda575501eaf30af7aaa2c44e71544c9c467c24f
|
20dfc797dc631bf8d655dcf123f46f13332d3074 |
|
17-Jun-2014 |
Dave Allison <dallison@google.com> |
Add some more instruction support to optimizing compiler. This adds a few more DEX instructions to the optimizing compiler's builder (constants, moves, if_xx, etc). Also: * Changes the codegen for IF_XX instructions to use a condition rather than comparing a value against 0. * Fixes some instructions in the ARM disassembler. * Fixes PushList and PopList in the thumb2 assembler. * Switches the assembler for the optimizing compiler to thumb2 rather than ARM. Change-Id: Iaafcd02243ccc5b03a054ef7a15285b84c06740f
|
e27f31a81636ad74bd3376ee39cf215941b85c0e |
|
12-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable the register allocator on ARM. - Also fixes a few bugs/wrong assumptions in code not hit by x86. - We need to differentiate between moves due to connecting siblings within a block, and moves due to control flow resolution. Change-Id: Idd05cf138a71c8f36f5531c473de613c0166fe38
|
86dbb9a12119273039ce272b41c809fa548b37b6 |
|
04-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Final CL to enable register allocation on x86. This CL implements: 1) Resolution after allocation: connecting the locations allocated to an interval within a block and between blocks. 2) Handling of fixed registers: some instructions require inputs/output to be at a specific location, and the allocator needs to deal with them in a special way. 3) ParallelMoveResolver::EmitNativeCode for x86. Change-Id: I0da6bd7eb66877987148b87c3be6a983b4e3f858
|
9cf35523764d829ae0470dae2d5dd99be469c841 |
|
09-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add x86_64 support to the optimizing compiler. Change-Id: I4462d9ae15be56c4a3dc1bd4d1c0c6548c1b94be
|
31d76b42ef5165351499da3f8ee0ac147428c5ed |
|
09-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Plug code generator into liveness analysis. Also implement spill slot support. Change-Id: If5e28811e9fbbf3842a258772c633318a2f4fafc
|
ffddfdf6fec0b9d98a692e27242eecb15af5ead2 |
|
03-Jun-2014 |
Tim Murray <timmurray@google.com> |
DO NOT MERGE Merge ART from AOSP to lmp-preview-dev. Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
|
a7062e05e6048c7f817d784a5b94e3122e25b1ec |
|
22-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a linear scan register allocator to the optimizing compiler. This is a "by-the-book" implementation. It currently only deals with allocating registers, with no hint optimizations. The changes remaining to make it functional are: - Allocate spill slots. - Resolution and placements of Move instructions. - Connect it to the code generator. Change-Id: Ie0b2f6ba1b98da85425be721ce4afecd6b4012a4
|
4e3d23aa1523718ea1fdf3a32516d2f9d81e84fe |
|
22-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Import Dart's parallel move resolver. And write a few tests while at it. A parallel move resolver will be needed for performing multiple moves that are conceptually parallel, for example moves at a block exit that branches to a block with phi nodes. Change-Id: Ib95b247b4fc3f2c2fcab3b8c8d032abbd6104cd7
|
b0fa5dc7769c1e054032f39de0a3f6d6dd06f8cf |
|
29-Apr-2014 |
Ian Rogers <irogers@google.com> |
Force inlining on trivial accessors. Make volatility for GetFieldObject a template parameter. Move some trivial mirror::String routines to a -inl.h. Bug: 14285442 Change-Id: Ie23b11d4f18cb15a62c3bbb42837a8aaf6b68f92
|
a7aca370a7d62ca04a1e24423d90e8020d6f1a58 |
|
28-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Setup policies for register allocation. Change-Id: I857e77530fca3e2fb872fc142a916af1b48400dc
|
c32e770f21540e4e9eda6dc7f770e745d33f1b9f |
|
24-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a Transform to SSA phase to the optimizing compiler. Change-Id: Ia9700756a0396d797a00b529896487d52c989329
|
a747a392fb5f88d2ecc4c6021edf9f1f6615ba16 |
|
17-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Code cleanup in preparation for x64 backend. - Use InvokeDexCallingConventionVisitor for setting up HParameterValues - Use kVregSize instead of kX86WordSize when dealing with virtual registers. Change-Id: Ia520223010194c70a3ff0ed659077f55cec4e7d8
|
db928fcc975b431d8a78700c11bd7da21090384a |
|
16-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Simplify HInvokeStatic code generation. HPushArgument is not needed for now (but might be when we start optimizing). Also, calling convention for 64bits backend will require to know more about the argument than the argument's index. Therefore currently let HInvokeStatic setup the arguments, which is possible because arguments of a calls are virtual registers and not instructions. Change-Id: I8753ed6083aa083c5180ab53b436dc8de4f1fe31
|
01bc96d007b67fdb7fe349232a83e4b354ce3d08 |
|
11-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Long support in optimizing compiler. - Add stack locations to the Location class. - Change logic of parameter passing/setup by setting the location of such instructions the ones for the calling convention. Change-Id: I4730ad58732813dcb9c238f44f55dfc0baa18799
|
b55f835d66a61e5da6fc1895ba5a0482868c9552 |
|
07-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Test control flow instruction with optimizing compiler. Add support for basic instructions to implement these tests. Change-Id: I3870bf9301599043b3511522bb49dc6364c9b4c0
|
f583e5976e1de9aa206fb8de4f91000180685066 |
|
07-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for taking parameters in optimizing compiler. - Fix stack layout to mimic Quick's. - Implement some sub operations. Change-Id: I8cf75a4d29b662381a64f02c0bc61d859482fc4e
|
707c809f661554713edfacf338365adca8dfd3a3 |
|
04-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Use target-specific word instead of runtime word. Change-Id: Ia11dc3cc520a1a5c7bd017013e5699af9570ce91
|
2e7038ac5848468740d6a419434d3dde8c585a53 |
|
03-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for new-instance and invoke-direct. Change-Id: I2daed646904f7711972a7da15d88be7573426932
|
4a34a428c6a2588e0857ef6baf88f1b73ce65958 |
|
03-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support passing arguments to invoke-static* instructions. - Stop using the frame pointer for accessing locals. - Stop emulating a stack when doing code generation. Instead, rely on dex register model, where instructions only reference registers. Change-Id: Id51bd7d33ac430cb87a53c9f4b0c864eeb1006f9
|
d8ee737fdbf380c5bb90c9270c8d1087ac23e76c |
|
28-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for adding two integers in optimizing compiler. Change-Id: I5524e193cd07f2692a57c6b4f8069904471b2928
|
8ccc3f5d06fd217cdaabd37e743adab2031d3720 |
|
19-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for invoke-static in optimizing compiler. Support is limited to calls without parameters and returning void. For simplicity, we currently follow the Quick ABI. Change-Id: I54805161141b7eac5959f1cae0dc138dd0b2e8a5
|
787c3076635cf117eb646c5a89a9014b2072fb44 |
|
17-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Plug new optimizing compiler in compilation pipeline. Also rename accessors to ART's conventions. Change-Id: I344807055b98aa4b27215704ec362191464acecc
|
bab4ed7057799a4fadc6283108ab56f389d117d4 |
|
11-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
More code generation for the optimizing compiler. - Add HReturn instruction - Generate code for locals/if/return - Setup infrastructure for register allocation. Currently emulate a stack. Change-Id: Ib28c2dba80f6c526177ed9a7b09c0689ac8122fb
|
3ff386aafefd5282bb76c8a50506a70a4321e698 |
|
04-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add register support to the optimizing compiler. Also make if take an input and build the use list for instructions. Change-Id: I1938cee7dce5bd4c66b259fa2b431d2c79b3cf82
|
d4dd255db1d110ceb5551f6d95ff31fb57420994 |
|
28-Feb-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add codegen support to the optimizing compiler. Change-Id: I9aae76908ff1d6e64fb71a6718fc1426b67a5c28
|