e6564f4efe74b2bb505a5810852141404b82a4a9 |
|
19-Mar-2018 |
David Sehr <sehr@google.com> |
Move some remaining dex utilities There were several utilities related to building/walking/testing dex files that were not in libdexfile. This change consolidates these. (cherry picked from commit 312f3b2fd0094c028a7d243b116947a35a745806) Bug: 22322814 Test: make -j 50 test-art-host Merged-In: Id76e9179d03b8ec7d67f7e0f267121f54f0ec2e0 Change-Id: Id76e9179d03b8ec7d67f7e0f267121f54f0ec2e0
|
8f4b056427a9d2321e3aa4f21ca8ffb18b3e5ae6 |
|
02-Mar-2018 |
David Sehr <sehr@google.com> |
Move most of runtime/base to libartbase/base Enforce the layering that code in runtime/base should not depend on runtime by separating it into libartbase. Some of the code in runtime/base depends on the Runtime class, so it cannot be moved yet. Also, some of the tests depend on CommonRuntimeTest, which itself needs to be factored (in a subsequent CL). Bug: 22322814 Test: make -j 50 checkbuild make -j 50 test-art-host Change-Id: I8b096c1e2542f829eb456b4b057c71421b77d7e2 Merged-In: c431b9dc4b23cc950eb313695258df5d89f53b22 (cherry picked from commit c431b9dc4b23cc950eb313695258df5d89f53b22)
|
d9e4d73b20d68aa387f5837e1535b6fc26b2859a |
|
05-Feb-2018 |
Gupta Kumar, Sanjiv <sanjiv.kumar.gupta@intel.com> |
Fix iCache misses for GetKind on x86,x86_64 GetKind() takes about 2.6% of total compilation time on x86_64. The primary reason is that the target call GetKindInternal() is often beyond the page boundary causing frequent i-cache misses. This patch removes the virtual call to GetKindInternal () and instead keeps the InstructionKind into each constructed instruction. Since we have about 121 instructions in total as of now, it takes about 7 extra bits in each instruction. dex2oat runs about 12% faster with --compiler-filter=everything on an APK of 25MB. Test: Tested the patch by running host art tests. Rebased. Change-Id: Ia7bbcd67180151e4565507164a718acbb6284885 Signed-off-by: Gupta Kumar, Sanjiv <sanjiv.kumar.gupta@intel.com>
|
ea179f477465789605e0c8f57a3ec660c3d852e8 |
|
08-Feb-2018 |
Nicolas Geoffray <ngeoffray@google.com> |
Refactor method resolution in class linker. Rewrite all runtime callers of DexCache::SetResolvedMethod to call a shared method that will do the dex cache update. bug: 64759619 Test: test-art-host Test: device boots, runs Change-Id: Icc1aca121030e2864de09667bdbc793b502e3802
|
bff7a52e2c6c9e988c3ed1f12a2da0fa5fd37cfb |
|
25-Jan-2018 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Compiler changes for bitstring based type checks." Bug: 64692057 Bug: 71853552 Bug: 26687569 This reverts commit eb0ebed72432b3c6b8c7b38f8937d7ba736f4567. Change-Id: I7daeaa077960ba41b2ed42bc47f17501621be4be
|
8c0961f9e061ee4b04c1c4ba8ad5cca13bcf884d |
|
24-Jan-2018 |
David Sehr <sehr@google.com> |
Move missed files to libdexfile Reduce the dependencies on utf and utils in preparation for separate directory. Bug: 22322814 Test: make -j 50 test-art-host make -j 50 dexdump2 dexlist Change-Id: Icdecf895dafec63ef903514eef79d459abc14925
|
eb0ebed72432b3c6b8c7b38f8937d7ba736f4567 |
|
10-Jan-2018 |
Vladimir Marko <vmarko@google.com> |
Compiler changes for bitstring based type checks. We guard the use of this feature with a compile-time flag, set to true in this CL. Boot image size for aosp_taimen-userdebug in AOSP master: - before: arm boot*.oat: 63604740 arm64 boot*.oat: 74237864 - after: arm boot*.oat: 63531172 (-72KiB, -0.1%) arm64 boot*.oat: 74135008 (-100KiB, -0.1%) The new TypeCheckBenchmark yields the following changes using the little cores of taimen fixed at 1.4016GHz: 32-bit 64-bit timeCheckCastLevel1ToLevel1 11.48->15.80 11.47->15.78 timeCheckCastLevel2ToLevel1 15.08->15.79 15.08->15.79 timeCheckCastLevel3ToLevel1 19.01->15.82 17.94->15.81 timeCheckCastLevel9ToLevel1 42.55->15.79 42.63->15.81 timeCheckCastLevel9ToLevel2 39.70->14.36 39.70->14.35 timeInstanceOfLevel1ToLevel1 13.74->17.93 13.76->17.95 timeInstanceOfLevel2ToLevel1 17.02->17.95 16.99->17.93 timeInstanceOfLevel3ToLevel1 24.03->17.95 24.45->17.95 timeInstanceOfLevel9ToLevel1 47.13->17.95 47.14->18.00 timeInstanceOfLevel9ToLevel2 44.19->16.52 44.27->16.51 This suggests that the bitstring typecheck should not be used for exact type checks which would be equivalent to the "Level1ToLevel1" benchmark. Whether the implementation is a beneficial replacement for the kClassHierarchyCheck and kAbstractClassCheck on average depends on how many levels from the target class (or Object for a negative result) is a typical object's class. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Test: testrunner.py --host -t 670-bitstring-type-check Test: Pixel 2 XL boots. Test: testrunner.py --target --optimizing --jit Test: testrunner.py --target -t 670-bitstring-type-check Bug: 64692057 Bug: 71853552 Bug: 26687569 Change-Id: I538d7e036b5a8ae2cc3fe77662a5903d74854562
|
a8b8e9b12a9740d71cff2fa65d47825b74f72c37 |
|
09-Jan-2018 |
Aart Bik <ajcbik@google.com> |
Improve code sinking near "always throwing" method calls Rationale: With simple dex bytecode analysis, the inliner marks methods that always throw to help subsequent code sinking. This reduces overhead of non-nullable enforcing calls found in e.g the Kotlin runtime library (1%-2% improvement on tree microbenchmark, about 5% on Denis' benchmark). Test: test-art-host test-art-target Change-Id: I45348f049721476828eb5443738021720d2857c0
|
7f4aff6705f46f411874b5ca8c4856b8ed5bfb13 |
|
21-Jun-2017 |
Artem Serov <artem.serov@linaro.org> |
ART: Implement SuperblockCloner. SuperblockCloner provides a feature of cloning subgraphs in a smart, high level way without fine grain manipulation with IR; data flow and graph properties are resolved/adjusted automatically. The clone transformation is defined by specifying a set of basic blocks to copy and a set of rules how to treat edges, remap their successors. By using this approach such optimizations as Branch Target Expansion, Loop Peeling, Loop Unrolling can be implemented. Test: superblock_cloner_test.cc. Change-Id: Ibeede38195376ca35f44ba9015491e50b3a5b87e
|
9e734c7ab4599d7747a05db0dc73c7b668cb6683 |
|
05-Jan-2018 |
David Sehr <sehr@google.com> |
Create dex subdirectory Move all the DexFile related source to a common subdirectory dex/ of runtime. Bug: 71361973 Test: make -j 50 test-art-host Change-Id: I59e984ed660b93e0776556308be3d653722f5223
|
217eb067308cf5aa43065377b66acbbee0f5b7c3 |
|
12-Dec-2017 |
Mingyao Yang <mingyao@google.com> |
Fix the side effects of clinit check HClinitCheck obviously does reads so it's side effects should include all reads and writes, just like HInvoke. GVN now explicitly allows clinit check to be reused, which would otherwise be disallowed based on the dependency introduced by the new side effects. Also make licm's logic cleaner and treat clinit check as a special case also, otherwise licm can't hoist clinit check due to the dependency introduced by the new side effects also. Test: run-test on host. Change-Id: I16886cfe557803d84d84ce68fbb185ebfc0b84dc
|
8758454d380a2b0de1f4a99e9623cfac5460ccdf |
|
12-Dec-2017 |
Vladimir Marko <vmarko@google.com> |
Clean up InstanceOf/CheckCast. Avoid read barriers for boot image class InstanceOf. Boot image classes are non-moveable, so comparing them against from-space and to-space reference yields the same result. Change the notion of a "fatal" type check slow path to mean that the runtime call shall not return by normal path, i.e. "fatal" now includes certainly throwing in a try-block. This avoids unnecessary code to restore registers and jump back. For boot image classes the CheckCast comparisons do not need read barriers (for the same reason as for InstanceOf), so we shall not have any false negatives and can treat the check's slow paths as final in the same cases as in non-CC configs. Boot image size for aosp_taimen-userdebug in AOSP master: - before: arm boot*.oat: 37075460 arm64 boot*.oat: 43431768 - after: arm boot*.oat: 36894292 (-177KiB, -0.5%) arm64 boot*.oat: 43201256 (-225KiB, -0.5%) Also remove some obsolete helpers from CodeGenerator. Test: Additional test in 603-checker-instanceof. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: Pixel 2 XL boots. Test: testrunner.py --target --optimizing Bug: 12687968 Change-Id: Ib1381084e46a10e70320dcc618f0502ad725f0b8
|
04366f382239f4bcf1f9c67bb1ff6975607cd8e4 |
|
14-Dec-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "ART: Try to statically evaluate some conditions." CL has an unwanted 10-15% compile-time impact. This reverts commit 1de1e11ac90db9fad8916ac43d43714ccb8d978f. Change-Id: I76b45aa95bbd24dd025d2ee6cf37d77fe17b8497
|
09faaea17b75269805b4857ed3c9cd04c7273959 |
|
07-Dec-2017 |
Artem Serov <artem.serov@linaro.org> |
ART: Fix single-preheader transformation. Original implementation of "Make sure the loop has only one pre-header" had an assumption that the header had no phi functions since loops with multiple preheaders now only may exist during graph building before ssa construction; all of the optimizations preserve the single-preheader invariant. This code is used by DCE; DCE was called multiple times but after graph building preheader transformation was never executed. However if someone introduces a optimization which might not keep the invariant (e.g. loop peeling) the data flow adjustments must be performed. Test: loop_optimization_test.cc Test: test-art-target, test-art-host Change-Id: I88bb0aad2dd5241addef7fe9cda474a6868bf532
|
1de1e11ac90db9fad8916ac43d43714ccb8d978f |
|
20-Jul-2017 |
Artem Serov <artem.serov@linaro.org> |
ART: Try to statically evaluate some conditions. If a condition 'cond' is evaluated in an HIf instruction then in the successors of the this HIF_BLOCK we statically know the value of the condition (TRUE in TRUE_SUCC, FALSE in FALSE_SUCC). Using that we could replace another evaluation (use) EVAL of the same 'cond' with TRUE value (FALSE value) if every path from the ENTRY_BLOCK to EVAL_BLOCK contains the edge HIF_BLOCK->TRUE_SUCC (HIF_BLOCK->FALSE_SUCC). if (cond) { ... if (cond) { ... } ... int a = cond ? 5 : 105; ... } The patch is a prerequisite step for "Loop peeling to eliminate invariant exits" however it brings some value on its own with a tiny code size reduction in boot-framework.oat (-8Kb). Test: 458-checker-instruct-simplification Test: test-art-target, test-art-host. Change-Id: Ifbe45097dc2b5f098176fa1a1d023ea90b76d396
|
28e012a4af2d710e5e5f824709ffd6432e4f549f |
|
07-Dec-2017 |
Vladimir Marko <vmarko@google.com> |
Determine HLoadClass/String load kind early. This helps save memory by avoiding the allocation of HEnvironment and related objects for AOT references to boot image strings and classes (kBootImage* load kinds) and also for JIT references (kJitTableAddress). Compiling aosp_taimen-userdebug boot image, the most memory hungry method BatteryStats.dumpLocked() needs - before: Used 55105384 bytes of arena memory... ... UseListNode 10009704 Environment 423248 EnvVRegs 20676560 ... - after: Used 50559176 bytes of arena memory... ... UseListNode 8568936 Environment 365680 EnvVRegs 17628704 ... Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Bug: 34053922 Change-Id: I68e73a438e6ac8e8908e6fccf53bbeea8a64a077
|
fec85cdfa337dfb1f1c6d5bd9a940bc2d20a0edb |
|
04-Dec-2017 |
Vladimir Marko <vmarko@google.com> |
Minor cleanup in CodeGenerator::RecordPcInfo(). And remove HInvokeInterface::GetDexMethodIndex() as the base class version is identical. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I489bff5b3f624eec19269487529e29d58f068960
|
0259c24f5e83ab69b259245b645076b864b2e2ca |
|
04-Dec-2017 |
Vladimir Marko <vmarko@google.com> |
Fix a bug in String.charAt() simplification. Do not pass method index as a bool flag indicating that the HBoundsCheck originates from a String.charAt(). This was working only thanks to the method index unlikely to be 0. This bug was introduced in https://android-review.googlesource.com/321573 . Test: Rely on TreeHugger. Bug: 30933338 Change-Id: I2a51e478ee145d342af8cd49f9fdec7adffd77ff
|
46721ef33e8f5cd405c291d72e3f259e3085fb5f |
|
05-Oct-2017 |
Mingyao Yang <mingyao@google.com> |
Don't merge values for exit block in LSE. This enables some additional optimizations since exit block doesn't really merge values. Test: run-test on host. Change-Id: I21ed7e0e43a3bc5d9ed2dabfad8462129b904eb7
|
cced8ba4245a061ab047a0a6882468d75d619dd9 |
|
19-Jul-2017 |
Artem Serov <artem.serov@linaro.org> |
ART: Introduce individual HInstruction cloning. Introduce API for HInstruction cloning, support it for a few instructions. add a gtest. Test: cloner_test.cc, test-art-target, test-art-host. Change-Id: I8b6299be5d04a26390d9ef13a20ce82ee5ae4afe
|
e0eb48353ddf0c1b79bfec2ba15c899a413c2c70 |
|
30-Oct-2017 |
xueliang.zhong <xueliang.zhong@linaro.org> |
Fix LSA hunt for original reference bug. Fix a bug in LSA where it doesn't take IntermediateAddress into account during hunting for original reference. In following example, original reference i0 can be transformed by NullCheck, BoundType, IntermediateAddress, etc. i0 NewArray i1 HInstruction(i0) i2 ArrayGet(i1, index) Test: test-art-host Test: test-art-target Test: load_store_analysis_test Test: 706-checker-scheduler Change-Id: I162dd8a86fcd31daee3517357c6af638c950b31b
|
61b922847403ac0e74b6477114c81a28ac2e01a0 |
|
11-Oct-2017 |
Vladimir Marko <vmarko@google.com> |
ART: Introduce Uint8 loads in compiled code. Some vectorization patterns are not recognized anymore. This shall be fixed later. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: testrunner.py --target --optimizing on Nexus 5X Test: Nexus 5X boots. Bug: 23964345 Bug: 67935418 Change-Id: I587a328d4799529949c86fa8045c6df21e3a8617
|
69d310e0317e2fce97bf8c9c133c5c2c0332e61d |
|
09-Oct-2017 |
Vladimir Marko <vmarko@google.com> |
Use ScopedArenaAllocator for building HGraph. Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 21.1MiB -> 20.2MiB BatteryStats.dumpLocked(): 42.0MiB -> 40.3MiB This is because all the memory previously used by the graph builder is reused by later passes. And finish the "arena"->"allocator" renaming; make renamed allocator pointers that are members of classes const when appropriate (and make a few more members around them const). Test: m test-art-host-gtest Test: testrunner.py --host Bug: 64312607 Change-Id: Ia50aafc80c05941ae5b96984ba4f31ed4c78255e
|
e764d2e50c544c2cb98ee61a15d613161ac6bd17 |
|
05-Oct-2017 |
Vladimir Marko <vmarko@google.com> |
Use ScopedArenaAllocator for register allocation. Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 25.1MiB -> 21.1MiB BatteryStats.dumpLocked(): 49.6MiB -> 42.0MiB This is because all the memory previously used by Scheduler is reused by the register allocator; the register allocator has a higher peak usage of the ArenaStack. And continue the "arena"->"allocator" renaming. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 64312607 Change-Id: Idfd79a9901552b5147ec0bf591cb38120de86b01
|
ca6fff898afcb62491458ae8bcd428bfb3043da1 |
|
03-Oct-2017 |
Vladimir Marko <vmarko@google.com> |
ART: Use ScopedArenaAllocator for pass-local data. Passes using local ArenaAllocator were hiding their memory usage from the allocation counting, making it difficult to track down where memory was used. Using ScopedArenaAllocator reveals the memory usage. This changes the HGraph constructor which requires a lot of changes in tests. Refactor these tests to limit the amount of work needed the next time we change that constructor. Test: m test-art-host-gtest Test: testrunner.py --host Test: Build with kArenaAllocatorCountAllocations = true. Bug: 64312607 Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
|
a290160f74ee53c0ffb51c7b3ac916d239c9556a |
|
21-Sep-2017 |
Lena Djokic <Lena.Djokic@imgtec.com> |
MIPS32R2: Share address computation For array accesses the element address has the following structure: Address = CONST_OFFSET + base_addr + index << ELEM_SHIFT The address part (index << ELEM_SHIFT) can be shared across array accesses with the same data type and index. For example, in the following loop 5 accesses can share address computation: void foo(int[] a, int[] b, int[] c) { for (i...) { a[i] = a[i] + 5; b[i] = b[i] + c[i]; } } Test: test-art-host, test-art-target Change-Id: Id09fa782934aad4ee47669275e7e1a4d7d23b0fa
|
c8fb211482e27ead6f015faf7e2b02225f728e99 |
|
03-Oct-2017 |
Vladimir Marko <vmarko@google.com> |
ART: Simplify And(TypeConversion<Int64>(x), Const32). Reorder the And and TypeConversion as TypeConversion<Int64>(And(x, Const32)) for 32-bit constant Const32. For example, java.io.Bits.getLong(byte[] b, int off) yields better generated code on 32-bit platforms for each of its eight "b[off + .] & 0xFFL" sequences. Also remove obsolete "doThrow" code that attempts to prevent inlining; the $noinline$ tag is now honored by the compiler. Test: Added tests to 458-checker-instruct-simplification. Test: m test-art-host-gtest Test: testrunner.py --host Change-Id: Ib6e413517daa5206764653ebb6c4687a4c68d02d
|
d5d2f2ce627aa0f6920d7ae05197abd1a396e035 |
|
26-Sep-2017 |
Vladimir Marko <vmarko@google.com> |
ART: Introduce Uint8 compiler data type. This CL adds all the necessary codegen for the Uint8 type but does not add code transformations that use that code. Vectorization codegens are modified to use Uint8 as the packed type when appropriate. The side effects are now disconnected from the instruction's type after the graph has been built to allow changing HArrayGet/H*FieldGet/HVecLoad to use a type different from the underlying field or array. Note: HArrayGet for String.charAt() is modified to have no side effects whatsoever; Strings are immutable. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Test: testrunner.py --target --optimizing on Nexus 6P Test: Nexus 6P boots. Bug: 23964345 Change-Id: If2dfffedcfb1f50db24570a1e9bd517b3f17bfd0
|
0ebe0d83138bba1996e9c8007969b5381d972b32 |
|
21-Sep-2017 |
Vladimir Marko <vmarko@google.com> |
ART: Introduce compiler data type. Replace most uses of the runtime's Primitive in compiler with a new class DataType. This prepares for introducing new types, such as Uint8, that the runtime does not need to know about. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 23964345 Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
|
dbbac8f812a866b1b53f3007721f66038d208549 |
|
01-Sep-2017 |
Aart Bik <ajcbik@google.com> |
Implement Sum-of-Abs-Differences idiom recognition. Rationale: Currently just on ARM64 (x86 lacks proper support), using the SAD idiom yields great speedup on loops that compute the sum-of-abs-difference operation. Also includes some refinements around type conversions. Speedup ExoPlayerAudio (golem run): 1.3x on ARM64 1.1x on x86 Test: test-art-host test-art-target Bug: 64091002 Change-Id: Ia2b711d2bc23609a2ed50493dfe6719eedfe0130
|
94ec2db21332ee1dcdbbf254b99a9a999a304fe0 |
|
06-Sep-2017 |
Vladimir Marko <vmarko@google.com> |
Use mmapped boot image class table for PIC app HLoadClass. Implement new HLoadClass load kind for boot image classes referenced by PIC-compiled apps (i.e. prebuilts) that uses PC-relative load from a boot image ClassTable mmapped into the apps .bss. This reduces the size of the PIC prebuilts that reference boot image classes compared to the kBssEntry as we can completely avoid the slow path and stack map unless we need to do the class initialization check. Prebuilt services.odex for aosp_angler-userdebug (arm64): - before: 20312800 - after: 19775352 (-525KiB) Test: m test-art-host-gtest Test: testrunner.py --host Test: testrunner.py --host --pictest Test: testrunner.py --target on Nexus 6P. Test: testrunner.py --target --pictest on Nexus 6P. Test: Nexus 6P boots. Bug: 31951624 Change-Id: I13adb19a1fa7d095a72a41f09daa6101876e77a8
|
dd018df8a00e841fe38fabe38520b7d297a885c1 |
|
09-Aug-2017 |
Igor Murashkin <iam@google.com> |
optimizing: add block-scoped constructor fence merging pass Introduce a new "Constructor Fence Redundancy Elimination" pass. The pass currently performs local optimization only, i.e. within instructions in the same basic block. All constructor fences preceding a publish (e.g. store, invoke) get merged into one instruction. ============== OptStat#ConstructorFenceGeneratedNew: 43825 OptStat#ConstructorFenceGeneratedFinal: 17631 <+++ OptStat#ConstructorFenceRemovedLSE: 164 OptStat#ConstructorFenceRemovedPFRA: 9391 OptStat#ConstructorFenceRemovedCFRE: 16133 <--- Removes ~91.5% of the 'final' constructor fences in RitzBenchmark: (We do not distinguish the exact reason that a fence was created, so it's possible some "new" fences were also removed.) ============== Test: art/test/run-test --host --optimizing 476-checker-ctor-fence-redun-elim Bug: 36656456 Change-Id: I8020217b448ad96ce9b7640aa312ae784690ad99
|
6cfbdbc359ec5414d3e49f70d28f8c0e65b98d63 |
|
25-Jul-2017 |
Vladimir Marko <vmarko@google.com> |
Use mmapped boot image intern table for PIC app HLoadString. Implement new HLoadString load kind for boot image strings referenced by PIC-compiled apps (i.e. prebuilts) that uses PC-relative load from a boot image InternTable mmapped into the apps .bss. This reduces the size of the PIC prebuilts that reference boot image strings compared to the kBssEntry as we can completely avoid the slow path and stack map. We separate the InternedStrings and ClassTable sections of the boot image (.art) file from the rest, aligning the start of the InternedStrings section to a page boundary. This may actually increase the size of the boot image file by a page but it also allows mprotecting() these tables as read-only. The ClassTable section is included in anticipation of a similar load kind for HLoadClass. Prebuilt services.odex for aosp_angler-userdebug (arm64): - before: 20862776 - after: 20308512 (-541KiB) Note that 92KiB savings could have been achieved by simply avoiding the read barrier, similar to the HLoadClass flag IsInBootImage(). Such flag is now unnecessary. Test: m test-art-host-gtest Test: testrunner.py --host Test: testrunner.py --host --pictest Test: testrunner.py --target on Nexus 6P. Test: testrunner.py --target --pictest on Nexus 6P. Test: Nexus 6P boots. Bug: 31951624 Change-Id: I5f2bf1fc0bb36a8483244317cfdfa69e192ef6c5
|
0148de41a5c77c2f61252c219f1a02413c7c4a32 |
|
05-Sep-2017 |
Aart Bik <ajcbik@google.com> |
Basic SIMD reduction support. Rationale: Enables vectorization of x += .... for very basic (simple, same-type) constructs. Paves the way for more complex (narrower and/or mixed-type) constructs, which will be handled by the next CL. This is a revert of Icb5d6c805516db0a1d911c3ede9a246ccef89a22 and thus a revert^2 of I2454778dd0ef1da915c178c7274e1cf33e271d0f and thus a revert^3 of I1c1c87b6323e01442e8fbd94869ddc9e760ea1fc and thus a revert^4 of I7880c135aee3ed0a39da9ae5b468cbf80e613766 PS1-2 shows what needed to change Test: test-art-host test-art-target Bug: 64091002 Change-Id: I647889e0da0959ca405b70081b79c7d3c9bcb2e9
|
982334cef17d47ef2477d88a97203a9587a4b86f |
|
02-Sep-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Basic SIMD reduction support." Fails 530-checker-lse on arm64. Bug: 64091002, 65212948 This reverts commit cfa59b49cde265dc5329a7e6956445f9f7a75f15. Change-Id: Icb5d6c805516db0a1d911c3ede9a246ccef89a22
|
cfa59b49cde265dc5329a7e6956445f9f7a75f15 |
|
31-Aug-2017 |
Aart Bik <ajcbik@google.com> |
Basic SIMD reduction support. Rationale: Enables vectorization of x += .... for very basic (simple, same-type) constructs. Paves the way for more complex (narrower and/or mixed-type) constructs, which will be handled by the next CL. This is a revert^2 of I7880c135aee3ed0a39da9ae5b468cbf80e613766 and thus a revert of I1c1c87b6323e01442e8fbd94869ddc9e760ea1fc PS1-2 shows what needed to change, with regression tests Test: test-art-host test-art-target Bug: 64091002, 65212948 Change-Id: I2454778dd0ef1da915c178c7274e1cf33e271d0f
|
a57b4ee7b15ce6abfb5fa88c8dc8a516fe40e0d9 |
|
30-Aug-2017 |
Aart Bik <ajcbik@google.com> |
Revert "Basic SIMD reduction support." This reverts commit 9879d0eac8fe2aae19ca6a4a2a83222d6383afc2. Getting these type check failures in some builds. Need time to look at this better, so reverting for now :-( dex2oatd F 08-30 21:14:29 210122 226218 code_generator.cc:115] Check failed: CheckType(instruction->GetType(), locations->InAt(0)) PrimDouble C Change-Id: I1c1c87b6323e01442e8fbd94869ddc9e760ea1fc
|
9879d0eac8fe2aae19ca6a4a2a83222d6383afc2 |
|
15-Aug-2017 |
Aart Bik <ajcbik@google.com> |
Basic SIMD reduction support. Rationale: Enables vectorization of x += .... for very basic (simple, same-type) constructs. Paves the way for more complex (narrower and/or mixed-type) constructs, which will be handled by the next CL. Test: test-art-host test-art-target Bug: 64091002 Change-Id: I7880c135aee3ed0a39da9ae5b468cbf80e613766
|
6ef45677305048c2bf0600f1c4b98a11b2cfaffb |
|
08-Aug-2017 |
Igor Murashkin <iam@google.com> |
optimizing: Add statistics for # of constructor fences added/removed Statistics are attributed as follows: Added because: * HNewInstances requires a HConstructorFence following it. * HReturn requires a HConstructorFence (for final fields) preceding it. Removed because: * Optimized in Load-Store-Elimination. * Optimized in Prepare-For-Register-Allocation. Test: art/test.py Bug: 36656456 Change-Id: Ic119441c5151a5a840fc6532b411340e2d68e5eb
|
16e528957869c7debb1f6758c9a364819e15ee1a |
|
14-Jul-2017 |
Mads Ager <ager@google.com> |
RFC: Generate select instruction for conditional returns. The select generator currently only inserts select instructions if there is a diamond shape with a phi. This change extends the select generator to also deal with the pattern: if (condition) { movable instruction 0 return value0 } else { movable instruction 1 return value1 } which it turns into: moveable instruction 0 moveable instruction 1 return select (value0, value1, condition) Test: 592-checker-regression-bool-input Change-Id: Iac50fb181dc2c9b7619f28977298662bc09fc0e1
|
c73ee37b76494253862ee17933acfe2b88de1a01 |
|
31-Jul-2017 |
Artem Serov <artem.serov@linaro.org> |
ART: Fix loop header's predecessors reordering in SimplifyLoops. Fix the issue when after loop header's predecessors reordering in SimplifyLoops phi inputs are not reordered correspondingly. Test: loop_optimization_test.cc, test-art-host, test-art-target. Change-Id: I8a251a0a953d751f9bb67da58181e47d225d90e6
|
21c7e6fbcabef2f22b467e1e89f4abe1aa43e459 |
|
27-Jul-2017 |
Artem Serov <artem.serov@linaro.org> |
ART: Fix SimplifyInduction for an instruction with HEnvironment. After an instruction is removed during RemoveFromCycle its environment isn't properly cleaned: it still has input instructions present and registered (those instructions still hold records for that). Test: test-art-target, test-art-host. Change-Id: Iea315bdf735d75fe477f43671f05b40dfecc63a8
|
8cf9cb386cd9286d67e879f1ee501ec00d72a4e1 |
|
19-Jul-2017 |
Andreas Gampe <agampe@google.com> |
ART: Include cleanup Let clang-format reorder the header includes. Derived with: * .clang-format: BasedOnStyle: Google IncludeIsMainRegex: '(_test|-inl)?$' * Steps: find . -name '*.cc' -o -name '*.h' | xargs sed -i.bak -e 's/^#include/ #include/' ; git commit -a -m 'ART: Include cleanup' git-clang-format -style=file HEAD^ manual inspection git commit -a --amend Test: mmma art Change-Id: Ia963a8ce3ce5f96b5e78acd587e26908c7a70d02
|
c9c310487b8730fce5edfa72e79c4188629898a3 |
|
29-Jun-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Turn a few DCHECK into CHECKs. To help diagnose b/63070152. bug: 63070152 Test: test.py Change-Id: I1ac1cf9bfe1bc15ecfa94b5b8537cd3afda6fd14
|
f57c1ae3682f95e6d7ce08ae4c241d04b09de658 |
|
28-Jun-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Prevent loop optimization in debuggable mode. bug: 33775412 Test: no scanner crash (torn on whether I should spend some time working on a smali test) Change-Id: I8b94725ce57171b592bede4bf55cd0a9626a8a10
|
0eb882bfc5d260e8014c26adfda11602065aa5d8 |
|
15-May-2017 |
Vladimir Marko <vmarko@google.com> |
Use ArtMethod* .bss entries for HInvokeStaticOrDirect. Test: m test-art-host-gtest Test: testrunner.py --host Test: testrunner.py --target Test: Nexus 6P boots. Test: Build aosp_mips64-userdebug. Bug: 30627598 Change-Id: I0e54fdd2e91e983d475b7a04d40815ba89ae3d4f
|
e7197bf7d58c705a048e13e241d7ca320502cd40 |
|
02-Jun-2017 |
Vladimir Marko <vmarko@google.com> |
Replace invoke kind kDexCacheViaMethod with kRuntimeCall. In preparation for replacing the dex cache method array with a hash-based array, get rid of one unnecessary use. This method load kind is currently used only on mips for irreducible loops and OSR, so this should have no impact on x86/x86-64/arm/arm64. Test: m test-art-host-gtest Test: testrunner.py --host Test: Repeat the above tests with manually changing kDexCachePcRelative to kRuntimeCall in sharpening.cc. (Ignore failures in 552-checker-sharpening.) Bug: 30627598 Change-Id: Ifce42645f2dcc350bbb88c2f4642e88fc5f98152
|
847e6ce98b4b822fd94c631975763845978ebaa3 |
|
02-Jun-2017 |
Vladimir Marko <vmarko@google.com> |
Rename kDexCacheViaMethod to kRuntimeCall for HLoadClass/String. The old name does not reflect the actual code anymore. Test: testrunner.py --host Change-Id: I2e13cf727bba9d901c4d3fc821bb526d38a775b8
|
ec32f6402382303608544fdac5a88067781bdec5 |
|
02-Jun-2017 |
Vladimir Marko <vmarko@google.com> |
Delay allocating environment locations. Many environments are killed before we get to the register allocation, so the early allocation of their locations was simply wasting memory. For the most expensive method of a certain app, this reduces EnvLocations with 64-bit dex2oat from 8657200 to 5339712 (-3.16MiB). Test: m test-art-host Test: testrunner.py --host Bug: 33650849 Change-Id: I70a02fc3c7ec87b54a87e989e1239dc4acfcf18b
|
82b0740f03b1a6acab4558214d3edc362e27e238 |
|
01-Mar-2017 |
Vladimir Marko <vmarko@google.com> |
Use IntrusiveForwardList<> for Env-/UsePosition. Test: m test-art-host-gtest Test: testrunner.py --host Change-Id: I2b720e2ed8f96303cf80e9daa6d5278bf0c3da2f
|
6597946d29be9108e2cc51223553d3db9290a3d9 |
|
19-May-2017 |
Vladimir Marko <vmarko@google.com> |
Use PC-relative pointer to boot image methods. In preparation for adding ArtMethod entries to the .bss section, add direct PC-relative pointers to methods so that the number of needed .bss entries for boot image is small. Test: m test-art-host-gtest Test: testrunner.py --host Test: testrunner.py --target on Nexus 6P Test: Nexus 6P boots. Test: Build aosp_mips64-userdebug Bug: 30627598 Change-Id: Ia89f5f9975b741ddac2816e1570077ba4b4c020f
|
79d8fa7c52c1810d4618c9bd1d43994be5abb53d |
|
18-Apr-2017 |
Igor Murashkin <iam@google.com> |
optimizing: Build HConstructorFence for HNewArray/HNewInstance nodes Also fixes: * LSE, code_sinking to keep optimizing new-instance if it did so before * Various tests to expect constructor fences after new-instance Sidenote: new-instance String does not get a ConstructorFence; the special StringFactory calls are assumed to be self-fencing. Metric changes on go/lem: * CodeSize -0.262% in ART-Compile (ARMv8) * RunTime -0.747% for all (linux-armv8) (No changes expected to x86, constructor fences are no-op). The RunTime regression is temporary until art_quick_alloc_* entrypoints have their DMBs removed in a follow up CL. Test: art/test.py Bug: 36656456 Change-Id: I6a936a6e51c623e1c6b5b22eee5c3c72bebbed35
|
764d454d1d51448deb81f6e8d2d7d317c7f4d1b4 |
|
16-May-2017 |
Vladimir Marko <vmarko@google.com> |
Remove LoadString/Class kind kBootImageLinkTimeAddress. We no longer support non-PIC boot image compilation. Also clean up some obsolete code for method patches and make JIT correctly report itself as non-PIC. Test: testrunner.py --host Test: testrunner.py --target Bug: 33192586 Change-Id: I593289c5c1b0e88b82b86a933038be97bbb15ad2
|
e1811ed6b57a54dc8ebd327e4bd2c4422092a3a0 |
|
27-Apr-2017 |
Artem Serov <artem.serov@linaro.org> |
ARM64: Share address computation across SIMD LDRs/STRs. For array accesses the element address has the following structure: Address = CONST_OFFSET + base_addr + index << ELEM_SHIFT Taking into account ARM64 LDR/STR addressing modes address part (CONST_OFFSET + index << ELEM_SHIFT) can be shared across array access with the same data type and index. For example, for the following loop 5 accesses can share address computation: void foo(int[] a, int[] b, int[] c) { for (i...) { a[i] = a[i] + 5; b[i] = b[i] + c[i]; } } Test: test-art-host, test-art-target Change-Id: I46af3b4e4a55004336672cdba3296b7622d815ca
|
79efadfdd861584f1c47654ade975eae6c43c360 |
|
08-May-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Add runtime reasons for deopt. Currently to help investigate. Also: 1) Log when deoptimization happens (which method and what reason) 2) Trace when deoptimization happens (to make it visible in systrace) bug:37655083 Test: test-art-host test-art-target (cherry picked from commit 4e92c3ce7ef354620a785553bbada554fca83a67) Change-Id: I992398a1038ab61ea0e5106af6b6ad0a3305312e
|
4e92c3ce7ef354620a785553bbada554fca83a67 |
|
08-May-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Add runtime reasons for deopt. Currently to help investigate. Also: 1) Log when deoptimization happens (which method and what reason) 2) Trace when deoptimization happens (to make it visible in systrace) bug:37655083 Test: test-art-host test-art-target Change-Id: I0c2d87b40db09e8e475cf97a7c784a034c585e97
|
d01745ef88bfd25df574a885d90a1a7785db5f5b |
|
06-Apr-2017 |
Igor Murashkin <iam@google.com> |
optimizing: constructor fence redundancy elimination - remove dmb after LSE Part one of a few upcoming CLs to optimize constructor fences. This improves load-store-elimination; all singleton objects that are not returned will have their associated constructor fence removed. If the allocation is removed, so is the fence. Even if allocation is not removed, fences can sometimes be removed. This change is enabled by tracking the "this" object associated with the constructor fence as an input. Fence inputs are considered weak; they do not keep the "this" object alive; if the instructions for "this" are all deleted, the fence can also be deleted. Bug: 36656456 Test: art/test.py --host && art/test.py --target Change-Id: I05659ab07e20d6e2ecd4be051b722726776f4ab1
|
66d691de219e840b3f84385d8bd1b7001562b0e5 |
|
07-Apr-2017 |
Vladimir Marko <vmarko@google.com> |
ARM64: Link-time generated thunks for ArrayGet Baker CC read barrier. Test: Added a test to relative_patcher_arm64 Test: m test-art-target-gtest on Nexus 6P. Test: Nexus 6P boots. Test: testrunner.py --target on Nexus 6P. Test: Nexus 6P boots with heap poisoning. Test: testrunner.py --target on Nexus 6P with heap poisoning. Bug: 29516974 Bug: 30126666 Bug: 36141117 Change-Id: Id0f23089c55cbb53b84305c11bb4b03718561ade
|
8de5916666ab5d146ac1bdac7d7748e197ae347e |
|
21-Apr-2017 |
Aart Bik <ajcbik@google.com> |
Factor vector unary/binary shared code out into superclass. Test: test-art-target, test-art-host Change-Id: I42770d9a9142f2e53d3b5bd60bd25593b2154a7c
|
f34dd206d0073fb3949be872224420a8488f551f |
|
10-Apr-2017 |
Artem Serov <artem.serov@linaro.org> |
ARM64: Support MultiplyAccumulate for SIMD. Test: test-art-host, test-art-target. Change-Id: I06af8415e15352d09d176cae828163cbe99ae7a7
|
f3e61ee363fe7f82ef56704f06d753e2034a67dd |
|
13-Apr-2017 |
Aart Bik <ajcbik@google.com> |
Implement halving add idiom (with checker tests). Rationale: First of several idioms that map to very efficient SIMD instructions. Note that the is-zero-ext and is-sign-ext are general-purpose utilities that will be widely used in the vectorizer to detect low precision idioms, so expect that code to be shared with many CLs to come. Test: test-art-host, test-art-target Change-Id: If7dc2926c72a2e4b5cea15c44ef68cf5503e9be9
|
032cacdbf32c50d3c43590600ed1e171a35fa93c |
|
06-Apr-2017 |
Igor Murashkin <iam@google.com> |
optimizing: do not illegally remove constructor barriers after inlining Remove the illegal optimization that destroyed constructor barriers after inlining invoke-super constructor calls. --- According to JLS 7.5.1, "Note that if one constructor invokes another constructor, and the invoked constructor sets a final field, the freeze for the final field takes place at the end of the invoked constructor." This means if an object is published (stored to a location potentially visible to another thread) inside of an outer constructor, all final field stores from any inner constructors must be visible to other threads. Test: art/test.py Bug: 37001605 Change-Id: I3b55f6c628ff1773dab88022a6475d50a1a6f906
|
6daebeba6ceab4e7dff5a3d65929eeac9a334004 |
|
03-Apr-2017 |
Aart Bik <ajcbik@google.com> |
Implemented ABS vectorization. Rationale: This CL adds the concept of vectorizing intrinsics to the ART vectorizer. More can follow (MIN, MAX, etc). Test: test-art-host, test-art-target (angler) Change-Id: Ieed8aa83ec64c1250ac0578570249cce338b5d36
|
f8f5a16ed7bad1e18179e38453e59c96a944de10 |
|
07-Feb-2017 |
Aart Bik <ajcbik@google.com> |
ART vectorizer. Rationale: Make SIMD great again with a retargetable and easily extendable vectorizer. Provides a full x86/x86_64 and a proof-of-concept ARM implementation. Sample improvement (without any perf tuning yet) for Linpack on x86 is about 20% to 50%. Test: test-art-host, test-art-target (angler) Bug: 34083438, 30933338 Change-Id: Ifb77a0f25f690a87cd65bf3d5e9f6be7ea71d6c1
|
dd0fc0481cdc4e04abb7b7300e76edbd2f07f011 |
|
28-Mar-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Make data dependency around HDeoptimize correct. We use HDeoptimize in a few places, but when it comes to data dependency we either: - don't have any (BCE, CHA), in which case we should make sure no code that the deoptimzation guards moves before the HDeoptimize - have one on the receiver (inline cache), in which case we can update the dominated users with the HDeoptimize to get the data dependency correct. bug:35661819 bug:36371709 test: 644-checker-deopt Change-Id: I30b750f97b656dede9e10e7e43ac02c8604d7b7a (cherry picked from commit 6f8e2c9913b24f746a154dda700f609cee3095f9)
|
6f8e2c9913b24f746a154dda700f609cee3095f9 |
|
23-Mar-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Make data dependency around HDeoptimize correct. We use HDeoptimize in a few places, but when it comes to data dependency we either: - don't have any (BCE, CHA), in which case we should make sure no code that the deoptimzation guards moves before the HDeoptimize - have one on the receiver (inline cache), in which case we can update the dominated users with the HDeoptimize to get the data dependency correct. bug:35661819 bug:36371709 test: 644-checker-deopt Change-Id: I4820c6710b06939e7f5a59606971693e995fb958
|
53fec08731de956fc68e6edb27e8266607b1a5f2 |
|
27-Mar-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Initialize art_method_ in HGraph. Spotted by Ivan Maidanski! Benign as HGraph is allocated on the arena, and arenas are always zero initialized. test: test-art-host Change-Id: Id8abe421e732dcf7a760f118b16b85fe1fac7c78
|
b13c65bb46544821a84ff2106d0710d77b0fb463 |
|
22-Mar-2017 |
Aart Bik <ajcbik@google.com> |
Saves full XMM state along suspend check's slow path. Rationale: Break-out CL of ART Vectorizer. We need to save 128-bit of data (default ABI of ART runtime only saves 64-bit) Note that this is *only* done for xmm registers that are live, so overhead is not too big. Bug: 34083438 Test: test-art-host Change-Id: Ic89988b0acb0c104634271d0c6c3e29b6596d59b
|
01b47b046b01ec68696f8ff61b5326cdd3af348e |
|
03-Feb-2017 |
Mingyao Yang <mingyao@google.com> |
Inlining a few small methods based on profiling dex2oat with perf. Test: m test-art-host Change-Id: I6313158e59592d8d132154523be9c82dda3c7eb8
|
c52f3034b06c03632e937aff07d46c2bdcadfef5 |
|
02-Mar-2017 |
Richard Uhler <ruhler@google.com> |
Remove --include-patch-information option from dex2oat. Because we no longer support running patchoat on npic oat files, which means the included patch information is unused . Bug: 33192586 Test: m test-art-host Change-Id: I9e100c4e47dc24d91cd74226c84025e961d30f67
|
c4aa82c5b0aa921c51eaf6f6bbaff36501ea2cee |
|
06-Mar-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Invoke typed arraycopy for primitive arrays. Apps will always call the Object version of arraycopy. When we can infer the types of the passed arrays, replace the method being called to be the typed System.arraycopy one. 10% improvement on ExoPlayerBench. Test: 641-checker-arraycopy bug: 7103825 Change-Id: I872d7a6e163a4614510ef04ae582eb90ec48b5fa
|
331605a7ba842573b3876e14c933175382b923c8 |
|
01-Mar-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Intrinsify Integer.valueOf."" Fix heap poisoning. LOG INFO instead of ERROR to avoid run-test failures with --no-image. bug:30933338 Test: ART_HEAP_POISONING=true test-art-host test-art-target This reverts commit db7b44ac3ea80a722aaed12e913ebc1661a57998. Change-Id: I0b7d4f1eb11c62c9a3df8e0de0b1a5d8af760181
|
db7b44ac3ea80a722aaed12e913ebc1661a57998 |
|
28-Feb-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Intrinsify Integer.valueOf." Heap poisoning missing jit-gcstress not optimizing it. bug:30933338 This reverts commit cd0b27287843cfd904dd163056322579ab4bbf27. Change-Id: I5ece1818afbca5214babb6803f62614a649aedeb
|
cd0b27287843cfd904dd163056322579ab4bbf27 |
|
23-Feb-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Intrinsify Integer.valueOf. Improves performance of ArrayListStress and Ritz by ~10% and ~3%. Test: test-art-host test-art-target bug: 30933338 Change-Id: I639046e3a18dae50069d3a7ecb538a900bb590a1
|
69d75ffac23fe1e655b7e81f0454c2841280dc1f |
|
07-Feb-2017 |
Mingyao Yang <mingyao@google.com> |
Skip loop optimization if there is no loop in the graph. LinearizeGraph() does quite some allocations. Also add some comments on the possible false positives of some flags. Test: m test-art-host Change-Id: I80ef89a2dc031d601e7621d0b22060cd8c17fae3
|
74234daabb28a4b9c804bf8bf908e7334bd4d400 |
|
13-Jan-2017 |
Anton Kirilov <anton.kirilov@linaro.org> |
ARM: Merge data-processing instructions and shifts/(un)signed extensions This commit mirrors the work that has already been done for ARM64. Test: m test-art-target-run-test-551-checker-shifter-operand Change-Id: Iec8c1563b035f40f0e18dcffde28d91dc21922f8
|
fbdfa6d7485534eedbd3fb32cf572529ebddb63c |
|
03-Feb-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Inline across dex files for JIT."" bug:30933338 This reverts commit d16da8bd8106452eea82408748dc6b3fd64bcb80. Change-Id: I6a30354d6d00442cb1a542af063c7769865e369d
|
83c8e27a292e6e002fb3b3def75cf6d8653378e8 |
|
31-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Code refactoring around sharpening HLoadClass. Even if the class is not accessible through the dex cache, we can access it by other means (eg boot class, jit table). So rewrite static field access instruction builder to not bail out if a class cannot be accessed through the dex cache. bug:34966607 test: test-art-host test-art-target Change-Id: I88e4e09951a002b480eb8f271726b56f981291bd
|
d16da8bd8106452eea82408748dc6b3fd64bcb80 |
|
03-Feb-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Inline across dex files for JIT." Broke hikey build. bug:30933338 This reverts commit f290c01c61f8a2979efa74ffcd2f54c5e426a3d0. Change-Id: I3363d703c54d0f9b69197a29395cc08f60c8b2ac
|
f290c01c61f8a2979efa74ffcd2f54c5e426a3d0 |
|
28-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Inline across dex files for JIT. bug:30933338 test: ART_TEST_JIT=true test-art-host test-art-target Change-Id: I4ac708d70d90c2db4139d99a75bf4665a810c206
|
22aa54bf8469689c7c6c33f15ff4df2ffba8fa15 |
|
18-Oct-2016 |
Alexandre Rames <alexandre.rames@linaro.org> |
AArch64: Add HInstruction scheduling support. This commit adds a new `HInstructionScheduling` pass that performs basic scheduling on the `HGraph`. Currently, scheduling is performed at the block level, so no `HInstruction` ever leaves its block in this pass. The scheduling process iterates through blocks in the graph. For blocks that we can and want to schedule: 1) Build a dependency graph for instructions. It includes data dependencies (inputs/uses), but also environment dependencies and side-effect dependencies. 2) Schedule the dependency graph. This is a topological sort of the dependency graph, using heuristics to decide what node to schedule first when there are multiple candidates. Currently the heuristics only consider instruction latencies and schedule first the instructions that are on the critical path. Test: m test-art-host Test: m test-art-target Change-Id: Iec103177d4f059666d7c9626e5770531fbc5ccdc
|
5e8d5f01b0fe87a6c649bd3a9f1534228b93423d |
|
18-Oct-2016 |
Roland Levillain <rpl@google.com> |
Fix some typos in ART. Test: m build-art-host Test: m cpplint-art Change-Id: Ifc6ce3d0d645c4a8dca72dd483fc03fc05077130
|
e761bccf9f0d884cc4d4ec104568cef968296492 |
|
19-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Load the array class in the compiler for allocations."" This reverts commit fee255039e30c1c3dfc70c426c3d176221c3cdf9. Change-Id: I02b45f9a659d872feeb35df40b42c1be9878413a
|
fee255039e30c1c3dfc70c426c3d176221c3cdf9 |
|
19-Jan-2017 |
Hiroshi Yamauchi <yamauchi@google.com> |
Revert "Load the array class in the compiler for allocations." libcore test fails. This reverts commit cc99df230feb46ba717252f002d0cc2da6828421. Change-Id: I5bac595acd2b240886062e8c1f11f9095ff6a9ed
|
cc99df230feb46ba717252f002d0cc2da6828421 |
|
18-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Load the array class in the compiler for allocations. Removing one other dependency for needing to pass the current method, and having dex_cache_resolved_types_ in ArtMethod. oat file increase: - x64: 0.25% - arm32: 0.30% - x86: 0.28% test: test-art-host, test-art-target Change-Id: Ibca4fa00d3e31954db2ccb1f65a584b8c67cb230
|
5247c08fb186a5a2ac02226827cf6b994f41a681 |
|
13-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Put the resolved class in HLoadClass. To avoid repeated lookups in sharpening/rtp/inlining. Test: test-art-host test-art-target Change-Id: I08d0da36a4bb061cdaa490ea2af3a3217a875bbe
|
5d37c152f21a0807459c6f53bc25e2d84f56d259 |
|
12-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Put inlined ArtMethod pointer in stack maps. Currently done for JIT. Can be extended for AOT and inlined boot image methods. Also refactor the lookup of a inlined method at runtime to not rely on the dex cache, but look at the class loader tables. bug: 30933338 test: test-art-host, test-art-target Change-Id: I58bd4d763b82ab8ca3023742835ac388671d1794
|
6bec91c7d4670905cd67440991ec76fd54d0f000 |
|
09-Jan-2017 |
Vladimir Marko <vmarko@google.com> |
Store resolved types for AOT code in .bss. Test: m test-art-host Test: m test-art-target on Nexus 9. Test: Nexus 9 boots. Test: Build aosp_mips64-eng. Bug: 30627598 Bug: 34193123 Change-Id: I8ec60a98eb488cb46ae3ea56341f5709dad4f623
|
4155998a2f5c7a252a6611e3926943e931ea280a |
|
06-Jan-2017 |
Vladimir Marko <vmarko@google.com> |
Make runtime call on main for HLoadClass/kDexCacheViaMethod. Remove dependency of the compiled code on types dex cache array in preparation for changing to a hash-based array. Test: m test-art-host Test: m test-art-target on Nexus 9 Bug: 30627598 Change-Id: I3c426ed762c12eb9eb4bb61ea9a23a0659abf0a2
|
48886c2ee655a16224870fee52dc8721a52babcf |
|
06-Jan-2017 |
Vladimir Marko <vmarko@google.com> |
Remove HLoadClass::LoadKind::kDexCachePcRelative. Test: m test-art-host Test: m test-art-target-run-test-552-checker-sharpening Bug: 30627598 Change-Id: Ic809b0f3a8ed0bd4dc7ab67aa64866f9cdff9bdb
|
ac141397dc29189ad2b2df41f8d4312246beec60 |
|
13-Jan-2017 |
Orion Hodson <oth@google.com> |
Revert "Revert "ART: Compiler support for invoke-polymorphic."" This reverts commit 0fb5af1c8287b1ec85c55c306a1c43820c38a337. This takes us back to the original change and attempts to fix the issues encountered: - Adds transition record push/pop around artInvokePolymorphic. - Changes X86/X64 relocations for MacSDK. - Implements MIPS entrypoint for art_quick_invoke_polymorphic. - Corrects size of returned reference in art_quick_invoke_polymorphic on ARM. Bug: 30550796,33191393 Test: art/test/run-test 953 Test: m test-art-run-test Change-Id: Ib6b93e00b37b9d4ab743a3470ab3d77fe857cda8
|
6b69e0acb0e4c506ce2587e362c38e36e41e34ab |
|
11-Jan-2017 |
Aart Bik <ajcbik@google.com> |
Complete unrolling of loops with small body and trip count one. Rationale: Avoids the unnecessary loop control overhead, suspend check, and exposes more opportunities for constant folding in the resulting loop body. Fully unrolls loop in execute() of the Dhrystone benchmark (3% to 8% improvements). Test: test-art-host Change-Id: If30f38caea9e9f87a929df041dfb7ed1c227aba3
|
0d3998b5ff619364acf47bec0b541e7a49bd6fe7 |
|
12-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Make object allocation entrypoints only take a class."" This reverts commit f7aaacd97881c6924b8212c7f8fe4a4c8721ef53. Change-Id: I6756cd1e6110bb45231f62f5e388f16c044cb145
|
f7aaacd97881c6924b8212c7f8fe4a4c8721ef53 |
|
12-Jan-2017 |
Hiroshi Yamauchi <yamauchi@google.com> |
Revert "Make object allocation entrypoints only take a class." 960-default-smali64 is failing. This reverts commit 2b615ba29c4dfcf54aaf44955f2eac60f5080b2e. Change-Id: Iebb8ee5a917fa84c5f01660ce432798524d078ef
|
0fb5af1c8287b1ec85c55c306a1c43820c38a337 |
|
11-Jan-2017 |
Orion Hodson <oth@google.com> |
Revert "ART: Compiler support for invoke-polymorphic." This reverts commit 02e3092f8d98f339588e48691db77f227b48ac1e. Reasons for revert: - Breaks MIPS/MIPS64 build. - Fails under GCStress test on x64. - Different x64 build configuration doesn't like relocation. Change-Id: I512555b38165d05f8a07e8aed528f00302061001
|
02e3092f8d98f339588e48691db77f227b48ac1e |
|
01-Dec-2016 |
Orion Hodson <oth@google.com> |
ART: Compiler support for invoke-polymorphic. Adds basic support to invoke method handles in compiled code. Enables method verification for methods containing invoke-polymorphic. Adds k45cc/k45rc output to Instruction::DumpString() which was found to be missing when enabling verification. Include stack traces in test 957-methodhandle-transforms for failures so they can be easily identified. Bug: 30550796,33191393 Test: art/test/run-test 953 Test: m test-art-run-test Change-Id: Ic9a96ea24906087597d96ad8159a5bc349d06950
|
2b615ba29c4dfcf54aaf44955f2eac60f5080b2e |
|
06-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Make object allocation entrypoints only take a class. Change motivated by: - Dex cache compression: having the allocation fast path do a dex cache lookup will be too expensive. So instead, rely on the compiler having direct access to the class (either through BSS for AOT, or JIT tables for JIT). - Inlining: the entrypoints relied on the caller of the allocation to have the same dex cache as the outer method (stored at the bottom of the stack). This meant we could not inline methods from a different dex file that do allocations. By avoiding the dex cache lookup in the entrypoint, we can now remove this restriction. Code expansion on average for Docs/Gms/FB/Framework (go/lem numbers): - Around 0.8% on arm64 - Around 1% for x64, arm - Around 1.5% on x86 Test: test-art-host, test-art-target, ART_USE_READ_BARRIER=true/false Test: test-art-host, test-art-target, ART_DEFAULT_GC_TYPE=SS ART_USE_TLAB=true Change-Id: I41f3748bb4d251996aaf6a90fae4c50176f9295f
|
f0acfe7a812a332122011832074142718c278dae |
|
09-Jan-2017 |
Nicolas Geoffray <ngeoffray@google.com> |
Keep resolved String in HLoadString. For the following reasons: - Avoids needing to do a lookup again in CodeGenerator::EmitJitRoots. - Fixes races where we the string was GC'ed before CodeGenerator::EmitJitRoots. - Makes it possible to do GVN on the same string but defined in different dex files. Test: test-art-host, test-art-target Change-Id: If2b5d3079f7555427b1b96ab04546b3373fcf921
|
c52b26d4fb5b1ca91f34ce4b535b764853e538f6 |
|
19-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Recognize getClass() in RTP. Also always keep around the resolved field in related HInstructions to avoid resolving it again and again. Test: test-art-host, 631-checker-get-class Change-Id: I3bc6be11f3eb175c635e746006f39865947e0669
|
4d1be4920fefe2c1f7cb40357842c6587cdcc50e |
|
06-Jan-2017 |
Vladimir Marko <vmarko@google.com> |
Remove the IsInDexCache flag from HLoadString. This flag was obsolete and always false. Test: m test-art-host Change-Id: Iabefc068908ff4f994b63e7e18a2a27c25a0919e
|
c1a42cf3873be202c8c0ca3c4e67500b470ab075 |
|
18-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Remove soon to be obsolete call kinds for direct calls. And remove CompilerDriver::GetCodeAndMethodForDirectCall in preparation of removing non-PIC prebuild and non-PIC on-device boot image compilation. Test: test-art-host test-art-target bug:33192586 Change-Id: Ic48e3e8b9d7605dd0e66f31d458a182198ba9578
|
b0b051ad6c9fab511346882650d5d689f805a980 |
|
17-Nov-2016 |
Mingyao Yang <mingyao@google.com> |
CHA guard optimization (elimination/hoisting). Test: manual by checking the dump-cfg output. Change-Id: I254e168b9a85d2d3d23e02eea7e129c1bc9ab920
|
568763405f6eb7cb78fd39272569e30fe21be85e |
|
16-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Sharpen HLoadClass from inliner. Also cleanup HLoadClass constructor. Test: ART_TEST_JIT=true m test-art-host-run-test Change-Id: I8f803b05fb8a7267d1421ca9c032e624f27efed3
|
a9dbe8333d4df5447157fe575a805003172af047 |
|
15-Dec-2016 |
Mingyao Yang <mingyao@google.com> |
Add HVariableInputSizeInstruction. Make HPhi and HInvoke subclasses of the new instruction. Test: m test-art-host-run-test Change-Id: I303c725876f1f4407b98702d92370be25193fc53
|
9186ced255f2e7402646b5b286deebb540640734 |
|
12-Dec-2016 |
Andreas Gampe <agampe@google.com> |
ART: Clean up utils.h Remove functionality provided by libbase. Move some single-use functions to their respective users. Test: m test-art-host Change-Id: I75594035fa975200d638cc29bb9f31bc6e6cb29f
|
22384aeab988df7fa5ccdc48a668589c5f602c39 |
|
12-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Add kJitTableAddress for HLoadClass."" This reverts commit d2d5262c8370309e1f2a009f00aafc24f1cf00a0. Change-Id: I6149d5c7d5df0b0fc5cb646a802a2eea8d01ac08
|
d2d5262c8370309e1f2a009f00aafc24f1cf00a0 |
|
12-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Add kJitTableAddress for HLoadClass." One test failure after merge. This reverts commit 5b12f7973636bfea29da3956a9baa7a6bbe2b666. Change-Id: I120c49e53274471fc1c82a10d52e99c83f5f85cc
|
5b12f7973636bfea29da3956a9baa7a6bbe2b666 |
|
09-Dec-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Add kJitTableAddress for HLoadClass. This new kind loads classes from the root table associated with JIT compiled code. Also remove kDexCacheAddress, which is replaced by kJitTableAddress. test: ART_TEST_JIT=true test-art-host-jit test-art-target-jit Change-Id: Ia23029688d1a60c178bf2ffa7463927c5d5de4d0
|
063fc772b5b8aed7d769cd7cccb6ddc7619326ee |
|
02-Aug-2016 |
Mingyao Yang <mingyao@google.com> |
Class Hierarchy Analysis (CHA) The class linker now tracks whether a method has a single implementation and if so, the JIT compiler will try to devirtualize a virtual call for the method into a direct call. If the single-implementation assumption is violated due to additional class linking, compiled code that makes the assumption is invalidated. Deoptimization is triggered for compiled code live on stack. Instead of patching return pc's on stack, a CHA guard is added which checks a hidden should_deoptimize flag for deoptimization. This approach limits the number of deoptimization points. This CL does not devirtualize abstract/interface method invocation. Slides on CHA: https://docs.google.com/a/google.com/presentation/d/1Ax6cabP1vM44aLOaJU3B26n5fTE9w5YU-1CRevIDsBc/edit?usp=sharing Change-Id: I18bf716a601b6413b46312e925a6ad9e4008efa4 Test: ART_TEST_JIT=true m test-art-host/target-run-test test-art-host-gtest
|
71bf7b43380eb445973f32a7f789d9670f8cc97d |
|
16-Nov-2016 |
Aart Bik <ajcbik@google.com> |
Optimizations around escape analysis. With tests. Details: (1) added new intrinsics (2) implemented optimizations more !can be null information more null check removals replace return-this uses with incoming parameter remove dead StringBuffer/Builder calls (with escape analysis) (3) Fixed exposed bug in CanBeMoved() Performance gain: This improves CafeineString by about 360% (removes null check from first loop, eliminates second loop completely) Test: test-art-host Change-Id: Iaf16a1b9cab6a7386f43d71c6b51dd59600e81c1
|
8a0128a5ca0784f6d2b4ca27907e8967a74bc4c5 |
|
28-Nov-2016 |
Andreas Gampe <agampe@google.com> |
ART: Add dex::StringIndex Add abstraction for uint32_t string index. Test: m test-art-host Change-Id: I917c2881702fe3df112c713f06980f2278ced7ed
|
a5b09a67034e57a6e10231dd4bd92f4cb50b824c |
|
18-Nov-2016 |
Andreas Gampe <agampe@google.com> |
ART: Add dex::TypeIndex Add abstraction for uint16_t type index. Test: m test-art-host Change-Id: I47708741c7c579cbbe59ab723c1e31c5fe71f83a
|
132d8363bf8cb043d910836672192ec8c36649b6 |
|
16-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Revert "Revert "JIT root tables."""" Test: 626-set-resolved-string, test-art-host, test-art-target Test: run-libcore-tests.sh Test: phone boots and runs This reverts commit 3395fbc20bcd20948bec8958db91b304c17cacd8. Change-Id: I104b73d093e3eb6a271d564cfdb9ab09c1c8cf24
|
3395fbc20bcd20948bec8958db91b304c17cacd8 |
|
14-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Revert "JIT root tables.""" libcore failures: dalvikvm32 F 11-14 03:04:06 14870 14870 jit_code_cache.cc:310] Check failed: new_string != nullptr This reverts commit 75afcdd3503a8a8518e5b23d21b6e73306ce39ce. Change-Id: I5a6b6b48aa79a763d1ff1ba4d85d63811254787d
|
75afcdd3503a8a8518e5b23d21b6e73306ce39ce |
|
10-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "JIT root tables."" Also contains Revert "Support kJitTableAddress in x86/arm/arm64." This reverts commit 4acd03638fcdb4e5d1666f8eec7eb3bf6d6be035. This reverts commit 997d1217830c0a18b70faeabd53c04700a87d7d9. Test: ART_USE_READ_BARRIER=true/false test-art-host test-art-target Change-Id: I77cb1e9bf8f1b4c58b72d3cf5ca31ced2aaa1ea3
|
c757decb04d8535fd806b9bce1c2fe5e52c228dc |
|
04-Nov-2016 |
David Sehr <sehr@google.com> |
Do not inline loops without exit edges Fixes an issue with LinearOrder after inlining a function containing a loop that has no exit edge. The failure is due to incorrect loop information being computed for blocks that are not on a path to the inlined function's return. They should not be considered part of the caller's enclosing loop, but are today. Bug: 32547653 Test: run-test --host 478-checker-inline-noreturn Change-Id: I9694a1cb861430051c801d07f7ce29752332cba5
|
ff7d89c0364f6ebd0f0798eb18ef8bd62917de6a |
|
07-Nov-2016 |
Aart Bik <ajcbik@google.com> |
Allow read side effects for removing dead instructions. Rationale: Instructions that only have the harmless read side effect may be removed when dead as well, we were too strict previously. As proof of concept, this cl also provides more accurate information on a few string related intrinsics. This removes the dead indexOf from CaffeineString (17% performance improvement, big bottleneck of the StringBuffer's toString() still remains in loop). Test: test-art-host Change-Id: Id835a8e287e13e1f09be6b46278a039b8865802e
|
4acd03638fcdb4e5d1666f8eec7eb3bf6d6be035 |
|
09-Nov-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "JIT root tables." May be the offender for jit-gcstress failure of 902. This reverts commit ac3ebc3150760425ed00abd56da48f9a6e0666bc. Change-Id: I9ea6c9236fd1729fed7d1868dd8a111172932308
|
54d6a207341ad45cb5eceed71a344073ed6d4e31 |
|
09-Nov-2016 |
Vladimir Marko <vmarko@google.com> |
Fix 552-checker-sharpening for PIC test. And remove obsolete HLoadString::LoadKind::kDexCacheAddress. Test: m ART_TEST_PIC_TEST=true test-art-host Change-Id: I3e7a1a98c2c7eba5ea10954d7efcf743a807c300
|
ac3ebc3150760425ed00abd56da48f9a6e0666bc |
|
05-Oct-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
JIT root tables. Implement root tables for the JIT. Each JIT compiled method gets a table allocated before the stack maps. The table gets visited through Runtime::SweepSystemWeaks. Implement String roots for x86_64 as an example. Test: test-art-host test-art-target Change-Id: Id3d5bc67479e08b52dd4b253e970201203a0f0d2
|
2767f4ba2df934fea4c50a016e2955c2cf3f6b19 |
|
29-Oct-2016 |
Aart Bik <ajcbik@google.com> |
New instruction simplifications. Extra dce pass. Allow more per block repeats. Rationale: We were missing some obvious simplifications, which left performance at the table for e.g. CaffeineLogic compiled with dx (4200us->2700us). The constant for allowing a repeat on a BB seemed very low, at the very least it should depend on the BB size. Test: test-art-host Change-Id: Ic234566e117593e12c936d556222e4cd4f928105
|
2c45bc9137c29f886e69923535aff31a74d90829 |
|
25-Oct-2016 |
Vladimir Marko <vmarko@google.com> |
Remove H[Reverse]PostOrderIterator and HInsertionOrderIterator. Use range-based loops instead, introducing helper functions ReverseRange() for iteration in reverse order in containers. When the contents of the underlying container change inside the loop, use an index-based loop that better exposes the container data modifications, compared to the old iterator interface that's hiding it which may lead to subtle bugs. Test: m test-art-host Change-Id: I2a4e6c508b854c37a697fc4b1e8423a8c92c5ea0
|
cc42be074ed15235426cdbcb34f357ead2be2caf |
|
21-Oct-2016 |
Aart Bik <ajcbik@google.com> |
Improved induction variable analysis and loop optimizations. Rationale: Rather than half-baked reconstructing cycles during loop optimizations, this CL passes the SCC computed during induction variable analysis to the loop optimizer (trading some memory for more optimizations). This further improves CaffeineLogic from 6000us down to 4200us (dx) and 2200us to 1690us (jack). Note that this is on top of prior improvements in previous CLs. Also, some narrowing type concerns are taken care of during transfer operations. Test: test-art-host Change-Id: Ice2764811a70073c5014b3a05fb51f39fd2f4c3c
|
96eeb4e2bb21afe8783d62e06b91fd1aef682dbb |
|
12-Oct-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Update HInstruction::NeedsCurrentMethod. HLoadString and HLoadClass when sharpened may not need it anymore. Instead just rely on the HCurrentMethod being the SSA dependency of those instructions. Also save storing the current method in the stack if the graph actually doesn't need it. test: m test-art-host test-art-target Change-Id: I235d8275230637cbbd38fc0d2f9b822f6d2a9c1e
|
e8a3c576301fd531d5f73a65fc8b84a63619d580 |
|
12-Oct-2016 |
Mathieu Chartier <mathieuc@google.com> |
Replace StackHandleScopeCollection with VariableSizedHandleScope VariableSizedHandleScope's internal handle scopes are not pushed directly on the thread. This means that it is safe to intermix with other types of handle scopes. Added test. Test: clean-oat-host && test-art-host Change-Id: Id2fd1155788428f394d49615d337d9134824c8f0
|
482095d3a03892b76f5b835c9e7ea4bc80638501 |
|
11-Oct-2016 |
Aart Bik <ajcbik@google.com> |
Improved and simplified loop optimizations. Rationale: Empty preheader simplification has been simplified to a much more general empty block removal optimization step. Incremental updating of induction variable analysis enables repeated elimination or simplification of induction cycles. This enabled an extra layer of optimization for e.g. Benchpress Loop (17.5us. -> 0.24us. -> 0.08us). So the original 73x speedup is now multiplied by another 3x, for a total of about 218x. Test: 618-checker-induction et al. Change-Id: I394699981481cdd5357e0531bce88cd48bd32879
|
9620230700d4b451097c2163faa70627c9d8088a |
|
05-Oct-2016 |
Aart Bik <ajcbik@google.com> |
Refactoring of graph linearization and linear order. Rationale: Ownership of graph's linear order and iterators was a bit unclear now that other phases are using it. New approach allows phases to compute their own order, while ssa_liveness is sole owner for graph (since it is not mutated afterwards). Also shortens lifetime of loop's arena. Test: test-art-host Change-Id: Ib7137d1203a1e0a12db49868f4117d48a4277f30
|
aad75c6d5bfab2dc8e30fc99fafe8cd2dc8b74d8 |
|
03-Oct-2016 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Store resolved Strings for AOT code in .bss."" Fixed oat_test to keep dex files alive. Fixed mips build. Rewritten the .bss GC root visiting and added write barrier to the artResolveStringFromCode(). Test: build aosp_mips-eng Test: m ART_DEFAULT_GC_TYPE=SS test-art-target-host-gtest-oat_test Test: Run ART test suite on host and Nexus 9. Bug: 20323084 Bug: 30627598 This reverts commit 5f926055cb88089d8ca27243f35a9dfd89d981f0. Change-Id: I07fa2278d82b8eb64964c9a4b66cb93726ccda6b
|
281c681a0852c10f5ca99b351650b244e878aea3 |
|
26-Aug-2016 |
Aart Bik <ajcbik@google.com> |
A first implementation of a loop optimization framework. Rationale: We are planning to add more and more loop related optimizations and this framework provides the basis to do so. For starters, the framework optimizes dead induction, induction that can be replaced with a simpler closed-form, and eliminates dead loops completely (either pre-existing or as a result of induction removal). Speedup on e.g. Benchpress Loop is 73x (17.5us. -> 0.24us.) [with the potential for more exploiting outer loop too] Test: 618-checker-induction et al. Change-Id: If80a809acf943539bf6726b0030dcabd50c9babc
|
5f926055cb88089d8ca27243f35a9dfd89d981f0 |
|
30-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Revert "Store resolved Strings for AOT code in .bss." There are some issues with oat_test64 on host and aosp_mips-eng. Also reverts "compiler_driver: Fix build." Bug: 20323084 Bug: 30627598 This reverts commit 63dccbbefef3014c99c22748d18befcc7bcb3b41. This reverts commit 04a44135ace10123f059373691594ae0f270a8a4. Change-Id: I568ba3e58cf103987fdd63c8a21521010a9f27c4
|
762869dee6e0eadab5be1c606792d6693bbabf4e |
|
15-Jul-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Simplify our intrinsic recognizer. - Use the modifiers for storing the intrinsic kind. - Delete dex_file_method_inliner and its associated map. This work was also motivated by the fact that the inline method analyzer leaks intrinsic tables, and even worse, might re-use a table from one dex file to another unrelated dex file in the presence of class unloading and the unlikely event of the dex files getting the same address. test: m test-art-host m test-art-target Change-Id: Ia653d2c72df13889dc85dd8c84997582c034ea4b
|
63dccbbefef3014c99c22748d18befcc7bcb3b41 |
|
21-Sep-2016 |
Vladimir Marko <vmarko@google.com> |
Store resolved Strings for AOT code in .bss. And do some related refactorings. Bug: 20323084 Bug: 30627598 Test: Run ART test suite including gcstress on host and Nexus 9. Test: Run ART test suite including gcstress with baker CC on host and Nexus 9. Test: Build aosp_mips64-eng. Change-Id: I1b12c1570fee8e5da490b47f231050142afcbd1e
|
da079bba8403733cac9bb7415b038ffd77e62403 |
|
26-Sep-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Cleanup String.<init> handling. Move everything to one place (currently well_known_classes.cc, but no strong preference) and define a macro to easily handle the list of affected methods. test: m test-art-host test: m test-art-target Change-Id: Ib8372d130d5458516a1f1ae31014afc76037fc34
|
5e4e11e171f90d9a3ea178fc8e72aac909de55d5 |
|
22-Sep-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Clean-up sharpening and compiler driver. Remove dependency on compiler driver for sharpening and dex2dex (the methods called on the compiler driver were doing unnecessary work), and remove the now unused methods in compiler driver. Also remove test that is now invalid, as sharpening always succeeds. test: m test-art-host m test-art-target Change-Id: I54e91c6839bd5b0b86182f2f43ba5d2c112ef908
|
91a6516103b8bf8bb75c3a2840cbdec7521e74a7 |
|
19-Sep-2016 |
Alexandre Rames <alexandre.rames@linaro.org> |
Remove the `CanTriggerGC` side-effects on a few instructions. The side-effect was specified for these instructions as they call runtime. We now have a list of entrypoints that we know cannot trigger GC. We can avoid requiring the side-effect for those. Test: Run ART test suite on Nexus 5X and host. Change-Id: I0e0e6a4d701ce6c75aff486cb0d1bc7fe2e8dda4
|
a5931185c97c7b17981a9fc5016834a0bdd9480b |
|
02-Sep-2016 |
Chih-Hung Hsieh <chh@google.com> |
Fix google-explicit-constructor warnings in art. * Add explicit keyword to conversion constructors, or NOLINT for implicit converters. Bug: 28341362 Test: build with WITH_TIDY=1 Change-Id: I1e1ee2661812944904fedadeff97b620506db47d
|
20e9db6db787e007e7032878c9899b28ec43e93f |
|
14-Sep-2016 |
Aart Bik <ajcbik@google.com> |
Make LinearizeGraph() public (and move it to nodes files) Rationale: It is strange that HLinearOrderIterator is defined (and visible) in nodes.h, but clients have no way to build this order. This CL makes the building available at the usual place. Change-Id: Ib66f2edf6dfc8edd6b429bd4bea3ac7e37440b28 Tests: m test-art
|
d9c90373d640a5e08072cf469c372e24a8c0fc35 |
|
14-Sep-2016 |
David Brazdil <dbrazdil@google.com> |
Move ArrayRef to runtime/base Will be used in upcoming CLs regarding VDEX and VerifierDeps. Test: m test-art-host Change-Id: I68e611a4a52246c2bdf45eab7c61f3212908afd4
|
96b6682d2d65f94c262590ef88bafdc70171ab8c |
|
10-Sep-2016 |
Alexey Frunze <Alexey.Frunze@imgtec.com> |
MIPS32: Implement table-based packed switch Test: booted MIPS32R2 in QEMU Test: test-art-target-run-test-optimizing (MIPS32R2) on CI20 Test: booted MIPS64 (with 2nd arch MIPS32R6) in QEMU Test: test-art-target-run-test-optimizing (MIPS32R6) in QEMU Test: test-art-host-gtest Change-Id: I2e1a65ff1ba9406b84351ba7998f853b1ce4aef9
|
31b12e32073f458950e96d0d1b44e48508cf67e4 |
|
03-Sep-2016 |
Mathieu Chartier <mathieuc@google.com> |
Avoid read barrier for image HLoadClass Concurrent copying baker: X86_64 core-optimizing-pic.oat: 28583112 -> 27906824 (2.4% smaller) Around 0.4% of 2.4% is from re-enabling kBootImageLinkTimeAddress, kBootImageLinkTimePcRelative, and kBootImageAddress. N6P boot.oat 32: 73042140 -> 71891956 (1.57% smaller) N6P boot.oat 64: 83831608 -> 82531456 (1.55% smaller) EAAC: 1252 -> 1245 (32 samples) Bug: 29516974 Test: test-art-host CC baker, N6P booting Change-Id: I9a196cf0157058836981c43c93872e9f0c4919aa
|
2c76e068cb49b6bd687510f887e2c1058678eccb |
|
31-Aug-2016 |
Scott Wakeling <scott.wakeling@linaro.org> |
Allow for testing alternative code generators in codegen_test.cc This will be used in a later patch to test a new VIXL32-based backend in parallel with the existing code_generator_arm. Test: gtest-codegen_test on host and target Change-Id: I0316da0430fa6da0a7c668315f531888d18e7eb3
|
fca16663334e5838790631d8eac95f4ffdb0cc2e |
|
14-Jul-2016 |
Serban Constantinescu <serban.constantinescu@linaro.org> |
Extend the InvokeRuntime() changes to mips. Also fix the side effects for <Static/Instance>Field<Get/Set>. Test: test-art-target Change-Id: Ia4284ccd9d0c88210eaa4458f74728c805e2e076
|
bdf7f1c3ab65ccb70f62db5ab31dba060632d458 |
|
31-Aug-2016 |
Andreas Gampe <agampe@google.com> |
ART: SHARED_REQUIRES to REQUIRES_SHARED This coincides with the actual attribute name and upstream usage. Preparation for deferring to libbase. Test: m Test: m test-art-host Change-Id: Ia8986b5dfd926ba772bf00b0a35eaf83596d8518
|
06a46c44bf1a5cba6c78c3faffc4e7ec1442b210 |
|
20-Jul-2016 |
Alexey Frunze <Alexey.Frunze@imgtec.com> |
MIPS32: Improve string and class loads Tested: - MIPS32 Android boots in QEMU - test-art-host-gtest - test-art-target-run-test-optimizing in QEMU, on CI20 - test-art-target-gtest on CI20 Change-Id: I70fd5d5267f8594c3b29d5a4ccf66b8ca8b09df3
|
26de38bb7f2122417388809f4ff88a7cb5c4af5e |
|
28-Jul-2016 |
Andreas Gampe <agampe@google.com> |
ART: Delete old compiler_enums.h Holdover from the Quick days. Move the two enums that are still used closer to the actual users (and prune no longer used cases). Test: m test-art-host Change-Id: I88aa49961a54635788cafac570ddc3125aa38262
|
328429ff48d06e2cad4ebdd3568ab06de916a10a |
|
06-Jul-2016 |
Artem Serov <artem.serov@linaro.org> |
ARM: Port instr simplification of array accesses. After changing the addressing mode for array accesses (in https://android-review.googlesource.com/248406) the 'add' instruction that calculates the base address for the array can be shared across accesses to the same array. Before https://android-review.googlesource.com/248406: add IP, r[Array], r[Index0], LSL #2 ldr r0, [IP, #12] add IP, r[Array], r[Index1], LSL #2 ldr r0, [IP, #12] Before this CL: add IP. r[Array], #12 ldr r0, [IP, r[Index0], LSL #2] add IP. r[Array], #12 ldr r0, [IP, r[Index1], LSL #2] After this CL: add IP. r[Array], #12 ldr r0, [IP, r[Index0], LSL #2] ldr r0, [IP, r[Index1], LSL #2] Link to the original optimization: https://android-review.googlesource.com/#/c/127310/ Test: Run ART test suite on Nexus 6. Change-Id: Iee26f9a0a7ca46abb90e3f60d19d22dc8dee4d8f
|
e3fb245fbdb5e91cf8a9750504df40bd629e0080 |
|
11-May-2016 |
Alexey Frunze <Alexey.Frunze@imgtec.com> |
MIPS32: Improve method invocation Improvements include: - CodeGeneratorMIPS::GenerateStaticOrDirectCall() supports: - MethodLoadKind::kDirectAddressWithFixup (via literals) - CodePtrLocation::kCallDirectWithFixup (via literals) - MethodLoadKind::kDexCachePcRelative - 32-bit literals to support the above (not ready for general- purpose applications yet because RA is not saved in leaf methods, but is clobbered on MIPS32R2 when simulating PC-relative addressing (MIPS32R6 is OK because it has PC-relative addressing with the lwpc instruction)) - shorter instruction sequences for recursive static/direct calls Tested: - test-art-host-gtest - test-art-target-gtest and test-art-target-run-test-optimizing on: - MIPS32R2 QEMU - CI20 board - MIPS32R6 (2nd arch) QEMU Change-Id: Id5b137ad32d5590487fd154c9a01d3b3e7e044ff
|
e90049140fdfb89080e5cc9b000b0c9be8c18bcd |
|
16-Jun-2016 |
Vladimir Marko <vmarko@google.com> |
Create a typedef for HInstruction::GetInputs() return type. And some other cleanup after https://android-review.googlesource.com/230742 Test: No new tests. ART test suite passed (tested on host). Change-Id: I4743bf17544d0234c6ccb46dd0c1b9aae5c93e17
|
7fe30f952f37dd1e829b8d215d023b301cb82ef9 |
|
29-Jun-2016 |
Anton Kirilov <anton.kirilov@linaro.org> |
Make the Compute() method of all HIRs static. Change-Id: Ibd7687dac907150c8e100791ed7d20d1ae18c9be
|
e8e1127da3f154fae8d2eb16a94203544a182159 |
|
28-Jun-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Do checks on the fault address when we think it's an NPE. bug:29321958 Change-Id: I28f4da56eb3e0b48721d3ac41114858bc80daadb
|
94ab38f01dc2cf3ed0c6e73e2a6b594c14758d67 |
|
21-Jun-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Run RTP if invoke inputs have more specific type Moving RTP inside the graph builder introduced a regression where replacing the inner parameters with the actual arguments of the HInvoke would not build the inner graph with types more specific than the method's signature. This patch runs RTP on the inner graph again when it is detected that RTP may improve typing precision. Bug: 29595335 Change-Id: I351babc8497c83c2fba589aa51f46eaa0b7ab33c
|
87f3fcbd0db352157fc59148e94647ef21b73bce |
|
28-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Replace String.charAt() with HIR. Replace String.charAt() with HArrayLength, HBoundsCheck and HArrayGet. This allows GVN on the HArrayLength and BCE on the HBoundsCheck as well as using the infrastructure for HArrayGet, i.e. better handling of constant indexes than the old intrinsic and using the HArm64IntermediateAddress. Bug: 28330359 Change-Id: I32bf1da7eeafe82537a60416abf6ac412baa80dc
|
dbb7f5bef10138ade0fb202da1d61f562b2df649 |
|
30-Mar-2016 |
Vladimir Marko <vmarko@google.com> |
Improve HLoadClass code generation. For classes in the boot image, use either direct pointers or PC-relative addresses. For other classes, use PC-relative access to the dex cache arrays for AOT and direct address of the type's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -252KiB (-0.3%) - 64-bit boot.oat: -412KiB (-0.4%) - 32-bit dalvik cache total: -392KiB (-0.4%) - 64-bit dalvik-cache total: -2312KiB (-1.0%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -124KiB (-0.2%) - 64-bit boot.oat: -420KiB (-0.5%) - 32-bit dalvik cache total: -136KiB (-0.1%) - 64-bit dalvik-cache total: -1136KiB (-0.5%) (contains more files than the 32-bit dalvik cache) Bug: 27950288 Change-Id: I4da991a4b7e53c63c92558b97923d18092acf139
|
d6c205eb9b04bcfa072cd5ffdd93deef167ec340 |
|
07-Jun-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove redundant MoveInstructionBefore method Change-Id: If53d7011197cc6b9c1702a3d98ef11b59eb76f0c
|
f479d7758dbe7e8740386fbf1d73e05b0277c5e3 |
|
16-May-2016 |
Anton Shamin <anton.shamin@intel.com> |
ART: ArrayGet hoisting restriction added. Currently if we hoist ArrayGet from loop there is no guarantee that insn will be executed at runtime. Because of that we could face issues like crashes in generated code. This patch introduces restriction for ArrayGet hoisting. We say that ArrayGet execution is guaranteed at least one time if its bb dominates all exit blocks. Signed-off-by: Anton Shamin <anton.shamin@intel.com> (cherry picked from commit f89381fed12faf96c45a83a989ae2fff82c05f3b) BUG=29145171 Change-Id: Ia5664dedb1543d78a7b4038801b8372572f069f6
|
f89381fed12faf96c45a83a989ae2fff82c05f3b |
|
16-May-2016 |
Anton Shamin <anton.shamin@intel.com> |
ART: ArrayGet hoisting restriction added. Currently if we hoist ArrayGet from loop there is no guarantee that insn will be executed at runtime. Because of that we could face issues like crashes in generated code. This patch introduces restriction for ArrayGet hoisting. We say that ArrayGet execution is guaranteed at least one time if its bb dominates all exit blocks. Change-Id: I9f72c0f4c33b358341109238bea46cb5a82f490f Signed-off-by: Anton Shamin <anton.shamin@intel.com>
|
372f10e5b0b34e2bb6e2b79aeba6c441e14afd1f |
|
17-May-2016 |
Vladimir Marko <vmarko@google.com> |
Refactor handling of input records. Introduce HInstruction::GetInputRecords(), a new virtual function that returns an ArrayRef<> to all input records. Implement all other functions dealing with input records as wrappers around GetInputRecords(). Rewrite functions that previously used multiple virtual calls to deal with input records, especially in loops, to prefetch the ArrayRef<> only once for each instruction. Besides avoiding all the extra calls, this also allows the compiler (clang++) to perform additional optimizations. This speeds up the Nexus 5 boot image compilation by ~0.5s (4% of "Compile Dex File", 2% of dex2oat time) on AOSP ToT. Change-Id: Id8ebe0fb9405e38d918972a11bd724146e4ca578
|
fcb503cba92133b72e29ab48af1c7e92be91e609 |
|
18-May-2016 |
Vladimir Marko <vmarko@google.com> |
Mark concrete HIR instructions as FINAL. This allows the compiler to apply more optimizations. Change-Id: Ic7d8a457ea4e7d5853195cc4b56482703a1176d5
|
d7c2fdc939bb7efb3e7204d62e54c6a3f7d77f9b |
|
10-May-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix another case of live_in at irreducible loop entry. GVN was implicitly extending the liveness of an instruction across an irreducible loop. Fix this problem by clearing the value set at loop entries that contain an irreducible loop. bug:28252896 (cherry picked from commit 77ce6430af2709432b22344ed656edd8ec80581b) Change-Id: Ie0121e83b2dfe47bcd184b90a69c0194d13fce54
|
77ce6430af2709432b22344ed656edd8ec80581b |
|
10-May-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix another case of live_in at irreducible loop entry. GVN was implicitly extending the liveness of an instruction across an irreducible loop. Fix this problem by clearing the value set at loop entries that contain an irreducible loop. bug:28252896 Change-Id: I68823cb88dceb4c2b4545286ba54fd0c958a48b0
|
7e589feab1b35203fbb8c431213f1d2b2a4ad530 |
|
06-May-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Fix dominance for irreducible loops Computation of dominance was broken in the presence of irreducible loops because the algorithm assumed that back edges are always dominated by their respective headers and a fix-point iteration is therefore unnecessary. This is not true for irreducible loops, forcing us to revisit their loop headers and all dependent blocks. This patch adds a fix-point iteration if a back edge not dominated by its header is found. Bug: 28611485 Change-Id: If84044e49d5b9c682949648033d2861628d7fe05 (cherry picked from commit 3f4a522cc39f5c651e7c718196e989bc81d8c6ef)
|
bf12e4d4209ac4e8fb98b4fd5193208adc7fe3ff |
|
05-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: LoadString may not have any side effects. LoadString does not have any side effects if the string is known to be in the dex cache or it's a boot image string referenced directly, as specified by the string load kind. We need to clear the side effects for these cases to avoid a DCHECK() failure when such LoadString instruction ends up between a ClinitCheck and an instruction to which we want to merge that ClinitCheck. This may happen as a consequence of inlining, LICM and DCE as shown by a regression test. Bug: 27929914 (cherry picked from commit ace7a000a433ce4ecf94f30adea39c01a76fa936) Change-Id: Iaf9c63b6e58aae1e246b43ca52eea0b47a6ad565
|
dce016eab87302f02b0bd903dd2cd86ae512df2d |
|
28-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Intrinsify String.length() and String.isEmpty() as HIR. Use HArrayLength for String.length() in anticipation of changing the String.charAt() to HBoundsCheck+HArrayGet to allow the existing BCE to seamlessly work for strings. Use HArrayLength+HEqual for String.isEmpty(). We previously relied on inlining but we now want to apply the new intrinsics even when we do not inline, i.e. when compiling debuggable (as is currently the case for boot image) or when we hit inlining limits, i.e. depth, size, or the number of accumulated dex registers. Bug: 28330359 Change-Id: Iab9d2f6d2967bdd930a72eb461f27efe8f37c103
|
3f4a522cc39f5c651e7c718196e989bc81d8c6ef |
|
06-May-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Fix dominance for irreducible loops Computation of dominance was broken in the presence of irreducible loops because the algorithm assumed that back edges are always dominated by their respective headers and a fix-point iteration is therefore unnecessary. This is not true for irreducible loops, forcing us to revisit their loop headers and all dependent blocks. This patch adds a fix-point iteration if a back edge not dominated by its header is found. Bug: 28611485 Change-Id: If84044e49d5b9c682949648033d2861628d7fe05
|
a4336d253b88f95c49891a8084579a4599785e90 |
|
19-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Use dex cache from compilation unit in RTP. Avoid calling the costly ClassLinker::FindDexCache() from reference type propagation when the dex cache from the compilation unit will do, i.e. almost always. Compiling the Nexus 5 boot image on host under perf(1) shows that the FindDexCache() hits drop from about 0.2% to almost nothing, though enabling inlining for the boot image will increase it a bit to 0.03% due to unavoidable calls from the inliner. Also clean up the ScopedObjectAccess usage a bit. Bug: 28173563 Cherry-picked the "revert-revert" (cherry picked from commit 456307a47336e3d6576ed6d8563b67573a4238d3) and squashed two subsequent fixes Fix RTP to hold mutator lock while using raw mirror pointers. (cherry picked from commit 62977ff198deb673a6990202a2fb8b993217c57c) Fix reference_type_propagation_test. (cherry picked from commit 5eed0c5d27f091c952704f652cd77c4e3833ad88) Change-Id: Ia944452d7ab26aed963832a9346df363743a419f
|
d59f3b1b7f5c1ab9f0731ff9dc60611e8d9a6ede |
|
29-Mar-2016 |
Vladimir Marko <vmarko@google.com> |
Use iterators "before" the use node in HUserRecord<>. Create a new template class IntrusiveForwardList<> that mimicks std::forward_list<> except that all allocations are handled externally. This is essentially the same as boost::intrusive::slist<> but since we're not using Boost we have to reinvent the wheel. Use the new container to replace the HUseList and use the iterators to "before" use nodes in HUserRecord<> to avoid the extra pointer to the previous node which was used exclusively for removing nodes from the list. This reduces the size of the HUseListNode by 25%, 32B to 24B in 64-bit compiler, 16B to 12B in 32-bit compiler. This translates directly to overall memory savings for the 64-bit compiler but due to rounding up of the arena allocations to 8B, we do not get any improvement in the 32-bit compiler. Compiling the Nexus 5 boot image with the 64-bit dex2oat on host this CL reduces the memory used for compiling the most hungry method, BatteryStats.dumpLocked(), by ~3.3MiB: Before: MEM: used: 47829200, allocated: 48769120, lost: 939920 Number of arenas allocated: 345, Number of allocations: 815492, avg size: 58 ... UseListNode 13744640 ... After: MEM: used: 44393040, allocated: 45361248, lost: 968208 Number of arenas allocated: 319, Number of allocations: 815492, avg size: 54 ... UseListNode 10308480 ... Note that while we do not ship the 64-bit dex2oat to the device, the JIT compilation for 64-bit processes is using the 64-bit libart-compiler. Bug: 28173563 Bug: 27856014 (cherry picked from commit 46817b876ab00d6b78905b80ed12b4344c522b6c) Change-Id: Ifb2d7b357064b003244e92c0d601d81a05e56a7b
|
456307a47336e3d6576ed6d8563b67573a4238d3 |
|
19-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Use dex cache from compilation unit in RTP."" The exposed issue has been fixed by https://android-review.googlesource.com/215877 Bug:28210356 This reverts commit 34d9b04d8d0006967486c0ad1b221e7b632652af. Change-Id: I5288c923e45d9ef3190dabb89738350a1212a60d
|
46817b876ab00d6b78905b80ed12b4344c522b6c |
|
29-Mar-2016 |
Vladimir Marko <vmarko@google.com> |
Use iterators "before" the use node in HUserRecord<>. Create a new template class IntrusiveForwardList<> that mimicks std::forward_list<> except that all allocations are handled externally. This is essentially the same as boost::intrusive::slist<> but since we're not using Boost we have to reinvent the wheel. Use the new container to replace the HUseList and use the iterators to "before" use nodes in HUserRecord<> to avoid the extra pointer to the previous node which was used exclusively for removing nodes from the list. This reduces the size of the HUseListNode by 25%, 32B to 24B in 64-bit compiler, 16B to 12B in 32-bit compiler. This translates directly to overall memory savings for the 64-bit compiler but due to rounding up of the arena allocations to 8B, we do not get any improvement in the 32-bit compiler. Compiling the Nexus 5 boot image with the 64-bit dex2oat on host this CL reduces the memory used for compiling the most hungry method, BatteryStats.dumpLocked(), by ~3.3MiB: Before: MEM: used: 47829200, allocated: 48769120, lost: 939920 Number of arenas allocated: 345, Number of allocations: 815492, avg size: 58 ... UseListNode 13744640 ... After: MEM: used: 44393040, allocated: 45361248, lost: 968208 Number of arenas allocated: 319, Number of allocations: 815492, avg size: 54 ... UseListNode 10308480 ... Note that while we do not ship the 64-bit dex2oat to the device, the JIT compilation for 64-bit processes is using the 64-bit libart-compiler. Bug: 28173563 Change-Id: I985eabd4816f845372d8aaa825a1489cf9569208
|
18b36abc7cc03076fe1c399c0bb8ec8793cc6806 |
|
14-Apr-2016 |
Aart Bik <ajcbik@google.com> |
Remove the no-longer-needed F/I and D/J alias. Rationale: Now that our HIR is type clean (yeah!), we no longer have to conservatively assume F/I and D/J are aliased. This enables more accurate side effects analysis, with improvements in all clients, such a LICM. Refinement: The HIR is not completely clean between building and SSA. This refinement takes care of that, with new tests. BUG=22538329 Change-Id: Id78ff0ff4e325aeebf0022d868937cff73d3a742
|
34d9b04d8d0006967486c0ad1b221e7b632652af |
|
15-Apr-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Use dex cache from compilation unit in RTP." bug:28210356 This reverts commit 27bb86edf60e2f9ca2c1075c0c86b9e79374f1d0. Change-Id: Ib27ee90a7e4d516fd2db67a9c4e454023737841a
|
062157f4e07b525728fa58f4ec57ffe1bf15d545 |
|
02-Mar-2016 |
Mingyao Yang <mingyao@google.com> |
Enable allocation elimination as part of LSE After load-store elimination, an allocation may not be used any more and may be eliminated. Change-Id: I7fcaaefa9d6ec2c611e46119c5799293770a917c
|
27bb86edf60e2f9ca2c1075c0c86b9e79374f1d0 |
|
14-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Use dex cache from compilation unit in RTP. Avoid calling the costly ClassLinker::FindDexCache() from reference type propagation when the dex cache from the compilation unit will do, i.e. almost always. Compiling the Nexus 5 boot image on host under perf(1) shows that the FindDexCache() hits drop from about 0.2% to almost nothing, though enabling inlining for the boot image will increase it a bit to 0.03% due to unavoidable calls from the inliner. Also clean up the ScopedObjectAccess usage a bit. Change-Id: I426a5f9f5da9e64fad2ea57654240789a48d3871
|
1f7624c3bc41251ff72b1409441f541d992967c7 |
|
13-Apr-2016 |
Aart Bik <ajcbik@google.com> |
Revert "Remove the no-longer-needed F/I and D/J alias." This reverts commit 2f52064dcfe5ebce5a998d30766ca079a366c920. Reason: Arrays.sort() returns wrong result on double[] and this CL is the most likely suspect. Rolling back to buy some time for careful analysis and debugging. Change-Id: I58223c42e95c2287520eef863fbcb738b0736d4d
|
2f52064dcfe5ebce5a998d30766ca079a366c920 |
|
13-Apr-2016 |
Aart Bik <ajcbik@google.com> |
Remove the no-longer-needed F/I and D/J alias. Rationale: Now that our HIR is type clean (yeah!), we no longer have to conservatively assume F/I and D/J are aliased. This enables more accurate side effects analysis, with improvements in all clients, such a LICM. BUG=22538329 Change-Id: Iba1fb09ff063f31b5893f588aa6d0c5ab3b42f39
|
c2e8af9659db7e456b26febb1b971900057ad427 |
|
05-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Speed up HGraph::PopulateIrreducibleRecursive Populating an irreducible loop can potentially traverse all possible paths through the HGraph, leading to an exponential algorithm. This patch adds a bit vector of nodes whose membership in the loop has been decided and need not be revisited again. Bug: 27856014 Change-Id: I3696f08c846e6f40e5de44cb771811bac7e3e08a
|
dee58d6bb6d567fcd0c4f39d8d690c3acaf0e432 |
|
07-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals"" This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) This CL fixed an issue with parsing quickened instructions. Bug: 27894376 Bug: 27998571 Bug: 27995065 Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
|
ace7a000a433ce4ecf94f30adea39c01a76fa936 |
|
05-Apr-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: LoadString may not have any side effects. LoadString does not have any side effects if the string is known to be in the dex cache or it's a boot image string referenced directly, as specified by the string load kind. We need to clear the side effects for these cases to avoid a DCHECK() failure when such LoadString instruction ends up between a ClinitCheck and an instruction to which we want to merge that ClinitCheck. This may happen as a consequence of inlining, LICM and DCE as shown by a regression test. Bug: 27929914 Change-Id: I7b3bddf7d8c79ce1828a4a751f1270cf2e3d61f0
|
60328910cad396589474f8513391ba733d19390b |
|
04-Apr-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals" Bug: 27995065 This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f. Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
|
e3ff7b293be2a6791fe9d135d660c0cffe4bd73f |
|
02-Mar-2016 |
David Brazdil <dbrazdil@google.com> |
Refactor HGraphBuilder and SsaBuilder to remove HLocals This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) Bug: 27894376 Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
|
86ea7eeabe30c98bbe1651a51d03cb89776724e7 |
|
16-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
Build dominator tree before generating HInstructions Second CL in the series of merging HGraphBuilder and SsaBuilder. This patch refactors the builders so that dominator tree can be built before any HInstructions are generated. This puts the SsaBuilder removal of HLoadLocals/HStoreLocals straight after HGraphBuilder's HInstruction generation phase. Next CL will therefore be able to merge them. This patch also adds util classes for iterating bytecode and switch tables which allowed to simplify the code. Bug: 27894376 Change-Id: Ic425d298b2e6e7980481ed697230b1a0b7904526
|
f355c3ff08710ac2eba3aac2aacc5e65caa06b4c |
|
30-Mar-2016 |
Roland Levillain <rpl@google.com> |
Fix Boolean to integral types conversions. Bug: 27616343 Change-Id: I050f92045bca1b8b5d6da53547cc617f17be84b1
|
cac5a7e871f1f346b317894359ad06fa7bd67fba |
|
22-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Improve const-string code generation. For strings in the boot image, use either direct pointers or pc-relative addresses. For other strings, use PC-relative access to the dex cache arrays for AOT and direct address of the string's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -692KiB (-0.9%) - 64-bit boot.oat: -948KiB (-1.1%) - 32-bit dalvik cache total: -900KiB (-0.9%) - 64-bit dalvik cache total: -3672KiB (-1.5%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -380KiB (-0.5%) - 64-bit boot.oat: -928KiB (-1.0%) - 32-bit dalvik cache total: -468KiB (-0.4%) - 64-bit dalvik cache total: -1928KiB (-0.8%) (contains more files than the 32-bit dalvik cache) Bug: 26884697 Change-Id: Iec7266ce67e6fedc107be78fab2e742a8dab2696
|
5b5b9319ff970979ed47d41a41283e4faeffb602 |
|
22-Mar-2016 |
Roland Levillain <rpl@google.com> |
Fix and improve shift and rotate operations. - Define maximum int and long shift & rotate distances as int32_t constants, as shift & rotate distances are 32-bit integer values. - Consider the (long, long) inputs case as invalid for static evaluation of shift & rotate rotations. - Add more checks in shift & rotate operations constructors as well as in art::GraphChecker. Change-Id: I754b326c3a341c9cc567d1720b327dad6fcbf9d6
|
937e6cd515bbe7ff2f255c8fcd40bf1a575a9a16 |
|
22-Mar-2016 |
Roland Levillain <rpl@google.com> |
Tighten art::HNeg type constraints on its input. Ensure art::HNeg is only passed a type having the kind of its input. For a boolean, byte, short, or char input, it means HNeg's type should be int. Bug: 27684275 Change-Id: Ic8442c62090a8ab65590754874a14a0deb7acd8d
|
f6a35de9eeefb20f6446f1b4815b4dcb0161d09c |
|
21-Mar-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Fix register allocator validation memory usage. Also attribute ArenaBitVector allocations to appropriate passes. This was used to track down the source of the excessive memory alloactions. Bug: 27690481 Change-Id: Ib895984cb7c04e24cbc7abbd8322079bab8ab100
|
1a65388f1d86bb232c2e44fecb44cebe13105d2e |
|
18-Mar-2016 |
Roland Levillain <rpl@google.com> |
Clean up art::HConstant predicates. - Make the difference between arithmetic zero and zero-bit pattern non ambiguous. - Introduce Boolean predicates in art::HIntConstant for when they are used as Booleans. - Introduce aritmetic positive and negative zero predicates for floating-point constants. Bug: 27639313 Change-Id: Ia04ecc6f6aa7450136028c5362ed429760c883bd
|
22c4922c6b31e154a6814c4abe9015d9ba156911 |
|
18-Mar-2016 |
Roland Levillain <rpl@google.com> |
Ensure art::HRor support boolean, byte, short and char inputs. Also extend tests covering the IntegerRotateLeft, LongRotateLeft, IntegerRotateRight and LongRotateRight intrinsics and their translation into an art::HRor instruction. Bug: 27682579 Change-Id: I89f6ea6a7315659a172482bf09875cfb7e7422a1
|
a5c4a4060edd03eda017abebc85f24cffb083ba7 |
|
15-Mar-2016 |
Roland Levillain <rpl@google.com> |
Make art::HCompare support boolean, byte, short and char inputs. Also extend tests covering the IntegerSignum, LongSignum, IntegerCompare and LongCompare intrinsics and their translation into an art::HCompare instruction. Bug: 27629913 Change-Id: I0afc75ee6e82602b01ec348bbb36a08e8abb8bb8
|
949e54dec11582cb9bb96f0d68e6485dd20336f1 |
|
15-Mar-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix HDoubleConstant::IsZero and HFloatConstant::IsZero. bug:27639313 Change-Id: I2f30a65a07662dfce0a6d6f4ed356a8a0b3dcdef
|
1693a1f9c83a0bf5a29fa18ddc2d87e04e049233 |
|
15-Mar-2016 |
Roland Levillain <rpl@google.com> |
Make art::HCompare side effect free. All our back ends implement all comparisons without making a runtime call, so we can mark art::HCompare as a side effect free instruction unconditionally. Change-Id: I9a9e7c09156c642edb6af1fe84408f887e762f2e
|
18401b748a3180f52e42547ede22d1b184fe8c43 |
|
11-Mar-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix invariant in reference type propagation. Also some cleanups. Change-Id: I7f0ec7d06b4bab10dbfa230c757447d311658f93
|
7fc6350f6f1ab04b52b9cd7542e0790528296cbe |
|
09-Feb-2016 |
Artem Serov <artem.serov@linaro.org> |
Integrate BitwiseNegated into shared framework. Share implementation between arm and arm64. Change-Id: I0dd12e772cb23b4c181fd0b1e2a447470b1d8702
|
bdd7935c2adc3ad190ee87958e714a36f33cedae |
|
14-Feb-2016 |
Anton Shamin <anton.shamin@intel.com> |
Revert "Revert "Revert "Revert "Change condition to opposite if lhs is constant"""" This reverts commit d4aee949b3dd976295201b5310f13aa2df40afa1. Change-Id: I505b8c9863c310a3a708f580b00d425b750c9541
|
7c9c31ca3b94a8e0828d2d8f9747fd579ca40305 |
|
09-Mar-2016 |
Andreas Gampe <agampe@google.com> |
ART: Fix missing include The SwitchTable needs a function from an inl file. Change-Id: I624d71e0c0efc0c87150d7ef3be71e0b4506c75a
|
3f52306b259caed1c654c4b3fd5b594d5ec8d46c |
|
29-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Fix overlapping instruction IDs in inliner Inliner creates the inner graph so that it generates instruction IDs higher than the outer graph. This was broken because the inliner would create instructions in the outer graph before the inner graph is inlined. The bug cannot be triggered because the offending instruction would share the same ID as the first inner HLocal, which is removed before the inner graph is inlined. The added DCHECKs reveal the hidden problem and make it safe for HLocals to be removed in the future. Change-Id: I486eb0f3987e20c50cbec0fb06332229e07fbae9
|
d65925108fc99607f3b447d8505fe6713acda55c |
|
29-Feb-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Bug fix for polymorphic inlining. The code used to wrongly propagate try/catch information to new blocks. Since it has the same logic as Hraph::InlineInto, extract the code that updates loop and try/catch information to blocks to a shared method. bug:27330865 bug:27372101 bug:27360329 (cherry picked from commit a1d8ddfaf09545f99bc326dff97ab604d4574eb6) Change-Id: Ice0373ec0a1c24d78121634a377f6f502e814cfb
|
a1d8ddfaf09545f99bc326dff97ab604d4574eb6 |
|
29-Feb-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Bug fix for polymorphic inlining. The code used to wrongly propagate try/catch information to new blocks. Since it has the same logic as Hraph::InlineInto, extract the code that updates loop and try/catch information to blocks to a shared method. bug:27330865 bug:27372101 bug:27360329 Change-Id: I4386f724d8d412bde5bcc04fda6955bc3bacf5a9
|
a1de9188a05afdecca8cd04ecc4fefbac8b9880f |
|
25-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Reduce memory usage of HInstructions. Pack narrow fields and flags into a single 32-bit field. Change-Id: Ib2f7abf987caee0339018d21f0d498f8db63542d
|
9ff0d205fd60cba6753a91f613b198ca2d67f04d |
|
11-Jan-2016 |
Kevin Brodsky <kevin.brodsky@linaro.org> |
Optimizing: ARM64 negated bitwise operations simplification Use negated instructions on ARM64 to replace [bitwise operation + not] patterns, that is: a & ~b (BIC) a | ~b (ORN) a ^ ~b (EON) The simplification only happens if the Not is only used by the bitwise operation. It does not happen if both inputs are Not's (this should be handled by a generic simplification applying De Morgan's laws). Change-Id: I0e112b23fd8b8e10f09bfeff5994508a8ff96e9c
|
4a0dad67867f389e01a5a6c0fe381d210f687c0d |
|
25-Jan-2016 |
Artem Udovichenko <artem.u@samsung.com> |
Revert "Revert "ARM/ARM64: Extend support of instruction combining."" This reverts commit 6b5afdd144d2bb3bf994240797834b5666b2cf98. Change-Id: Ic27a10f02e21109503edd64e6d73d1bb0c6a8ac6
|
e53bd8160ad2892f33849108d3b1099992a311fd |
|
24-Feb-2016 |
Roland Levillain <rpl@google.com> |
Remove unreachable code paths in constant folding. Change-Id: I7ffb361711c87f6b1b98d172d2cfdf9b2ba65607
|
e4084a5eb46dc6b99c0e0b74bcdecccaceb28fe7 |
|
18-Feb-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Small inlining improvements. - Use the type_index in the current dex file for classes not defined in the current dex file. - Make the loading of the vtable field of a class have no side effects to enable gvn'ing it. Note that those improvements only affect the JIT, where we don't have checker support. Change-Id: I519f52bd8270f2b828f0920a1214da33cf788f41
|
916cc1d504f10a24f43b384e035fdecbe6a74b4c |
|
18-Feb-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement polymorphic inlining. For example, before: HInvokeVirtual After: If (receiver == Foo) { // inlined code. } else if (receiver == Bar) { // inlined code } else { // HInvokeVirtual or HDeoptimize(receiver != Baz) } Change-Id: I5ce305aef8f39f8294bf2b2bcfe60e0dddcfdbec
|
31dd3d60491148d345c1edae1ccd090a1b67dd2b |
|
16-Feb-2016 |
Roland Levillain <rpl@google.com> |
Extend constant folding to float and double operations. Change-Id: I2837064b2ceea587bc171fc520507f13355292c6
|
55bd749991f9a0a73f612696e1a93e739380546b |
|
16-Feb-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Refactor the inliner. In preparation for more polymorphic inlining, refactor the inliner a bit. Change-Id: Ie3fd6c1ef205f1089989c67a527e6f57ff3c8b5d
|
9779307ce8f2dd40c429abb0f0cafc1415f70648 |
|
16-Feb-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
HInvokeStaticOrDirect may not have a special input. For irreducible loops, we disable the generation of HX86ComputeBaseMethodAddress, so intrinsics code should not assume it's there. bug:27149923 Change-Id: I78ba0ca7aefa4033227c77ba438b6eaca53dadd9
|
c0b601b5e4c1add5eefd45f2f4d2c376a20ba4d4 |
|
08-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Implement HSelect with CSEL/FCSEL on arm64 Change-Id: I549af0cba3c5048066a2d1206b78a70b496d349e
|
badd826664896d4a9628a5a89b78016894aa414b |
|
02-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Run SsaBuilder from HGraphBuilder First step towards merging the two passes, which will later result in HGraphBuilder directly producing SSA form. This CL mostly just updates tests broken by not being able to inspect the pre-SSA form. Using HLocals outside the HGraphBuilder is now deprecated. Bug: 27150508 Change-Id: I00fb6050580f409dcc5aa5b5aa3a536d6e8d759e
|
6e332529c33be4d7dae5dad3609a839f4c0d3bfc |
|
02-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove HTemporary Change-Id: I21b984224370a9ce7a4a13a9652503cfb03c5f03
|
86503785cd6414b8692e5c83cadaa2972b6a099b |
|
11-Feb-2016 |
Roland Levillain <rpl@google.com> |
Fix x86-64 Baker's read barrier fast path for CheckCast. Use an art::x86_64::Label instead of an art::x86_64::NearLabel as end label when emitting code for a HCheckCast instruction, as the range of the latter may sometimes be too short when Baker's read barriers are enabled. Bug: 12687968 Change-Id: Ia9742dce65be7d4fb104688f3c4717b65df1fb54
|
b331febbab8e916680faba722cc84b66b84218a3 |
|
05-Feb-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Implement on-stack replacement for arm/arm64/x86/x86_64."" This reverts commit bd89a5c556324062b7d841843b039392e84cfaf4. Change-Id: I08d190431520baa7fcec8fbdb444519f25ac8d44
|
bd89a5c556324062b7d841843b039392e84cfaf4 |
|
05-Feb-2016 |
David Brazdil <dbrazdil@google.com> |
Revert "Implement on-stack replacement for arm/arm64/x86/x86_64." DCHECK whether loop headers are covered fails. This reverts commit 891bc286963892ed96134ca1adb7822737af9710. Change-Id: I0f9a90630b014b16d20ba1dfba31ce63e6648021
|
891bc286963892ed96134ca1adb7822737af9710 |
|
29-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement on-stack replacement for arm/arm64/x86/x86_64. High-level overview: - osr_method_threshold is used to know when to compile a method in osr mode (-> treat all loops as irreducible). - branch instructions in the compiler query whether they can jump to an osr method. - An osr entry point is found through the stack maps: if a stack map is duplicated in the CodeInfo, it is an osr entry point. Change-Id: Ifb39338cd281e2c7eccce67f4e18d46428be71e4
|
2f10a5fb8c236a6786928f0323bd312c3ee9a4cc |
|
25-Jan-2016 |
Mark P Mendell <mark.p.mendell@intel.com> |
Revert "Revert "X86: Use the constant area for more operations."" This reverts commit cf8d1bb97e193e02b430d707d3b669565fababb4. Handle the case of an intrinsic where CurrentMethod is still an input. This will be the case when there are unresolved classes in the hierarchy. Add a test case to confirm that we don't crash when handling Math.abs, which wants to add a pointer to the constant area for the bitmask to be used to remove the sign bit. Enhance 565-checker-condition-liveness to check for the case of deeply nested EmitAtUseSite chains. Change-Id: I022e8b96a32f5bf464331d0c318c56b9d0ac3c9a
|
03196cfae4e8a91ce37d257b315f78a965a79829 |
|
01-Feb-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Be on the safe side: emit an environment for runtime calls. Even if those runtime calls don't throw, they may be interrupted and be asked to dump their stack. Since dumping a stack also dumps locked Java objects, we need a DexRegisterMap at these locations to know the location of those objects. Adds 0.05% to boot image code size. bug:26168076 Change-Id: I7c3975addea9ddf3123183b07108b0701bb26fc8
|
a42363f79832a6e14f348514664dc6dc3edf9da2 |
|
17-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement first kind of polymorphic inlining. Add HClassTableGet to fetch an ArtMethod from the vtable or imt, and compare it to the only method the profiling saw. Change-Id: I76afd3689178f10e3be048aa3ac9a97c6f63295d
|
74eb1b264691c4eb399d0858015a7fc13c476ac6 |
|
14-Dec-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement HSelect This patch adds a new HIR instruction to Optimizing. HSelect returns one of two inputs based on the outcome of a condition. This is only initial implementation which: - defines the new instruction, - repurposes BooleanSimplifier to emit it, - extends InstructionSimplifier to statically resolve it, - updates existing code and tests accordingly. Code generators currently emit fallback if/then/else code and will be updated in follow-up CLs to use platform-specific conditional moves when possible. Change-Id: Ib61b17146487ebe6b55350c2b589f0b971dcaaee
|
b3e773eea39a156b3eacf915ba84e3af1a5c14fa |
|
26-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Implement support for instruction inlining Optimizing HIR contains 'non-materialized' instructions which are emitted at their use sites rather than their defining sites. This was not properly handled by the liveness analysis which did not adjust the use positions of the inputs of such instructions. Despite the analysis being incorrect, the current use cases never produce incorrect code. This patch generalizes the concept of inlined instructions and updates liveness analysis to set the compute use positions correctly. Change-Id: Id703c154b20ab861241ae5c715a150385d3ff621
|
f39745e663f8f2634fc8858e427b77da98f8f2b4 |
|
26-Jan-2016 |
Vladimir Marko <vmarko@google.com> |
ART: Remove some unnecessary mutator lock annotations. The StackReference<> pointer held by a Handle<> can be used without holding the mutator lock. We already do that when we copy Handle<>s around. Only accessing the actual content of the pointed-to StackReference<> needs to be done while holding the mutator lock. Change-Id: I5f93bd7e277383192f1f16dff6883ecb26387414
|
cf8d1bb97e193e02b430d707d3b669565fababb4 |
|
25-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "X86: Use the constant area for more operations." Hits a DCHECK: dex2oatd F 19461 20411 art/compiler/optimizing/pc_relative_fixups_x86.cc:196] Check failed: !invoke_static_or_direct->HasCurrentMethodInput() This reverts commit dc00454f0b9a134f01f79b419200f4044c2af5c6. Change-Id: Idfcacf12eb9e1dd7e68d95e880fda0f76f90e9ed
|
dc00454f0b9a134f01f79b419200f4044c2af5c6 |
|
30-Oct-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86: Use the constant area for more operations. Allow FP HNeg to use the constant area to hold the constant to flip the sign bit. Enhance some math intrinsics to allow the use of the constant area: Abs{Float,Double}, {Min,Max}{FloatFloat,DoubleDouble}. Allow compares of floats/doubles to constants using the constant area. These eliminate almost all uses of loading constants from the stack. Change-Id: Ic4b831565825cbe9f0801b1b53c1013be7c87ae4 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
788f2f05c3e5b0e5bda247b00e34f0094585546f |
|
22-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Inline methods with loops."" Bug: 26689526 This reverts commit 451ad8d1be9a1949ea3c3e3a713a9e76198a8b2d. Change-Id: If484fe4c0744254dd7568fd5006e574d621a1855
|
d4aee949b3dd976295201b5310f13aa2df40afa1 |
|
22-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Revert "Change condition to opposite if lhs is constant""" Fails two checker tests: 458-checker-instruction-simplification 537-checker-jump-over-jump This reverts commit 884e54c8a45e49b58cb1127c8ed890f79f382601. Change-Id: I22553e4e77662736b8b453d911a2f4e601f3a27e
|
6b5afdd144d2bb3bf994240797834b5666b2cf98 |
|
22-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "ARM/ARM64: Extend support of instruction combining." The test fails its checker parts. This reverts commit debeb98aaa8950caf1a19df490f2ac9bf563075b. Change-Id: I49929e15950c7814da6c411ecd2b640d12de80df
|
884e54c8a45e49b58cb1127c8ed890f79f382601 |
|
22-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Change condition to opposite if lhs is constant"" This reverts commit a05cacc11fa075246c38497c01b949745fadc54b. Change-Id: Ifdc261fd4dfb2c538017fe1d69af723aafd4afef
|
debeb98aaa8950caf1a19df490f2ac9bf563075b |
|
11-Dec-2015 |
Ilmir Usmanov <i.usmanov@samsung.com> |
ARM/ARM64: Extend support of instruction combining. Combine multiply instructions in the following way: ARM64: MUL/NEG -> MNEG ARM32 (32-bit integers only): MUL/ADD -> MLA MUL/SUB -> MLS Change-Id: If20f2d8fb060145ab6fbceeb5a8f1a3d02e0ecdb
|
a0ee77157ee6ceb72292c20bc299c1d24fe95a39 |
|
20-Jan-2016 |
Andreas Gampe <agampe@google.com> |
Revert "Inline methods with loops." This reverts commit 82fc9bb45dbf8ff728122fb7ab72d1eb7b2f4869. Loop inlining exposes issues with BCE. Bug: 26689526 (cherry picked from commit 451ad8d1be9a1949ea3c3e3a713a9e76198a8b2d) Change-Id: Ie6f260e6a224aeb7f5ed93df378b6cefba10d35f
|
451ad8d1be9a1949ea3c3e3a713a9e76198a8b2d |
|
20-Jan-2016 |
Andreas Gampe <agampe@google.com> |
Revert "Inline methods with loops." This reverts commit 82fc9bb45dbf8ff728122fb7ab72d1eb7b2f4869. Loop inlining exposes issues with BCE. Bug: 26689526 Change-Id: Id9983d7f9d3c5579d91e56e4699d4d939517b2dc
|
bc9ab1630a198efbbf730275541291321ac3d2d4 |
|
20-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Cannot assume String.<init> called on NewInstance Irreducible loops create uneliminatable phis for all live vregs. This breaks the StringFactory optimization which assumes that the first input is always a NewInstance instruction. Bug: 26676472 Change-Id: Ib7dfdadbafbbfef89e1f5b1a80eb75ecf792621a
|
82fc9bb45dbf8ff728122fb7ab72d1eb7b2f4869 |
|
19-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Inline methods with loops. Compiling Gms/Fb/Framework/Docs: - Overall compilation-time increase: 2.2% - Overall code size increase: 1.1% Performance improvements: - Richards with jit: +6% - Takl: +11% Change-Id: I0a6fcf2a360e5ad193cd95b5c4fe92227ac6bd96
|
09aa147f0891ef28a95d89e8ad61c429f82ddd5b |
|
19-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Disable DCE when there are irreducible loops. Also ensure an instruction that requires an environment does have one. Change-Id: I41a8460e05ef320f872197d3be7847e7ffaa6ee8
|
6de1938e562b0d06e462512dd806166e754035ea |
|
08-Jan-2016 |
David Brazdil <dbrazdil@google.com> |
ART: Remove incorrect HFakeString optimization Simplification of HFakeString assumes that it cannot be used until String.<init> is called which is not true and causes different behaviour between the compiler and the interpreter. This patch removes the optimization together with the HFakeString instruction. Instead, HNewInstance is generated and an empty String allocated until it is replaced with the result of the StringFactory call. This is consistent with the behaviour of the interpreter but is too conservative. A follow-up CL will attempt to optimize out the initial allocation when possible. Bug: 26457745 Bug: 26486014 Change-Id: I7139e37ed00a880715bfc234896a930fde670c44
|
15bd22849ee6a1ffb3fb3630f686c2870bdf1bbc |
|
05-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement irreducible loop support in optimizing. So we don't fallback to the interpreter in the presence of irreducible loops. Implications: - A loop pre-header does not necessarily dominate a loop header. - Non-constant redundant phis will be kept in loop headers, to satisfy our linear scan register allocation algorithm. - while-graph optimizations, such as gvn, licm, lse, and dce need to know when they are dealing with irreducible loops. Change-Id: I2cea8934ce0b40162d215353497c7f77d6c9137e
|
da2299c918b8b305098d2dd6ee81f012fc12a1e5 |
|
13-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Set side effects to HNullCheck and HBoundsCheck. Both can trigger GC, as they will call NullPointerException or IndexOutOfBoundsException constructors. bug:26532563 (cherry picked from commit 1af564e2d3b560fb9a076eb35ea20471aed0dc92) Change-Id: I03b71a59cc1b1526efbd8d5819bf2bf60dbdba30 (cherry picked from commit d126236d98d538a1ed749c9461d7f5973fc06e04)
|
83fd866439ec79d75db5970c06c007f958d8e703 |
|
13-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
HDeoptimize can also trigger GC. bug:26532563 Change-Id: Idaa294fb500ab820c7b45e37747e96f0b455f663
|
9cf132ba612dcb6d53f3105d32ed007c698968a0 |
|
13-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
HDeoptimize can also trigger GC. bug:26532563 Change-Id: Idaa294fb500ab820c7b45e37747e96f0b455f663
|
780aeced2a8ef918901d8f450864de934f79c555 |
|
13-Jan-2016 |
Alexandre Rames <alexandre.rames@linaro.org> |
Update `ValidateInvokeRuntime()` and HDivZeroCheck. Change-Id: I35beab2777a8c83bd508d56966afa1ceff9ee24f
|
1cde05849f2057b11e3a149144a1d02245d22060 |
|
13-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
HDeoptimize can also trigger GC. bug:26532563 Change-Id: Idaa294fb500ab820c7b45e37747e96f0b455f663
|
d126236d98d538a1ed749c9461d7f5973fc06e04 |
|
13-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Set side effects to HNullCheck and HBoundsCheck. Both can trigger GC, as they will call NullPointerException or IndexOutOfBoundsException constructors. bug:26532563 (cherry picked from commit 1af564e2d3b560fb9a076eb35ea20471aed0dc92) Change-Id: I03b71a59cc1b1526efbd8d5819bf2bf60dbdba30
|
1af564e2d3b560fb9a076eb35ea20471aed0dc92 |
|
13-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Set side effects to HNullCheck and HBoundsCheck. Both can trigger GC, as they will call NullPointerException or IndexOutOfBoundsException constructors. bug:26532563 Change-Id: Id9e42f0450caaaf365630989e1b36e98add46c89
|
a3eca2d7300f35c66cf4b696d788a8b7ba74eb99 |
|
12-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not leave intermediate addresses across Java calls. bug:26472446 Change-Id: Ie4a9b5fe6f1d61a76c71eceaa2299fe55512c612
|
a05cacc11fa075246c38497c01b949745fadc54b |
|
12-Jan-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Change condition to opposite if lhs is constant" Breaks arm64 This reverts commit f9f196c55f3b25c3b09350cd8ed5d7ead31f1757. Change-Id: Ie1027a218154b8ded6c1c8f0007720f5be68780d
|
f9f196c55f3b25c3b09350cd8ed5d7ead31f1757 |
|
08-Sep-2015 |
Anton Shamin <anton.shamin@intel.com> |
Change condition to opposite if lhs is constant Swap operands if lhs is constant. Handeled unsigned comparison in insruction simplifier. Fixed NaN comparison: no matter what bias is set result of Equal and NotEqual operations should not depend on it. Added checker tests. Change-Id: I5a9ac25fb10f2705127a52534867cee43368ed1b Signed-off-by: Anton Shamin <anton.shamin@intel.com> Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
c928591f5b2c544751bb3fb26dc614d3c2e67bef |
|
18-Dec-2015 |
Roland Levillain <rpl@google.com> |
ARM Baker's read barrier fast path implementation. Introduce an ARM fast path implementation in Optimizing for Baker's read barriers (for both heap reference loads and GC root loads). The marking phase of the read barrier is performed by a slow path, invoking the runtime entry point artReadBarrierMark. Other read barrier algorithms continue to use the original slow path based implementation, which has been renamed as GenerateReadBarrierSlow/GenerateReadBarrierForRootSlow. Bug: 12687968 Change-Id: Ie7ee85b1b4c0564148270cebdd3cbd4c3da51b3a
|
15693bfdf9fa3ec79327a77b7e10315614d716cc |
|
16-Dec-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Resolve ambiguous ArraySets Just like aget(-wide), the value operand of aput(-wide) bytecode instructions can be both int/long and float/double. This patch builds on the previous mechanism for resolving type of ArrayGets to type the values of ArraySets based on the reference type of the array. Bug: 22538329 Change-Id: Ic86abbb58de146692de04476b555010b6fcdd8b6
|
f555258861aea7df8af9c2241ab761227fd2f66a |
|
27-Dec-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Create BoundType for CheckCast early ReferenceTypePropagation creates a BoundType for each CheckCast and replaces all dominated uses of the casted object with it. This does not include Phi uses on the boundary of the dominated scope, reducing typing precision. This patch creates the BoundType in Builder, causing SsaBuilder to replace uses of the object automatically. Bug: 26081304 Change-Id: I083979155cccb348071ff58cb9060a896ed7d2ac
|
4833f5a1990c76bc2be89504225fb13cca22bedf |
|
16-Dec-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Refactor SsaBuilder for more precise typing info This reverts commit 68289a531484d26214e09f1eadd9833531a3bc3c. Now uses Primitive::Is64BitType instead of Primitive::ComponentSize because it was incorrectly optimized by GCC. Bug: 26208284 Bug: 24252151 Bug: 24252100 Bug: 22538329 Bug: 25786318 Change-Id: Ib39f3da2b92bc5be5d76f4240a77567d82c6bebe
|
5d75afe333f57546786686d9bee16b52f1bbe971 |
|
14-Dec-2015 |
Aart Bik <ajcbik@google.com> |
Improved side-effects/can-throw information on intrinsics. Rationale: improved side effect and exception analysis gives many more opportunities for GVN/LICM/BCE. Change-Id: I8aa9b757d77c7bd9d58271204a657c2c525195b5
|
0cf4493166ff28518c8eafa2d0463f6e817cce75 |
|
09-Dec-2015 |
David Srbecky <dsrbecky@google.com> |
Generate more stack maps during native debugging. Generate extra stack map at the start of each java statement. The stack maps are later translated to DWARF which allows LLDB to set breakpoints and view local variables. Change-Id: If00ab875513308e4a1399d1e12e0fe8934a6f0c3
|
5f7b58ea1adfc0639dd605b65f59198d3763f801 |
|
23-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Rewrite HInstruction::Is/As<type>(). Make Is<type>() and As<type>() non-virtual for concrete instruction types, relying on GetKind(), and mark GetKind() as PURE to improve optimization opportunities. This reduces the number of relocations in libart-compiler.so's .rel.dyn section by ~4K, or ~44%, and in .data.rel.ro by ~18K, or ~65%. The file is 96KiB smaller for Nexus 5, including 8KiB reduction of the .text section. Unfortunately, the g++/clang++ __attribute__((pure)) is not strong enough to avoid duplicated virtual calls and we would need the C++ [[pure]] attribute proposed in n3744 instead. To work around this deficiency, we introduce an extra non-virtual indirection for GetKind(), so that the compiler can optimize common expressions such as instruction->IsAdd() || instruction->IsSub() or instruction->IsAdd() && instruction->AsAdd()->... which contain two virtual calls to GetKind() after inlining. Change-Id: I83787de0671a5cb9f5b0a5f4a536cef239d5b401
|
68289a531484d26214e09f1eadd9833531a3bc3c |
|
16-Dec-2015 |
Alex Light <allight@google.com> |
Revert "ART: Refactor SsaBuilder for more precise typing info" This reverts commit d9510dfc32349eeb4f2145c801f7ba1d5bccfb12. Bug: 26208284 Bug: 24252151 Bug: 24252100 Bug: 22538329 Bug: 25786318 Change-Id: I5f491becdf076ff51d437d490405ec4e1586c010
|
d9510dfc32349eeb4f2145c801f7ba1d5bccfb12 |
|
05-Nov-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Refactor SsaBuilder for more precise typing info This patch refactors the SsaBuilder to do the following: 1) All phis are constructed live and marked dead if not used or proved to be conflicting. 2) Primitive type propagation, now not a separate pass, identifies conflicting types and marks corresponding phis dead. 3) When compiling --debuggable, DeadPhiHandling used to revive phis which had only environmental uses but did not attempt to resolve conflicts. This pass was removed as obsolete and is now superseded by primitive type propagation (identifying conflicting phis) and SsaDeadPhiEliminiation (keeping phis live if debuggable + env use). 4) Resolving conflicts requires correct primitive type information on all instructions. This was not the case for ArrayGet instructions which can have ambiguous types in the bytecode. To this end, SsaBuilder now runs reference type propagation and types ArrayGets from the type of the input array. 5) With RTP being run inside the SsaBuilder, it is not necessary to run it as a separate optimization pass. Optimizations can now assume that all instructions of type kPrimNot have reference type info after SsaBuilder (with the exception of NullConstant). 6) Graph now contains a reference type to be assigned to NullConstant. All reference type instructions therefore have RTI, as now enforced by the SsaChecker. Bug: 24252151 Bug: 24252100 Bug: 22538329 Bug: 25786318 Change-Id: I7a3aee1ff66c82d64b4846611c547af17e91d260
|
40a04bf64e5837fa48aceaffe970c9984c94084a |
|
11-Dec-2015 |
Scott Wakeling <scott.wakeling@linaro.org> |
Replace rotate patterns and invokes with HRor IR. Replace constant and register version bitfield rotate patterns, and rotateRight/Left intrinsic invokes, with new HRor IR. Where k is constant and r is a register, with the UShr and Shl on either side of a |, +, or ^, the following patterns are replaced: x >>> #k OP x << #(reg_size - k) x >>> #k OP x << #-k x >>> r OP x << (#reg_size - r) x >>> (#reg_size - r) OP x << r x >>> r OP x << -r x >>> -r OP x << r Implemented for ARM/ARM64 & X86/X86_64. Tests changed to not be inlined to prevent optimization from folding them out. Additional tests added for constant rotate amounts. Change-Id: I5847d104c0a0348e5792be6c5072ce5090ca2c34
|
917d01680714b2295f109f8fea0aa06764a30b70 |
|
24-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't generate a slow path for strings in the dex cache. Change-Id: I1d258f1a89bf0ec7c7ddd134be9215d480f0b09a
|
4b467ed97bc5886fb800209c0ee94df10163b88d |
|
20-Nov-2015 |
Mingyao Yang <mingyao@google.com> |
Simplify and rename IsLoopInvariant() test. Simplify IsLoopInvariant() test. Also rename it to IsDefinedOutOfTheLoop() so there is no ambiguity for example whether a instruction after the loop counts as a loop invariant. It's up to the caller to make the interpretation. Change-Id: I999139032b0e4d815dd1e2276f2bd428cf558686
|
73be1e8f8609708f6624bb297c9628de44fd8b6f |
|
17-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Inline monomorphic calls. Change-Id: If38171c2dc7d4a4378df5d050afc4fff4499c98f
|
e523423a053af5cb55837f07ceae9ff2fd581712 |
|
02-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Don't use the compiler driver for method resolution."" This reverts commit c88ef3a10c474045a3476a02ae75d07ddd3230b7. Change-Id: I0ed88a48b313a8d28bc39fae40631123aadb13ef
|
f64242a30c6e05a8e4302a64eab4bcc28297dc9e |
|
01-Dec-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Add checker tests for sharpening. This is a follow-up to https://android-review.googlesource.com/184116 . Change-Id: Ib03c424fb673afc5ccce15d7d072b7572b47799a
|
c88ef3a10c474045a3476a02ae75d07ddd3230b7 |
|
01-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't use the compiler driver for method resolution." Fails 425 in debuggable mode. This reverts commit 4db0bf9c4db6a09716c3388b7d2f88d534470339. Change-Id: I346df8f75674564fc4fb241c60f23e250fc7f0a7
|
4db0bf9c4db6a09716c3388b7d2f88d534470339 |
|
23-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use the compiler driver for method resolution. The compiler driver makes assumptions that don't hold for the optimizing compiler, and will for example always go to slow path for an invoke-super when there's no verified method. Also fix GenerateInvokeVirtual in the presence of intrinsics. Next change will address some of the TODOs in sharpening.cc. Change-Id: I2b0e543ee9b9bebcadb2d26de29e850c59ad58b9
|
fb337ea53d1e6fe68b24217a9ea95e9f544ef697 |
|
25-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Move PC-relative addressing bases to a better position. Move the platform-specific HX86ComputeBaseMethodAddress and HArmDexCacheArraysBase to the latest dominator of their uses outside any loop. This brings the base closer to the first use (previously, it was in the entry block) and relieves some pressure on the register allocator while avoiding recalculation of the base in a loop. Change-Id: I231aa81eb5b4de9af2d0167054d06b65eb18a636
|
b4536b7de576b20c74c612406c5d3132998075ef |
|
24-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing/ARM: Implement kDexCachePcRelative dispatch. Change-Id: I0fe2da50a30a3f62bec8ea01688dd1fec84b1831
|
8626b741716390a0119ffeb88b5b9fcf08e13010 |
|
25-Nov-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
ARM64: Use the shifter operands. This introduces architecture-specific instruction simplification. On ARM64 we try to merge shifts and sign-extension operations into arithmetic and logical instructions. For example for the Java code int res = a + (b << 5); we would generate lsl w3, w2, #5 add w0, w1, w3 and we now generate add w0, w1, w2, lsl #5 Change-Id: Ic03bdff44a1c12e21ddff1b0513bd32a730742b7
|
42e372e5a34d0fef88007bc5f40dd0fc7c03b58b |
|
24-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize HLoadClass when we know the class is in the cache. Change-Id: Iaa74591eed0f2eabc9ba9f9988681d9582faa320
|
809d70f5b268227dbd59432dc038c74d8351be29 |
|
19-Nov-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Fix wide stores in Optimizing SsaBuilder::VisitStoreLocal did not take into account the following: (a) when storing a wide value, the high vreg must be invalidated, (b) when storing into the high vreg of a wide value, the low vreg must be invalidated. Both situations cause overestimation of liveness but only (b) has implications on correctness. CodeGenerator::EmitEnvironment will skip the high vreg, causing deoptimizing and try/catch to load a wrong value for that vreg. In order to fix this bug, several changes had to be made to the SsaBuilder: (1) phis need to be initialized with a type which matches its inputs' size, (2) eagerly created loop header phis may end up being undefined because of their corresponding vregs being invalidated inside the loop; these are marked dead during input setting, (3) the entire SSA-building algorithm should never revive an undefined loop header phi. Bug: 25677992 Bug: https://code.google.com/p/android/issues/detail?id=194022 Change-Id: Id8a852e38c3f5ff1c2e608b1aafd6d5ac8311e32
|
8e1ef53e3d551f11bb424ae4f29cc1f5eabbe6bc |
|
23-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not change to the access check entrypoint when inlined. The allocation entrypoint that deals with access checks does not work with inlined methods. Fixes 542-unresolved-access-check in jit mode. Change-Id: I02290a8b2089fcf06e2216dabf8089920b529765
|
729645a937eb9f04a311b3c22471dcf3ebe9bcec |
|
19-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Explicitly add HLoadClass/HClinitCheck for HNewInstance. bug:25735083 bug:25173758 Change-Id: Ie81cfa4fa9c47cc025edb291cdedd7af209a03db
|
f652917de5634b30c974c81d35a72871915b352a |
|
17-Nov-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Simplify boolean condition compared to 0 CaffeineMarkRR Logic has some boolean flipping which can be helped by some simplification. Simplify non-FP (A COND_OP B) != 0 to A OPPOSITE_COND_OP B. This is better than the original code, which would use a HBooleanNot after the condition. Also simplify non-FP (A COND_OP B) == 1 to A OPPOSITE_COND_OP B. Move GetOppositeCondition to nodes.h/nodes.cc to share with Boolean Simplification, renaming it to InsertOppositeCondition, as it inserts the new HInstruction (unless it is a constant). Change-Id: I34ded7758836e375de0d6fdba9239d2d451928d0 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
418318f4d50e0cfc2d54330d7623ee030d4d727d |
|
20-Nov-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
ARM64: Add support for multiply-accumulate. Change-Id: I88dc313df520480f3fd16bbabda27f9435d25368
|
c53c0797a78a89d637e4230503cc1feb27e855a8 |
|
19-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Clean up the special input in HInvokeStaticOrDirect. Change-Id: I4042aefbdac1a8c236d00e2e7145349a64f6486b
|
fbb184a1c6df22d9302b32b55206396c8278edcf |
|
13-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Fix ClinitCheck pruning. Make sure we merge the ClinitCheck only with LoadClass and HInvokeStaticOrDirect that is a part of the very same dex instruction. This fixes incorrect stack traces from class initializers (wrong dex pcs). Rewrite the pruning to do all the ClinitCheck merging when we see the ClinitCheck, instead of merging ClinitCheck into LoadClass and then LoadClass into HInvokeStaticOrDirect. When we later see an HInvokeStaticOrDirect with an explicit check (i.e. not merged), we know that some other instruction is doing the check and the invoke doesn't need to, so we mark it as not requiring the check at all. (Previously it would have been marked as having an implicit check.) Remove the restriction on merging with inlined invoke static as this is not necessary anymore. This was a workaround for X.test(): invoke-static C.foo() [1] C.foo(): invoke-static C.bar() [2] After inlining and GVN we have X.test(): LoadClass C (from [1]) ClinitCheck C (from [1], to be merged to LoadClass) InvokeStaticOrDirect C.bar() (from [2]) and the LoadClass must not be merged into the invoke as this would cause the resolution trampoline to see an inlined frame from the not-yet-loaded class C during the stack walk and try to load the class. However, we're not allowed to load new classes at that point, so an attempt to do so leads to an assertion failure. With this CL, LoadClass is not merged when it comes from a different instruction, so we can guarantee that all inlined frames seen by the stack walk in the resolution trampoline belong to already loaded classes. Change-Id: I2b8da8d4f295355dce17141f0fab2dace126684d
|
fb8464ae5f5194dc16278e528cfcbff71498c767 |
|
02-Nov-2015 |
Mingyao Yang <mingyao@google.com> |
Revert "Revert "Enable store elimination for singleton objects."" This reverts commit 55d02cf056f993aeafebd54e7b7c68c7a48507c9, and makes the following change: Currently we leverage loop side effects to decide whether heap values are killed by the loop. Stores need to be kept if heap values may be killed by loops and the corresponding loads cannot be eliminated. Similar thing need to be done for each predecessor when we merge predecessor heap values. To do that, the HInstanceFieldSet instruction itself is put in the heap value array instead of the value of the store instruction. The store instruction may be added to possibly_removed_stores_ first, but can later be removed from possibly_removed_stores_ when it's found out that the store needs to be kept due to merging/loop side effects. Change-Id: I4f7bb1960f7b47240873e00ff1adac46fc102a02
|
0d5a281c671444bfa75d63caf1427a8c0e6e1177 |
|
13-Nov-2015 |
Roland Levillain <rpl@google.com> |
x86/x86-64 read barrier support for concurrent GC in Optimizing. This first implementation uses slow paths to instrument heap reference loads and GC root loads for the concurrent copying collector, respectively calling the artReadBarrierSlow and artReadBarrierForRootSlow (new) runtime entry points. Notes: - This implementation does not instrument HInvokeVirtual nor HInvokeInterface instructions (for class reference loads), as the corresponding read barriers are not stricly required with the current concurrent copying collector. - Intrinsics which may eventually call (on slow path) are disabled when read barriers are enabled, as the current slow path infrastructure does not support this case. - When read barriers are enabled, the code generated for a HArraySet instruction always go into the array set slow path for object arrays (delegating the operation to the runtime), as we are lacking a mechanism to keep a temporary register live accross a runtime call (needed for the instrumentation of type checking code, which requires two successive read barriers). Bug: 12687968 Change-Id: I14cd6107233c326389120336f93955b28ffbb329
|
0f7dca4ca0be8d2f8776794d35edf8b51b5bc997 |
|
02-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing/X86: PC-relative dex cache array addressing. Add PC-relative dex cache array addressing for X86 and use it for better invoke-static/-direct dispatch. Also delay the initialization to the PC-relative base until needed. Change-Id: Ib8634d5edce4920cd70172fd13211809cf6948d1
|
cdfed3dc422d0e1a9a0a948863308e58c39d01ba |
|
26-Oct-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Run type propagation after inliner only when needed."" This reverts commit 271743601650308c7ac5c7a3ec35025d8130a298. Change-Id: I173e27a0a4d7d54f90ca459eb48d280d1d40ab70
|
040db345c4bcc5572b9f7dafba168f78a4e99792 |
|
10-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Clean up Equal/NotEqual constant folding for nulls. Change-Id: I17766395092ec61df61ef0b9ae4c37fd38164a3b
|
d26a411adee1e71b3f09dd604ab9b23018037138 |
|
10-Nov-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Refactor iteration over normal/exceptional successors Add helper methods on HBasicBlock which return ArrayRef with the suitable sub-array of the `successors_` list. Change-Id: I66c83bb56f2984d7550bf77c48110af4087515a8
|
9e23df5c21bed53ead79e3131b67105abc8871e4 |
|
10-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Improve constant folding + DCE for inlining. Run constant folding before DCE in inliner to eliminate more code that can prevent inlining. Improve the constant folding to evaluate Equals and NotEquals for null inputs. Change-Id: I876ffb903ef39484370b6c8793f0f8467a977362
|
8a7c0fe837bb00b02dfcfc678910d81d07fb2136 |
|
02-Nov-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Update DCE to work with try/catch"" The previous CL failed because it did not update inputs of catch phis. Since phi input indices cannot be easily mapped back to throwing instructions, this new implementation at least removes catch phi uses of values defined in the removed blocks to preserve graph consistency. This reverts commit fb552d7061746f7a90fdd5002696e255e2e15c35. Change-Id: I63d95915d1ef50e71d3bcf0cd10aaded554035b4
|
391d01f3e8dedf3af3727bdf5d5b76ab35d52795 |
|
06-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Rewrite search for common dominators. Provide a utility class that can be used to quickly search for common dominators of two or more blocks. Change the algorithm to avoid memory allocations. Change-Id: Id72c975fc42377cb7622902f87c4262ea7b3cc38
|
b554b5a5ae3cdc66969d61be20783a8af816206e |
|
06-Nov-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Remove unused ArtMethod* input from HInvokeStaticOrDirect. Change-Id: Iea99fa683440673ff517e246f35fade96600f229
|
9bc436160b4af99067973affb0b1008de9a2b04c |
|
05-Nov-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Fix simplification of catch blocks in the presence of dead code Simplification of catch blocks transforms the code so that catch blocks have only exceptional predecessors. However, it is invoked before trivially dead code is eliminated which breaks simple assumptions such as the fact that a catch block cannot start with move-exception if it has non-exceptional predecessors. This patch fixes the algorithm to work under these relaxed conditions. Bug: 25494450 Bug: 25492628 Change-Id: Idc8d010102a4b8b9a6cd918b98d6e11d1838db0c
|
2bd4c5c1b704be8a81d9b7a94b3e828afa2b0963 |
|
04-Nov-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Implement DeadPhiHandling in PrimitiveTypePropagation" Crashes on YouTube, need to investigate This reverts commit 1749e2cfb5c5ed4d6970a09aecf898ca9cdfcb75. Change-Id: If5f133d55dcc26b8db79a670a48fbd4af7807556
|
1749e2cfb5c5ed4d6970a09aecf898ca9cdfcb75 |
|
28-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement DeadPhiHandling in PrimitiveTypePropagation DeadPhiHandling revives non-conflicting phis with environment uses but does not properly merge types. To not duplicate code, this patch modifies PrimitiveTypePropagation to deal with conflicts and thus replaces DeadPhiHandling altogether. Bug: 24252151 Bug: 24252100 Change-Id: I198c71d1b8167fc05783a5a24aa9f1e3804acafe
|
d930929be93798d790c91dd05adf2c038508f1b0 |
|
31-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix inlining and lse bugs with unresolved access. bug:25414532 Change-Id: I48b6660754774ea3e8a62a74175b1aa3728e0151
|
73f1f3be46652d3f6df61b4234c366ebbf81274a |
|
28-Oct-2015 |
Aart Bik <ajcbik@google.com> |
Move loop invariant utility to more general place. Change-Id: I15ebfbf9684f0fcce9e63d078ff8dc1381fd1ca3
|
55d02cf056f993aeafebd54e7b7c68c7a48507c9 |
|
29-Oct-2015 |
Andreas Gampe <agampe@google.com> |
Revert "Enable store elimination for singleton objects." This reverts commit 7f43a3d48fc29045875d50e10bbc5d6ffc25d61e. Fails booting. Bug: 25357772 Change-Id: Ied19536f3ce8d81e76885cb6baed4853e2ed6714
|
7f43a3d48fc29045875d50e10bbc5d6ffc25d61e |
|
28-Oct-2015 |
Mingyao Yang <mingyao@google.com> |
Enable store elimination for singleton objects. Enable store elimination for singleton objects. However for finalizable object, don't eliminate stores. Also added a testcase. Change-Id: Icf991e7ded5b490f55f580ef928ece5c45e89902
|
e5d80f83ae53792bc1eebd4e33e4e99f7c031b0c |
|
16-Oct-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move ArenaBitVector into the runtime Motivation is using arenas in the verifier. Bug: 10921004 Change-Id: I3c7ed369194b2309a47b12a621e897e0f2f65fcf
|
dc151b2346bb8a4fdeed0c06e54c2fca21d59b5d |
|
15-Oct-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Determine invoke-static/-direct dispatch early. Determine the dispatch type of invoke-static/-direct in a special pass right after the type inference. This allows the inliner to pass the "needs dex cache" check and inline more. It also allows the code generator to avoid requesting a register location for the ArtMethod* for kDexCachePcRelative and direct methods. The supported dispatch check handles also situations that the CompilerDriver currently doesn't allow. The cleanup of the CompilerDriver and required changes to Quick will come in a separate change. Change-Id: I3f8e903a119949e95871d8ab0a995f4731a13a07
|
8df69d42a9e3ccd9456ff72fac8dbd1999f98755 |
|
23-Oct-2015 |
Mingyao Yang <mingyao@google.com> |
Revert "Revert "load store elimination."" This reverts commit 8030c4100d2586fac39ed4007c61ee91d4ea4f25. Change-Id: I79558d85484be5f5d04e4a44bea7201fece440f0
|
f652cecb984c104d44a0223c3c98400ef8ed8ce2 |
|
25-Aug-2015 |
Goran Jakovljevic <Goran.Jakovljevic@imgtec.com> |
MIPS: Initial version of optimizing compiler for MIPS32 Change-Id: I370388e8d5de52c7001552b513877ef5833aa621
|
e6dbf48d7a549e58a3d798bbbdc391e4d091b432 |
|
19-Oct-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
ARM64: Instruction simplification for array accesses. HArrayGet and HArraySet with variable indexes generate two instructions on arm64, like add temp, obj, #data_offset ldr out, [temp, index LSL #shift_amount] When we have multiple accesses to the same array, the initial `add` instruction is redundant. This patch introduces the first instruction simplification in the arm64-specific instruction simplification pass. It splits HArrayGet and HArraySet using the new arm64-specific IR HIntermediateAddress. After that we run GVN again to squash the multiple occurrences of HIntermediateAddress. Change-Id: I2e3d12fbb07fed07b2cb2f3f47f99f5a032f8312
|
4b8f1ecd3aa5a29ec1463ff88fee9db365f257dc |
|
26-Aug-2015 |
Roland Levillain <rpl@google.com> |
Use ATTRIBUTE_UNUSED more. Use it in lieu of UNUSED(), which had some incorrect uses. Change-Id: If247dce58b72056f6eea84968e7196f0b5bef4da
|
8030c4100d2586fac39ed4007c61ee91d4ea4f25 |
|
15-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "load store elimination." Breaks libcore tests: libcore.java.lang.ref.FinalizeTest#testWatchdogDoesNotFailForObjectsThatAreNearTheDeadline libcore.java.util.ResourceLeakageDetectorTest#testDetectsUnclosedCloseGuard org.apache.harmony.tests.java.lang.ref.ReferenceTest#test_finalizeReferenceInteraction This reverts commit 589dac7f0ce078d19aad7e35bb0195c47ddf01d2. Change-Id: I55115765c10762d5bc152d3425e4622560d8b9f4
|
589dac7f0ce078d19aad7e35bb0195c47ddf01d2 |
|
24-Aug-2015 |
Mingyao Yang <mingyao@google.com> |
load store elimination. This adds a pass to eliminate some unnecessary heap loads/stores. It first collects heap locations and then tracks values stored to those heap locations. Alias analysis is done based on offset, type, singleton, pre-existence, etc. Change-Id: I11a9d8ef20d1b2f245607eb25118e9aff9be472a
|
e9f37600e98ba21308ad4f70d9d68cf6c057bdbe |
|
09-Oct-2015 |
Aart Bik <ajcbik@google.com> |
Added support for unsigned comparisons Rationale: even though not directly supported in input graph, having the ability to express unsigned comparisons in HIR is useful for all sorts of optimizations. Change-Id: I4543c96a8c1895c3d33aaf85685afbf80fe27d72
|
805b3b56c6eb542298db33e0181f135dc9fed3d9 |
|
18-Sep-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
X86 jump tables for PackedSwitch Implement X86PackedSwitch using a jump table of offsets to blocks. The X86PackedSwitch version just adds an input to address the constant area. Change-Id: Id2752a1ee79222493040c6fd0e59aee9a544b76a Bug: 21119474 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
e6e3beaf2d35d18a79f5e7b60a21e75fac9fd15d |
|
14-Oct-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "optimizing: propagate type information of arguments"" This reverts commit 89c0d32437011bbe492fe14c766cd707046ce043. Change-Id: I603a49794e155cc97410b8836c8ea425bfdc98eb
|
c05aca78fad20901ae17902a3671ccfca9071758 |
|
13-Oct-2015 |
Calin Juravle <calin@google.com> |
Revert "optimizing: propagate type information of arguments" This reverts commit 2c1ffc3a06e9ed0411e29e7dc2558b5d657ede7a. Change-Id: I3291070c373e661fa578f5a38becbb5a502baf94
|
2c1ffc3a06e9ed0411e29e7dc2558b5d657ede7a |
|
12-Oct-2015 |
Calin Juravle <calin@google.com> |
optimizing: propagate type information of arguments This helps inlining and type check elimination. e.g: void foo(ArrayList a) { int size = a.size(); // this can be inlined now. } Change-Id: I3ffeaa79d9df444aa19511c83c544cb5f9d9ab20
|
4e2a55760b231554b72ba6703a22fcc7ab1f714e |
|
07-Oct-2015 |
Calin Juravle <calin@google.com> |
Assert that referrers class should not need access check. Change-Id: Ia682befdb0dc665f74c0f96454cc007304ff2397
|
ee3cf0731d0ef0787bc2947c8e3ca432b513956b |
|
06-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Intrinsify System.arraycopy. Currently on x64, will do the other architectures in different changes. Change-Id: I15fbbadb450dd21787809759a8b14b21b1e42624
|
a9a306d4c1abd43efe75987f174f64fe9e385874 |
|
08-Oct-2015 |
Calin Juravle <calin@google.com> |
Add a clarifying comment on HLoadClass::InstructionDataEquals. Change-Id: I4c298a453f03cde9d32fe43aff86886835af16fe
|
386062d13ce20d036555a9e24b73a67b4156b5cb |
|
07-Oct-2015 |
Calin Juravle <calin@google.com> |
Make sure classes with different access checks are not GVN-ed Change-Id: I89f72fef3be35a4dd9585d97d03a3150386e0891
|
ec7802a102d49ab5c17495118d4fe0bcc7287beb |
|
01-Oct-2015 |
Vladimir Marko <vmarko@google.com> |
Add DCHECKs to ArenaVector and ScopedArenaVector. Implement dchecked_vector<> template that DCHECK()s element access and insert()/emplace()/erase() positions. Change the ArenaVector<> and ScopedArenaVector<> aliases to use the new template instead of std::vector<>. Remove DCHECK()s that have now become unnecessary from the Optimizing compiler. Change-Id: Ib8506bd30d223f68f52bd4476c76d9991acacadc
|
a83a54d7f2322060f08480f8aabac5eb07268912 |
|
02-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for intrinsic optimizations. Change-Id: Ib5a4224022f9360e60c09a19ac8642270a7f3b64
|
154746b84b407cfd166b45e039b62e6a06dc3f39 |
|
06-Oct-2015 |
Calin Juravle <calin@google.com> |
Remove dex_pc's default value from top level HInstruction This clearly hints that the dex_pc is stored in the super class and doesn't need to be reimplemented in subclasses. Change-Id: Ifd4aa95190c4c89367b4dd2cc8ab0ffd263659ac
|
98893e146b0ff0e1fd1d7c29252f1d1e75a163f2 |
|
02-Oct-2015 |
Calin Juravle <calin@google.com> |
Add support for unresolved classes in optimizing. Change-Id: I0e299a81e560eb9cb0737ec46125dffc99333b54
|
e460d1df1f789c7c8bb97024a8efbd713ac175e9 |
|
29-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Support unresolved fields in optimizing" The CL also changes the calling convetion for 64bit static field set to use kArg2 instead of kArg1. This allows optimizing to keep the asumptions: - arm pairs are always of form (even_reg, odd_reg) - ecx_edx is not used as a register on x86. This reverts commit e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1. Change-Id: I93159917565824084abc96775f31be1a4249f2f3
|
9f389d4d00f34a6c76e55b183b8c3d106e314261 |
|
01-Oct-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Fix a static_cast int32_t -> uint64_t bug. HConstant::GetValueAsUint64 is used by SsaChecker to verify that equivalent phis are created only for untyped constants. The test would fail because a static_cast would sign extend the value of the IntConstant. Bug: 24561315 Change-Id: I818ce6a2080994a7c4395d084c1df7fd615a246d
|
e0395dd58454e27fc47c0ca273913929fb658e6c |
|
25-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize ArraySet for x86/x64/arm/arm64. Change-Id: I5bc8c6adf7f82f3b211f0c21067f5bb54dd0c040
|
225b6464a58ebe11c156144653f11a1c6607f4eb |
|
28-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in code generators. And completely remove the deprecated GrowableArray. Replace GrowableArray with ArenaVector in code generators and related classes and tag arena allocations. Label arrays use direct allocations from ArenaAllocator because Label is non-copyable and non-movable and as such cannot be really held in a container. The GrowableArray never actually constructed them, instead relying on the zero-initialized storage from the arena allocator to be correct. We now actually construct the labels. Also avoid StackMapStream::ComputeDexRegisterMapSize() being passed null references, even though unused. Change-Id: I26a46fdd406b23a3969300a67739d55528df8bf4
|
3b9f30487e160eb97f3fa8694351dc1073e2fd45 |
|
24-Sep-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Address HPackedSwitch issues raised after merge There were some stylistic comments about the merged files. Fix those. Add a test that PackedSwitch can be removed by DCE. Change-Id: Idf45833956e9b58051f942a52b06a1e416606e2e Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
211c2119dc8932bdb264fae858adba6c0541ce3c |
|
24-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Rewrite DCE's MarkReachableBlocks(). Replace a recursive implementation with a loop using a work list to avoid stack overflow that we would presumably hit for 702-LargeBranchOffset in host debug build with -O0, once the DCE block elimination is enabled for methods containing try-catch. Bug: 24133462 Change-Id: I41288ba368722bcb5d68259c7c147552c8928099
|
d7558daaa86decf5a38f4f9bcd82267ab6e3e17f |
|
22-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Preserve loop headers with try/catch Algorithm for inserting HTryBoundary instructions would generate a non-natural loop when a loop header block was covered by a TryItem. This patch changes the approach to fix the issue. Bug: 23895756 Change-Id: I0e1ee6cf135cea326a96c97954907d202c9793cc
|
1f8695ca0c0d443a3d2754637ea5c9459147af55 |
|
24-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Rewrite HGraph::FindBackEdges(). Replace a recursive implementation with a loop using a work list to avoid stack overflow for 702-LargeBranchOffset in host debug build with -O0, 512KiB thread pool worker stack. Change-Id: Iaa91f006fa1099913aeffc9c764879bd004d56de
|
d76d1390b04a4db9ca1f74eb4873d926643d979b |
|
23-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Rewrite HGraph::ComputeDominanceInformation(). Replace a recursive implementation with a loop using a work list to avoid stack overflow for 702-LargeBranchOffset in host debug build with -O0. Bug: 24133462 Change-Id: I444cc85733a9212403a071ea98b9ddfb52bfc402
|
fe57faa2e0349418dda38e77ef1c0ac29db75f4d |
|
18-Sep-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Add basic PackedSwitch support Add HPackedSwitch, and generate it from the builder. Code generators convert this to a series of compare/branch tests. Better implementation in the code generators as a real jump table will follow as separate CLs. Change-Id: If14736fa4d62809b6ae95280148c55682e856911 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
85c7bab43d11180d552179c506c2ffdf34dd749c |
|
18-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Optimize code generation of check-cast and instance-of."" This reverts commit 7537437c6a2f89249a48e30effcc27d4e7c5a04f. Change-Id: If759cb08646e47b62829bebc3c5b1e2f2969cf84
|
7537437c6a2f89249a48e30effcc27d4e7c5a04f |
|
17-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Optimize code generation of check-cast and instance-of." Failures with libcore tests. This reverts commit 64acf303eaa2f32c0b1d8cfcbf044a822c5eec08. Change-Id: Ie6f323fcf5d86bae5c334c1352bb21f1bad60a88
|
b7d8e8cf7063fdec1cce6ebd33e33804976bd978 |
|
17-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Do not use range-based loop when inserting elements. When we iterate over the elements of a container and we may insert new elements into that container, it's wrong to use the range-based loop. Bug: 24133462 Change-Id: Iee35fbcf88ed3bcd6155cbeba09bd256032a16be
|
76c92ac73eeda2582caee39dd427ca035caf172b |
|
17-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Allow storing value objects in containers. Change-Id: Ic9c6b62e36706e571fd71c18d24d8e76ae2d5c7b
|
e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1 |
|
17-Sep-2015 |
Calin Juravle <calin@google.com> |
Revert "Support unresolved fields in optimizing" breaks debuggable tests. This reverts commit 23a8e35481face09183a24b9d11e505597c75ebb. Change-Id: I8e60b5c8f48525975f25d19e5e8066c1c94bd2e5
|
64acf303eaa2f32c0b1d8cfcbf044a822c5eec08 |
|
14-Sep-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize code generation of check-cast and instance-of. On x86/x64/arm/arm64. Improve code size of selected apks from 0.3% to 1%, and performance of DeltaBlue by 20%. Change-Id: Ib5799f7a53443cd880a121dd7f21932ae9f5c7aa
|
23a8e35481face09183a24b9d11e505597c75ebb |
|
08-Sep-2015 |
Calin Juravle <calin@google.com> |
Support unresolved fields in optimizing Change-Id: I9941fa5fcb6ef0a7a253c7a0b479a44a0210aad4
|
175dc732c80e6f2afd83209348124df349290ba8 |
|
25-Aug-2015 |
Calin Juravle <calin@google.com> |
Support unresolved methods in Optimizing Change-Id: If2da02b50d2fa668cd58f134a005f1752e7746b1
|
71bf8090663d02869cafafdd530976f7f2a9db7f |
|
15-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in SsaBuilder. Replace GrowableArray with ArenaVector in SsaBuilder and tag allocations with a new arena allocation type. Change-Id: I27312c51d7be9d2ad02a974cce93b365c65c5fc4
|
fa6b93c4b69e6d7ddfa2a4ed0aff01b0608c5a3a |
|
15-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag arena allocations in HGraph. Replace GrowableArray with ArenaVector in HGraph and related classes HEnvironment, HLoopInformation, HInvoke and HPhi, and tag allocations with new arena allocation types. Change-Id: I3d79897af405b9a1a5b98bfc372e70fe0b3bc40d
|
77a48ae01bbc5b05ca009cf09e2fcb53e4c8ff23 |
|
15-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Register allocation and runtime support for try/catch"" The original CL triggered b/24084144 which has been fixed by Ib72e12a018437c404e82f7ad414554c66a4c6f8c. This reverts commit 659562aaf133c41b8d90ec9216c07646f0f14362. Change-Id: Id8980436172457d0fcb276349c4405f7c4110a55
|
659562aaf133c41b8d90ec9216c07646f0f14362 |
|
14-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Register allocation and runtime support for try/catch" Breaks libcore test org.apache.harmony.security.tests.java.security.KeyStorePrivateKeyEntryTest#testGetCertificateChain. Need to investigate. This reverts commit b022fa1300e6d78639b3b910af0cf85c43df44bb. Change-Id: Ib24d3a80064d963d273e557a93469c95f37b1f6f
|
b022fa1300e6d78639b3b910af0cf85c43df44bb |
|
20-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Register allocation and runtime support for try/catch This patch completes a series of CLs that add support for try/catch in the Optimizing compiler. With it, Optimizing can compile all methods containing try/catch, provided they don't contain catch loops. Future work will focus on improving performance of the generated code. SsaLivenessAnalysis was updated to propagate liveness information of instructions live at catch blocks, and to keep location information on instructions which may be caught by catch phis. RegisterAllocator was extended to spill values used after catch, and to allocate spill slots for catch phis. Catch phis generated for the same vreg share a spill slot as the raw value must be the same. Location builders and slow paths were updated to reflect the fact that throwing an exception may not lead to escaping the method. Instruction code generators are forbidden from using of implicit null checks in try blocks as live registers need to be saved before handing over to the runtime. CodeGenerator emits a stack map for each catch block, storing locations of catch phis. CodeInfo and StackMapStream recognize this new type of stack map and store them separate from other stack maps to avoid dex_pc conflicts. After having found the target catch block to deliver an exception to, QuickExceptionHandler looks up the dex register maps at the throwing instruction and the catch block and copies the values over to their respective locations. The runtime-support approach was selected because it allows for the best performance in the normal control-flow path, since no propagation of catch phi values is necessary until the exception is thrown. In addition, it also greatly simplifies the register allocation phase. ConstantHoisting was removed from LICMTest because it instantiated (now abstract) HConstant and was bogus anyway (constants are always in the entry block). Change-Id: Ie31038ad8e3ee0c13a5bbbbaf5f0b3e532310e4e
|
3ecfd65143d95bd7c6cbe4f58c33af517d3761e0 |
|
07-Sep-2015 |
Yevgeny Rouban <yevgeny.y.rouban@intel.com> |
Add dex_pc to all HInstructions in builder. Optimizing compiler generates minimum debug line info that is built using the dex_pc information about suspend points. This is not enough for performance and debugging needs. This patch makes all HInstructions contain dex_pc and all allocations in the builder define this value. Change-Id: I1d14aefe075189b7b1b41b4384c3499474c19afc Signed-off-by: Yevgeny Rouban <yevgeny.y.rouban@intel.com> Signed-off-by: Serdjuk, Nikolay Y <nikolay.y.serdjuk@intel.com>
|
0616ae081e648f4b9b64b33e2624a943c5fce977 |
|
17-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Add support for x86 constant area Use the Quick trick of finding the address of the method by calling the next instruction and popping the return address into a register. This trick is used because of the lack of PC-relative addressing in 32 bit mode on the X86. Add a HX86ComputeBaseMethodAddress instruction to trigger generation of the method address, which is referenced by instructions needing access to the constant area. Add a HX86LoadFromConstantTable instruction that takes a HX86ComputeBaseMethodAddress and a HConstant that will be used to load the value when needed. Change Add/Sub/Mul/Div to detect a HX86LoadFromConstantTable right hand side, and generate code that directly references the constant area. Other uses will be added later. Change the inputs to HReturn and HInvoke(s), replacing the FP constants with HX86LoadFromConstantTable instead. This allows values to be loaded from the constant area into the right location. Port the X86_64 assembler constant area handling to the X86. Use the new per-backend optimization framework to do this conversion. Change-Id: I6d235a72238262e4f9ec0f3c88319a187f865932 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
6058455d486219994921b63a2d774dc9908415a2 |
|
03-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag basic block allocations with their source. Replace GrowableArray with ArenaVector in HBasicBlock and, to track the source of allocations, assign one new and two Quick's arena allocation types to these vectors. Rename kArenaAllocSuccessor to kArenaAllocSuccessors. Bug: 23736311 Change-Id: Ib52e51698890675bde61f007fe6039338cf1a025
|
736b560f2d2c89b63dc895888c671b5519afa4c8 |
|
02-Sep-2015 |
Mathieu Chartier <mathieuc@google.com> |
Reduce how often we call FindDexCache Before host boot.oat -j4 optimizing compile: real 1m17.792s user 3m26.140s sys 0m8.340s After: real 1m12.324s user 3m22.718s sys 0m8.320s Change-Id: If18e9e79e06cdf1676692e5efacb682bf93889c3
|
145acc5361deb769eed998f057bc23abaef6e116 |
|
03-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Optimizing: Tag basic block allocations with their source." Reverting so that we can have more discussion about the STL API. This reverts commit 91e11c0c840193c6822e66846020b6647de243d5. Change-Id: I187fe52f2c16b6e7c5c9d49c42921eb6c7063dba
|
91e11c0c840193c6822e66846020b6647de243d5 |
|
02-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag basic block allocations with their source. Replace GrowableArray with ArenaVector in HBasicBlock and, to track the source of allocations, assign one new and two Quick's arena allocation types to these vectors. Rename kArenaAllocSuccessor to kArenaAllocSuccessors. Bug: 23736311 Change-Id: I984aef6e615ae2380a532f5c6726af21015f43f5
|
2a7c1ef95c850abae915b3a59fbafa87e6833967 |
|
22-Jul-2015 |
Yevgeny Rouban <yevgeny.y.rouban@intel.com> |
Add more dwarf debug line info for Optimized methods. Optimizing compiler generates minimum debug line info that is built using the dex_pc information about suspend points. This is not enough for performance and debugging needs. This CL generates additional debug line information for instructions which have known dex_pc and it ensures that whole call sites are mapped (as opposed to suspend points which map only one instruction past the function call). Bug: 23157336 Change-Id: I9f2b1c2038e3560847c175b8121cf9496b8b58fa Signed-off-by: Yevgeny Rouban <yevgeny.y.rouban@intel.com>
|
f9f6441c665b5ff9004d3ed55014f46d416fb1bb |
|
02-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag Arena allocations with their source. This adds the ability to track where we allocate memory when the kArenaAllocatorCountAllocations flag is turned on. Also move some allocations from native heap to the Arena and remove some unnecessary utilities. Bug: 23736311 Change-Id: I1aaef3fd405d1de444fe9e618b1ce7ecef07ade3
|
68ad649d3918f2eed3a37209c01a7f0a0faf09f0 |
|
18-Aug-2015 |
Calin Juravle <calin@google.com> |
Refactor BuildInvoke. BuildInvoke got to be too complex an unreadble. This breaks it down in smaller pieces. Change-Id: Ibda63f69f5a1be537ae13e18a5f67c361173f4a6
|
e418dda75998e0186f7580c2c54705767c3c8f1f |
|
12-Aug-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Be more flexible on the code unit size when inlining. This change increases the maximum code unit size, and fold parameters in the inlinee in the hope to reduce the overall size of the graph. We then make sure we don't inline methods that have more than N HInstructions. Also, remove the kAccDontInline flag on ArtMethod. The compiler does not need it anymore. Change-Id: I4cd3da40e551f30ba83b8b274728b87e67f6812e
|
05f2056b4f11e0b2bac92b2655abe7030771f5dc |
|
19-Aug-2015 |
Agi Csaki <agicsaki@google.com> |
Add support to indicate whether intrinsics require a dex cache A structural change to indicate whether a given intrinsic requires access to a dex cache. I updated the needs_environment_ field to indicate whether an HInvoke needs an environment or a dex cache, and if an HInvoke represents an intrisified method, we utilize this field to determine if the HInvoke needs a dex cache. Bug: 21481923 Change-Id: I9dd25a385e1a1397603da6c4c43f6c1aea511b32
|
bbd733e4ef277eff19bf9a6601032da081e9b68f |
|
18-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Enable basic optimizations for try/catch Generating code for try/catch methods requires having run at least the instruction simplifier to remove redundant suspend checks. This patch enables the first group of optimizations when try/catch is present. Enabled optimizations: 1) IntrinsicsRecognizer Does not modify the graph, only sets HInvoke::intrinsic_. 2) ConstantFolding Does not deal with throwing instructions. 3) InstructionSimplifier May remove a throwing instruction (e.g. LoadClass in VisitCheckCast), or may turn a throwing instruction into a non-throwing one (ArraySet). Their corresponding catch phi inputs are not removed but correctness is preserved. 4) ReferenceTypePropagation Does not modify the graph, only sets type properties. Typing of LoadException from catch handler information was added. 5) DeadCodeElimination Removing individual instructions is fine (same as 3). Removal of dead blocks was disabled for try/catch. Change-Id: I2722c3229eb8aaf326391e07f522dbf5186774b8
|
581550137ee3a068a14224870e71aeee924a0646 |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Optimizing: Better invoke-static/-direct dispatch."" Fixed kCallArtMethod to use correct callee location for kRecursive. This combination is used when compiling with debuggable flag set. This reverts commit b2c431e80e92eb6437788cc544cee6c88c3156df. Change-Id: Idee0f2a794199ebdf24892c60f8a5dcf057db01c
|
ec16f79a4d0aeff319bf52139a0c82de3080d73c |
|
19-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Refactor try/catch block info, store exception type This patch replaces HBasicBlock fields storing try/catch info with a single TryCatchInformation data structure, saving memory for the majority of non-try/catch blocks. It also changes builder to store the exception type for catch blocks. Change-Id: Ib3e43f7db247e6915d67c267fc62410420e230c9
|
b2c431e80e92eb6437788cc544cee6c88c3156df |
|
19-Aug-2015 |
Vladimir Marko <vmarko@google.com> |
Revert "Optimizing: Better invoke-static/-direct dispatch." Reverting due to failing ndebug tests. This reverts commit 9b688a095afbae21112df5d495487ac5231b12d0. Change-Id: Ie4f69da6609df3b7c8443412b6cf7f5c43c2c5d9
|
9b688a095afbae21112df5d495487ac5231b12d0 |
|
06-May-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Better invoke-static/-direct dispatch. Add framework for different types of loading ArtMethod* and code pointer retrieval. Implement invoke-static and invoke-direct calls the same way as Quick. Document the dispatch kinds in HInvokeStaticOrDirect's new enumerations MethodLoadKind and CodePtrLocation. PC-relative loads from dex cache arrays are used only for x86-64 and arm64. The implementation for other architectures will be done in separate CLs. Change-Id: I468ca4d422dbd14748e1ba6b45289f0d31734d94
|
29fc008c9689e9036a3f5e3bd186bbfb5de3cb82 |
|
18-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Revert storing of exceptional predecessors After change of the approach for try/catch register allocation, it is no longer necessary to record instructions which might throw into a catch block. Change-Id: I7ef12ed06c49a35280029810975fa2a50fe4a424
|
df3f8227badd0276177774a72f1bcb181688d954 |
|
13-Aug-2015 |
Roland Levillain <rpl@google.com> |
Adjust art::HTypeConversion's side effects for MIPS64. Also improve debugging information in art::CodeGenerator::ValidateInvokeRuntime. Change-Id: Icfcd1a5cfa5e5449a316251dc20547de6badecb5
|
efa8468c78fdd808043dfb664b56541f3f2dd0e8 |
|
13-Aug-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Small optimization improvements. - Tune CanBeNull for HBoundType. - Remove LoadClass when we know the class is loaded. - Tune CanBeNull for StringInit. Change-Id: I564ed33a506d65e991a514342bdfd1610bed0cf5
|
57b81ecbe74138992dd447251e94ed06cd5eb802 |
|
12-Aug-2015 |
agicsaki <agicsaki@google.com> |
Add support to indicate whether intrinsics require an environment A structural change to indicate whether a given intrinsic requires access to an environment. I added a field to HInvoke objects to indicate if they need an environment whose default value is true and is only updated if an intrinsic is marked as not requiring an environment. At this point there is no functional change, as all intrinsics are marked as requiring an environment. This change adds the structure for future inliner work which will allow us to inline more intrinsified calls. Change-Id: I2930e3cef7b785384bf95b95a542d34af442f3b9
|
3887c468d731420e929e6ad3acf190d5431e94fc |
|
12-Aug-2015 |
Roland Levillain <rpl@google.com> |
Remove unnecessary `explicit` qualifiers on constructors. Change-Id: Id12e392ad50f66a6e2251a68662b7959315dc567
|
78e3ef6bc5f8aa149f2f8bf0c78ce854c2f910fa |
|
12-Aug-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Add a GVN dependency 'GC' for garbage collection. This will be used by incoming architecture specific optimizations. The dependencies must be conservative. When an HInstruction is created we may not be sure whether it can trigger GC. In that case the 'ChangesGC' dependency must be set. We control at code-generation time that HInstructions that can call have the 'ChangesGC' dependency set. Change-Id: Iea6a7f430009f37a9599b0a0039207049906e45d
|
8c0676ce786f33b8f9c8eedf1ace48988c750932 |
|
03-Aug-2015 |
Serguei Katkov <serguei.i.katkov@intel.com> |
ART-Optimizing: Fix the type of HDivZeroCheck HDivZeroCheck is created during the building CFG and at this moment its type is not known completely. So it sets the type to int or long. However, later SSA builder can insert the type conversion and type of input of HDivZeroCheck can become byte or short while the type of HDivZeroCheck remains the same. In reality the type of HDivZeroCheck should be always equal to its input parameter. To fix this inconsistency we return the type of HDivZeroCheck as its input type. Code generators are updated accordingly. Change-Id: I6a5aedc8d479cfc6328704e7ddf252bca830076b Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
9867bc722f7c41e07a95397bc08b790cd21dc758 |
|
05-Aug-2015 |
Roland Levillain <rpl@google.com> |
Have constant folding be more flexible. - Have Evaluate methods take as argument(s) and return value instances of HConstant (instead of built-in 32- or 64-bit integer values), to let the evaluated instruction choose the type of the statically evaluated node; for instance, art::HEqual::Evaluate shall return a HIntConstant node (as implementation of a Boolean constant) whatever the type of its inputs (a pair of HIntConstant or a pair of HLongConstant). - Split the evaluation job from the operation logic: the former is addressed by Evaluate methods, while the latter is done by a generic Compute method. - Adress valid BinOp(int, long) and BinOp(long, int) cases. - Add a constructor to art::HIntConstant to build an integer constant from a `bool` value. Change-Id: If84b6fe8406bb94ddb1aa8b02e36628dff526db3
|
cb1c0557033065f2436ee79e7fa6c19d87064801 |
|
04-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Move exception clearing into own instruction Runtime delivers exceptions only to catch blocks which begin with a MOVE_EXCEPTION instruction (in DEX). In that case, the catch block is expected to clear the thread-local exception storage after having read the exception reference. This patch changes Optimizing to represent MOVE_EXCEPTION with two instructions - HLoadException and HClearException - instead of one. If the exception reference is not used, HLoadException can be safely removed, saving a memory load without breaking the runtime behaviour. Change-Id: Idad8a714467bf9d9d5fccefbc43c0bd8ae13ddba
|
b618adebbc19e50d7b1aa2f11b84341beb3c64dc |
|
29-Jul-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Store and check exceptional predecessors Future CL on register allocation for try/catch will require the knowledge of instructions which throw into a catch block. This patch stores that information with the basic block and verifies it in the graph checker. More checks on try catch also added to the graph checker and an order of exception handlers is enforced in TryBoundary successors. Change-Id: I3034c610791ea51d96724bcca97f49ec6ecf2af3
|
2e76830f0b3f23825677436c0633714402715099 |
|
28-Jul-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Revert "Revert "Use the object class as top in reference type propagation"""" This reverts commit b734808d0c93af98ec4e3539fdb0a8c0787263b0. Change-Id: Ifd925f166761bcb9be2268ff0fc9fa3a72f00c6f
|
a5ae3c3f468ffe3a317b498d7fde1f8e9325346a |
|
28-Jul-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Revert "Revert "Fixes and improvements in ReferenceTypePropagation"""" This reverts commit e344a8070d4549d513413c06767abf8a2c5e9709. Change-Id: I400fab0e02ce3c11376cc1f3ae9c7cf2c82ffcc1
|
e344a8070d4549d513413c06767abf8a2c5e9709 |
|
28-Jul-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Revert "Fixes and improvements in ReferenceTypePropagation""" This reverts commit 00e3b38be4b280d6d7a7e843cd336ffbd2ba4365. Change-Id: I4dbadb2d7312a410f1c56283f063dd82156cf702
|
b734808d0c93af98ec4e3539fdb0a8c0787263b0 |
|
28-Jul-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Revert "Use the object class as top in reference type propagation""" This reverts commit 80caa1478cf3df4eac1214d8a63a4da6f4fe622b. Change-Id: I63b51ca418b19b2bfb5ede3f8444f8fbeb8a339d
|
80caa1478cf3df4eac1214d8a63a4da6f4fe622b |
|
16-Jul-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Use the object class as top in reference type propagation"" This reverts commit 7733bd644ac71f86d4b30a319624b23343882e53. Change-Id: I7d393a808c01c084c18d632a54e0554b4b455f2c
|
00e3b38be4b280d6d7a7e843cd336ffbd2ba4365 |
|
15-Jul-2015 |
Calin Juravle <calin@google.com> |
Revert "Revert "Fixes and improvements in ReferenceTypePropagation"" This reverts commit 9b0096ba77e7e61bc2dcbbf954831dcae54a6c27. Change-Id: I824f16e800ca32e646577d5e1e0d593887ccead1
|
90443477f9a0061581c420775ce3b7eeae7468bc |
|
17-Jul-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move to newer clang annotations Also enable -Wthread-safety-negative. Changes: Switch to capabilities and negative capabilities. Future work: Use capabilities to implement uninterruptible annotations to work with AssertNoThreadSuspension. Bug: 20072211 Change-Id: I42fcbe0300d98a831c89d1eff3ecd5a7e99ebf33
|
7733bd644ac71f86d4b30a319624b23343882e53 |
|
22-Jul-2015 |
Calin Juravle <calin@google.com> |
Revert "Use the object class as top in reference type propagation" This reverts commit 3fabec7a25d151b26ba7de13615bbead0dd615a6. Change-Id: Id8614f6b6e3e0e4c9caeb9f771e4c145d9fec64f
|
9b0096ba77e7e61bc2dcbbf954831dcae54a6c27 |
|
22-Jul-2015 |
Calin Juravle <calin@google.com> |
Revert "Fixes and improvements in ReferenceTypePropagation" This reverts commit b0d5fc0ac139da4aaa1440263416b9bde05630b0. Change-Id: Iea8adfc0bd4cb7ee2b292278b8bac80a259acbd1
|
1c4ccea094965fb5ba491ace846d154f00d30055 |
|
22-Jul-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Delete extraneous prefix `SideEffects::` in `nodes.h`. Change-Id: Ic0a8442d20323df0d9db9e6a1d26c07bd903a13d
|
3fabec7a25d151b26ba7de13615bbead0dd615a6 |
|
16-Jul-2015 |
Calin Juravle <calin@google.com> |
Use the object class as top in reference type propagation This properly types all instructions, making it safe to query the type at any time. This also moves a few functions from class.h to class-inl.h to please gcc linker when compiling for target. Change-Id: I6b7ce965c10834c994b95529ab65a548515b4406
|
b0d5fc0ac139da4aaa1440263416b9bde05630b0 |
|
15-Jul-2015 |
Calin Juravle <calin@google.com> |
Fixes and improvements in ReferenceTypePropagation - Bound object types after a CheckCast. This increases the precision of (inlining) generic operations. - Make sure that the BoundType is exact when the class is final. - Make sure that we don't duplicate BoundTypes when we run the analysis more than once. Change-Id: Ic22b610766fae101f942c0d753ddcac32ac1844a
|
34c3ba93e74d14ab832297ff590cb76c3f0f519d |
|
20-Jul-2015 |
Aart Bik <ajcbik@google.com> |
Fix broken tests. Rationale: (1) volatile field write/read need to apply to all to comply with Java memory model (2) clinit only needs only the write (3) added conservative assumptions to memory barrier (nothing broke, but this seems better) Change-Id: I37787ec8f3f2c8d6166a94c57193fa4544ad3372
|
854a02b1b488327f80c544ca1119b386b8715c26 |
|
15-Jul-2015 |
Aart Bik <ajcbik@google.com> |
Improved side effect analysis (field/array write/read). Rationale: Types (int, float etc.) and access type (field vs. array) can be used to disambiguate write/read side-effects analysis. This directly improves e.g. dead code elimination and licm. Change-Id: I371f6909a3f42bda13190a03f04c4a867bde1d06
|
ffee3d33f3ea39aa6031c3d2ff29c4806c8dcc51 |
|
06-Jul-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Build SSA form when try/catch is present This patch implements support for try/catch in the SsaBuilder. Values of locals are propagated from throwing sites inside try blocks to their respective catch blocks and phis ("catch phis") are created when necessary. Change-Id: I0736565c2c4ff3f9f0924b6e3a785a50023f875a
|
2e7cd752452d02499a2f5fbd604c5427aa372f00 |
|
10-Jul-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
[optimizing] Don't rely on the verifier for String.<init>. Continue work on cutting the dependency on the verifier. Change-Id: I0f95b1eb2e10fd8f6bf54817f1202bdf6dfdb0fe
|
bff7503625400b610a43678c6930354146ce5f92 |
|
08-Jul-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Ignore try blocks with no throwing instructions"" The original CL broke libcore tests because monitor-exit instructions did not have any side-effects and got removed by DCE once not labelled throwing any more. This reverts commit efe374d7c25c1d48945a9198d96469de99e0c1bd. Change-Id: I624c0f91676d9baaada6f33be9d7091f68d57535
|
efe374d7c25c1d48945a9198d96469de99e0c1bd |
|
08-Jul-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Ignore try blocks with no throwing instructions" Turns out monitor-exit *can* throw... Need to investigate This reverts commit 8f8ee680bec71a28d9d7b7538e8c7ca100a18184. Change-Id: I8b42690918833c917b6a7fc3ceea932b7c1a6f15
|
beba9302bec33d72beb582970bf23d056f62641f |
|
08-Jul-2015 |
Calin Juravle <calin@google.com> |
Revert "Use the object class as top in reference type propagation" failing on the build bot on some targets but not locally. needs more investigation. This reverts commit 20e6071362b84a9782b633a893c29ebde458205e. Change-Id: I6965483f569fb862f9bdb66d459b747ded54de71
|
8f8ee680bec71a28d9d7b7538e8c7ca100a18184 |
|
08-Jul-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Ignore try blocks with no throwing instructions In order to avoid complex removal of redundant exceptional edges in the SSA builder, this patch modified the graph builder to consider blocks without throwing instructions as not in a try block, even if covered by a TryItem. In some corner cases, this may generate more TryBoundaries than necessary, but those can be removed once the SSA form is built. Change-Id: I158c4542b2c1964a8dd532f82e921b9cb1997e1e
|
4fa13f65ece3b68fe3d8722d679ebab8656bbf99 |
|
06-Jul-2015 |
Roland Levillain <rpl@google.com> |
Fuse long and FP compare & condition on ARM in Optimizing. Also: - Stylistic changes in corresponding parts on the x86 and x86-64 code generators. - Update and improve the documentation of art::arm::Condition. Bug: 21120453 Change-Id: If144772046e7d21362c3c2086246cb7d011d49ce
|
20e6071362b84a9782b633a893c29ebde458205e |
|
01-Jul-2015 |
Calin Juravle <calin@google.com> |
Use the object class as top in reference type propagation This properly types all instructions, making it safe to query the type at any time. Change-Id: I3ee2f0f79253cdf45b10ddab37ecb473345ca53a
|
c470193cfc522fc818eb2eaab896aef9caf0c75a |
|
10-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Fuse long and FP compare & condition on x86/x86-64 in Optimizing. This is a preliminary implementation of fusing long/float/double compares with conditions to avoid materializing the result from the compare and condition. The information from a HCompare is transferred to the HCondition if it is legal. There must be only a single use of the HCompare, the HCompare and HCondition must be in the same block, the HCondition must not need materialization. Added GetOppositeCondition() to HCondition to return the flipped condition. Bug: 21120453 Change-Id: I1f1db206e6dc336270cd71070ed3232dedc754d6 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
7d5ea03b2a7d886325b3ad97942038c2336aa855 |
|
02-Jul-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not create a HBoundType when the instruction is non-null. We don't need to refine the type after a null check, if the instruction is known non null or null. As a side effect, this avoids replacing HLoadClass instructions with HBoundType instructions. bug:22116987 (cherry picked from commit 3abd437507f8ba30a238a52c273c9944dcb9d5a1) Change-Id: I5e56de293554534195ade9770b7d1e4b078d685b
|
3abd437507f8ba30a238a52c273c9944dcb9d5a1 |
|
02-Jul-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not create a HBoundType when the instruction is non-null. We don't need to refine the type after a null check, if the instruction is known non null or null. As a side effect, this avoids replacing HLoadClass instructions with HBoundType instructions. bug:22116987 Change-Id: I565ae95db5a64faec30e026674636e398e0bf445
|
49bace1ccbec6f12b5b475ccc2ce76e0b666b500 |
|
01-Jul-2015 |
David Brazdil <dbrazdil@google.com> |
Address additional comments on try-catch CL Extra documentation of try-catch building. Change-Id: I5048c5fcb354c76fa4a60c3d8d21dd216bc9f6cd
|
56e1accf3966ae92e151567abf4561ef3f6466f4 |
|
30-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Changes to try-catch in GraphBuilder This patch adds an additional case into the insertion algorithm for HTryBoundary inside HGraphBuilder in order to better handle catch blocks covered by a TryItem. Building SSA form also required to stop combining HTryBoundaries for neighbouring TryItems because it was not clear which exception handlers belong to which try block. Change-Id: Ic68bd6ef98fee784609fa593cb08dca1f00a15e0
|
a1935c4fa255b5c20f5e9b2abce6be2d0f7cb0a8 |
|
26-Jun-2015 |
Roland Levillain <rpl@google.com> |
MIPS: Initial version of optimizing compiler for MIPS64R6. (cherry picked from commit 4dda3376b71209fae07f5c3c8ac3eb4b54207aa8) (amended for mnc-dev) Bug: 21555893 Change-Id: I874dc356eee6ab061a32f8f3df5f8ac3a4ab7dcf Signed-off-by: Alexey Frunze <Alexey.Frunze@imgtec.com> Signed-off-by: Douglas Leung <douglas.leung@imgtec.com>
|
fc6a86ab2b70781e72b807c1798b83829ca7f931 |
|
26-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Implement try/catch blocks in Builder"" This patch enables the GraphBuilder to generate blocks and edges which represent the exceptional control flow when try/catch blocks are present in the code. Actual compilation is still delegated to Quick and Baseline ignores the additional code. To represent the relationship between try and catch blocks, Builder splits the edges which enter/exit a try block and links the newly created blocks to the corresponding exception handlers. This layout will later enable the SsaBuilder to correctly infer the dominators of the catch blocks and to produce the appropriate reverse post ordering. It will not, however, allow for building the complete SSA form of the catch blocks and consequently optimizing such blocks. To this end, a new TryBoundary control-flow instruction is introduced. Codegen treats it the same as a Goto but it allows for additional successors (the handlers). This reverts commit 3e18738bd338e9f8363b26bc895f38c0ec682824. Change-Id: I4f5ea961848a0b83d8db3673763861633e9bfcfb
|
3e18738bd338e9f8363b26bc895f38c0ec682824 |
|
26-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Implement try/catch blocks in Builder" Causes OutOfMemory issues, need to investigate. This reverts commit 0b5c7d1994b76090afcc825e737f2b8c546da2f8. Change-Id: I263e6cc4df5f9a56ad2ce44e18932ca51d7e349f
|
0b5c7d1994b76090afcc825e737f2b8c546da2f8 |
|
11-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement try/catch blocks in Builder This patch enables the GraphBuilder to generate blocks and edges which represent the exceptional control flow when try/catch blocks are present in the code. Actual compilation is still delegated to Quick and Baseline ignores the additional code. To represent the relationship between try and catch blocks, Builder splits the edges which enter/exit a try block and links the newly created blocks to the corresponding exception handlers. This layout will later enable the SsaBuilder to correctly infer the dominators of the catch blocks and to produce the appropriate reverse post ordering. It will not, however, allow for building the complete SSA form of the catch blocks and consequently optimizing such blocks. To this end, a new TryBoundary control-flow instruction is introduced. Codegen treats it the same as a Goto but it allows for additional successors (the handlers). Change-Id: I415b985596d5bebb7b1bb358a46e08b7b04bb53a
|
18b236e5261d2b1f312e632a4d3bb2273c8bf641 |
|
24-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Recompute dominator tree after DCE. bug:22031382 (cherry picked from commit 1f82ecc6a0c9f88d03d6d1a6d95eeb8707bd06c1) Change-Id: I9a74edb185cb806045903dfe9695d9cc1a02e86b
|
69ba7b7112c2277ac225615b37e6df74c055740d |
|
23-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Run GraphChecker after Builder and SsaBuilder This patch refactors the way GraphChecker is invoked, utilizing the same scoping mechanism as pass timing and graph visualizer. Therefore, GraphChecker will now run not just after instances of HOptimization but after the builders and reg alloc, too. Change-Id: I8173b98b79afa95e1fcbf3ac9630a873d7f6c1d4
|
f39e0641a6d1a6561b20f6a130d1e763788cd70b |
|
23-Jun-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Minor fixes to mips64 for the arch-specific optimisation framework. Change-Id: I9d49ea61c732e4fc6b3393aa8778951e29ce4efe
|
1f82ecc6a0c9f88d03d6d1a6d95eeb8707bd06c1 |
|
24-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Recompute dominator tree after DCE. bug:22031382 Change-Id: Ifebe169897b76872015e3ce0ed7d0a9662f80cef
|
1e256bf257e8d97df9b2178ae8658b731ca2d662 |
|
19-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Be careful with predecessor/successor index. When we simplify the CFG, we must preserve things that were already simplified. For example, the index in the predecessor list or successor list of a block must be preserved for ensuring the first block is a loop pre header. bug:21867463 (cherry picked from commit 8b20f88b0a8d1b374dd5eaae289d19734c77b8f8) Change-Id: I2581b5a50942290da96cd9ec876f6f2573e0a6c4
|
bca381a12965a98e3727e93986dd0a195db500a0 |
|
20-May-2015 |
Mingyao Yang <mingyao@google.com> |
Fix premature deoptimization if the loop body isn't entered. Add a test between initial_ and end_ to see if the loop body is entered. If the loop body isn't entered at all, we jump to the loop header. Loop header is still executed and is going to test the condition again and loop body won't be entered. This makes sure no deoptimization is triggered if the loop body isn't even entered. Bug: 21034044 (cherry picked from commit 3584bce5b1f45e5741d3a6ca24884a36320ecb6b) Change-Id: I2b6de1f22fbc4568ca419f76382ebd87806d9694
|
8b20f88b0a8d1b374dd5eaae289d19734c77b8f8 |
|
19-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Be careful with predecessor/successor index. When we simplify the CFG, we must preserve things that were already simplified. For example, the index in the predecessor list or successor list of a block must be preserved for ensuring the first block is a loop pre header. bug:21867463 Change-Id: Ic3fcb3eb2c3fb109d8a57ee2a6b6d4d65fdb9410
|
4dda3376b71209fae07f5c3c8ac3eb4b54207aa8 |
|
02-Jun-2015 |
Alexey Frunze <Alexey.Frunze@imgtec.com> |
MIPS: Initial version of optimizing compiler for MIPS64R6. Bug: 21555893 Change-Id: I874dc356eee6ab061a32f8f3df5f8ac3a4ab7dcf Signed-off-by: Alexey Frunze <Alexey.Frunze@imgtec.com> Signed-off-by: Douglas Leung <douglas.leung@imgtec.com>
|
f78848f2ced8466b5fb2d7148d608288ee88757b |
|
17-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't special case HCurrentMethod in DCE. Instead, re-create the HCurrentMethod if it is needed after it has been removed. Change-Id: Id3bf15ae87b00a1d7eb35bf36d58fe96f788fba4
|
78f4fa74ae2d392ca9314b7ab25386d0e9a07cdb |
|
12-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Preserve class loading semantics in optimizing. We were being too agressive in removing HLoadClass instructions. A HInvokeStaticOrDirect can only remove it if it's directly before it. bug: 21711097 Change-Id: Id63502d90e11da60eccfb46daca62e0d5d022119
|
3584bce5b1f45e5741d3a6ca24884a36320ecb6b |
|
20-May-2015 |
Mingyao Yang <mingyao@google.com> |
Fix premature deoptimization if the loop body isn't entered. Add a test between initial_ and end_ to see if the loop body is entered. If the loop body isn't entered at all, we jump to the loop header. Loop header is still executed and is going to test the condition again and loop body won't be entered. This makes sure no deoptimization is triggered if the loop body isn't even entered. Bug: 21034044 Change-Id: I2b6de1f22fbc4568ca419f76382ebd87806d9694
|
222862ceaeed48528020412ef4f7b1cdaecf8789 |
|
09-Jun-2015 |
Guillaume Sanchez <guillaumesa@google.com> |
Add optimizations for instanceof/checkcast. The optimizations try to statically determine the outcome of the type tests, replacing/removing the instructions when possible. This required to fix the is_exact flag for ReferenceTypePropagation. Change-Id: I6cea29b6c351d118b62060e8420333085e9383fb
|
ef20f71e16f035a39a329c8524d7e59ca6a11f04 |
|
09-Jun-2015 |
Alexandre Rames <alexandre.rames@linaro.org> |
Add boilerplate code for architecture-specific HInstructions. Change-Id: I2723cd96e5f03012c840863dd38d7b2168117db8
|
69aa60163989c33a008115205d39732a76ecc1dc |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Pass current method to HNewInstance and HNewArray."" Problem exposed by this change was fixed in: https://android-review.googlesource.com/#/c/154031/ This reverts commit 7b0e353b49ac3f464c662f20e20e240f0231afff. Change-Id: I680c13dc9db9ba223ab11c7af255222860b4e6d2
|
7b0e353b49ac3f464c662f20e20e240f0231afff |
|
09-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Pass current method to HNewInstance and HNewArray." 082-inline-execute fails on x86. This reverts commit e21aa42e1341d34250742abafdd83311ad9fa737. Change-Id: Ib3fd25faee2e0128001e40d3d51a74f959bc4449
|
94015b939060f5041d408d48717f22443e55b6ad |
|
04-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Use HCurrentMethod in HInvokeStaticOrDirect."" Fix was to special case baseline for x86, which does not have enough registers to allocate the current method. This reverts commit c345f141f11faad177aa9635a78088d00cf66086. Change-Id: I5997aa52f8d4df373ae5ff4d4150dac0c44c4c10
|
e21aa42e1341d34250742abafdd83311ad9fa737 |
|
08-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Pass current method to HNewInstance and HNewArray. Also remove unsed CodeGenerator::LoadCurrentMethod. Change-Id: I4b8d3f2a30b8e2c76b6b329a72555483c993cb73
|
c345f141f11faad177aa9635a78088d00cf66086 |
|
04-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Use HCurrentMethod in HInvokeStaticOrDirect." Fails on baseline/x86. This reverts commit 38207af82afb6f99c687f64b15601ed20d82220a. Change-Id: Ib71018367eb7c6046965494a7e996c22af3de403
|
38207af82afb6f99c687f64b15601ed20d82220a |
|
01-Jun-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use HCurrentMethod in HInvokeStaticOrDirect. Change-Id: I0d15244b6b44c8b10079398c55da5071a3e3af66
|
3d21bdf8894e780d349c481e5c9e29fe1556051c |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 (cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33) Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3 ART: Fix casts for 64-bit pointers on 32-bit compiler. Bug: 19264997 Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457 Fix JDWP tests after ArtMethod change Fixes Throwable::GetStackDepth for exception event detection after internal stack trace representation change. Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of proxy method. Bug: 19264997 Change-Id: I363e293796848c3ec491c963813f62d868da44d2 Fix accidental IMT and root marking regression Was always using the conflict trampoline. Also included fix for regression in GC time caused by extra roots. Most of the regression was IMT. Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to detached thread. EvaluateAndApplyChanges: From ~2500 -> ~1980 GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots Bug: 19264997 Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0 Fix bogus image test assert Previously we were comparing the size of the non moving space to size of the image file. Now we properly compare the size of the image space against the size of the image file. Bug: 19264997 Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a [MIPS64] Fix art_quick_invoke_stub argument offsets. ArtMethod reference's size got bigger, so we need to move other args and leave enough space for ArtMethod* and 'this' pointer. This fixes mips64 boot. Bug: 19264997 Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
|
81014cb945bdf244ee0ade95163c77e1ff52f9ad |
|
02-Jun-2015 |
Mingyao Yang <mingyao@google.com> |
CanThrow() for HArraySet may return true. HArraySet can throw ArrayStoreException. Change-Id: Iba50dc95c822b079f0f1d024fbba7c5581a3d21b
|
e401d146407d61eeb99f8d6176b2ac13c4df1e33 |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
|
d23eeef3492b53102eb8093524cf37e2b4c296db |
|
18-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support for inlining methods that call/throw. Mostly fixes here and there to make it working. Change-Id: I1b535e895105d78b65634636d675b818551f783e
|
fbdaa30a448029d75422c76f29087a4e39630f4a |
|
29-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use the new HCurrentMethod in HLoadString. Change-Id: I23d27e5e10736d127519eb3238ff8f25df3843a2
|
104fd8a3f30ddcf07831250571aa2a233cd5c04d |
|
20-May-2015 |
Guillaume "Vermeille" Sanchez <guillaumesa@google.com> |
Bring Reference Type Propagation to Instance/StaticInstanceField For this, we need the field index in FieldInfo, hence the add of the field. Change-Id: Id219bd826d8496acf3981307a8c42e2eb6ddb712
|
76b1e1799a713a19218de26b171b0aef48a59e98 |
|
27-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a HCurrentMethod node. This enables register allocation for the current method, so that users of it don't always load it from the stack. Currently only used by HLoadClass. Will make follow-up CLs for the other users. Change-Id: If73324d85643102faba47fabbbd2755eb258c59c
|
81d804a51d4fc415e1544a5a09505db049f4eda6 |
|
20-May-2015 |
Guillaume "Vermeille" Sanchez <guillaumesa@google.com> |
Bring Reference Type Propagation to NewArray Change-Id: Ieff4f38854e06b0ed4b5689ced94a4289053d80d
|
c7af85dad0dc392cfc0b373b0c1cb4b4197c89f4 |
|
26-May-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Update graph's exit block field if removed Running DCE on an infinite loop will delete the exit block but the corresponding field is currently not cleared in the parent graph. This does not cause any problems at the moment as that information is only used in codegens to DCHECK that a block is not the exit block. However, it will be necessary to update the inliner once we start to inline methods with loops. With this patch, DCE will update the HGraph::exit_block_ field. DCHECK was also added to HGraph::InlineInto to make sure that the inlined graph does have an exit block. Change-Id: Ia8ddca375bbc6830cd919af6059a52cc9b73a023
|
d5111bf05fc0a9974280a80eeb43db6d5227a81e |
|
22-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not use dex_compilation_unit after inlining. It's incompatible with inlining, as inlined invokes/load class/new can be from another dex file. Change-Id: I8897b6a012942bc8e136f2bea70252d3fb3a7fa5
|
b176d7c6c8c01a50317f837a78de5da57ee84fb2 |
|
20-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Also encode the InvokeType in an InlineInfo. This will be needed to recover the call stack. Change-Id: I2fe10785eb1167939c8cce1862b2d7f4066e16ec
|
0ba218df92d2130295eccd2c564f8fdd2efc3a71 |
|
19-May-2015 |
Calin Juravle <calin@google.com> |
Remove unnecessary clinit checks Bug: 20852802 Change-Id: Ia6db8017ac22d45456845704a69ddffcc6917f4e
|
3cd4fc8bbb40a57d2ffde85f543c124f53237c1d |
|
14-May-2015 |
Calin Juravle <calin@google.com> |
Eliminate redundant constructor barriers when inlining. Bug: 20410297 Change-Id: I2097743d00eb795d050d390b1918e38c7f41d506
|
07276db28d654594e0e86e9e467cad393f752e6e |
|
18-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't do a null test in MarkGCCard if the value cannot be null. Change-Id: I45687f6d3505178e2fc3689eac9cb6ab1b2c1e29
|
8909bafa5d64e12eb53f3d37b984f53e7a632224 |
|
23-Apr-2015 |
Guillaume "Vermeille" Sanchez <guillaumesa@google.com> |
Mark CheckCast's and InstanceOf's input as !CanBeNull if used before in a NullCheck Change-Id: Ied0412a01922b40a3f5d89bed49707498582abc1
|
e82549b14c7def0a45461183964f7e6a34cbb70c |
|
06-May-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Fold HTypeConversion of constants While looking into optimizing long shifts on x86, I found that the compiler wasn't folding HTypeConversion of constants. Add simple conversions of constants, taking care of float/double values with NaNs and small/large values, ensuring Java conversion semantics. Add checker cases to see that constant folding of HTypeConversion is done. Ensure 422-type-conversion type conversion routiness don't get inlined to avoid compile time folding. Change-Id: I5a4eb376b64bc4e41bf908af5875bed312efb228 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
e8ff50df01c89e1b5264a5a900cfebdde87a9b44 |
|
07-May-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Rediscover loops after deleting blocks in DCE The way DCE currently updates loop information does not cover all cases. This patch removes the logic, resets loop information of live blocks to pre-SSA state and reanalyzes the affected loops. Change-Id: I0b996a70235b95a8db0de9a23a03f71db57a21b8 (cherry picked from commit a4b8c21dae70ae34aee13628632c39a675c06022)
|
a4b8c21dae70ae34aee13628632c39a675c06022 |
|
07-May-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Rediscover loops after deleting blocks in DCE The way DCE currently updates loop information does not cover all cases. This patch removes the logic, resets loop information of live blocks to pre-SSA state and reanalyzes the affected loops. Change-Id: I0b996a70235b95a8db0de9a23a03f71db57a21b8
|
0a23d74dc2751440822960eab218be4cb8843647 |
|
07-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a parent environment to HEnvironment. This code has no functionality change. It adds a placeholder for chaining inlined frames. Change-Id: I5ec57335af76ee406052345b947aad98a6a4423a
|
3b55ebb22156e1f3496cd1ee4a05e03b4780e579 |
|
08-May-2015 |
Roland Levillain <rpl@google.com> |
Simplify floating-point comparisons with NaN in Optimizing. This change was suggested by Ian. Also, simplify some art::HFloatConstant and art::HDoubleConstant methods. Change-Id: I7908df23581a7f61c8ec79c290fe5f70798ac3be
|
8c0c91a845568624815df026cfdac8c42ecccdf6 |
|
07-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use a growable array instead of an environment during SSA. Using an environment was convenient because it contains a growable array. But there's no need for the environment abstraction when being used as a temporary holder for values of locals. Change-Id: Idf2883fe4b8f97a31ee70b3627c1bdd23ebfff0e
|
db216f4d49ea1561a74261c29f1264952232728a |
|
05-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Relax the only one back-edge restriction. The rule is in the way for better register allocation, as it creates an artificial join point between multiple paths. Change-Id: Ia4392890f95bcea56d143138f28ddce6c572ad58
|
6db49a74e8402d3b6c66536ea7ec988144c05d24 |
|
28-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Update the remaining input index of phis after deleting an input. bug:20715803 bug:20690906 (cherry picked from commit 5d7b7f81ed5455893f984752c00571ef27cc97c5) Change-Id: Ie55739601b8d6fedc830d6e19d8a053392047d34
|
5d7b7f81ed5455893f984752c00571ef27cc97c5 |
|
28-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Update the remaining input index of phis after deleting an input. bug:20715803 bug:20690906 Change-Id: Iaf08f0c30d629e766be2b04815dc3e38b6e7ff35
|
2af2307f3903a75a379029c049b86f9903fc81a5 |
|
30-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "GVN final fields even with side effects." This reverts commit 781733632637db98d79dfffad72bf063be3259be. Change-Id: Id7c4591f6b8190921852044b278d11627457c570
|
781733632637db98d79dfffad72bf063be3259be |
|
29-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
GVN final fields even with side effects. Two accesses of a final field can be GVN'ed even if there are side effects between them. Change-Id: I04495ae83c7858f4216b083ad1c29851954320ad
|
395086f0a9e0658a2d33eeade7121db55c1f5dc8 |
|
29-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Fix loop information after dead code elimination Compilation failed when only some blocks of a loop were removed during dead code elimination. Bug: 20680703 (cherry picked from commit 69a2804c3bb48cf4fd00a66080f613a4fd96c422) Change-Id: If9988381236e4d8d8c3b508dfce1376b27c20d75
|
579026039080252878106118645ed70706f4838e |
|
21-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Add synthesize uses at back edge. This reduces the cost of linearizing the graph (hence removing the notion of back edge). Since linear scan allocates/spills registers based on next use, adding a use at a back edge ensures we do count for loop uses. Change-Id: Idaa882cb120edbdd08ca6bff142d326a8245bd14
|
69a2804c3bb48cf4fd00a66080f613a4fd96c422 |
|
29-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Fix loop information after dead code elimination Compilation failed when only some blocks of a loop were removed during dead code elimination. Bug: 20680703 Change-Id: If31025169ca493f0d7f7f2788576e98d05f03394
|
3e3d73349a2de81d14e2279f60ffbd9ab3f3ac28 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Have HInvoke instructions know their number of actual arguments. Add an art::HInvoke::GetNumberOfArguments routine so that art::HInvoke and its subclasses can return the number of actual arguments of the called method. Use it in code generators and intrinsics handlers. Consequently, no longer remove a clinit check as last input of a static invoke if it is still present during baseline code generation, but ensure that static invokes have no such check as last input in optimized compilations. Change-Id: Iaf9e07d1057a3b15b83d9638538c02b70211e476
|
848f70a3d73833fc1bf3032a9ff6812e429661d9 |
|
15-Jan-2014 |
Jeff Hao <jeffhao@google.com> |
Replace String CharArray with internal uint16_t array. Summary of high level changes: - Adds compiler inliner support to identify string init methods - Adds compiler support (quick & optimizing) with new invoke code path that calls method off the thread pointer - Adds thread entrypoints for all string init methods - Adds map to verifier to log when receiver of string init has been copied to other registers. used by compiler and interpreter Change-Id: I797b992a8feb566f9ad73060011ab6f51eb7ce01
|
769c9e539da8ca80aa914cd12276aa5bd79148ee |
|
27-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Simplify Ifs with BooleanNot condition If statements with negated condition can be simplified by removing the negation and swapping the true and false branches. Change-Id: I197afbc79fb7344d73b7b85d3611e7ca2519717f
|
2b1c622d5db941fe06b3ea9c1a5366358fa298c6 |
|
27-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Fix removing a Phi with RemoveInstruction Boolean simplifier might attempt to remove a Phi from the Instruction list. (cherry picked from commit c7508e93fa3df3a3890f6b62550cbd5e35bdd8df) Change-Id: Ic8ad31967aa3e47c1fb1c67553d08681b6063a16
|
f213e05cef6d38166cfe0cce8f3b0a53225a1b39 |
|
27-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for caching float and double constants. Change-Id: Ib5205bad1006bc5e3c9cc86bc82a6b4b1ce9bef9
|
c7508e93fa3df3a3890f6b62550cbd5e35bdd8df |
|
27-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Fix removing a Phi with RemoveInstruction Boolean simplifier might attempt to remove a Phi from the Instruction list. Change-Id: I698cc616549bd88dac96395cb2e5d09b5433d157
|
2967ec6c3dad1c1dc15fc827188bd5ecfa75493b |
|
24-Apr-2015 |
Guillaume "Vermeille" Sanchez <guillaumesa@google.com> |
Add InsertInstructionAfter in HBasicBlock. Change-Id: I56e4e6edb39d1aab747877b7e517e94f0393f296
|
206d6fd6cae5ba8c4d5f0e230111fe77b9d5c0a5 |
|
14-Apr-2015 |
Mingyao Yang <mingyao@google.com> |
Deoptimization-based BCE for unknown loop bounds. For loop like: for (int i = start; i < end; i++) { array[i] = 1; } We add the following to the loop pre-header: if (start < 0) deoptimize(); if (end > array.length) deoptimize(); Then we can eliminate bounds-check of array[i] inside the loop. We also take care of indexing with induction variable plus some offsets, like array[i - 1]/array[i + 1] inside the loop, and adjust the condition for deoptimization accordingly. Change-Id: I9e24c6b5e134ff95eff5b5605ff8f95d6546616f
|
067cae2c86627d2edcf01b918ee601774bc76aeb |
|
26-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "[optimizing] Replace FP divide by power of 2" Fails compiling docs. This reverts commit b0bd8915cb257cdaf46ba663c450a6543bca75af. Change-Id: I47d32525c83a73118e2163eb58c68bbb7a28bb38
|
1152c926076a760490085c4497c3f117fa8da891 |
|
24-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Rename HasArrayAccesses and check it Since the flag is only used to see if there is a HBoundsCheck, rename HasArrayAccesses() to HasBoundsChecks(). Add a check in graph_checker to see that the flag is set if we see a HBoundsCheck instruction. Change-Id: I10fe92897374fb247082152dd75c3611cc40ff30 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
0379f82393237798616d485ad99952e73e480e12 |
|
25-Apr-2015 |
Roland Levillain <rpl@google.com> |
Fix DCHECKs about clinit checks in Optimizing's code generators. These assertions are not true for the baseline compiler. As a temporary workaround, remove a clinit check as last input of a static invoke if it is still present at the stage of code generation. Change-Id: I5655f4a0873e2e7ee7790b6a341c18b4b7b52af1
|
4c0eb42259d790fddcd9978b66328dbb3ab65615 |
|
24-Apr-2015 |
Roland Levillain <rpl@google.com> |
Ensure inlined static calls perform clinit checks in Optimizing. Calls to static methods have implicit class initialization (clinit) checks of the method's declaring class in Optimizing. However, when such a static call is inlined, the implicit clinit check vanishes, possibly leading to an incorrect behavior. To ensure that inlining static methods does not change the behavior of a program, add explicit class initialization checks (art::HClinitCheck) as well as load class instructions (art::HLoadClass) as last input of static calls (art::HInvokeStaticOrDirect) in Optimizing' control flow graphs, when the declaring class is reachable and not known to be already initialized. Then when considering the inlining of a static method call, proceed only if the method has no implicit clinit check requirement. The added explicit clinit checks are already removed by the art::PrepareForRegisterAllocation visitor. This CL also extends this visitor to turn explicit clinit checks from static invokes into implicit ones after the inlining step, by removing the added art::HLoadClass nodes mentioned hereinbefore. Change-Id: I9ba452b8bd09ae1fdd9a3797ef556e3e7e19c651
|
2d7352ba5311b8f57427b91b7a891e61497373c1 |
|
20-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Dead block removal Adds a new pass which finds all unreachable blocks, typically due to simplifying an if-condition to a constant, and removes them from the graph. The patch also slightly generalizes the graph-transforming operations. Change-Id: Iff7c97f1d10b52886f3cd7401689ebe1bfdbf456
|
af88835231c2508509eb19aa2d21b92879351962 |
|
20-Apr-2015 |
Guillaume "Vermeille" Sanchez <guillaumesa@google.com> |
Remove unnecessary null checks in CheckCast and InstanceOf Change-Id: I6fd81cabd8673be360f369e6318df0de8b18b634
|
edad8add1f1216850cb3f179ba6f57b0d885b016 |
|
23-Apr-2015 |
Calin Juravle <calin@google.com> |
Remove ActAsNullConstant We now properly type null constants during ssa builder so this is not needed anymore. Bug: 20322006 Change-Id: Ic060a52d4fa2d4f00755dd6427f822d368392d7b
|
2cebb24bfc3247d3e9be138a3350106737455918 |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Replace NULL with nullptr Also fixed some lines that were too long, and a few other minor details. Change-Id: I6efba5fb6e03eb5d0a300fddb2a75bf8e2f175cb
|
c3d743fa2a26effcb35627d8a1338029c86e582a |
|
22-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Update last_instruction when adding Phis HBasicBlock::InsertPhiAfter would not update the last_instruction pointer when adding at the end of the list. This could cause problems when iterating over phis backwards. Fortunately, we don't do that anywhere in the existing code. Change-Id: I4487265bf2cf3d3819623fafd7ce7c359bac190e
|
641547a5f18ca2ea54469cceadcfef64f132e5e0 |
|
21-Apr-2015 |
Calin Juravle <calin@google.com> |
[optimizing] Fix a bug in moving the null check to the user. When taking the decision to move a null check to the user we did not verify if the next instruction checks the same object. Change-Id: I2f4533a4bb18aa4b0b6d5e419f37dcccd60354d2
|
1ba1981ee9d28f87f594b157566d09e973fa5bce |
|
21-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Linear scan: Use FirstUse instead of FirstRegisterUse. This is in preparation for introducing synthesized used at back edges. Change-Id: Ie28d6725d2dde982cf2137f2110daabcbab9f789
|
b0bd8915cb257cdaf46ba663c450a6543bca75af |
|
16-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Replace FP divide by power of 2 Replace a floating point division by a power of two by a multiplication of the reciprocal. This is guarenteed to have the exact same result as it is exactly representable. Add routines to allow generation of float and double constants after the SSA Builder. I was unsure if float and double caches should be implemented. Under the assumption that there is probably not a lot of repetition of FP values. Please let me know. Change-Id: I3a6c3847b49b4e747a7e7e8843ca32bb174b1584 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
27df758e2e7baebb6e3f393f9732fd0d064420c8 |
|
17-Apr-2015 |
Calin Juravle <calin@google.com> |
[optimizing] Add memory barriers in constructors when needed If a class has final fields we must add a memory barrier before returning from constructor. This makes sure the fields are visible to other threads. Bug: 19851497 Change-Id: If8c485092fc512efb9636cd568cb0543fb27688e
|
ad4450e5c3ffaa9566216cc6fafbf5c11186c467 |
|
17-Apr-2015 |
Zheng Xu <zheng.xu@arm.com> |
Opt compiler: Implement parallel move resolver without using swap. The algorithm of ParallelMoveResolverNoSwap() is almost the same with ParallelMoveResolverWithSwap(), except the way we resolve the circular dependency. NoSwap() uses additional scratch register to resolve the circular dependency. For example, (0->1) (1->2) (2->0) will be performed as (2->scratch) (1->2) (0->1) (scratch->0). On architectures without swap register support, NoSwap() can reduce the number of moves from 3x(N-1) to (N+1) when there is circular dependency with N moves. And also, NoSwap() algorithm does not depend on architecture register layout information, which means it can support register pairs on arm32 and X/W, D/S registers on arm64 without additional modification. Change-Id: Idf56bd5469bb78c0e339e43ab16387428a082318
|
a4f8831d6533e4fe5aed18433099e1130d95a877 |
|
16-Apr-2015 |
Calin Juravle <calin@google.com> |
Remove duplicates phis created during SSA transformation When creating equivalent phis we copy the inputs of the original phi which may be improperly typed. This will be fixed during the type propagation but as a result we may have two equivalent phis with the same type for the same dex register. This is correct but generates more code and prevent some optimizations. This CL adds another step in the SSA builder to remove the extra Phi nodes created due to equality operators. The graph checker verifies that for a given dex register not two phis have the same type. Also, replace zero int constant with null constant when we compare a reference against null. Change-Id: Id37cc11a016ea767c7e351575e003d822a9d2e60
|
f776b92a0d52bb522043812dacb9c21ac11858e2 |
|
15-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Remove dead blocks for the blocks_ array. This prevents crashing because of structurally incorrect blocks. Also we now don't need to remove its instructions. Test case courtesy of Serguei I Katkov. Change-Id: Ia3ef9580549fc3546e8cd5f346079b1f0ceb2a61
|
0d9f17de8f21a10702de1510b73e89d07b3b9bbf |
|
15-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Move the linear order to the HGraph. Bug found by Zheng Xu: SsaLivenessAnalysis being a stack allocated object, we should not refer to it in later phases of the compiler. Specifically, the code generator was using the linear order, which was stored in the liveness analysis object. Change-Id: I574641f522b7b86fc43f3914166108efc72edb3b
|
9021825d1e73998b99c81e89c73796f6f2845471 |
|
15-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Type MoveOperands. The ParallelMoveResolver implementation needs to know if a move is for 64bits or not, to handle swaps correctly. Bug found, and test case courtesy of Serguei I. Katkov. Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
|
66d126ea06ce3f507d86ca5f0d1f752170ac9be1 |
|
03-Apr-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Implement HBooleanNot instruction Optimizations simplifying operations on boolean values (boolean simplifier, instruction simplifier) can benefit from having a special HInstruction for negating booleans in order to perform more transforms and produce faster machine code. This patch implements HBooleanNot as 'x xor 1', assuming that booleans are 1-bit integers and allowing for a single-instruction negation on all supported platforms. Change-Id: I33a2649c1821255b18a86ca68ed16416063c739f
|
188d4316a880ae24aed315aa52dc503c4fcb1ec7 |
|
09-Apr-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Instruction simplification for HAdd, HNeg, HNot, HSub. Under assumptions for the 'cost' of each IR (eg. neither HAdd nor HSub are faster than the other), transformations are only applied if they (locally) cannot degrade the quality of the graph. The code could be extended to look at uses of the IRs and detect more opportunities for optimisations. The optimisations in this patch do not look at other uses for their inputs. Change-Id: Ib60dab007af30f43421ef5bb55db2ec32fb8fc0c
|
3dcd58cd54a922b864494fb7fff4a7f7a8562db9 |
|
03-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a bug when creating a HDeoptimization instruction. We need to copy the environment, instead of just pointing to an existing one. Otherwise, if the instruction that initially holds the environemnt gets removed from the graph, any update to an instruction in that environment will not be reflected in it. bug:20058506 Change-Id: I2a62476d0851ecbc3707c0da395d8502ee437422
|
0c365e674545159fd77b998081207f0685a605e5 |
|
01-Apr-2015 |
Mingyao Yang <mingyao@google.com> |
CanThrow() of HNewArray should return true. Change-Id: I9950f1c391dfeb26cf59cee769705d01d8e283d5
|
d43b3ac88cd46b8815890188c9c2b9a3f1564648 |
|
01-Apr-2015 |
Mingyao Yang <mingyao@google.com> |
Revert "Revert "Deoptimization-based bce."" This reverts commit 0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430. Change-Id: I1ca10d15bbb49897a0cf541ab160431ec180a006
|
a0466e1773ec1db32c4b3d04b0416ffef5005b39 |
|
27-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
GVN HClinitCheck nodes. Change-Id: I5c79caadd57d10214a44149fda53e9e185ac7eca
|
8d5b8b295930aaa43255c4f0b74ece3ee8b43a47 |
|
24-Mar-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Force constants into the entry block Optimizations such as GVN and BCE make the assumption that all constants are located in the entry block of the CFG, but not all passes adhere to this rule. This patch makes constructors of constants private and only accessible to friend classes - HGraph for int/long constants and SsaBuilder for float/double - which ensure that they are placed correctly and not duplicated. Note that the ArenaAllocatorAdapter was modified to not increment the ArenaAllocator's internal reference counter in order to allow for use of ArenaSafeMap inside an arena-allocated objects. Because their destructor is not called, the counter does not get decremented. Change-Id: I36a4fa29ae34fb905cdefd482ccbf386cff14166
|
790412959a6413a585f45fc5f77fe7106311a00c |
|
26-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use the original invoke type when inlining. When resolving a method through the compiler driver, the code makes sure the call in the DEX bytecode matches the kind of method found, to check for IncompatibleClassChangeError. Because when we sharpen an invoke virtual, we transform the invoke kind to direct, we must not use the new kind, but the one in DEX. Change-Id: Iaf77b27b529c659ea48ffb19f46427552c9e3654
|
9437b78780f9e6ffa5797ebe82de8e8d7f3a5ed6 |
|
25-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Inline across dex files."" This reverts commit 6a816cf624ba56bf2872916d7b65b18fd9a411ef. Change-Id: I36cb524108786dd7996f2aea0443675be1f1b859
|
b2bd1c5f9171f35fa5b71ada42d1a9e11189428d |
|
25-Mar-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Formatting and comments in BooleanSimplifier Change-Id: I9a5aa3f2aa8b0a29d7b0f1e5e247397cf8e9e379
|
10f56cb6b4e39ed0032e9a23b179b557463e65ad |
|
24-Mar-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Fix crash in gtests SsaLivenessAnalysis was crashing after change of iteration order in 142377 because gtests do not always build reverse post order. Change-Id: If5ad5b7c52040b119c4415f0b942988049fa3c16
|
46e2a3915aa68c77426b71e95b9f3658250646b7 |
|
16-Mar-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Boolean simplifier The optimization recognizes the negation pattern generated by 'javac' and replaces it with a single condition. To this end, boolean values are now consistently assumed to be represented by an integer. This is a first optimization which deletes blocks from the HGraph and does so by replacing the corresponding entries with null. Hence, existing code can continue indexing the list of blocks with the block ID, but must check for null when iterating over the list. Change-Id: I7779da69cfa925c6521938ad0bcc11bc52335583
|
6a816cf624ba56bf2872916d7b65b18fd9a411ef |
|
24-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Inline across dex files." bug: 19904089 bug: 19903495 This reverts commit 7e4c3508e4f5512650b63c41f7872a749e99aee9. Change-Id: I15df746b5f1882cce78eedde6c05c0d3b69bfa4a
|
da4d79bc9a4aeb9da7c6259ce4c9c1c3bf545eb8 |
|
24-Mar-2015 |
Roland Levillain <rpl@google.com> |
Unify ART's various implementations of bit_cast. ART had several implementations of art::bit_cast: 1. one in runtime/base/casts.h, declared as: template <class Dest, class Source> inline Dest bit_cast(const Source& source); 2. another one in runtime/utils.h, declared as: template<typename U, typename V> static inline V bit_cast(U in); 3. and a third local version, in runtime/memory_region.h, similar to the previous one: template<typename Source, typename Destination> static Destination MemoryRegion::local_bit_cast(Source in); This CL removes versions 2. and 3. and changes their callers to use 1. instead. That version was chosen over the others as: - it was the oldest one in the code base; and - its syntax was closer to the standard C++ cast operators, as it supports the following use: bit_cast<Destination>(source) since `Source' can be deduced from `source'. Change-Id: I7334fd5d55bf0b8a0c52cb33cfbae6894ff83633
|
0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430 |
|
24-Mar-2015 |
Andreas Gampe <agampe@google.com> |
Revert "Deoptimization-based bce." This breaks compiling the core image: Error after BCE: art::SSAChecker: Instruction 219 in block 1 does not dominate use 221 in block 1. This reverts commit e295e6ec5beaea31be5d7d3c996cd8cfa2053129. Change-Id: Ieeb48797d451836ed506ccb940872f1443942e4e
|
e295e6ec5beaea31be5d7d3c996cd8cfa2053129 |
|
07-Mar-2015 |
Mingyao Yang <mingyao@google.com> |
Deoptimization-based bce. A mechanism is introduced that a runtime method can be called from code compiled with optimizing compiler to deoptimize into interpreter. This can be used to establish invariants in the managed code If the invariant does not hold at runtime, we will deoptimize and continue execution in the interpreter. This allows to optimize the managed code as if the invariant was proven during compile time. However, the exception will be thrown according to the semantics demanded by the spec. The invariant and optimization included in this patch are based on the length of an array. Given a set of array accesses with constant indices {c1, ..., cn}, we can optimize away all bounds checks iff all 0 <= min(ci) and max(ci) < array-length. The first can be proven statically. The second can be established with a deoptimization-based invariant. This replaces n bounds checks with one invariant check (plus slow-path code). Change-Id: I8c6e34b56c85d25b91074832d13dba1db0a81569
|
7e4c3508e4f5512650b63c41f7872a749e99aee9 |
|
18-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Inline across dex files. Change-Id: I5c2c44f5130b50f0bad21a6877a3935dc60b4a85
|
915b9d0c13bb5091875d868fbfa551d7b65d7477 |
|
11-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Tweak liveness when instructions are used in environments. Instructions remain live when debuggable, but only instructions with object types remain live when non-debuggable. Enable StackVisitor::GetThisObject for optimizing. Change-Id: Id87b2cbf33a02450059acc9993995782e5f28987
|
d335083828e2838bd360303be768e600275cedf5 |
|
12-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Run the natural loop recognizer before building SSA. SSA building checks the consistency of the graph when dealing with dead phis. Fixes continuous AOSP builds with optimizing. Change-Id: Ia9a0f0adc24a8e144e54444e090ad828b9b40040
|
b2fd7bca70b580921eebf7c45769c39d2dfd8a5a |
|
11-Mar-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Basic simplification for arithmetic operations. The optimisations in this patch do not look further than the inputs of each operation. Change-Id: Iddd0ab6b360b9e7bb042db22086d51a31be85530
|
234d69d075d1608f80adb647f7935077b62b6376 |
|
09-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "[optimizing] Enable x86 long support."" This reverts commit 154552e666347d41d95d7619c6ee56249ff4feca. Change-Id: Idc726551c249a888b7ff5fde8508ae50e81b2e13
|
e0fe7ae36180863e45cbb9d1e6e9c30b1b1a949c |
|
09-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Make the SSA builder honor the debuggable flag. This requires to properly type phis that are only used by environments, and discard phis with incomptable types. The code generators do not handle these conflicting types. In the process, ensure a phi has a type that does not depend on the order of the inputs (for example (char, short) -> short), and set int for int-like types. We can refine this later. Change-Id: I60ab601d6d00b1cbf18623ee4ff1795aa28f84a1
|
154552e666347d41d95d7619c6ee56249ff4feca |
|
06-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "[optimizing] Enable x86 long support." Few libcore failures. This reverts commit b4ba354cf8d22b261205494875cc014f18587b50. Change-Id: I4a28d853e730dff9b69aec9555505803cf2fcd63
|
2ed20afc6a1032e9e0cf919cb8d1b2b41e147182 |
|
06-Mar-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Clean the use of `virtual` and `OVERRIDE`. Change-Id: I806ec522b979334cee8f344fc95e8660c019160a
|
b4ba354cf8d22b261205494875cc014f18587b50 |
|
05-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
[optimizing] Enable x86 long support. Change-Id: I9006972a65a1f191c45691104a960366747f9d16
|
e4335eb5bcbca6927e51c10cf0de3516d94ef599 |
|
03-Mar-2015 |
Mingyao Yang <mingyao@google.com> |
Make BCE a no-op if there is no array access. Change-Id: I8456182808c1dbaa0c0ae1b8c2e94bb17baf5f29
|
dc5ac731f6369b53b42f1cee3404f3b3384cec34 |
|
25-Feb-2015 |
Mingyao Yang <mingyao@google.com> |
Opt compiler: enhance gvn for commutative ops. Change-Id: I415b50d58b30cab4ec38077be22373eb9598ec40
|
61d544bfb812d79f5c9ddad171198836cea719db |
|
23-Feb-2015 |
Calin Juravle <calin@google.com> |
[optimizing] Add if-context sensitivity for null popagation. Change-Id: I3725b6c6a6cf44440c34a1bfb67e623531e665d6
|
1abb4191a2e56d8dbf518efcaeefb266c1acdf2b |
|
17-Feb-2015 |
David Brazdil <dbrazdil@google.com> |
Optimizing: Speed up HInstruction use removal Similarly to a previous commit on HEnvironment use removal, this patch adds links from instructions to their respective inputs' use lists for contant-time removal at the cost of doubling the size of input lists (from one pointer per entry to two). Manual testing shows that this significantly reduces the time required to transform HGraph to SSA form for some huge methods. Change-Id: I8dc3e4b0c48a50ac1481eb55c31093b99f4dc29f
|
b1498f67b444c897fa8f1530777ef118e05aa631 |
|
16-Feb-2015 |
Calin Juravle <calin@google.com> |
Improve type propagation with if-contexts This works by adding a new instruction (HBoundType) after each `if (a instanceof ClassA) {}` to bound the type that `a` can take in the True- dominated blocks. Change-Id: Iae6a150b353486d4509b0d9b092164675732b90c
|
b666f4805c8ae707ea6fd7f6c7f375e0b000dba8 |
|
18-Feb-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move arenas into runtime Moved arena pool into the runtime. Motivation: Allow GC to use arena allocators, recycle arena pool for linear alloc. Bug: 19264997 Change-Id: I8ddbb6d55ee923a980b28fb656c758c5d7697c2f
|
acf735c13998ad2a175f5a17e7bfce220073279d |
|
12-Feb-2015 |
Calin Juravle <calin@google.com> |
Reference type propagation - propagate reference types between instructions - remove checked casts when possible - add StackHandleScopeCollection to manage an arbitrary number of stack handles (see comments) Change-Id: I31200067c5e7375a5ea8e2f873c4374ebdb5ee60
|
d6138ef1ea13d07ae555542f8898b30d89e9ac9a |
|
18-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Ensure the graph is correctly typed. We used to be forgiving because of HIntConstant(0) also being used for null. We now create a special HNullConstant for such uses. Also, we need to run the dead phi elimination twice during ssa building to ensure the correctness. Change-Id: If479efa3680d3358800aebb1cca692fa2d94f6e5
|
f7a0c4e421b5edaad5b7a15bfff687da28d0b287 |
|
10-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Improve ParallelMoveResolver to work with pairs. Change-Id: Ie2a540ffdb78f7f15d69c16a08ca2d3e794f65b9
|
c0572a451944f78397619dec34a38c36c11e9d2a |
|
06-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize leaf methods. Avoid suspend checks and stack changes when not needed. Change-Id: I0fdb31e8c631e99091b818874a558c9aa04b1628
|
276d9daaedfbff716339f94d55e6eff98b7434c6 |
|
02-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Inline methods with multiple blocks. Change-Id: I3431af60e97fae230e0b6e98bcf0acc0ee9abf8c
|
cb1b00aedd94785e7599f18065a0b97b314e64f6 |
|
28-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use the non access check entrypoint when possible. Change-Id: I0b53d63141395e26816d5d2ce3fa6a297bb39b54
|
82091dad38f3e5bfaf3b6984c9ab73069fb68310 |
|
26-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement LICM in optimizing compiler. Change-Id: I9c8afb0a58ef45e568576015473cbfd5f011c242
|
10e244f9e7f6d96a95c910a2bedef5bd3810c637 |
|
26-Jan-2015 |
Calin Juravle <calin@google.com> |
optimizing: NullCheck elimination How it works: - run a type analysis to propagate null information on instructions - during the last instruction simplifier remove null checks for which the input is known to be not null The current type analysis is actually a nullability analysis but it will be reused in follow up CLs to propagate type information: so it keeps the more convenient name. Change-Id: I54bb1d32ab24604b4d677d1ecdaf8d60a5ff5ce9
|
1cf95287364948689f6a1a320567acd7728e94a3 |
|
12-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Small optimization for recursive calls: avoid dex cache. Change-Id: I044757a2f06e535cdc1480c4fc8182b89635baf6
|
ea55b934cff1280318f5514039549799227cfa3d |
|
27-Jan-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Further refactor use lists Change-Id: I9e3219575a508ca5141d851bfcaf848302480c32
|
ed59619b370ef23ffbb25d1d01f615e60a9262b6 |
|
23-Jan-2015 |
David Brazdil <dbrazdil@google.com> |
Optimizing: Speed up HEnvironment use removal Removal of use records from HEnvironment vregs involved iterating over potentially large linked lists which made compilation of huge methods very slow. This patch turns use lists into doubly-linked lists, stores pointers to the relevant nodes inside HEnvironment and subsequently turns the removals into constant-time operations. Change-Id: I0e1d4d782fd624e7b8075af75d4adf0a0634a1ee
|
6c2dff8ff8e1440fa4d9e1b2ba2a44d036882801 |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Fully support pairs in the register allocator."" This reverts commit c399fdc442db82dfda66e6c25518872ab0f1d24f. Change-Id: I19f8215c4b98f2f0827e04bf7806c3ca439794e5
|
77520bca97ec44e3758510cebd0f20e3bb4584ea |
|
12-Jan-2015 |
Calin Juravle <calin@google.com> |
Record implicit null checks at the actual invoke time. ImplicitNullChecks are recorded only for instructions directly (see NB below) preceeded by NullChecks in the graph. This way we avoid recording redundant safepoints and minimize the code size increase. NB: ParallalelMoves might be inserted by the register allocator between the NullChecks and their uses. These modify the environment and the correct action would be to reverse their modification. This will be addressed in a follow-up CL. Change-Id: Ie50006e5a4bd22932dcf11348f5a655d253cd898
|
c399fdc442db82dfda66e6c25518872ab0f1d24f |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Fully support pairs in the register allocator." Libcore tests fail. This reverts commit 41aedbb684ccef76ff8373f39aba606ce4cb3194. Change-Id: I2572f120d4bbaeb7a4d4cbfd47ab00c9ea39ac6c
|
41aedbb684ccef76ff8373f39aba606ce4cb3194 |
|
14-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fully support pairs in the register allocator. Enabled on ARM for longs and doubles. Change-Id: Id8792d08bd7ca9fb049c5db8a40ae694bafc2d8b
|
42d1f5f006c8bdbcbf855c53036cd50f9c69753e |
|
16-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not use register pair in a parallel move. The ParallelMoveResolver does not work with pairs. Instead, decompose the pair into two individual moves. Change-Id: Ie9d3f0b078cef8dc20640c98b20bb20cc4971a7f
|
dd8f887e81b894bc8075d8bacdb223747b6a8018 |
|
15-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a bug in the register allocator. When allocating a register blocked by existing intervals, we need to split inactive intervals at the end of their lifetime hole, and not at the next intersection. Otherwise, the allocation for following intervals will not see that a register is being used by the split interval. Change-Id: I40cc79dde541c07392a7cf4c6f0b291dd1ce1819
|
71fb52fee246b7d511f520febbd73dc7a9bbca79 |
|
30-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Optimizing compiler intrinsics Add intrinsics infrastructure to the optimizing compiler. Add almost all intrinsics supported by Quick to the x86-64 backend. Further intrinsics require more assembler support. Change-Id: I48de9b44c82886bb298d16e74e12a9506b8e8807
|
840e5461a85f8908f51e7f6cd562a9129ff0e7ce |
|
07-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement double and float support for arm in register allocator. The basic approach is: - An instruction that needs two registers gets two intervals. - When allocating the low part, we also allocate the high part. - When splitting a low (or high) interval, we also split the high (or low) equivalent. - Allocation follows the (S/D register) requirement that low registers are always even and the high equivalent is low + 1. Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
|
10c9cbe05ab860cb7d5ce82c411698a10f811aa6 |
|
19-Dec-2014 |
Calin Juravle <calin@google.com> |
Fixed CanBeMoved for field access Change-Id: I36a1f4a468f3701e0608d71f64d64049c54aec18
|
52c489645b6e9ae33623f1ec24143cde5444906e |
|
16-Dec-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add support for volatile - for backends: arm, x86, x86_64 - added necessary instructions to assemblies - clean up code gen for field set/get - fixed InstructionDataEquals for some instructions - fixed comments in compiler_enums * 003-opcode test verifies basic volatile functionality Change-Id: I144393efa312dfb2c332cb84056b00edffee338a
|
4e44c829e282b3979a73bfcba92510e64fbec209 |
|
17-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Small optimization for recursive calls: avoid dex cache." Fails on target. This reverts commit 390f59f9bec64fd81b05e796dfaeb03ab6d4cc81. Change-Id: Ic3865b8897068ba20df0fbc2bcf561faf6c290c1
|
390f59f9bec64fd81b05e796dfaeb03ab6d4cc81 |
|
12-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Small optimization for recursive calls: avoid dex cache. Change-Id: Ic4054b6c38f0a2a530ba6ef747647f86cee0b1b8
|
53d9da8507a1b68f036ce8669ad3f2ae9fc3d225 |
|
04-Dec-2014 |
Jean Christophe Beyler <jean.christophe.beyler@intel.com> |
ART: Create a RemoveBlock method The RemoveDeadBlocks should be separated into a utility function to remove a single block so that it can be used as a future utility method. Change-Id: I4c67113fff24e92a66a81bc0e8edf9fbdda08cdf Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com>
|
e53798a7e3267305f696bf658e418c92e63e0834 |
|
01-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Inlining support in optimizing. Currently only inlines simple things that don't require an environment, such as: - Returning a constant. - Returning a parameter. - Returning an arithmetic operation. Change-Id: Ie844950cb44f69e104774a3cf7a8dea66bc85661
|
624279f3c70f9904cbaf428078981b05d3b324c0 |
|
04-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for float-to-long in the optimizing compiler. - Add support for the float-to-long Dex instruction in the optimizing compiler. - Add a Dex PC field to art::HTypeConversion to allow the x86 and ARM code generators to produce runtime calls. - Instruct art::CodeGenerator::RecordPcInfo not to record PC information for HTypeConversion instructions. - Add S0 to the list of ARM FPU parameter registers. - Have art::x86_64::X86_64Assembler::cvttss2si work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for float to long HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I954214f0d537187883f83f7a83a1bb2dd8a21fd4
|
fc600dccd7797a9a10cdd457034ea8e148ccd631 |
|
02-Dec-2014 |
Roland Levillain <rpl@google.com> |
Fix a compiler bug related to a catch-less try-finally statement. Ensure a dead basic block produced in this case is properly removed. Change-Id: I7c88e26aaa6c6378892f7c7c299494fa42312db2
|
92a6ed2014278c78b60d7ef00751f15e6727aae0 |
|
02-Dec-2014 |
Calin Juravle <calin@google.com> |
Fix new-instance node. new-instance may throw when called on: - interfaces - abstract/innaccessible/unknown classes Change-Id: Id55dbb95b906a58c946b14adad934ee0e3498c0a
|
f537012ceb6cba8a78b36a5065beb9588451a250 |
|
02-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Treat SSA transformation special, as we may have to bailout. We forgot to bailout when we found a non-natural loop (on which our optimizations don't work). Change-Id: I11976b5af4c98f4f29267a74c74d34b5ad81e20c
|
ddb7df25af45d7cd19ed1138e537973735cc78a5 |
|
25-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE} Adds: - float comparison for arm, x86, x86_64 backends. - ucomis{s,d} assembly to x86 and x86_64. - vmstat assebmly for thumb2 - new assembly tests Change-Id: Ie3e19d0c08b3b875cd0a4be4ee4e9c8a4a076290
|
91debbc3da3e3376416e4394155d9f9e355255cb |
|
26-Nov-2014 |
Calin Juravle <calin@google.com> |
Revert "[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE}" Fails on arm due to missing vmrs op after vcmp. I revert this instead of pushing the fix because I don't understand yet why it compiles with run-test but not with dex2oat. This reverts commit fd861249f31ab360c12dd1ffb131d50f02b0bfc6. Change-Id: Idc2d30f6a0f39ddd3596aa18a532ae90f8aaf62f
|
fd861249f31ab360c12dd1ffb131d50f02b0bfc6 |
|
25-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE} - adds float comparison for arm, x86, x86_64 backends. - adds ucomis{s,d} assembly to x86 and x86_64. Change-Id: I232d2b6e9ecf373beb5cc63698dd97a658ff9c83
|
799f506b8d48bcceef5e6cf50f3f5eb6bcea05e1 |
|
26-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE}" Fails on x86_64 and target. This reverts commit cea28ec4b9e94ec942899acf1dbf20f8999b36b4. Change-Id: I30c1d188c7ecfe765f137a307022ede84f15482c
|
cea28ec4b9e94ec942899acf1dbf20f8999b36b4 |
|
25-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE} - adds float comparison for arm, x86, x86_64 backends. - adds ucomis{s,d} assembly to x86 and x86_64. Change-Id: Ie91e04bfb402025073054f3803a3a569e4705caa
|
a8eed3acbc39c71ec22dc2943e71eaa07c6507dd |
|
24-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Fix the computation of linear ordering."" PS2 fixes the obvious typos/wrong refactoring. This reverts commit e50fa5887b1342b845826197d81950e26753fc9c. Change-Id: I22f81d63a12cf01aafd61535abc2399d936d49c2
|
e50fa5887b1342b845826197d81950e26753fc9c |
|
24-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Fix the computation of linear ordering." Build is broken. This reverts commit 3054a90063d379ab8c9e5a42a7daf0d644b48b07. Change-Id: I259bc2bd6a58e30391b8176f3db5fdb5c07e4d6d
|
9aec02fc5df5518c16f1e5a9b6cb198a192db973 |
|
19-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add shifts Added SHL, SHR, USHR for arm, x86, x86_64. Change-Id: I971f594e270179457e6958acf1401ff7630df07e
|
3054a90063d379ab8c9e5a42a7daf0d644b48b07 |
|
21-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix the computation of linear ordering. The register allocator makes assumptions on the order, and we ended up not computing the right one. The algorithm worked fine when the loop header is the block branching to the exit, but in the presence of breaks or do/while, it was incorrect. Change-Id: Iad0a89872cd3f7b7a8b2bdf560f0d03493f93ba5
|
bacfec30ee9f2f6fdfd190f11b105b609938efca |
|
14-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add REM_INT, REM_LONG - for arm, x86, x86_64 - minor cleanup/fix in div tests Change-Id: I240874010206a5a9b3aaffbc81a885b94c248f93
|
af07bc121121d7bd7e8329c55dfe24782207b561 |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Minor object store optimizations. - Avoid emitting write barrier when the value is null. - Do not do a typecheck on an arraystore when storing something that was loaded from the same array. Change-Id: I902492928692e4553b5af0fc99cce3c2186c442a
|
d6fb6cfb6f2d0d9595f55e8cc18d2753be5d9a13 |
|
11-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add DIV_LONG - for backends: arm, x86, x86_64 - added cqo, idivq, testq assembly for x64_64 - small cleanups Change-Id: I762ef37880749038ed25d6014370be9a61795200
|
f97f9fbfdf7f2e23c662f21081fadee6af37809d |
|
11-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] add HTemporary support for long and doubles Change-Id: I5247ecd71d0193050484b7632c804c9bfd20f924
|
9574c4b5f5ef039d694ac12c97e25ca02eca83c0 |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement and/or/xor in optimizing. Change-Id: I7cf6da1fd334a7177a5580931b8f174dd40b7cec
|
b7baf5c58d0e864f8c3f889357c51288aed42e61 |
|
11-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement monitorenter/monitorexit. Pretty simple as they just invoke the runtime. Change-Id: I5fcb2c783deac27e55e28d8b3da3e68ea4b77363
|
57a88d4ac205874dc85d22f9f6a9ca3c4c373eeb |
|
10-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement checkcast for optimizing. - Ended up not using HTypeCheck because of how instanceof and checkcast end up having different logic for code generation. - Fix a x86_64 assembler bug triggered by now enabling more methods to be compiled. Difficult to test today without b/18117217. Change-Id: I3022e7ae03befb1d10bea9637ad21fadc430abe0
|
421e9f9088b51e9680a3dfcae6965fc1854d3ee4 |
|
11-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Remove HTemporary when building the SSA graph. - They are useless afterwards. If we keep them around, they can crash the dump of the graph, where they always assume a previous instruction. - In the call to HTemporary::GetType, check that the previous instruction exists. Change-Id: Ie7bf44d05cb61e3654a69725c1980925580dd3a6
|
52839d17c06175e19ca4a093fb878450d1c4310d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support invoke-interface in optimizing. Change-Id: Ic18d7c3d2810557231caf0571956e0c431f5d384
|
6f5c41f9e409bc4da53b5d7c385202255e391e72 |
|
06-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement instanceof in optimizing. - Only fast-path for now: null or same class. - Use pQuickInstanceofNonTrivial for slow path. Change-Id: Ic5196b94bef792f081f3cb4d15157058e1381e6b
|
f43083d560565aea46c602adb86423daeefe589d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not update Out after it has a valid location. Slow paths use LocationSummary to know where to move things around, and they are executed at the end of the code generation. This fix is needed for https://android-review.googlesource.com/#/c/113345/. Change-Id: Id336c6409479b1de6dc839b736a7234d08a7774a
|
de58ab2c03ff8112b07ab827c8fa38f670dfc656 |
|
05-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement try/catch/throw in optimizing. - We currently don't run optimizations in the presence of a try/catch. - We therefore implement Quick's mapping table. - Also fix a missing null check on array-length. Change-Id: I6917dfcb868e75c1cf6eff32b7cbb60b6cfbd68f
|
cd2de0c1c7f1051a2f7bdb0e827dd6057f3bafcd |
|
06-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix failures after div support. - We need to special case divide by -1 because of x86. - Disable div test on arm64, which does not support div yet. Change-Id: I07e137cb555a958b02a6c4070f296503b7e30bae
|
d0d4852847432368b090c184d6639e573538dccf |
|
04-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add div-int and exception handling. - for backends: arm, x86, x86_64 - fixed a register allocator bug: the request for a fixed register for the first input was ignored if the output was kSameAsFirstInput - added divide by zero exception - more tests - shuffle around some code in the builder to reduce the number of lines of code for a single function. Change-Id: Id3a515e02bfbc66cd9d16cb9746f7551bdab3d42
|
ed9b1958371952f5cdcc040bec8997da462edba7 |
|
06-Nov-2014 |
Roland Levillain <rpl@google.com> |
Fix ART build issues. - Use ATTRIBUTE_UNUSED to avoid a warning about an unused argument in compiler/optimizing/nodes.h instead of simply commenting it out. - Disable run test 002-sleep on ARM64. Change-Id: I96911904289b73611e0fc168e7b597a9a2df8141
|
dff1f2812ecdaea89978c5351f0c70cdabbc0821 |
|
05-Nov-2014 |
Roland Levillain <rpl@google.com> |
Support int-to-long conversions in the optimizing compiler. - Add support for the int-to-float Dex instruction in the optimizing compiler. - Add a HTypeConversion node type for control-flow graphs. - Generate x86, x86-64 and ARM (but not ARM64) code for int-to-float HTypeConversion nodes. - Add a 64-bit "Move doubleword to quadword with sign-extension" (MOVSXD) instruction to the x86-64 assembler. - Add related tests to test/422-type-conversion. Change-Id: Ieb8ec5380f9c411857119c79aa8d0728fd10f780
|
424f676379f2f872acd1478672022f19f3240fc1 |
|
03-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement CONST_CLASS in optimizing compiler. Change-Id: Ia8c8dfbef87cb2f7893bfb6e178466154eec9efd
|
6a3c1fcb4ba42ad4d5d142c17a3712a6ddd3866f |
|
31-Oct-2014 |
Ian Rogers <irogers@google.com> |
Remove -Wno-unused-parameter and -Wno-sign-promo from base cflags. Fix associated errors about unused paramenters and implict sign conversions. For sign conversion this was largely in the area of enums, so add ostream operators for the effected enums and fix tools/generate-operator-out.py. Tidy arena allocation code and arena allocated data types, rather than fixing new and delete operators. Remove dead code. Change-Id: I5b433e722d2f75baacfacae4d32aef4a828bfe1b
|
b5f62b3dc5ac2731ba8ad53cdf3d9bdb14fbf86b |
|
30-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support for CONST_STRING in optimizing compiler. Change-Id: Iab8517bdadd1d15ffbe570010f093660be7c51aa
|
19a19cffd197a28ae4c9c3e59eff6352fd392241 |
|
22-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for static fields in optimizing compiler. Change-Id: Id2f010589e2bd6faf42c05bb33abf6816ebe9fa9
|
7c4954d429626a6ceafbf05be41bf5f840894e44 |
|
28-Oct-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add division for floats and doubles backends: x86, x86_64, arm. Also: - ordered instructions based on their name. - add missing kNoOutputOverlap to add/sub/mul. Change-Id: Ie47cde3b15ac74e7a1660c67a2eed1d7871f0ad0
|
1cc5f251df558b0e22cea5000626365eb644c727 |
|
22-Oct-2014 |
Roland Levillain <rpl@google.com> |
Implement int bit-wise not operation in the optimizing compiler. - Add support for the not-int (integer one's complement negate) instruction in the optimizing compiler. - Extend the HNot control-flow graph node type and make it inherit from HUnaryOperation. - Generate ARM, x86 and x86-64 code for integer HNeg nodes. - Exercise these additions in the codegen_test gtest, as there is not direct way to assess the support of not-int from a Java source. Indeed, compiling a Java expression such as `~a' using javac and then dx generates an xor-int/lit8 Dex instruction instead of the expected not-int Dex instruction. This is probably because the Java bytecode has an `ixor' instruction, but there's not instruction directly corresponding to a bit-wise not operation. Change-Id: I223aed75c4dac5785e04d99da0d22e8d699aee2b
|
cf7f19135f0e273f7b0136315633c2abfc715343 |
|
23-Oct-2014 |
Ian Rogers <irogers@google.com> |
C++11 related clean-up of DISALLOW_.. Move DISALLOW_COPY_AND_ASSIGN to delete functions. By no having declarations with no definitions this prompts better warning messages so deal with these by correcting the code. Add a DISALLOW_ALLOCATION and use for ValueObject and mirror::Object. Make X86 assembly operand types ValueObjects to fix compilation errors. Tidy the use of iostream and ostream. Avoid making cutils a dependency via mutex-inl.h for tests that link against libart. Push tracing dependencies into appropriate files and mutex.cc. x86 32-bit host symbols size is increased for libarttest, avoid copying this in run-test 115 by using symlinks and remove this test's higher than normal ulimit. Fix the RunningOnValgrind test in RosAllocSpace to not use GetHeap as it returns NULL when the heap is under construction by Runtime. Change-Id: Ia246f7ac0c11f73072b30d70566a196e9b78472b
|
a3d05a40de076aabf12ea284c67c99ff28b43dbf |
|
20-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement array creation related DEX instructions. Implement new-array, filled-new-array, and fill-array-data. Change-Id: I405560d66777a57d881e384265322617ac5d3ce3
|
102cbed1e52b7c5f09458b44903fe97bb3e14d5f |
|
15-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement register allocator for floating point registers. Also: - Fix misuses of emitting the rex prefix in the x86_64 assembler. - Fix movaps code generation in the x86_64 assembler. Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
|
9240d6a2baa9ed1e18ee08744b461fe49a1ee269 |
|
20-Oct-2014 |
Roland Levillain <rpl@google.com> |
Constant folding on unary operations in the optimizing compiler. Change-Id: I4b77afa2a89f5ad2eedd4d6c0c6c382585419349
|
88cb1755e1d6acaed0f66ce65d7a2a4465053342 |
|
20-Oct-2014 |
Roland Levillain <rpl@google.com> |
Implement int negate instruction in the optimizing compiler. - Add support for the neg-int (integer two's complement negate) instruction in the optimizing compiler. - Add a HNeg node type for control-flow graphs and an intermediate HUnaryOperation base class. - Generate ARM, x86 and x86-64 code for integer HNeg nodes. Change-Id: I72fd3e1e5311a75c38a8cb665a9211a20325a42e
|
6c82d40eb142771086f5531998de2273ba5cc08c |
|
13-Oct-2014 |
Roland Levillain <rpl@google.com> |
Have HInstruction::StrictlyDominates compute strict dominance. Change-Id: I3a4fa133268615fb4ce54a0bcb43e0c2458cc865
|
34bacdf7eb46c0ffbf24ba7aa14a904bc9176fb2 |
|
07-Oct-2014 |
Calin Juravle <calin@google.com> |
Add multiplication for integral types This also fixes an issue where we could allocate a pair register even if one of its parts was already blocked. Change-Id: I4869175933409add2a56f1ccfb369c3d3dd3cb01
|
633021e6ff6b9a57a374a994e74cfd69275ce100 |
|
01-Oct-2014 |
Roland Levillain <rpl@google.com> |
Implement default traversals in CFG & SSA graph checkers. - Check CFG graphs using an insertion order traversal. - Check SSA form graphs using a reverse post-order traversal. Change-Id: Ib9062599bdbf3c17b9f213b743274b2d71a9fa90
|
e161a2a60c0325793f04be42a0f05228955ecfdd |
|
03-Oct-2014 |
Roland Levillain <rpl@google.com> |
Do not remove NullChecks & BoundsChecks in HDeadCodeElimination. Removing a NullCheck or a BoundsCheck instruction may change the behavior of a program. Change-Id: Ib2c9beff0cc98c382210e7cc88b1fa9af3c61887
|
0279ebb3efd653e6bb255470c99d26949c7bcd95 |
|
09-Oct-2014 |
Ian Rogers <irogers@google.com> |
Tidy ELF builder. Don't do "if (ptr)". Use const. Use DISALLOW_COPY_AND_ASSIGN. Avoid public member variables. Move ValueObject to base and use in ELF builder. Tidy VectorOutputStream to not use non-const reference arguments. Change-Id: I2c727c3fc61769c3726de7cfb68b2d6eb4477e53
|
360231a056e796c36ffe62348507e904dc9efb9b |
|
08-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix code generation of materialized conditions. Move the logic for knowing if a condition needs to be materialized in an optimization pass (so that the information does not change as a side effect of another optimization). Also clean-up arm and x86_64 codegen: - arm: ldr and str are for power-users when a constant is in play. We should use LoadFromOffset and StoreToOffset. - x86_64: fix misuses of movq instead of movl. Change-Id: I01a03b91803624be2281a344a13ad5efbf4f3ef3
|
93445689c714e53cabf347da4321ecf3023e926c |
|
06-Oct-2014 |
Roland Levillain <rpl@google.com> |
Fix and improve static evaluation of constant expressions. - Fix the definition of art::HSub::Evaluate. - Qualify Evaluate methods as OVERRIDE. - Evaluate comparisons in a deterministic way: if a comparison is true, always return 1 (instead of letting the compiler return any non-null value). - Better exercise static evaluation of constant expressions in compiler/optimizing/constant_propagation_test.cc. Change-Id: I13d0862e5f4eba1275016fb8c3c17e9aff54408b
|
740475d5f45b8caa2c3c6fc51e657ecf4f3547e5 |
|
29-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a bug in the insertion of parallel move. To make sure we do not connect interval siblings in the same parallel move, I added a new field in MoveOperands that tells for which instruction this move is for. A parallel move should not contains moves for the same instructions. The checks revealed a bug when connecting siblings, where we would choose the wrong parallel move. Change-Id: I70f27ec120886745c187071453c78da4c47c1dd2
|
9ebc72c99e6b703bda611d7c918c9cf3dfb43e55 |
|
25-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Make suspend checks note have side effects. Also adjust gtests. Change-Id: I5e1a3e53115812b45ec7f4b6f50ba468fa7ac6b1
|
5799fc0754da7ff2b50b472e05c65cd4ba32dda2 |
|
25-Sep-2014 |
Roland Levillain <rpl@google.com> |
Optimizing compiler: remove unnecessary `explicit' keywords. Change-Id: I5927fd92d53308c81e14edbd6e7d1c943bfa085b
|
3c04974a90b0e03f4b509010bff49f0b2a3da57f |
|
24-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize suspend checks in optimizing compiler. - Remove the ones added during graph build (they were added for the baseline code generator). - Emit them at loop back edges after phi moves, so that the test can directly jump to the loop header. - Fix x86 and x86_64 suspend check by using cmpw instead of cmpl. Change-Id: I6fad5795a55705d86c9e1cb85bf5d63dadfafa2a
|
6b46923ff0197c95f1e7ea0bc730961df6725cc9 |
|
25-Sep-2014 |
Roland Levillain <rpl@google.com> |
Optimizing compiler: check inputs & uses definitions in CFG. Ensure each input and each use of an instruction is defined in a block of the control-flow graph. Change-Id: If4a83b02825230329b0b4fd84255dcb7c3219684
|
18efde5017369e005f1e8bcd3bbfb04e85053640 |
|
22-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix code generation with materialized conditions. Change-Id: I8630af3c13fc1950d3fa718d7488407b00898796
|
6b879ddc0959df1cec871f0d41f11cce35a11716 |
|
22-Sep-2014 |
Roland Levillain <rpl@google.com> |
Add loop- and phi-related checks in the optimizing compiler. - Ensure the pre-header block is first in the list of predecessors of a loop header. - Ensure the loop header has only two predecessors and that only the second one is the back edge. - Ensure there is only one back edge per loop. - Ensure the first input of a phi is not itself. - Ensure the number of phi inputs is the same as the number of its predecessors. - Ensure phi input at index I either comes from the Ith predecessor or from a block that dominates this predecessor. Change-Id: I4db5c68cfbc9b74d2d03125753d0143ece625378
|
724c96326dea6ec33287a0076279c136abb0208a |
|
22-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Also remove environment links to removed instructions. Change-Id: I505163fb8683269c7d3fe21b34df92337d244552
|
d31cf3d55a0847c018c4eaa2b349b8eea509de64 |
|
08-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
First optimization in new compiler: simple GVN. Change-Id: Ibe0efa4e84fd020a53ded310a92e0b4363f91b12
|
c83d441a722f0afb510c9cd0e69e09d65652143c |
|
18-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a lint error and update a test after the phi fix. Change-Id: I6e9ab2a8300c2493a8d3e93ab4ced3d7c9552fc5
|
556c3d193134f6461f3e1fe17c032b087c5931a0 |
|
18-Sep-2014 |
Roland Levillain <rpl@google.com> |
Initiate a constant propagation pass in the optimizing compiler. - Perform constant folding on int and long additions and subtractions in the optimizing compiler. - Apply constant folding to conditions and comparisons. Change-Id: Ic88783a3c975fda777c74c531e257fa777be42eb
|
b09aacb495dce2cb3e8469f056fdc2636ae393e6 |
|
17-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Small fixes to get the boot image compiled in debug mode. Change-Id: Id697737a1bcfb87f407d707e2ddd4f50a77caf26
|
72bceff11a98cc1ecdb64a6fae16c521f99ec6a7 |
|
15-Sep-2014 |
Roland Levillain <rpl@google.com> |
Initiate a dead code elimination pass in the optimizing compiler. Change-Id: Ie9db5d8e2c2c30e34145a0f7d2386b8ec58cfc4e
|
ccc07a9579c554443cd03a306ca9b4f943fd2a93 |
|
16-Sep-2014 |
Roland Levillain <rpl@google.com> |
Add CFG and SSA form checkers in the optimizing compiler. Checks performed on control-flow graphs: - Ensure that the predecessors and successors of a basic block are consistent within a control-flow graph. - Ensure basic blocks end with a branch instruction. - Detect phi functions listed in non-phi instruction lists and vice versa. - Ensure a block's instructions (and phi functions) are associated with this very block. Checks performed on SSA form graphs: - Ensure an instruction dominates all its uses. - Ensure there are no critical edges. Change-Id: I1c12b4a61ecf608682152c897980ababa7eca847
|
604c6e4764edb2fd244e9f47626868cda5644a7a |
|
17-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Ensure the first predecessor of a loop is the pre header. Note that the check in ssa_phi_elimination.cc was very defensive: it does not affect the outcome of the algorithm whether the loop phi takes itself as the first input. It makes things consistent to always have the pre header as first input. Change-Id: Ic86248c1f38af67f7432782f6deefae1f4bf1ab6
|
e982f0b8e809cece6f460fa2d8df25873aa69de4 |
|
13-Aug-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement invoke virtual in optimizing compiler. Also refactor 004 tests to make them work with both Quick and Optimizing. Change-Id: I87e275cb0ae0258fc3bb32b612140000b1d2adf8
|
fbc695f9b8e2084697e19c1355ab925f99f0d235 |
|
15-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Implement suspend checks in new compiler."" This reverts commit 7e3652c45c30c1f2f840e6088e24e2db716eaea7. Change-Id: Ib489440c34e41cba9e9e297054f9274f6e81a2d8
|
7e3652c45c30c1f2f840e6088e24e2db716eaea7 |
|
15-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Implement suspend checks in new compiler." This reverts commit 6fbce029fba3ed5da6c36017754ed408e6bcb632. Change-Id: Ia915c27873b021e658a10212e559095dfc91284e
|
6fbce029fba3ed5da6c36017754ed408e6bcb632 |
|
10-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement suspend checks in new compiler. For simplicity, they are currently placed on all (dex-level) back edges, and at method entry. Change-Id: I6e833e244d559dd788c69727e22fe40aff5b3435
|
065bf77b43c39da315b974ea08a5ed25e9049681 |
|
03-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add (simple) side effects flags and equality methods on nodes. This is in preparation of doing GVN and LICM. Change-Id: I43050ff846755f9387a62b893d548ecdb54e7e95
|
3946844c34ad965515f677084b07d663d70ad1b8 |
|
02-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Runtime support for the new stack maps for the opt compiler. Now most of the methods supported by the compiler can be optimized, instead of using the baseline. Change-Id: I80ab36a34913fa4e7dd576c7bf55af63594dc1fa
|
3ac17fcce8773388512ce72cb491b202872ca1c1 |
|
07-Aug-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix SsaDeadPhiElimination in the presence of dependent phis. This fixes the problem of having a dead loop phi taking as back-edge input a phi that also has this loop phi as input. Walking backwards does not solve the problem because the loop phi will be visited last. Most of the time, dex removes dead locals like this. Change-Id: I797198cf9c15f8faa6585cca157810e23aaa4940
|
3c7bb98698f77af10372cf31824d3bb115d9bf0f |
|
23-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement array get and array put in optimizing. Also fix a couple of assembler/disassembler issues. Change-Id: I705c8572988c1a9c4df3172b304678529636d5f6
|
96f89a290eb67d7bf4b1636798fa28df14309cc7 |
|
11-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add assembly operations with constants in optimizing compiler. Change-Id: I5bcc35ab50d4457186effef5592a75d7f4e5b65f
|
e63db27db913f1a88e2095a1ee8239b2bb9124e8 |
|
16-Jul-2014 |
Ian Rogers <irogers@google.com> |
Break apart header files. Create libart-gtest for common runtime and compiler gtest routines. Rename CompilerCallbacksImpl that is quick compiler specific. Rename trace clock source constants to not use the overloaded profiler term. Change-Id: I4aac4bdc7e7850c68335f81e59a390133b54e933
|
ab032bc1ff57831106fdac6a91a136293609401f |
|
15-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a braino in the stack layout. Also do some refactoring to have this code be just in CodeGenerator. Change-Id: I88de109889138af8d60027973c12a64bee813cb7
|
e50383288a75244255d3ecedcc79ffe9caf774cb |
|
04-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support fields in optimizing compiler. - Required support for temporaries, to be only used by baseline compiler. - Also fixed a few invalid assumptions around locations and instructions that don't need materialization. These instructions should not have an Out. Change-Id: Idc4a30dd95dd18015137300d36bec55fc024cf62
|
7dc206a53a42a658f52d5cb0b7e79b47da370c9b |
|
11-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add two phi pruning phases. Change-Id: Ic4f05e3df96970d78a6938b27cdf9b58ef3849b9
|
412f10cfed002ab617c78f2621d68446ca4dd8bd |
|
19-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support longs in the register allocator for x86_64. Change-Id: I7fb6dfb761bc5cf9e5705682032855a0a70ca867
|
20dfc797dc631bf8d655dcf123f46f13332d3074 |
|
17-Jun-2014 |
Dave Allison <dallison@google.com> |
Add some more instruction support to optimizing compiler. This adds a few more DEX instructions to the optimizing compiler's builder (constants, moves, if_xx, etc). Also: * Changes the codegen for IF_XX instructions to use a condition rather than comparing a value against 0. * Fixes some instructions in the ARM disassembler. * Fixes PushList and PopList in the thumb2 assembler. * Switches the assembler for the optimizing compiler to thumb2 rather than ARM. Change-Id: Iaafcd02243ccc5b03a054ef7a15285b84c06740f
|
86dbb9a12119273039ce272b41c809fa548b37b6 |
|
04-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Final CL to enable register allocation on x86. This CL implements: 1) Resolution after allocation: connecting the locations allocated to an interval within a block and between blocks. 2) Handling of fixed registers: some instructions require inputs/output to be at a specific location, and the allocator needs to deal with them in a special way. 3) ParallelMoveResolver::EmitNativeCode for x86. Change-Id: I0da6bd7eb66877987148b87c3be6a983b4e3f858
|
ec7e4727e99aa1416398ac5a684f5024817a25c7 |
|
06-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix some bugs in graph construction/simplification methods. Also fix a brano during SSA construction. The code should not have been commented out. Added a test to cover what the code intends. Change-Id: Ia00ae79dcf75eb0d412f07649d73e7f94dbfb6f0
|
ffddfdf6fec0b9d98a692e27242eecb15af5ead2 |
|
03-Jun-2014 |
Tim Murray <timmurray@google.com> |
DO NOT MERGE Merge ART from AOSP to lmp-preview-dev. Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
|
a7062e05e6048c7f817d784a5b94e3122e25b1ec |
|
22-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a linear scan register allocator to the optimizing compiler. This is a "by-the-book" implementation. It currently only deals with allocating registers, with no hint optimizations. The changes remaining to make it functional are: - Allocate spill slots. - Resolution and placements of Move instructions. - Connect it to the code generator. Change-Id: Ie0b2f6ba1b98da85425be721ce4afecd6b4012a4
|
4e3d23aa1523718ea1fdf3a32516d2f9d81e84fe |
|
22-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Import Dart's parallel move resolver. And write a few tests while at it. A parallel move resolver will be needed for performing multiple moves that are conceptually parallel, for example moves at a block exit that branches to a block with phi nodes. Change-Id: Ib95b247b4fc3f2c2fcab3b8c8d032abbd6104cd7
|
ddb311fdeca82ca628fed694c4702f463b5c4927 |
|
16-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Build live ranges in preparation for register allocation. Change-Id: I7ae24afaa4e49276136bf34f4ba7d62db7f28c01
|
0d3f578909d0d1ea072ca68d78301b6fb7a44451 |
|
14-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Linearize the graph before creating live ranges. Change-Id: I02eb5671e3304ab062286131745c1366448aff58
|
f635e63318447ca04731b265a86a573c9ed1737c |
|
14-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a compilation tracing mechanism to the new compiler. Code mostly imported from: https://android-review.googlesource.com/#/c/81653/. Change-Id: I150fe942be0fb270e03fabb19032180f7a065d13
|
622d9c31febd950255b36a48b47e1f630197c5fe |
|
12-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add loop recognition and CFG simplifications in new compiler. We do three simplifications: - Split critical edges, for code generation from SSA (new). - Ensure one back edge per loop, to simplify loop recognition (new). - Ensure only one pre header for a loop, to simplify SSA creation (existing). Change-Id: I9bfccd4b236a00486a261078627b091c8a68be33
|
804d09372cc3d80d537da1489da4a45e0e19aa5d |
|
02-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Build live-in, live-out and kill sets for each block. This information will be used when computing live ranges of instructions. Change-Id: I345ee833c1ccb4a8e725c7976453f6d58d350d74
|
c32e770f21540e4e9eda6dc7f770e745d33f1b9f |
|
24-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a Transform to SSA phase to the optimizing compiler. Change-Id: Ia9700756a0396d797a00b529896487d52c989329
|
db928fcc975b431d8a78700c11bd7da21090384a |
|
16-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Simplify HInvokeStatic code generation. HPushArgument is not needed for now (but might be when we start optimizing). Also, calling convention for 64bits backend will require to know more about the argument than the argument's index. Therefore currently let HInvokeStatic setup the arguments, which is possible because arguments of a calls are virtual registers and not instructions. Change-Id: I8753ed6083aa083c5180ab53b436dc8de4f1fe31
|
01bc96d007b67fdb7fe349232a83e4b354ce3d08 |
|
11-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Long support in optimizing compiler. - Add stack locations to the Location class. - Change logic of parameter passing/setup by setting the location of such instructions the ones for the calling convention. Change-Id: I4730ad58732813dcb9c238f44f55dfc0baa18799
|
b55f835d66a61e5da6fc1895ba5a0482868c9552 |
|
07-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Test control flow instruction with optimizing compiler. Add support for basic instructions to implement these tests. Change-Id: I3870bf9301599043b3511522bb49dc6364c9b4c0
|
f583e5976e1de9aa206fb8de4f91000180685066 |
|
07-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for taking parameters in optimizing compiler. - Fix stack layout to mimic Quick's. - Implement some sub operations. Change-Id: I8cf75a4d29b662381a64f02c0bc61d859482fc4e
|
2e7038ac5848468740d6a419434d3dde8c585a53 |
|
03-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for new-instance and invoke-direct. Change-Id: I2daed646904f7711972a7da15d88be7573426932
|
4a34a428c6a2588e0857ef6baf88f1b73ce65958 |
|
03-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support passing arguments to invoke-static* instructions. - Stop using the frame pointer for accessing locals. - Stop emulating a stack when doing code generation. Instead, rely on dex register model, where instructions only reference registers. Change-Id: Id51bd7d33ac430cb87a53c9f4b0c864eeb1006f9
|
d8ee737fdbf380c5bb90c9270c8d1087ac23e76c |
|
28-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for adding two integers in optimizing compiler. Change-Id: I5524e193cd07f2692a57c6b4f8069904471b2928
|
8ccc3f5d06fd217cdaabd37e743adab2031d3720 |
|
19-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for invoke-static in optimizing compiler. Support is limited to calls without parameters and returning void. For simplicity, we currently follow the Quick ABI. Change-Id: I54805161141b7eac5959f1cae0dc138dd0b2e8a5
|
787c3076635cf117eb646c5a89a9014b2072fb44 |
|
17-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Plug new optimizing compiler in compilation pipeline. Also rename accessors to ART's conventions. Change-Id: I344807055b98aa4b27215704ec362191464acecc
|
bab4ed7057799a4fadc6283108ab56f389d117d4 |
|
11-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
More code generation for the optimizing compiler. - Add HReturn instruction - Generate code for locals/if/return - Setup infrastructure for register allocation. Currently emulate a stack. Change-Id: Ib28c2dba80f6c526177ed9a7b09c0689ac8122fb
|
3ff386aafefd5282bb76c8a50506a70a4321e698 |
|
04-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add register support to the optimizing compiler. Also make if take an input and build the use list for instructions. Change-Id: I1938cee7dce5bd4c66b259fa2b431d2c79b3cf82
|
d4dd255db1d110ceb5551f6d95ff31fb57420994 |
|
28-Feb-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add codegen support to the optimizing compiler. Change-Id: I9aae76908ff1d6e64fb71a6718fc1426b67a5c28
|
0e33643519b68a343a7466dcaba12b8567777cc3 |
|
26-Feb-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Move arena_bit_vector.h/cc to compiler/utils. Also move MIR's BasicBlock related code from arena_bit_vector.h to bit_vector_block_iterator.cc. Change-Id: I85c224b387d31cf57a1ef1f1a36eaadf22f1c85d
|
be9a92aa804c0d210f80966b74ef8ed3987f335a |
|
25-Feb-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add conditional branches, and build dominator tree. Change-Id: I4b151a07b72692961235a1419b54b6b45cf54e63
|
818f2107e6d2d9e80faac8ae8c92faffa83cbd11 |
|
18-Feb-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Re-apply: Initial check-in of an optimizing compiler. The classes and the names are very much inspired by V8/Dart. It currently only supports the RETURN_VOID dex instruction, and there is a pretty printer to check if the building of the graph is correct. Change-Id: I28e125dfee86ae6ec9b3fec6aa1859523b92a893
|
1af0c0b88a956813eb0ad282664cedc391e2938f |
|
19-Feb-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Initial check-in of an optimizing compiler." g++ warnings turned into errors. This reverts commit 68a5fefa90f03fdf5a238ac85c9439c6b03eae96. Change-Id: I09bb95d9cc13764ca8a266c41af04801a34b9fd0
|
68a5fefa90f03fdf5a238ac85c9439c6b03eae96 |
|
18-Feb-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Initial check-in of an optimizing compiler. The classes and the names are very much inspired by V8/Dart. It currently only supports the RETURN_VOID dex instruction, and there is a pretty printer to check if the building of the graph is correct. Change-Id: Id5ef1b317ab997010d4e3888e456c26bef1ab9c0
|