2aaa4b5532d30c4e65d8892b556400bb61f9dc8c |
|
17-Sep-2015 |
Vladimir Marko <vmarko@google.com> |
Optimizing: Tag more arena allocations. Replace GrowableArray with ArenaVector and tag arena allocations with new allocation types. As part of this, make the register allocator a bit more efficient, doing bulk insert/erase. Some loops are now O(n) instead of O(n^2). Change-Id: Ifac0871ffb34b121cc0447801a2d07eefd308c14
|
77a48ae01bbc5b05ca009cf09e2fcb53e4c8ff23 |
|
15-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "Revert "ART: Register allocation and runtime support for try/catch"" The original CL triggered b/24084144 which has been fixed by Ib72e12a018437c404e82f7ad414554c66a4c6f8c. This reverts commit 659562aaf133c41b8d90ec9216c07646f0f14362. Change-Id: Id8980436172457d0fcb276349c4405f7c4110a55
|
659562aaf133c41b8d90ec9216c07646f0f14362 |
|
14-Sep-2015 |
David Brazdil <dbrazdil@google.com> |
Revert "ART: Register allocation and runtime support for try/catch" Breaks libcore test org.apache.harmony.security.tests.java.security.KeyStorePrivateKeyEntryTest#testGetCertificateChain. Need to investigate. This reverts commit b022fa1300e6d78639b3b910af0cf85c43df44bb. Change-Id: Ib24d3a80064d963d273e557a93469c95f37b1f6f
|
b022fa1300e6d78639b3b910af0cf85c43df44bb |
|
20-Aug-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Register allocation and runtime support for try/catch This patch completes a series of CLs that add support for try/catch in the Optimizing compiler. With it, Optimizing can compile all methods containing try/catch, provided they don't contain catch loops. Future work will focus on improving performance of the generated code. SsaLivenessAnalysis was updated to propagate liveness information of instructions live at catch blocks, and to keep location information on instructions which may be caught by catch phis. RegisterAllocator was extended to spill values used after catch, and to allocate spill slots for catch phis. Catch phis generated for the same vreg share a spill slot as the raw value must be the same. Location builders and slow paths were updated to reflect the fact that throwing an exception may not lead to escaping the method. Instruction code generators are forbidden from using of implicit null checks in try blocks as live registers need to be saved before handing over to the runtime. CodeGenerator emits a stack map for each catch block, storing locations of catch phis. CodeInfo and StackMapStream recognize this new type of stack map and store them separate from other stack maps to avoid dex_pc conflicts. After having found the target catch block to deliver an exception to, QuickExceptionHandler looks up the dex register maps at the throwing instruction and the catch block and copies the values over to their respective locations. The runtime-support approach was selected because it allows for the best performance in the normal control-flow path, since no propagation of catch phi values is necessary until the exception is thrown. In addition, it also greatly simplifies the register allocation phase. ConstantHoisting was removed from LICMTest because it instantiated (now abstract) HConstant and was bogus anyway (constants are always in the entry block). Change-Id: Ie31038ad8e3ee0c13a5bbbbaf5f0b3e532310e4e
|
41b175aba41c9365a1c53b8a1afbd17129c87c14 |
|
19-May-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Clean up arm64 kNumberOfXRegisters usage. Avoid undefined behavior for arm64 stemming from 1u << 32 in loops with upper bound kNumberOfXRegisters. Create iterators for enumerating bits in an integer either from high to low or from low to high and use them for <arch>Context::FillCalleeSaves() on all architectures. Refactor runtime/utils.{h,cc} by moving all bit-fiddling functions to runtime/base/bit_utils.{h,cc} (together with the new bit iterators) and all time-related functions to runtime/base/time_utils.{h,cc}. Improve test coverage and fix some corner cases for the bit-fiddling functions. Bug: 13925192 (cherry picked from commit 80afd02024d20e60b197d3adfbb43cc303cf29e0) Change-Id: I905257a21de90b5860ebe1e39563758f721eab82
|
80afd02024d20e60b197d3adfbb43cc303cf29e0 |
|
19-May-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Clean up arm64 kNumberOfXRegisters usage. Avoid undefined behavior for arm64 stemming from 1u << 32 in loops with upper bound kNumberOfXRegisters. Create iterators for enumerating bits in an integer either from high to low or from low to high and use them for <arch>Context::FillCalleeSaves() on all architectures. Refactor runtime/utils.{h,cc} by moving all bit-fiddling functions to runtime/base/bit_utils.{h,cc} (together with the new bit iterators) and all time-related functions to runtime/base/time_utils.{h,cc}. Improve test coverage and fix some corner cases for the bit-fiddling functions. Bug: 13925192 Change-Id: I704884dab15b41ecf7a1c47d397ab1c3fc7ee0f7
|
8826f67ad53099021f6442364348fa66729288d7 |
|
17-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Callee/caller save logic in register allocator. Prevent intervals that do not span a 'will-call' safepoint to allocate a callee-save register when caller-saves are available. Change-Id: I6e613ab54b087f433bbc433aa62847fbca423377
|
8cbab3c4de3328b576454ce702d7748f56c44346 |
|
23-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Linear scan: split at better positions. - Split at block entry to piggy back on control flow resolution. - Split at the loop header, if the split position is within a loop. Change-Id: I718299a58c02ee02a1b22bda589607c69a35f0e8
|
5b168deeae2c5a8a566ce5c140741f0e2227af21 |
|
27-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix user-build on fugu. Calling Delete on an array shifts the elements, so when iterating over inactives and removing entries we need to decrement for the found interval, but also its potential other half. The code used to not decrement for the other half Change-Id: Idcb1533643c11a37ed4f459fe88aaef208a4bfd6
|
234d69d075d1608f80adb647f7935077b62b6376 |
|
09-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "[optimizing] Enable x86 long support."" This reverts commit 154552e666347d41d95d7619c6ee56249ff4feca. Change-Id: Idc726551c249a888b7ff5fde8508ae50e81b2e13
|
154552e666347d41d95d7619c6ee56249ff4feca |
|
06-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "[optimizing] Enable x86 long support." Few libcore failures. This reverts commit b4ba354cf8d22b261205494875cc014f18587b50. Change-Id: I4a28d853e730dff9b69aec9555505803cf2fcd63
|
b4ba354cf8d22b261205494875cc014f18587b50 |
|
05-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
[optimizing] Enable x86 long support. Change-Id: I9006972a65a1f191c45691104a960366747f9d16
|
7c3952f423b8213083d60596a5f0bf4237ca3f7b |
|
20-Feb-2015 |
Andreas Gampe <agampe@google.com> |
ART: Add -Wunused Until the global CFLAGS are fixed, add Wunused. Fix declarations in the optimizing compiler. Change-Id: Ic4553f08e809dc54f3d82af57ac592622c98e000
|
776b3184ee04092b11edc781cdb81e8ed60601e3 |
|
23-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Each primitive kind now spills to different locations. Having different slots depending on the types greatly simplifies the parallel move resolver. It also avoids doing FPU <-> Core register swaps, and force backends to implement such a swap. Change-Id: Ide9f0452e7ccf9efb8adddbcc246d44b937b253c
|
6c2dff8ff8e1440fa4d9e1b2ba2a44d036882801 |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Fully support pairs in the register allocator."" This reverts commit c399fdc442db82dfda66e6c25518872ab0f1d24f. Change-Id: I19f8215c4b98f2f0827e04bf7806c3ca439794e5
|
c399fdc442db82dfda66e6c25518872ab0f1d24f |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Fully support pairs in the register allocator." Libcore tests fail. This reverts commit 41aedbb684ccef76ff8373f39aba606ce4cb3194. Change-Id: I2572f120d4bbaeb7a4d4cbfd47ab00c9ea39ac6c
|
41aedbb684ccef76ff8373f39aba606ce4cb3194 |
|
14-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fully support pairs in the register allocator. Enabled on ARM for longs and doubles. Change-Id: Id8792d08bd7ca9fb049c5db8a40ae694bafc2d8b
|
dd8f887e81b894bc8075d8bacdb223747b6a8018 |
|
15-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a bug in the register allocator. When allocating a register blocked by existing intervals, we need to split inactive intervals at the end of their lifetime hole, and not at the next intersection. Otherwise, the allocation for following intervals will not see that a register is being used by the split interval. Change-Id: I40cc79dde541c07392a7cf4c6f0b291dd1ce1819
|
f85a9ca9859ad843dc03d3a2b600afbaf2e9bbdd |
|
13-Jan-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing compiler] Compute live spill size The current stack frame calculation assumes that each live register to be saved/restored has the word size of the machine. This fails for X86, where a double in an XMM register takes up 8 bytes. Change the calculation to keep track of the number of core registers and number of fp registers to handle this distinction. This is slightly pessimal, as the registers may not be active at the same time, but the only way to handle this would be to allocate both classes of registers simultaneously, or remember all the active intervals, matching them up and compute the size of each safepoint interval. Change-Id: If7860aa319b625c214775347728cdf49a56946eb Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
840e5461a85f8908f51e7f6cd562a9129ff0e7ce |
|
07-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement double and float support for arm in register allocator. The basic approach is: - An instruction that needs two registers gets two intervals. - When allocating the low part, we also allocate the high part. - When splitting a low (or high) interval, we also split the high (or low) equivalent. - Allocation follows the (S/D register) requirement that low registers are always even and the high equivalent is low + 1. Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
|
3e69f16ae3fddfd24f4f0e29deb106d564ab296c |
|
10-Dec-2014 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Add arm64 support for register allocation. Change-Id: Idc6e84eee66170de4a9c0a5844c3da038c083aa7
|
296bd60423e0630d8152b99fb7afb20fbff5a18a |
|
07-Oct-2014 |
Mingyao Yang <mingyao@google.com> |
Some improvement to reg alloc. Change-Id: If579a37791278500a7e5bc763f144c241f261920
|
102cbed1e52b7c5f09458b44903fe97bb3e14d5f |
|
15-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement register allocator for floating point registers. Also: - Fix misuses of emitting the rex prefix in the x86_64 assembler. - Fix movaps code generation in the x86_64 assembler. Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
|
677cd61ad05d993c4d3b22656675874f06d6aabc |
|
15-Oct-2014 |
Ian Rogers <irogers@google.com> |
Make ART compile with GCC -O0 again. Tidy up InstructionSetFeatures so that it has a type hierarchy dependent on architecture. Add to instruction_set_test to warn when InstructionSetFeatures don't agree with ones from system properties, AT_HWCAP and /proc/cpuinfo. Clean-up class linker entry point logic to not return entry points but to test whether the passed code is the particular entrypoint. This works around image trampolines that replicate entrypoints. Bug: 17993736 (cherry picked from commit 6f3dbbadf4ce66982eb3d400e0a74cb73eb034f3) Change-Id: I3e7595f437db4828072589d475a5453b7f31003e
|
6f3dbbadf4ce66982eb3d400e0a74cb73eb034f3 |
|
15-Oct-2014 |
Ian Rogers <irogers@google.com> |
Make ART compile with GCC -O0 again. Tidy up InstructionSetFeatures so that it has a type hierarchy dependent on architecture. Add to instruction_set_test to warn when InstructionSetFeatures don't agree with ones from system properties, AT_HWCAP and /proc/cpuinfo. Clean-up class linker entry point logic to not return entry points but to test whether the passed code is the particular entrypoint. This works around image trampolines that replicate entrypoints. Bug: 17993736 Change-Id: I5f4b49e88c3b02a79f9bee04f83395146ed7be23
|
740475d5f45b8caa2c3c6fc51e657ecf4f3547e5 |
|
29-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a bug in the insertion of parallel move. To make sure we do not connect interval siblings in the same parallel move, I added a new field in MoveOperands that tells for which instruction this move is for. A parallel move should not contains moves for the same instructions. The checks revealed a bug when connecting siblings, where we would choose the wrong parallel move. Change-Id: I70f27ec120886745c187071453c78da4c47c1dd2
|
3c04974a90b0e03f4b509010bff49f0b2a3da57f |
|
24-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize suspend checks in optimizing compiler. - Remove the ones added during graph build (they were added for the baseline code generator). - Emit them at loop back edges after phi moves, so that the test can directly jump to the loop header. - Fix x86 and x86_64 suspend check by using cmpw instead of cmpl. Change-Id: I6fad5795a55705d86c9e1cb85bf5d63dadfafa2a
|
3bca0df855f0e575c6ee020ed016999fc8f14122 |
|
19-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support for saving and restoring live registers in a slow path. And use it in suspend check slow paths. Change-Id: I79caf28f334c145a36180c79a6e2fceae3990c31
|
aac0f39a3501a7f7dd04b2342c2a16961969f139 |
|
16-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a bug in the register allocator. We need to take the live interval that starts first to know until when a register is free, instead of using the live interval that is last in the inactive list. Change-Id: I2c9f87481ff1b4fc7b9948db7559b8d3b11d84ce
|
3946844c34ad965515f677084b07d663d70ad1b8 |
|
02-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Runtime support for the new stack maps for the opt compiler. Now most of the methods supported by the compiler can be optimized, instead of using the baseline. Change-Id: I80ab36a34913fa4e7dd576c7bf55af63594dc1fa
|
93bedb7a96c8e6f9b6caa66689bf4f3c520bc234 |
|
18-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
We can also run the linear scan register allocator on thumb. Change-Id: I5d21b5cbcdd93ff36342111de4ebcaab172034dd
|
e63db27db913f1a88e2095a1ee8239b2bb9124e8 |
|
16-Jul-2014 |
Ian Rogers <irogers@google.com> |
Break apart header files. Create libart-gtest for common runtime and compiler gtest routines. Rename CompilerCallbacksImpl that is quick compiler specific. Rename trace clock source constants to not use the overloaded profiler term. Change-Id: I4aac4bdc7e7850c68335f81e59a390133b54e933
|
412f10cfed002ab617c78f2621d68446ca4dd8bd |
|
19-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support longs in the register allocator for x86_64. Change-Id: I7fb6dfb761bc5cf9e5705682032855a0a70ca867
|
e27f31a81636ad74bd3376ee39cf215941b85c0e |
|
12-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable the register allocator on ARM. - Also fixes a few bugs/wrong assumptions in code not hit by x86. - We need to differentiate between moves due to connecting siblings within a block, and moves due to control flow resolution. Change-Id: Idd05cf138a71c8f36f5531c473de613c0166fe38
|
86dbb9a12119273039ce272b41c809fa548b37b6 |
|
04-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Final CL to enable register allocation on x86. This CL implements: 1) Resolution after allocation: connecting the locations allocated to an interval within a block and between blocks. 2) Handling of fixed registers: some instructions require inputs/output to be at a specific location, and the allocator needs to deal with them in a special way. 3) ParallelMoveResolver::EmitNativeCode for x86. Change-Id: I0da6bd7eb66877987148b87c3be6a983b4e3f858
|
ecb2f9ba57b08ceac4204ddd6a0a88a0524f8741 |
|
13-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable the register allocator on x86_64. Also fix an x86_64 assembler bug for movl. Change-Id: I8d17c68cd35ddd1d8df159f2d6173a013a7c3347
|
31d76b42ef5165351499da3f8ee0ac147428c5ed |
|
09-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Plug code generator into liveness analysis. Also implement spill slot support. Change-Id: If5e28811e9fbbf3842a258772c633318a2f4fafc
|
ffddfdf6fec0b9d98a692e27242eecb15af5ead2 |
|
03-Jun-2014 |
Tim Murray <timmurray@google.com> |
DO NOT MERGE Merge ART from AOSP to lmp-preview-dev. Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
|
a7062e05e6048c7f817d784a5b94e3122e25b1ec |
|
22-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a linear scan register allocator to the optimizing compiler. This is a "by-the-book" implementation. It currently only deals with allocating registers, with no hint optimizations. The changes remaining to make it functional are: - Allocate spill slots. - Resolution and placements of Move instructions. - Connect it to the code generator. Change-Id: Ie0b2f6ba1b98da85425be721ce4afecd6b4012a4
|