History log of /art/compiler/optimizing/register_allocator.h
Revision Date Author Comments
2aaa4b5532d30c4e65d8892b556400bb61f9dc8c 17-Sep-2015 Vladimir Marko <vmarko@google.com> Optimizing: Tag more arena allocations.

Replace GrowableArray with ArenaVector and tag arena
allocations with new allocation types.

As part of this, make the register allocator a bit more
efficient, doing bulk insert/erase. Some loops are now
O(n) instead of O(n^2).

Change-Id: Ifac0871ffb34b121cc0447801a2d07eefd308c14
77a48ae01bbc5b05ca009cf09e2fcb53e4c8ff23 15-Sep-2015 David Brazdil <dbrazdil@google.com> Revert "Revert "ART: Register allocation and runtime support for try/catch""

The original CL triggered b/24084144 which has been fixed
by Ib72e12a018437c404e82f7ad414554c66a4c6f8c.

This reverts commit 659562aaf133c41b8d90ec9216c07646f0f14362.

Change-Id: Id8980436172457d0fcb276349c4405f7c4110a55
659562aaf133c41b8d90ec9216c07646f0f14362 14-Sep-2015 David Brazdil <dbrazdil@google.com> Revert "ART: Register allocation and runtime support for try/catch"

Breaks libcore test org.apache.harmony.security.tests.java.security.KeyStorePrivateKeyEntryTest#testGetCertificateChain. Need to investigate.

This reverts commit b022fa1300e6d78639b3b910af0cf85c43df44bb.

Change-Id: Ib24d3a80064d963d273e557a93469c95f37b1f6f
b022fa1300e6d78639b3b910af0cf85c43df44bb 20-Aug-2015 David Brazdil <dbrazdil@google.com> ART: Register allocation and runtime support for try/catch

This patch completes a series of CLs that add support for try/catch
in the Optimizing compiler. With it, Optimizing can compile all
methods containing try/catch, provided they don't contain catch loops.
Future work will focus on improving performance of the generated code.

SsaLivenessAnalysis was updated to propagate liveness information of
instructions live at catch blocks, and to keep location information on
instructions which may be caught by catch phis.

RegisterAllocator was extended to spill values used after catch, and
to allocate spill slots for catch phis. Catch phis generated for the
same vreg share a spill slot as the raw value must be the same.

Location builders and slow paths were updated to reflect the fact that
throwing an exception may not lead to escaping the method.

Instruction code generators are forbidden from using of implicit null
checks in try blocks as live registers need to be saved before handing
over to the runtime.

CodeGenerator emits a stack map for each catch block, storing locations
of catch phis. CodeInfo and StackMapStream recognize this new type of
stack map and store them separate from other stack maps to avoid dex_pc
conflicts.

After having found the target catch block to deliver an exception to,
QuickExceptionHandler looks up the dex register maps at the throwing
instruction and the catch block and copies the values over to their
respective locations.

The runtime-support approach was selected because it allows for the
best performance in the normal control-flow path, since no propagation
of catch phi values is necessary until the exception is thrown. In
addition, it also greatly simplifies the register allocation phase.

ConstantHoisting was removed from LICMTest because it instantiated
(now abstract) HConstant and was bogus anyway (constants are always in
the entry block).

Change-Id: Ie31038ad8e3ee0c13a5bbbbaf5f0b3e532310e4e
41b175aba41c9365a1c53b8a1afbd17129c87c14 19-May-2015 Vladimir Marko <vmarko@google.com> ART: Clean up arm64 kNumberOfXRegisters usage.

Avoid undefined behavior for arm64 stemming from 1u << 32 in
loops with upper bound kNumberOfXRegisters.

Create iterators for enumerating bits in an integer either
from high to low or from low to high and use them for
<arch>Context::FillCalleeSaves() on all architectures.

Refactor runtime/utils.{h,cc} by moving all bit-fiddling
functions to runtime/base/bit_utils.{h,cc} (together with
the new bit iterators) and all time-related functions to
runtime/base/time_utils.{h,cc}. Improve test coverage and
fix some corner cases for the bit-fiddling functions.

Bug: 13925192

(cherry picked from commit 80afd02024d20e60b197d3adfbb43cc303cf29e0)

Change-Id: I905257a21de90b5860ebe1e39563758f721eab82
80afd02024d20e60b197d3adfbb43cc303cf29e0 19-May-2015 Vladimir Marko <vmarko@google.com> ART: Clean up arm64 kNumberOfXRegisters usage.

Avoid undefined behavior for arm64 stemming from 1u << 32 in
loops with upper bound kNumberOfXRegisters.

Create iterators for enumerating bits in an integer either
from high to low or from low to high and use them for
<arch>Context::FillCalleeSaves() on all architectures.

Refactor runtime/utils.{h,cc} by moving all bit-fiddling
functions to runtime/base/bit_utils.{h,cc} (together with
the new bit iterators) and all time-related functions to
runtime/base/time_utils.{h,cc}. Improve test coverage and
fix some corner cases for the bit-fiddling functions.

Bug: 13925192
Change-Id: I704884dab15b41ecf7a1c47d397ab1c3fc7ee0f7
8826f67ad53099021f6442364348fa66729288d7 17-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Callee/caller save logic in register allocator.

Prevent intervals that do not span a 'will-call' safepoint
to allocate a callee-save register when caller-saves
are available.

Change-Id: I6e613ab54b087f433bbc433aa62847fbca423377
8cbab3c4de3328b576454ce702d7748f56c44346 23-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Linear scan: split at better positions.

- Split at block entry to piggy back on control flow resolution.
- Split at the loop header, if the split position is within a loop.

Change-Id: I718299a58c02ee02a1b22bda589607c69a35f0e8
5b168deeae2c5a8a566ce5c140741f0e2227af21 27-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Fix user-build on fugu.

Calling Delete on an array shifts the elements, so when iterating
over inactives and removing entries we need to decrement for
the found interval, but also its potential other half. The code
used to not decrement for the other half

Change-Id: Idcb1533643c11a37ed4f459fe88aaef208a4bfd6
234d69d075d1608f80adb647f7935077b62b6376 09-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Revert "[optimizing] Enable x86 long support.""

This reverts commit 154552e666347d41d95d7619c6ee56249ff4feca.

Change-Id: Idc726551c249a888b7ff5fde8508ae50e81b2e13
154552e666347d41d95d7619c6ee56249ff4feca 06-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "[optimizing] Enable x86 long support."

Few libcore failures.

This reverts commit b4ba354cf8d22b261205494875cc014f18587b50.

Change-Id: I4a28d853e730dff9b69aec9555505803cf2fcd63
b4ba354cf8d22b261205494875cc014f18587b50 05-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> [optimizing] Enable x86 long support.

Change-Id: I9006972a65a1f191c45691104a960366747f9d16
7c3952f423b8213083d60596a5f0bf4237ca3f7b 20-Feb-2015 Andreas Gampe <agampe@google.com> ART: Add -Wunused

Until the global CFLAGS are fixed, add Wunused. Fix declarations
in the optimizing compiler.

Change-Id: Ic4553f08e809dc54f3d82af57ac592622c98e000
776b3184ee04092b11edc781cdb81e8ed60601e3 23-Feb-2015 Nicolas Geoffray <ngeoffray@google.com> Each primitive kind now spills to different locations.

Having different slots depending on the types greatly simplifies
the parallel move resolver. It also avoids doing FPU <-> Core
register swaps, and force backends to implement such a swap.

Change-Id: Ide9f0452e7ccf9efb8adddbcc246d44b937b253c
6c2dff8ff8e1440fa4d9e1b2ba2a44d036882801 21-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Revert "Fully support pairs in the register allocator.""

This reverts commit c399fdc442db82dfda66e6c25518872ab0f1d24f.

Change-Id: I19f8215c4b98f2f0827e04bf7806c3ca439794e5
c399fdc442db82dfda66e6c25518872ab0f1d24f 21-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Fully support pairs in the register allocator."

Libcore tests fail.

This reverts commit 41aedbb684ccef76ff8373f39aba606ce4cb3194.

Change-Id: I2572f120d4bbaeb7a4d4cbfd47ab00c9ea39ac6c
41aedbb684ccef76ff8373f39aba606ce4cb3194 14-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Fully support pairs in the register allocator.

Enabled on ARM for longs and doubles.

Change-Id: Id8792d08bd7ca9fb049c5db8a40ae694bafc2d8b
dd8f887e81b894bc8075d8bacdb223747b6a8018 15-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Fix a bug in the register allocator.

When allocating a register blocked by existing intervals,
we need to split inactive intervals at the end of their
lifetime hole, and not at the next intersection. Otherwise,
the allocation for following intervals will not see
that a register is being used by the split interval.

Change-Id: I40cc79dde541c07392a7cf4c6f0b291dd1ce1819
f85a9ca9859ad843dc03d3a2b600afbaf2e9bbdd 13-Jan-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing compiler] Compute live spill size

The current stack frame calculation assumes that each live register to
be saved/restored has the word size of the machine. This fails for X86,
where a double in an XMM register takes up 8 bytes. Change the
calculation to keep track of the number of core registers and number of
fp registers to handle this distinction.

This is slightly pessimal, as the registers may not be active at the
same time, but the only way to handle this would be to allocate both
classes of registers simultaneously, or remember all the active
intervals, matching them up and compute the size of each safepoint
interval.

Change-Id: If7860aa319b625c214775347728cdf49a56946eb
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
840e5461a85f8908f51e7f6cd562a9129ff0e7ce 07-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Implement double and float support for arm in register allocator.

The basic approach is:
- An instruction that needs two registers gets two intervals.
- When allocating the low part, we also allocate the high part.
- When splitting a low (or high) interval, we also split the high
(or low) equivalent.
- Allocation follows the (S/D register) requirement that low
registers are always even and the high equivalent is low + 1.

Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
3e69f16ae3fddfd24f4f0e29deb106d564ab296c 10-Dec-2014 Alexandre Rames <alexandre.rames@arm.com> Opt compiler: Add arm64 support for register allocation.

Change-Id: Idc6e84eee66170de4a9c0a5844c3da038c083aa7
296bd60423e0630d8152b99fb7afb20fbff5a18a 07-Oct-2014 Mingyao Yang <mingyao@google.com> Some improvement to reg alloc.

Change-Id: If579a37791278500a7e5bc763f144c241f261920
102cbed1e52b7c5f09458b44903fe97bb3e14d5f 15-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Implement register allocator for floating point registers.

Also:
- Fix misuses of emitting the rex prefix in the x86_64 assembler.
- Fix movaps code generation in the x86_64 assembler.

Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
677cd61ad05d993c4d3b22656675874f06d6aabc 15-Oct-2014 Ian Rogers <irogers@google.com> Make ART compile with GCC -O0 again.

Tidy up InstructionSetFeatures so that it has a type hierarchy dependent on
architecture.
Add to instruction_set_test to warn when InstructionSetFeatures don't agree
with ones from system properties, AT_HWCAP and /proc/cpuinfo.
Clean-up class linker entry point logic to not return entry points but to
test whether the passed code is the particular entrypoint. This works around
image trampolines that replicate entrypoints.
Bug: 17993736

(cherry picked from commit 6f3dbbadf4ce66982eb3d400e0a74cb73eb034f3)

Change-Id: I3e7595f437db4828072589d475a5453b7f31003e
6f3dbbadf4ce66982eb3d400e0a74cb73eb034f3 15-Oct-2014 Ian Rogers <irogers@google.com> Make ART compile with GCC -O0 again.

Tidy up InstructionSetFeatures so that it has a type hierarchy dependent on
architecture.
Add to instruction_set_test to warn when InstructionSetFeatures don't agree
with ones from system properties, AT_HWCAP and /proc/cpuinfo.
Clean-up class linker entry point logic to not return entry points but to
test whether the passed code is the particular entrypoint. This works around
image trampolines that replicate entrypoints.
Bug: 17993736

Change-Id: I5f4b49e88c3b02a79f9bee04f83395146ed7be23
740475d5f45b8caa2c3c6fc51e657ecf4f3547e5 29-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Fix a bug in the insertion of parallel move.

To make sure we do not connect interval siblings in the
same parallel move, I added a new field in MoveOperands
that tells for which instruction this move is for.
A parallel move should not contains moves for the same instructions.

The checks revealed a bug when connecting siblings, where
we would choose the wrong parallel move.

Change-Id: I70f27ec120886745c187071453c78da4c47c1dd2
3c04974a90b0e03f4b509010bff49f0b2a3da57f 24-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Optimize suspend checks in optimizing compiler.

- Remove the ones added during graph build (they were added
for the baseline code generator).
- Emit them at loop back edges after phi moves, so that the test
can directly jump to the loop header.
- Fix x86 and x86_64 suspend check by using cmpw instead of cmpl.

Change-Id: I6fad5795a55705d86c9e1cb85bf5d63dadfafa2a
3bca0df855f0e575c6ee020ed016999fc8f14122 19-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Support for saving and restoring live registers in a slow path.

And use it in suspend check slow paths.

Change-Id: I79caf28f334c145a36180c79a6e2fceae3990c31
aac0f39a3501a7f7dd04b2342c2a16961969f139 16-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Fix a bug in the register allocator.

We need to take the live interval that starts first to know
until when a register is free, instead of using the live interval
that is last in the inactive list.

Change-Id: I2c9f87481ff1b4fc7b9948db7559b8d3b11d84ce
3946844c34ad965515f677084b07d663d70ad1b8 02-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Runtime support for the new stack maps for the opt compiler.

Now most of the methods supported by the compiler can be optimized,
instead of using the baseline.

Change-Id: I80ab36a34913fa4e7dd576c7bf55af63594dc1fa
93bedb7a96c8e6f9b6caa66689bf4f3c520bc234 18-Jul-2014 Nicolas Geoffray <ngeoffray@google.com> We can also run the linear scan register allocator on thumb.

Change-Id: I5d21b5cbcdd93ff36342111de4ebcaab172034dd
e63db27db913f1a88e2095a1ee8239b2bb9124e8 16-Jul-2014 Ian Rogers <irogers@google.com> Break apart header files.

Create libart-gtest for common runtime and compiler gtest routines.
Rename CompilerCallbacksImpl that is quick compiler specific.
Rename trace clock source constants to not use the overloaded profiler term.

Change-Id: I4aac4bdc7e7850c68335f81e59a390133b54e933
412f10cfed002ab617c78f2621d68446ca4dd8bd 19-Jun-2014 Nicolas Geoffray <ngeoffray@google.com> Support longs in the register allocator for x86_64.

Change-Id: I7fb6dfb761bc5cf9e5705682032855a0a70ca867
e27f31a81636ad74bd3376ee39cf215941b85c0e 12-Jun-2014 Nicolas Geoffray <ngeoffray@google.com> Enable the register allocator on ARM.

- Also fixes a few bugs/wrong assumptions in code not hit by x86.
- We need to differentiate between moves due to connecting siblings within
a block, and moves due to control flow resolution.

Change-Id: Idd05cf138a71c8f36f5531c473de613c0166fe38
86dbb9a12119273039ce272b41c809fa548b37b6 04-Jun-2014 Nicolas Geoffray <ngeoffray@google.com> Final CL to enable register allocation on x86.

This CL implements:
1) Resolution after allocation: connecting the locations
allocated to an interval within a block and between blocks.
2) Handling of fixed registers: some instructions require
inputs/output to be at a specific location, and the allocator
needs to deal with them in a special way.
3) ParallelMoveResolver::EmitNativeCode for x86.

Change-Id: I0da6bd7eb66877987148b87c3be6a983b4e3f858
ecb2f9ba57b08ceac4204ddd6a0a88a0524f8741 13-Jun-2014 Nicolas Geoffray <ngeoffray@google.com> Enable the register allocator on x86_64.

Also fix an x86_64 assembler bug for movl.

Change-Id: I8d17c68cd35ddd1d8df159f2d6173a013a7c3347
31d76b42ef5165351499da3f8ee0ac147428c5ed 09-Jun-2014 Nicolas Geoffray <ngeoffray@google.com> Plug code generator into liveness analysis.

Also implement spill slot support.

Change-Id: If5e28811e9fbbf3842a258772c633318a2f4fafc
ffddfdf6fec0b9d98a692e27242eecb15af5ead2 03-Jun-2014 Tim Murray <timmurray@google.com> DO NOT MERGE

Merge ART from AOSP to lmp-preview-dev.

Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
a7062e05e6048c7f817d784a5b94e3122e25b1ec 22-May-2014 Nicolas Geoffray <ngeoffray@google.com> Add a linear scan register allocator to the optimizing compiler.

This is a "by-the-book" implementation. It currently only deals
with allocating registers, with no hint optimizations.

The changes remaining to make it functional are:
- Allocate spill slots.
- Resolution and placements of Move instructions.
- Connect it to the code generator.

Change-Id: Ie0b2f6ba1b98da85425be721ce4afecd6b4012a4