History log of /art/compiler/optimizing/code_generator.cc
Revision Date Author Comments
a1935c4fa255b5c20f5e9b2abce6be2d0f7cb0a8 26-Jun-2015 Roland Levillain <rpl@google.com> MIPS: Initial version of optimizing compiler for MIPS64R6.

(cherry picked from commit 4dda3376b71209fae07f5c3c8ac3eb4b54207aa8)
(amended for mnc-dev)

Bug: 21555893
Change-Id: I874dc356eee6ab061a32f8f3df5f8ac3a4ab7dcf
Signed-off-by: Alexey Frunze <Alexey.Frunze@imgtec.com>
Signed-off-by: Douglas Leung <douglas.leung@imgtec.com>
dd3c7d2d6124ceb346b4ed9aa7115f75fc6d3f9f 18-Jun-2015 David Brazdil <dbrazdil@google.com> ART: Remove old DCHECK that trips Baseline

Codegen verified that the entry block always falls through to the next
block. While this is the case with Optimizing, it doesn't hold for
Baseline but it doesn't need to since codegen handles it fine.

Bug:21913514
Change-Id: I751ef227e6cf103af3e7fc35fca4b01c663385a1
(cherry picked from commit 015c7e63604c038e866d7af3850c557403cddc8b)
3d21bdf8894e780d349c481e5c9e29fe1556051c 22-Apr-2015 Mathieu Chartier <mathieuc@google.com> Move mirror::ArtMethod to native

Optimizing + quick tests are passing, devices boot.

TODO: Test and fix bugs in mips64.

Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS.
Some of the savings are from removal of virtual methods and direct
methods object arrays.

Bug: 19264997

(cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33)

Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d

Fix some ArtMethod related bugs

Added root visiting for runtime methods, not currently required
since the GcRoots in these methods are null.

Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes
--trace run-tests 005, 044.

Fixed optimizing compiler bug where we used a normal stack location
instead of double on ARM64, this fixes the debuggable tests.

TODO: Fix JDWP tests.

Bug: 19264997

Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3

ART: Fix casts for 64-bit pointers on 32-bit compiler.

Bug: 19264997
Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457

Fix JDWP tests after ArtMethod change

Fixes Throwable::GetStackDepth for exception event detection after
internal stack trace representation change.

Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of
proxy method.

Bug: 19264997
Change-Id: I363e293796848c3ec491c963813f62d868da44d2

Fix accidental IMT and root marking regression

Was always using the conflict trampoline. Also included fix for
regression in GC time caused by extra roots. Most of the regression
was IMT.

Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to
detached thread.

EvaluateAndApplyChanges:
From ~2500 -> ~1980
GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots

Bug: 19264997
Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0

Fix bogus image test assert

Previously we were comparing the size of the non moving space to
size of the image file.

Now we properly compare the size of the image space against the size
of the image file.

Bug: 19264997
Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a

[MIPS64] Fix art_quick_invoke_stub argument offsets.

ArtMethod reference's size got bigger, so we need to move other args
and leave enough space for ArtMethod* and 'this' pointer.

This fixes mips64 boot.

Bug: 19264997
Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
0a23d74dc2751440822960eab218be4cb8843647 07-May-2015 Nicolas Geoffray <ngeoffray@google.com> Add a parent environment to HEnvironment.

This code has no functionality change. It adds a placeholder
for chaining inlined frames.

Change-Id: I5ec57335af76ee406052345b947aad98a6a4423a
3e3d73349a2de81d14e2279f60ffbd9ab3f3ac28 28-Apr-2015 Roland Levillain <rpl@google.com> Have HInvoke instructions know their number of actual arguments.

Add an art::HInvoke::GetNumberOfArguments routine so that
art::HInvoke and its subclasses can return the number of
actual arguments of the called method. Use it in code
generators and intrinsics handlers.

Consequently, no longer remove a clinit check as last input
of a static invoke if it is still present during baseline
code generation, but ensure that static invokes have no such
check as last input in optimized compilations.

Change-Id: Iaf9e07d1057a3b15b83d9638538c02b70211e476
4f46ac5179967dda5966f2dcecf2cf08977951ef 23-Apr-2015 Calin Juravle <calin@google.com> Cleanup and improve stack map stream

- transform AddStackMapEntry into BeginStackMapEntry/EndStackMapEntry.
This allows for nicer code and less assumptions when searching for equal
dex register maps.
- store the components sizes and their start positions as fields to
avoid re-computation.
- store the current stack map entry as a field to avoid the copy
semantic when updating its value in the stack maps array.
- remove redundant methods and fix visibility for the remaining ones.

Change-Id: Ica2d2969d7e15993bdbf8bc41d9df083cddafd24
641547a5f18ca2ea54469cceadcfef64f132e5e0 21-Apr-2015 Calin Juravle <calin@google.com> [optimizing] Fix a bug in moving the null check to the user.

When taking the decision to move a null check to the user we did not
verify if the next instruction checks the same object.

Change-Id: I2f4533a4bb18aa4b0b6d5e419f37dcccd60354d2
88c13cddc3a4184908662b0f3de796565d348c76 14-Apr-2015 Alexandre Rames <alexandre.rames@arm.com> Opt compiler: Correctly require register or FPU register.

Also add a check that location summary are correctly typed
with the HInstruction.

Change-Id: I699762ff4e8f4e321c7db01ea005236ea1934af9
9021825d1e73998b99c81e89c73796f6f2845471 15-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Type MoveOperands.

The ParallelMoveResolver implementation needs to know if a move
is for 64bits or not, to handle swaps correctly.

Bug found, and test case courtesy of Serguei I. Katkov.

Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
c6b4dd8980350aaf250f0185f73e9c42ec17cd57 07-Apr-2015 David Srbecky <dsrbecky@google.com> Implement CFI for Optimizing.

CFI is necessary for stack unwinding in gdb, lldb, and libunwind.

Change-Id: I1a3480e3a4a99f48bf7e6e63c4e83a80cfee40a2
65b798ea10dd716c1bb3dda029f9bf255435af72 06-Apr-2015 Andreas Gampe <agampe@google.com> ART: Enable more Clang warnings

Change-Id: Ie6aba02f4223b1de02530e1515c63505f37e184c
fb8d279bc011b31d0765dc7ca59afea324fd0d0c 01-Apr-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Implement x86/x86_64 math intrinsics

Implement floor/ceil/round/RoundFloat on x86 and x86_64.
Implement RoundDouble on x86_64.

Add support for roundss and roundsd on both architectures. Support them
in the disassembler as well.

Add the instruction set features for x86, as the 'round' instruction is
only supported if SSE4.1 is supported.

Fix the tests to handle the addition of passing the instruction set
features to x86 and x86_64.

Add assembler tests for roundsd and roundss to x86_64 assembler tests.

Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
46e2a3915aa68c77426b71e95b9f3658250646b7 16-Mar-2015 David Brazdil <dbrazdil@google.com> ART: Boolean simplifier

The optimization recognizes the negation pattern generated by 'javac'
and replaces it with a single condition. To this end, boolean values
are now consistently assumed to be represented by an integer.

This is a first optimization which deletes blocks from the HGraph and
does so by replacing the corresponding entries with null. Hence,
existing code can continue indexing the list of blocks with the block
ID, but must check for null when iterating over the list.

Change-Id: I7779da69cfa925c6521938ad0bcc11bc52335583
da4d79bc9a4aeb9da7c6259ce4c9c1c3bf545eb8 24-Mar-2015 Roland Levillain <rpl@google.com> Unify ART's various implementations of bit_cast.

ART had several implementations of art::bit_cast:

1. one in runtime/base/casts.h, declared as:

template <class Dest, class Source>
inline Dest bit_cast(const Source& source);

2. another one in runtime/utils.h, declared as:

template<typename U, typename V>
static inline V bit_cast(U in);

3. and a third local version, in runtime/memory_region.h,
similar to the previous one:

template<typename Source, typename Destination>
static Destination MemoryRegion::local_bit_cast(Source in);

This CL removes versions 2. and 3. and changes their callers
to use 1. instead. That version was chosen over the others
as:
- it was the oldest one in the code base; and
- its syntax was closer to the standard C++ cast operators,
as it supports the following use:

bit_cast<Destination>(source)

since `Source' can be deduced from `source'.

Change-Id: I7334fd5d55bf0b8a0c52cb33cfbae6894ff83633
eeefa1276e83776f08704a3db4237423b0627e20 13-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Update locations of registers after slow paths spilling.

Change-Id: Id9aafcc13c1a085c17ce65d704c67b73f9de695d
fead4e4f397455aa31905b2982d4d861126ab89d 13-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> [optimizing] Don't record None locations in the stack maps.

- moved environment recording from code generator to stack map stream
- added creation/loading factory methods for the DexRegisterMap (hides
internal details)
- added new tests

Change-Id: Ic8b6d044f0d8255c6759c19a41df332ef37876fe
a8ac9130b872c080299afacf5dcaab513d13ea87 13-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Refactor code in preparation of correct stack maps in slow path.

Move the logic of saving/restoring live registers in slow path
in the SlowPathCode method. Also add a RecordPcInfo helper to
SlowPathCode, that will act as the placeholder of saving correct
stack maps.

Change-Id: I25c2bc7a642ef854bbc8a3eb570e5c8c8d2d030c
a4d120c88e79eece333e66eec64c4e909d770e3e 13-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Fix build breakage.

Change-Id: I86959eca5d8f5458ff75c78776b0af9db9c26800
915b9d0c13bb5091875d868fbfa551d7b65d7477 11-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Tweak liveness when instructions are used in environments.

Instructions remain live when debuggable, but only instructions
with object types remain live when non-debuggable.

Enable StackVisitor::GetThisObject for optimizing.

Change-Id: Id87b2cbf33a02450059acc9993995782e5f28987
a2d8ec6876325e89e5d82f5dbeca59f96ced3ec1 12-Mar-2015 Roland Levillain <rpl@google.com> Compress the Dex register maps built by the optimizing compiler.

- Replace the current list-based (fixed-size) Dex register
encoding in stack maps emitted by the optimizing compiler
with another list-based variable-size Dex register
encoding compressing short locations on 1 byte (3 bits for
the location kind, 5 bits for the value); other (large)
values remain encoded on 5 bytes.
- In addition, use slot offsets instead of byte offsets to
encode the location of Dex registers placed in stack
slots at small offsets, as it enables more values to use
the short (1-byte wide) encoding instead of the large
(5-byte wide) one.
- Rename art::DexRegisterMap::LocationKind as
art::DexRegisterLocation::Kind, turn it into a
strongly-typed enum based on a uint8_t, and extend it to
support new kinds (kInStackLargeOffset and
kConstantLargeValue).
- Move art::DexRegisterEntry from
compiler/optimizing/stack_map_stream.h to
runtime/stack_map.h and rename it as
art::DexRegisterLocation.
- Adjust art::StackMapStream,
art::CodeGenerator::RecordPcInfo,
art::CheckReferenceMapVisitor::CheckOptimizedMethod,
art::StackVisitor::GetVRegFromOptimizedCode, and
art::StackVisitor::SetVRegFromOptimizedCode.
- Implement unaligned memory accesses in art::MemoryRegion.
- Use them to manipulate data in Dex register maps.
- Adjust oatdump to support the new Dex register encoding.
- Update compiler/optimizing/stack_map_test.cc.

Change-Id: Icefaa2e2b36b3c80bb1b882fe7ea2f77ba85c505
5f8741860d465410bfed495dbb5f794590d338da 04-Mar-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Use callee-save registers for x86

Add ESI, EDI, EBP to available registers for non-baseline mode. Ensure
that they aren't used when byte addressible registers are needed.

Change-Id: Ie7130d4084c2ae9cfcd1e47c26eb3e5dcac1ebd6
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
579885a26d761f5ba9550f2a1cd7f0f598c2e1e3 22-Feb-2015 Serban Constantinescu <serban.constantinescu@arm.com> Opt Compiler: ARM64: Enable explicit memory barriers over acquire/release

Implement remaining explicit memory barrier code paths and temporarily
enable the use of explicit memory barriers for testing.

This CL also enables the use of instruction set features in the ARM64
backend. kUseAcquireRelease has been replaced with PreferAcquireRelease(),
which for now is statically set to false (prefer explicit memory barriers).

Please note that we still prefer acquire-release for the ARM64 Optimizing
Compiler, but we would like to exercise the explicit memory barrier code
path too.

Change-Id: I84e047ecd43b6fbefc5b82cf532e3f5c59076458
Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
d6138ef1ea13d07ae555542f8898b30d89e9ac9a 18-Feb-2015 Nicolas Geoffray <ngeoffray@google.com> Ensure the graph is correctly typed.

We used to be forgiving because of HIntConstant(0) also being
used for null. We now create a special HNullConstant for such uses.

Also, we need to run the dead phi elimination twice during ssa
building to ensure the correctness.

Change-Id: If479efa3680d3358800aebb1cca692fa2d94f6e5
442b46a087c389a91a0b51547ac9205058432364 18-Feb-2015 Roland Levillain <rpl@google.com> Display optimizing compiler's CodeInfo objects in oatdump.

A few elements are not displayed yet (stack mask, inline info) though.

Change-Id: I5e51a801c580169abc5d1ef43ad581aadc110754
dc23d8318db08cb42e20f1d16dbc416798951a8b 16-Feb-2015 Nicolas Geoffray <ngeoffray@google.com> Avoid generating jmp +0.

When a block branches to a non-following block, but blocks
in-between do branch to it, we can avoid doing the branch.

Change-Id: I9b343f662a4efc718cd4b58168f93162a24e1219
c0572a451944f78397619dec34a38c36c11e9d2a 06-Feb-2015 Nicolas Geoffray <ngeoffray@google.com> Optimize leaf methods.

Avoid suspend checks and stack changes when not needed.

Change-Id: I0fdb31e8c631e99091b818874a558c9aa04b1628
829280cc90b7a84db42864589b4bafb4c94a79d9 28-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Finally implement Location::kNoOutputOverlap.

The [i, i + 1) interval scheme we chose for representing
lifetime positions is not optimal for doing this optimization.
It however doesn't prevent recognizing a non-split interval
during the TryAllocateFreeReg phase, and try to re-use
its inputs' registers.

Change-Id: I80a2823b0048d3310becfc5f5fb7b1230dfd8201
4c204bafbc8d596894f8cb8ec696f5be1c6f12d8 03-Feb-2015 Nicolas Geoffray <ngeoffray@google.com> Use a different block order when not compiling baseline.

Use the linearized order instead, as it puts blocks logically
next to each other in a better way. Also, it does not contain
dead blocks.

Change-Id: Ie65b56041a093c8155e6c1e06351cb36a4053505
4dee636d21d9ce54386cdfbb824e5eb2a9c1af0d 23-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Support callee-save registers on ARM.

Change-Id: I7c519b7a828c9891b1141a8e51e12d6a8bc84118
d97dc40d186aec46bfd318b6a2026a98241d7e9c 22-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Support callee save floating point registers on x64.

- Share the computation of core_spill_mask and fpu_spill_mask
between backends.
- Remove explicit stack overflow check support: we need to adjust
them and since they are not tested, they will easily bitrot.

Change-Id: I0b619b8de4e1bdb169ea1ae7c6ede8df0d65837a
988939683c26c0b1c8808fc206add6337319509a 21-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Enable core callee-save on x64.

Will work on other architectures and FP support in other CLs.

Change-Id: I8cef0343eedc7202d206f5217fdf0349035f0e4d
0ada95d8de4b04b5f201b4b7e9c3c2fd2cc321ae 04-Dec-2014 Jean Christophe Beyler <jean.christophe.beyler@intel.com> ART: Replace NULL to nullptr in the optimizing compiler

Replace macro NULL to the nullptr variation for C++.

Change-Id: Ib6e48dd4bb3c254343383011b67372622578ca76
Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com>
6c2dff8ff8e1440fa4d9e1b2ba2a44d036882801 21-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Revert "Fully support pairs in the register allocator.""

This reverts commit c399fdc442db82dfda66e6c25518872ab0f1d24f.

Change-Id: I19f8215c4b98f2f0827e04bf7806c3ca439794e5
77520bca97ec44e3758510cebd0f20e3bb4584ea 12-Jan-2015 Calin Juravle <calin@google.com> Record implicit null checks at the actual invoke time.

ImplicitNullChecks are recorded only for instructions directly (see NB
below) preceeded by NullChecks in the graph. This way we avoid recording
redundant safepoints and minimize the code size increase.

NB: ParallalelMoves might be inserted by the register allocator between
the NullChecks and their uses. These modify the environment and the
correct action would be to reverse their modification. This will be
addressed in a follow-up CL.

Change-Id: Ie50006e5a4bd22932dcf11348f5a655d253cd898
c399fdc442db82dfda66e6c25518872ab0f1d24f 21-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Fully support pairs in the register allocator."

Libcore tests fail.

This reverts commit 41aedbb684ccef76ff8373f39aba606ce4cb3194.

Change-Id: I2572f120d4bbaeb7a4d4cbfd47ab00c9ea39ac6c
41aedbb684ccef76ff8373f39aba606ce4cb3194 14-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Fully support pairs in the register allocator.

Enabled on ARM for longs and doubles.

Change-Id: Id8792d08bd7ca9fb049c5db8a40ae694bafc2d8b
cd6dffedf1bd8e6dfb3fb0c933551f9a90f7de3f 08-Jan-2015 Calin Juravle <calin@google.com> Add implicit null checks for the optimizing compiler

- for backends: arm, arm64, x86, x86_64
- fixed parameter passing for CodeGenerator
- 003-omnibus-opcodes test verifies that NullPointerExceptions work as
expected

Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
42d1f5f006c8bdbcbf855c53036cd50f9c69753e 16-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Do not use register pair in a parallel move.

The ParallelMoveResolver does not work with pairs. Instead,
decompose the pair into two individual moves.

Change-Id: Ie9d3f0b078cef8dc20640c98b20bb20cc4971a7f
f85a9ca9859ad843dc03d3a2b600afbaf2e9bbdd 13-Jan-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing compiler] Compute live spill size

The current stack frame calculation assumes that each live register to
be saved/restored has the word size of the machine. This fails for X86,
where a double in an XMM register takes up 8 bytes. Change the
calculation to keep track of the number of core registers and number of
fp registers to handle this distinction.

This is slightly pessimal, as the registers may not be active at the
same time, but the only way to handle this would be to allocate both
classes of registers simultaneously, or remember all the active
intervals, matching them up and compute the size of each safepoint
interval.

Change-Id: If7860aa319b625c214775347728cdf49a56946eb
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
12df9ebf72255544b0147c81b1dca6644a29764e 09-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Move code around in OptimizingCompiler::Compile to reduce stack space.

Also fix an (intentional) memory leak, by allocating the CodeGenerator
on the heap instead of the arena: they construct an Assembler object
that requires destruction.

BUG:18787334

Change-Id: I8cf0667cb70ce5b14d4ac334bd4487a562635f1b
840e5461a85f8908f51e7f6cd562a9129ff0e7ce 07-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Implement double and float support for arm in register allocator.

The basic approach is:
- An instruction that needs two registers gets two intervals.
- When allocating the low part, we also allocate the high part.
- When splitting a low (or high) interval, we also split the high
(or low) equivalent.
- Allocation follows the (S/D register) requirement that low
registers are always even and the high equivalent is low + 1.

Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
3416601a9e9be81bb7494864287fd3602d18ef13 19-Dec-2014 Calin Juravle <calin@google.com> Look at instruction set features when generating volatiles code

Change-Id: Ia882405719fdd60b63e4102af7e085f7cbe0bb2a
e21dc3db191df04c100620965bee4617b3b24397 09-Dec-2014 Andreas Gampe <agampe@google.com> ART: Swap-space in the compiler

Introduce a swap-space and corresponding allocator to transparently
switch native allocations to memory backed by a file.

Bug: 18596910

(cherry picked from commit 62746d8d9c4400e4764f162b22bfb1a32be287a9)

Change-Id: I131448f3907115054a592af73db86d2b9257ea33
5b4b898ed8725242ee6b7229b94467c3ea3054c8 18-Dec-2014 Nicolas Geoffray <ngeoffray@google.com> Revert "Don't block quick callee saved registers for optimizing."

X64 has one libcore test failing, and codegen_test on
arm is failing.

This reverts commit 6004796d6c630696127df2494dcd4f30d1367a34.

Change-Id: I20e00431fa18e11ce4c0cb6fffa91977fa8e9b4f
6004796d6c630696127df2494dcd4f30d1367a34 15-Dec-2014 Nicolas Geoffray <ngeoffray@google.com> Don't block quick callee saved registers for optimizing.

This change builds on:
https://android-review.googlesource.com/#/c/118983/

- Also fix x86_64 assembler bug triggered by this change.
- Fix (and improve) x86's backend byte register usage.
- Fix a bug in baseline register allocator: a fixed
out register must prevent inputs from allocating it.

Change-Id: I4883862e29b4e4b6470f1823cf7eab7e7863d8ad
e53798a7e3267305f696bf658e418c92e63e0834 01-Dec-2014 Nicolas Geoffray <ngeoffray@google.com> Inlining support in optimizing.

Currently only inlines simple things that don't require an
environment, such as:
- Returning a constant.
- Returning a parameter.
- Returning an arithmetic operation.

Change-Id: Ie844950cb44f69e104774a3cf7a8dea66bc85661
d2ec87d84057174d4884ee16f652cbcfd31362e9 08-Dec-2014 Calin Juravle <calin@google.com> [optimizing compiler] Add REM_FLOAT and REM_DOUBLE

- for arm, x86, x86_64 backends
- reinstated fmod quick entry points for x86. This is a partial revert
of bd3682eada753de52975ae2b4a712bd87dc139a6 which added inline assembly
for floting point rem on x86. Note that Quick still uses the inline
version.
- fix rem tests for longs

Change-Id: I73be19a9f2f2bcf3f718d9ca636e67bdd72b5440
624279f3c70f9904cbaf428078981b05d3b324c0 04-Dec-2014 Roland Levillain <rpl@google.com> Add support for float-to-long in the optimizing compiler.

- Add support for the float-to-long Dex instruction in the
optimizing compiler.
- Add a Dex PC field to art::HTypeConversion to allow the
x86 and ARM code generators to produce runtime calls.
- Instruct art::CodeGenerator::RecordPcInfo not to record
PC information for HTypeConversion instructions.
- Add S0 to the list of ARM FPU parameter registers.
- Have art::x86_64::X86_64Assembler::cvttss2si work with
64-bit operands.
- Generate x86, x86-64 and ARM (but not ARM64) code for
float to long HTypeConversion nodes.
- Add related tests to test/422-type-conversion.

Change-Id: I954214f0d537187883f83f7a83a1bb2dd8a21fd4
32f5b4d2c8c9b52e9522941c159577b21752d0fa 25-Nov-2014 Serban Constantinescu <serban.constantinescu@arm.com> Vixl: Update the VIXL interface to VIXL 1.7 and enable VIXL debug.

This patch updates the interface to VIXL 1.7 and enables the debug version of
VIXL when ART is built in debug mode.

Change-Id: I443fb941bec3cffefba7038f93bb972e6b7d8db5
Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
647b9ed41cdb7cf302fd356627a3ba372419b78c 27-Nov-2014 Roland Levillain <rpl@google.com> Add support for long-to-double in the optimizing compiler.

- Add support for the long-to-double Dex instruction in the
optimizing compiler.
- Enable requests of temporary FPU (double) registers during
code generation.
- Fix art::x86::X86Assembler::LoadLongConstant and extend
it to int64_t values.
- Have art::x86_64::X86_64Assembler::cvtsi2sd work with
64-bit operands.
- Generate x86, x86-64 and ARM (but not ARM64) code for
long to double HTypeConversion nodes.
- Add related tests to test/422-type-conversion.

Change-Id: Ie73d9e5e25bd2e15f585c371e8fc2dcb83438ccd
87d03761f35ad6cbe0bffbf1ec739875a471da6d 19-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Fix safepoint bug when computing live registers.

Change-Id: I8f28dd287c0e04223c49dea6a323058c1b210913
f97f9fbfdf7f2e23c662f21081fadee6af37809d 11-Nov-2014 Calin Juravle <calin@google.com> [optimizing compiler] add HTemporary support for long and doubles

Change-Id: I5247ecd71d0193050484b7632c804c9bfd20f924
f0e3937b87453234d0d7970b8712082062709b8d 12-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Do a parallel move in BoundsCheckSlowPath.

The two locations of the index and length could overlap,
so we need a parallel move. Also factorize the code for
doing a parallel move based on two locations.

Change-Id: Iee8b3459e2eed6704d45e9a564fb2cd050741ea4
52839d17c06175e19ca4a093fb878450d1c4310d 07-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Support invoke-interface in optimizing.

Change-Id: Ic18d7c3d2810557231caf0571956e0c431f5d384
f43083d560565aea46c602adb86423daeefe589d 07-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Do not update Out after it has a valid location.

Slow paths use LocationSummary to know where to move
things around, and they are executed at the end of the
code generation.

This fix is needed for https://android-review.googlesource.com/#/c/113345/.

Change-Id: Id336c6409479b1de6dc839b736a7234d08a7774a
de58ab2c03ff8112b07ab827c8fa38f670dfc656 05-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Implement try/catch/throw in optimizing.

- We currently don't run optimizations in the presence of a try/catch.
- We therefore implement Quick's mapping table.
- Also fix a missing null check on array-length.

Change-Id: I6917dfcb868e75c1cf6eff32b7cbb60b6cfbd68f
19a19cffd197a28ae4c9c3e59eff6352fd392241 22-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Add support for static fields in optimizing compiler.

Change-Id: Id2f010589e2bd6faf42c05bb33abf6816ebe9fa9
3c03503d66df3b4440f851ae7d0c4fae5e7872df 28-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Follow-up CL after hard float changes.

Addressing comments from Zheng Xu.

Change-Id: I8c599cdfab03373e82a1b90b711005c490bc6ca0
1ba0f596e9e4ddd778ab431237d11baa85594eba 27-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Support hard float on arm in optimizing compiler.

Also bump oat version, needed after latest hard float switch.

Change-Id: Idf5acfb36c07e74acff00edab998419a3c6b2965
5319defdf502fc4569316473846b83180ec08035 23-Oct-2014 Alexandre Rames <alexandre.rames@arm.com> ART: optimizing compiler: initial support for ARM64.

The ARM64 port uses VIXL for code generation, to which it defers work
like label binding and branch resolving, register type coherency
checking, and immediate values handling.

Change-Id: I0a44508c0c991f472a63e67b3469cdd878fe1a68
Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
Signed-off-by: Alexandre Rames <alexandre.rames@arm.com>
102cbed1e52b7c5f09458b44903fe97bb3e14d5f 15-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Implement register allocator for floating point registers.

Also:
- Fix misuses of emitting the rex prefix in the x86_64 assembler.
- Fix movaps code generation in the x86_64 assembler.

Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
92a73aef279be78e3c2b04db1713076183933436 16-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Don't use assembler classes in code_generator.h.

The arm64 backend uses its own assembler and does not share
the same classes as the other backends. To avoid conflicts
or unnecessary mappings, just don't use those classes in the
shared part of the code generator.

Change-Id: I9e5fa40c1021d2e83a4ef14c52cd1ccd03f2f73d
71175b7f19a4f6cf9cc264feafd820dbafa371fb 09-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Cleanup baseline register allocator.

- Use three arrays for blocking regsters instead of
one and computing offsets in that array.]
- Don't pass blocked_registers_ to methods, just use the field.

Change-Id: Ib698564c31127c59b5a64c80f4262394b8394dc6
56b9ee6fe1d6880c5fca0e7feb28b25a1ded2e2f 09-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Stop converting from Location to ManagedRegister.

Now the source of truth is the Location object that knows
which register (core, pair, fpu) it needs to refer to.

Change-Id: I62401343d7479ecfb24b5ed161ec7829cda5a0b1
7fb49da8ec62e8a10ed9419ade9f32c6b1174687 06-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Add support for floats and doubles.

- Follows Quick conventions.
- Currently only works with baseline register allocator.

Change-Id: Ie4b8e298f4f5e1cd82364da83e4344d4fc3621a3
3c04974a90b0e03f4b509010bff49f0b2a3da57f 24-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Optimize suspend checks in optimizing compiler.

- Remove the ones added during graph build (they were added
for the baseline code generator).
- Emit them at loop back edges after phi moves, so that the test
can directly jump to the loop header.
- Fix x86 and x86_64 suspend check by using cmpw instead of cmpl.

Change-Id: I6fad5795a55705d86c9e1cb85bf5d63dadfafa2a
3bca0df855f0e575c6ee020ed016999fc8f14122 19-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Support for saving and restoring live registers in a slow path.

And use it in suspend check slow paths.

Change-Id: I79caf28f334c145a36180c79a6e2fceae3990c31
8a16d97fb8f031822b206e65f9109a071da40563 11-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Fix valgrind errors.

For now just stack allocate the code generator. Will think
about cleaning up the root problem later (CodeGenerator being an
arena object).

Change-Id: I161a6f61c5f27ea88851b446f3c1e12ee9c594d7
3946844c34ad965515f677084b07d663d70ad1b8 02-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Runtime support for the new stack maps for the opt compiler.

Now most of the methods supported by the compiler can be optimized,
instead of using the baseline.

Change-Id: I80ab36a34913fa4e7dd576c7bf55af63594dc1fa
e3ea83811d47152c00abea24a9b420651a33b496 08-Aug-2014 Yevgeny Rouban <yevgeny.y.rouban@intel.com> ART source line debug info in OAT files

OAT files have source line information enough for ART runtime needs like
jump to/from interpreter and thread suspension. But this information
is not enough for finer grained source level debugging and low-level
profiling (VTune or perf).

This patch adds to OAT files two additional sections:
.debug_line - DWARF formatted Elf32 section with detailed source line
information (mapping from native PC to Java source lines).

In addition to the debugging symbols added using the dex2oat option
--include-debug-symbols, the source line information is added to
the section .debug_line.

The source line info can be read by many Elf reading tools like objdump,
readelf, dwarfdump, gdb, perf, VTune, ...

gdb can use this debug line information in x86. In 64-bit mode
the information can be used if the oat file is mapped in the lower
address space (address has higher 32 bits zeroed). Relocation works.

Testing:
1. art/test/run-test --host --gdb [--64] 001-HelloWorld
2. in gdb: break Main.java:19
3. in gdb: break Runtime.java:111
4. in gdb: run - stops at void java.lang.Runtime.<init>()
5. in gdb: backtrace - shows call stack down to main()
6. in gdb: continue - stops at void Main.main() (only in 32-bit mode)
7. in gdb: backtrace - shows call stack down to main()
8. objdump -W <oat-file> - addresses are from VMA range of .text
section reported by objdump -h <file>
9. dwarfdump -ka <oat-file> - no errors expected

Size of aosp-x86-eng boot.oat increased by 11% from 80.5Mb to 89.2Mb
with two sections added .debug_line (7.2Mb) and .rel.debug (1.5Mb).

Change-Id: Ib8828832686e49782a63d5529008ff4814ed9cda
Signed-off-by: Yevgeny Rouban <yevgeny.y.rouban@intel.com>
73e80c3ae76fafdb53afe3a85306dcb491fb5b00 22-Jul-2014 Nicolas Geoffray <ngeoffray@google.com> Make unit test tell if a method is a leaf.

The runtime is not initialized completely in gtests, so we
cannot run code (such as explicit stack overflow checks) that
look at tls values.

Change-Id: I74a4449b01eb203f1b411dda700e9459878d0d55
f12feb8e0e857f2832545b3f28d31bad5a9d3903 17-Jul-2014 Nicolas Geoffray <ngeoffray@google.com> Stack overflow checks and NPE checks for optimizing.

Change-Id: I59e97448bf29778769b79b51ee4ea43f43493d96
ab032bc1ff57831106fdac6a91a136293609401f 15-Jul-2014 Nicolas Geoffray <ngeoffray@google.com> Fix a braino in the stack layout.

Also do some refactoring to have this code be just in CodeGenerator.

Change-Id: I88de109889138af8d60027973c12a64bee813cb7
e50383288a75244255d3ecedcc79ffe9caf774cb 04-Jul-2014 Nicolas Geoffray <ngeoffray@google.com> Support fields in optimizing compiler.

- Required support for temporaries, to be only used by baseline compiler.
- Also fixed a few invalid assumptions around locations and instructions
that don't need materialization. These instructions should not have an Out.

Change-Id: Idc4a30dd95dd18015137300d36bec55fc024cf62
86dbb9a12119273039ce272b41c809fa548b37b6 04-Jun-2014 Nicolas Geoffray <ngeoffray@google.com> Final CL to enable register allocation on x86.

This CL implements:
1) Resolution after allocation: connecting the locations
allocated to an interval within a block and between blocks.
2) Handling of fixed registers: some instructions require
inputs/output to be at a specific location, and the allocator
needs to deal with them in a special way.
3) ParallelMoveResolver::EmitNativeCode for x86.

Change-Id: I0da6bd7eb66877987148b87c3be6a983b4e3f858
9cf35523764d829ae0470dae2d5dd99be469c841 09-Jun-2014 Nicolas Geoffray <ngeoffray@google.com> Add x86_64 support to the optimizing compiler.

Change-Id: I4462d9ae15be56c4a3dc1bd4d1c0c6548c1b94be
f635e63318447ca04731b265a86a573c9ed1737c 14-May-2014 Nicolas Geoffray <ngeoffray@google.com> Add a compilation tracing mechanism to the new compiler.

Code mostly imported from: https://android-review.googlesource.com/#/c/81653/.

Change-Id: I150fe942be0fb270e03fabb19032180f7a065d13
804d09372cc3d80d537da1489da4a45e0e19aa5d 02-May-2014 Nicolas Geoffray <ngeoffray@google.com> Build live-in, live-out and kill sets for each block.

This information will be used when computing live ranges of
instructions.

Change-Id: I345ee833c1ccb4a8e725c7976453f6d58d350d74
f529d776ca9f48b115714f6c79677755ecc37d24 02-May-2014 Nicolas Geoffray <ngeoffray@google.com> Make all registers available when allocating an output register.

On ARM we currently only have two register pairs available, so we
need to use one already used for an input.

Change-Id: I5411862310009a41e50ddab3549d3a9e9052266a
a7aca370a7d62ca04a1e24423d90e8020d6f1a58 28-Apr-2014 Nicolas Geoffray <ngeoffray@google.com> Setup policies for register allocation.

Change-Id: I857e77530fca3e2fb872fc142a916af1b48400dc
c32e770f21540e4e9eda6dc7f770e745d33f1b9f 24-Apr-2014 Nicolas Geoffray <ngeoffray@google.com> Add a Transform to SSA phase to the optimizing compiler.

Change-Id: Ia9700756a0396d797a00b529896487d52c989329
b55f835d66a61e5da6fc1895ba5a0482868c9552 07-Apr-2014 Nicolas Geoffray <ngeoffray@google.com> Test control flow instruction with optimizing compiler.

Add support for basic instructions to implement these tests.

Change-Id: I3870bf9301599043b3511522bb49dc6364c9b4c0
f583e5976e1de9aa206fb8de4f91000180685066 07-Apr-2014 Nicolas Geoffray <ngeoffray@google.com> Add support for taking parameters in optimizing compiler.

- Fix stack layout to mimic Quick's.
- Implement some sub operations.

Change-Id: I8cf75a4d29b662381a64f02c0bc61d859482fc4e
707c809f661554713edfacf338365adca8dfd3a3 04-Apr-2014 Nicolas Geoffray <ngeoffray@google.com> Use target-specific word instead of runtime word.

Change-Id: Ia11dc3cc520a1a5c7bd017013e5699af9570ce91
4a34a428c6a2588e0857ef6baf88f1b73ce65958 03-Apr-2014 Nicolas Geoffray <ngeoffray@google.com> Support passing arguments to invoke-static* instructions.

- Stop using the frame pointer for accessing locals.
- Stop emulating a stack when doing code generation. Instead,
rely on dex register model, where instructions only reference
registers.

Change-Id: Id51bd7d33ac430cb87a53c9f4b0c864eeb1006f9
6a58cb16d803c9a7b3a75ccac8be19dd9d4e520d 02-Apr-2014 Dmitry Petrochenko <dmitry.petrochenko@intel.com> art: Handle x86_64 architecture equal to x86

This patch forces FE/ME to treat x86_64 as x86 exactly.
The x86_64 logic will be revised later when assembly will be ready.

Change-Id: I4a92477a6eeaa9a11fd710d35c602d8d6f88cbb6
Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
8ccc3f5d06fd217cdaabd37e743adab2031d3720 19-Mar-2014 Nicolas Geoffray <ngeoffray@google.com> Add support for invoke-static in optimizing compiler.

Support is limited to calls without parameters and returning
void. For simplicity, we currently follow the Quick ABI.

Change-Id: I54805161141b7eac5959f1cae0dc138dd0b2e8a5
92cf83e001357329cbf41fa15a6e053fab6f4933 18-Mar-2014 Nicolas Geoffray <ngeoffray@google.com> Run Java tests with the optimizing compiler.

Also fix a vector.reserve -> vector.resize braino, and build
a GC map that dex2oat expects.

Change-Id: I6acf2f90a4c32f90b79bf7709bf2e43931b98757
787c3076635cf117eb646c5a89a9014b2072fb44 17-Mar-2014 Nicolas Geoffray <ngeoffray@google.com> Plug new optimizing compiler in compilation pipeline.

Also rename accessors to ART's conventions.

Change-Id: I344807055b98aa4b27215704ec362191464acecc
bab4ed7057799a4fadc6283108ab56f389d117d4 11-Mar-2014 Nicolas Geoffray <ngeoffray@google.com> More code generation for the optimizing compiler.

- Add HReturn instruction
- Generate code for locals/if/return
- Setup infrastructure for register allocation. Currently
emulate a stack.

Change-Id: Ib28c2dba80f6c526177ed9a7b09c0689ac8122fb
d4dd255db1d110ceb5551f6d95ff31fb57420994 28-Feb-2014 Nicolas Geoffray <ngeoffray@google.com> Add codegen support to the optimizing compiler.

Change-Id: I9aae76908ff1d6e64fb71a6718fc1426b67a5c28