History log of /art/compiler/optimizing/ssa_liveness_analysis.cc
Revision Date Author Comments
2dd7b672ea0afd7ea4448b43d24829e9886de3af 07-Dec-2017 Aart Bik <ajcbik@google.com> Fixed spilling bug (visible on ARM64): missed SIMD type.

Test: test-art-host test-art-target
Change-Id: I6f321446f54943e02f250732ec9da729f633c3a9
e764d2e50c544c2cb98ee61a15d613161ac6bd17 05-Oct-2017 Vladimir Marko <vmarko@google.com> Use ScopedArenaAllocator for register allocation.

Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 25.1MiB -> 21.1MiB
BatteryStats.dumpLocked(): 49.6MiB -> 42.0MiB
This is because all the memory previously used by Scheduler
is reused by the register allocator; the register allocator
has a higher peak usage of the ArenaStack.

And continue the "arena"->"allocator" renaming.

Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 64312607
Change-Id: Idfd79a9901552b5147ec0bf591cb38120de86b01
ca6fff898afcb62491458ae8bcd428bfb3043da1 03-Oct-2017 Vladimir Marko <vmarko@google.com> ART: Use ScopedArenaAllocator for pass-local data.

Passes using local ArenaAllocator were hiding their memory
usage from the allocation counting, making it difficult to
track down where memory was used. Using ScopedArenaAllocator
reveals the memory usage.

This changes the HGraph constructor which requires a lot of
changes in tests. Refactor these tests to limit the amount
of work needed the next time we change that constructor.

Test: m test-art-host-gtest
Test: testrunner.py --host
Test: Build with kArenaAllocatorCountAllocations = true.
Bug: 64312607
Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
0ebe0d83138bba1996e9c8007969b5381d972b32 21-Sep-2017 Vladimir Marko <vmarko@google.com> ART: Introduce compiler data type.

Replace most uses of the runtime's Primitive in compiler
with a new class DataType. This prepares for introducing
new types, such as Uint8, that the runtime does not need
to know about.

Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 23964345
Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
5e3afa950f05bca470ef6b92460940f37831c27f 20-Sep-2017 Aart Bik <ajcbik@google.com> Ensure extract is seen as having scalar result.

Rationale:
Extracting from a vector yields a scalar, yet
our parallel mover and one DCHECK did not account
for that fact (note that moving towards a vector
type system will prevent such errors).

Regression test for this is part of the SAD CL.

Test: test-art-host test-art-target

Bug: 64091002
Change-Id: Id154edd1a069c54e7d8da069c368dea0a8f973f4
c9c310487b8730fce5edfa72e79c4188629898a3 29-Jun-2017 Nicolas Geoffray <ngeoffray@google.com> Turn a few DCHECK into CHECKs.

To help diagnose b/63070152.

bug: 63070152
Test: test.py
Change-Id: I1ac1cf9bfe1bc15ecfa94b5b8537cd3afda6fd14
82b0740f03b1a6acab4558214d3edc362e27e238 01-Mar-2017 Vladimir Marko <vmarko@google.com> Use IntrusiveForwardList<> for Env-/UsePosition.

Test: m test-art-host-gtest
Test: testrunner.py --host
Change-Id: I2b720e2ed8f96303cf80e9daa6d5278bf0c3da2f
f8f5a16ed7bad1e18179e38453e59c96a944de10 07-Feb-2017 Aart Bik <ajcbik@google.com> ART vectorizer.

Rationale:
Make SIMD great again with a retargetable and easily extendable vectorizer.

Provides a full x86/x86_64 and a proof-of-concept ARM implementation. Sample
improvement (without any perf tuning yet) for Linpack on x86 is about 20% to 50%.

Test: test-art-host, test-art-target (angler)
Bug: 34083438, 30933338

Change-Id: Ifb77a0f25f690a87cd65bf3d5e9f6be7ea71d6c1
5576f3741c58cb8b5fb2f68f3b3a9415efe05f4f 24-Mar-2017 Aart Bik <ajcbik@google.com> Implement a SIMD spilling slot.

Rationale:
The last ART vectorizer break-out CL \O/
This ensures spilling on x86 and x86_4 is correct.
Also, it paves the way to wider SIMD on ARM and MIPS.

Test: test-art-host
Bug: 34083438

Change-Id: I5b27d18c2045f3ab70b64c335423b3ff2a507ac2
cc89525c13894247cb82a1973617da6cba286f0c 21-Mar-2017 Aart Bik <ajcbik@google.com> Change 1/2 spill slots to more general number of spill slots.

Rationale:
This prepares requesting a different number of spill slots
during SIMD vectorization.

Bug: 34083438
Test: test-art-host, test-art-host-gtest-register_allocator_test
Change-Id: I6d22966ba483deec72b5eea5061c403c12b2ada7
2c45bc9137c29f886e69923535aff31a74d90829 25-Oct-2016 Vladimir Marko <vmarko@google.com> Remove H[Reverse]PostOrderIterator and HInsertionOrderIterator.

Use range-based loops instead, introducing helper functions
ReverseRange() for iteration in reverse order in containers.
When the contents of the underlying container change inside
the loop, use an index-based loop that better exposes the
container data modifications, compared to the old iterator
interface that's hiding it which may lead to subtle bugs.

Test: m test-art-host
Change-Id: I2a4e6c508b854c37a697fc4b1e8423a8c92c5ea0
9620230700d4b451097c2163faa70627c9d8088a 05-Oct-2016 Aart Bik <ajcbik@google.com> Refactoring of graph linearization and linear order.

Rationale:
Ownership of graph's linear order and iterators was
a bit unclear now that other phases are using it.
New approach allows phases to compute their own
order, while ssa_liveness is sole owner for graph
(since it is not mutated afterwards).

Also shortens lifetime of loop's arena.

Test: test-art-host
Change-Id: Ib7137d1203a1e0a12db49868f4117d48a4277f30
20e9db6db787e007e7032878c9899b28ec43e93f 14-Sep-2016 Aart Bik <ajcbik@google.com> Make LinearizeGraph() public (and move it to nodes files)

Rationale:
It is strange that HLinearOrderIterator is defined (and visible)
in nodes.h, but clients have no way to build this order. This CL
makes the building available at the usual place.

Change-Id: Ib66f2edf6dfc8edd6b429bd4bea3ac7e37440b28
Tests: m test-art
30f766688006813ce90f42160c4b31112e90da60 02-Sep-2016 David Brazdil <dbrazdil@google.com> Cache result of an expensive DCHECK

LiveInterval::AddBackEdgeUses tests whether linear order is well
formed on debug builds. This is expensive and can significantly hinder
compilation times when many back edge uses are added.

This patch moves the IsLinearOrderWellFormed test at the end of
linear order generation.

Bug: 31163119
Change-Id: Ic4fe66bee2055f4b2cb065d9451ad5f21ba00676
d9ffd0dd7266f6a5e76f29d98dbe1a04f64cbb9b 22-Jun-2016 Matthew Gharrity <gharrma@google.com> Implement a graph coloring register allocator

Test: m test-art-host

Change-Id: I8c0d77f339ab02b33588a54b96ecce5c8322cfce
e90049140fdfb89080e5cc9b000b0c9be8c18bcd 16-Jun-2016 Vladimir Marko <vmarko@google.com> Create a typedef for HInstruction::GetInputs() return type.

And some other cleanup after
https://android-review.googlesource.com/230742

Test: No new tests. ART test suite passed (tested on host).
Change-Id: I4743bf17544d0234c6ccb46dd0c1b9aae5c93e17
372f10e5b0b34e2bb6e2b79aeba6c441e14afd1f 17-May-2016 Vladimir Marko <vmarko@google.com> Refactor handling of input records.

Introduce HInstruction::GetInputRecords(), a new virtual
function that returns an ArrayRef<> to all input records.
Implement all other functions dealing with input records as
wrappers around GetInputRecords(). Rewrite functions that
previously used multiple virtual calls to deal with input
records, especially in loops, to prefetch the ArrayRef<>
only once for each instruction. Besides avoiding all the
extra calls, this also allows the compiler (clang++) to
perform additional optimizations.

This speeds up the Nexus 5 boot image compilation by ~0.5s
(4% of "Compile Dex File", 2% of dex2oat time) on AOSP ToT.

Change-Id: Id8ebe0fb9405e38d918972a11bd724146e4ca578
d7c2fdc939bb7efb3e7204d62e54c6a3f7d77f9b 10-May-2016 Nicolas Geoffray <ngeoffray@google.com> Fix another case of live_in at irreducible loop entry.

GVN was implicitly extending the liveness of an instruction across
an irreducible loop.

Fix this problem by clearing the value set at loop entries that contain
an irreducible loop.

bug:28252896

(cherry picked from commit 77ce6430af2709432b22344ed656edd8ec80581b)

Change-Id: Ie0121e83b2dfe47bcd184b90a69c0194d13fce54
77ce6430af2709432b22344ed656edd8ec80581b 10-May-2016 Nicolas Geoffray <ngeoffray@google.com> Fix another case of live_in at irreducible loop entry.

GVN was implicitly extending the liveness of an instruction across
an irreducible loop.

Fix this problem by clearing the value set at loop entries that contain
an irreducible loop.

bug:28252896
Change-Id: I68823cb88dceb4c2b4545286ba54fd0c958a48b0
d59f3b1b7f5c1ab9f0731ff9dc60611e8d9a6ede 29-Mar-2016 Vladimir Marko <vmarko@google.com> Use iterators "before" the use node in HUserRecord<>.

Create a new template class IntrusiveForwardList<> that
mimicks std::forward_list<> except that all allocations
are handled externally. This is essentially the same as
boost::intrusive::slist<> but since we're not using Boost
we have to reinvent the wheel.

Use the new container to replace the HUseList and use the
iterators to "before" use nodes in HUserRecord<> to avoid
the extra pointer to the previous node which was used
exclusively for removing nodes from the list. This reduces
the size of the HUseListNode by 25%, 32B to 24B in 64-bit
compiler, 16B to 12B in 32-bit compiler. This translates
directly to overall memory savings for the 64-bit compiler
but due to rounding up of the arena allocations to 8B, we
do not get any improvement in the 32-bit compiler.

Compiling the Nexus 5 boot image with the 64-bit dex2oat
on host this CL reduces the memory used for compiling the
most hungry method, BatteryStats.dumpLocked(), by ~3.3MiB:

Before:
MEM: used: 47829200, allocated: 48769120, lost: 939920
Number of arenas allocated: 345,
Number of allocations: 815492, avg size: 58
...
UseListNode 13744640
...
After:
MEM: used: 44393040, allocated: 45361248, lost: 968208
Number of arenas allocated: 319,
Number of allocations: 815492, avg size: 54
...
UseListNode 10308480
...

Note that while we do not ship the 64-bit dex2oat to the
device, the JIT compilation for 64-bit processes is using
the 64-bit libart-compiler.

Bug: 28173563
Bug: 27856014

(cherry picked from commit 46817b876ab00d6b78905b80ed12b4344c522b6c)

Change-Id: Ifb2d7b357064b003244e92c0d601d81a05e56a7b
46817b876ab00d6b78905b80ed12b4344c522b6c 29-Mar-2016 Vladimir Marko <vmarko@google.com> Use iterators "before" the use node in HUserRecord<>.

Create a new template class IntrusiveForwardList<> that
mimicks std::forward_list<> except that all allocations
are handled externally. This is essentially the same as
boost::intrusive::slist<> but since we're not using Boost
we have to reinvent the wheel.

Use the new container to replace the HUseList and use the
iterators to "before" use nodes in HUserRecord<> to avoid
the extra pointer to the previous node which was used
exclusively for removing nodes from the list. This reduces
the size of the HUseListNode by 25%, 32B to 24B in 64-bit
compiler, 16B to 12B in 32-bit compiler. This translates
directly to overall memory savings for the 64-bit compiler
but due to rounding up of the arena allocations to 8B, we
do not get any improvement in the 32-bit compiler.

Compiling the Nexus 5 boot image with the 64-bit dex2oat
on host this CL reduces the memory used for compiling the
most hungry method, BatteryStats.dumpLocked(), by ~3.3MiB:

Before:
MEM: used: 47829200, allocated: 48769120, lost: 939920
Number of arenas allocated: 345,
Number of allocations: 815492, avg size: 58
...
UseListNode 13744640
...
After:
MEM: used: 44393040, allocated: 45361248, lost: 968208
Number of arenas allocated: 319,
Number of allocations: 815492, avg size: 54
...
UseListNode 10308480
...

Note that while we do not ship the 64-bit dex2oat to the
device, the JIT compilation for 64-bit processes is using
the 64-bit libart-compiler.

Bug: 28173563
Change-Id: I985eabd4816f845372d8aaa825a1489cf9569208
3563c44464ca55b2106373b35110e5ecaae38abf 18-Apr-2016 Vladimir Marko <vmarko@google.com> Fix inlining loops in OSR mode.

When compiling a method in OSR mode and the method does not
contain a loop (arguably, a very odd case) but we inline
another method with a loop and then the final DCE re-runs
the loop identification, the inlined loop would previously
be marked as irreducible. However, the SSA liveness analysis
expects irreducible loop to have extra loop Phis which were
already eliminated from the loop before the inner graph was
inlined to the outer graph, so we would fail a DCHECK().

We fix this by not marking inlined loops as irreducible when
compiling in OSR mode.

Bug: 28210356

(cherry picked from commit fd66c50d64c38e40bafde83b4872e27bbff7546d)

Change-Id: I149273b766d1c713c571baad6033c5f70e6dd960
fd66c50d64c38e40bafde83b4872e27bbff7546d 18-Apr-2016 Vladimir Marko <vmarko@google.com> Fix inlining loops in OSR mode.

When compiling a method in OSR mode and the method does not
contain a loop (arguably, a very odd case) but we inline
another method with a loop and then the final DCE re-runs
the loop identification, the inlined loop would previously
be marked as irreducible. However, the SSA liveness analysis
expects irreducible loop to have extra loop Phis which were
already eliminated from the loop before the inner graph was
inlined to the outer graph, so we would fail a DCHECK().

We fix this by not marking inlined loops as irreducible when
compiling in OSR mode.

Bug: 28210356
Change-Id: If10057ed883333c62a878ed2ae3fe01bb5280e33
badd826664896d4a9628a5a89b78016894aa414b 02-Feb-2016 David Brazdil <dbrazdil@google.com> ART: Run SsaBuilder from HGraphBuilder

First step towards merging the two passes, which will later result in
HGraphBuilder directly producing SSA form. This CL mostly just updates
tests broken by not being able to inspect the pre-SSA form.

Using HLocals outside the HGraphBuilder is now deprecated.

Bug: 27150508
Change-Id: I00fb6050580f409dcc5aa5b5aa3a536d6e8d759e
674f519fe00ae07e0db90c4374f785bb418ae332 02-Feb-2016 David Brazdil <dbrazdil@google.com> ART: Enable multi-level instruction inlining

Change-Id: I4b4c927d7b1598dc197793c25185fb079dec7fe1
b3e773eea39a156b3eacf915ba84e3af1a5c14fa 26-Jan-2016 David Brazdil <dbrazdil@google.com> ART: Implement support for instruction inlining

Optimizing HIR contains 'non-materialized' instructions which are
emitted at their use sites rather than their defining sites. This
was not properly handled by the liveness analysis which did not
adjust the use positions of the inputs of such instructions.
Despite the analysis being incorrect, the current use cases never
produce incorrect code.

This patch generalizes the concept of inlined instructions and
updates liveness analysis to set the compute use positions correctly.

Change-Id: Id703c154b20ab861241ae5c715a150385d3ff621
15bd22849ee6a1ffb3fb3630f686c2870bdf1bbc 05-Jan-2016 Nicolas Geoffray <ngeoffray@google.com> Implement irreducible loop support in optimizing.

So we don't fallback to the interpreter in the presence of
irreducible loops.

Implications:
- A loop pre-header does not necessarily dominate a loop header.
- Non-constant redundant phis will be kept in loop headers, to
satisfy our linear scan register allocation algorithm.
- while-graph optimizations, such as gvn, licm, lse, and dce
need to know when they are dealing with irreducible loops.

Change-Id: I2cea8934ce0b40162d215353497c7f77d6c9137e
ec7802a102d49ab5c17495118d4fe0bcc7287beb 01-Oct-2015 Vladimir Marko <vmarko@google.com> Add DCHECKs to ArenaVector and ScopedArenaVector.

Implement dchecked_vector<> template that DCHECK()s element
access and insert()/emplace()/erase() positions. Change the
ArenaVector<> and ScopedArenaVector<> aliases to use the new
template instead of std::vector<>. Remove DCHECK()s that
have now become unnecessary from the Optimizing compiler.

Change-Id: Ib8506bd30d223f68f52bd4476c76d9991acacadc
2aaa4b5532d30c4e65d8892b556400bb61f9dc8c 17-Sep-2015 Vladimir Marko <vmarko@google.com> Optimizing: Tag more arena allocations.

Replace GrowableArray with ArenaVector and tag arena
allocations with new allocation types.

As part of this, make the register allocator a bit more
efficient, doing bulk insert/erase. Some loops are now
O(n) instead of O(n^2).

Change-Id: Ifac0871ffb34b121cc0447801a2d07eefd308c14
fa6b93c4b69e6d7ddfa2a4ed0aff01b0608c5a3a 15-Sep-2015 Vladimir Marko <vmarko@google.com> Optimizing: Tag arena allocations in HGraph.

Replace GrowableArray with ArenaVector in HGraph and related
classes HEnvironment, HLoopInformation, HInvoke and HPhi,
and tag allocations with new arena allocation types.

Change-Id: I3d79897af405b9a1a5b98bfc372e70fe0b3bc40d
77a48ae01bbc5b05ca009cf09e2fcb53e4c8ff23 15-Sep-2015 David Brazdil <dbrazdil@google.com> Revert "Revert "ART: Register allocation and runtime support for try/catch""

The original CL triggered b/24084144 which has been fixed
by Ib72e12a018437c404e82f7ad414554c66a4c6f8c.

This reverts commit 659562aaf133c41b8d90ec9216c07646f0f14362.

Change-Id: Id8980436172457d0fcb276349c4405f7c4110a55
659562aaf133c41b8d90ec9216c07646f0f14362 14-Sep-2015 David Brazdil <dbrazdil@google.com> Revert "ART: Register allocation and runtime support for try/catch"

Breaks libcore test org.apache.harmony.security.tests.java.security.KeyStorePrivateKeyEntryTest#testGetCertificateChain. Need to investigate.

This reverts commit b022fa1300e6d78639b3b910af0cf85c43df44bb.

Change-Id: Ib24d3a80064d963d273e557a93469c95f37b1f6f
b022fa1300e6d78639b3b910af0cf85c43df44bb 20-Aug-2015 David Brazdil <dbrazdil@google.com> ART: Register allocation and runtime support for try/catch

This patch completes a series of CLs that add support for try/catch
in the Optimizing compiler. With it, Optimizing can compile all
methods containing try/catch, provided they don't contain catch loops.
Future work will focus on improving performance of the generated code.

SsaLivenessAnalysis was updated to propagate liveness information of
instructions live at catch blocks, and to keep location information on
instructions which may be caught by catch phis.

RegisterAllocator was extended to spill values used after catch, and
to allocate spill slots for catch phis. Catch phis generated for the
same vreg share a spill slot as the raw value must be the same.

Location builders and slow paths were updated to reflect the fact that
throwing an exception may not lead to escaping the method.

Instruction code generators are forbidden from using of implicit null
checks in try blocks as live registers need to be saved before handing
over to the runtime.

CodeGenerator emits a stack map for each catch block, storing locations
of catch phis. CodeInfo and StackMapStream recognize this new type of
stack map and store them separate from other stack maps to avoid dex_pc
conflicts.

After having found the target catch block to deliver an exception to,
QuickExceptionHandler looks up the dex register maps at the throwing
instruction and the catch block and copies the values over to their
respective locations.

The runtime-support approach was selected because it allows for the
best performance in the normal control-flow path, since no propagation
of catch phi values is necessary until the exception is thrown. In
addition, it also greatly simplifies the register allocation phase.

ConstantHoisting was removed from LICMTest because it instantiated
(now abstract) HConstant and was bogus anyway (constants are always in
the entry block).

Change-Id: Ie31038ad8e3ee0c13a5bbbbaf5f0b3e532310e4e
6058455d486219994921b63a2d774dc9908415a2 03-Sep-2015 Vladimir Marko <vmarko@google.com> Optimizing: Tag basic block allocations with their source.

Replace GrowableArray with ArenaVector in HBasicBlock and,
to track the source of allocations, assign one new and two
Quick's arena allocation types to these vectors. Rename
kArenaAllocSuccessor to kArenaAllocSuccessors.

Bug: 23736311
Change-Id: Ib52e51698890675bde61f007fe6039338cf1a025
145acc5361deb769eed998f057bc23abaef6e116 03-Sep-2015 Vladimir Marko <vmarko@google.com> Revert "Optimizing: Tag basic block allocations with their source."

Reverting so that we can have more discussion about the STL API.

This reverts commit 91e11c0c840193c6822e66846020b6647de243d5.

Change-Id: I187fe52f2c16b6e7c5c9d49c42921eb6c7063dba
91e11c0c840193c6822e66846020b6647de243d5 02-Sep-2015 Vladimir Marko <vmarko@google.com> Optimizing: Tag basic block allocations with their source.

Replace GrowableArray with ArenaVector in HBasicBlock and,
to track the source of allocations, assign one new and two
Quick's arena allocation types to these vectors. Rename
kArenaAllocSuccessor to kArenaAllocSuccessors.

Bug: 23736311
Change-Id: I984aef6e615ae2380a532f5c6726af21015f43f5
681652d8e8a33bc07c5c082a71aea13d0f15e0a0 23-Jul-2015 Mingyao Yang <mingyao@google.com> HDeoptimize should hold values live in env.

Values that are not live in compiled code anymore may still be needed in
interpreter, due to code motion, etc.

(cherry-picked from commit 718493c6c3c8e380663cb8a94e57ce160a6c473f)

Bug: 22665511
Change-Id: I8b85833c5c462f8fe36f86d6026a51b07563995a
718493c6c3c8e380663cb8a94e57ce160a6c473f 23-Jul-2015 Mingyao Yang <mingyao@google.com> HDeoptimize should hold values live in env.

Values that are not live in compiled code anymore may still be needed in
interpreter, due to code motion, etc.

Bug: 22665511
Change-Id: I8b85833c5c462f8fe36f86d6026a51b07563995a
94015b939060f5041d408d48717f22443e55b6ad 04-Jun-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Revert "Use HCurrentMethod in HInvokeStaticOrDirect.""

Fix was to special case baseline for x86, which does not have enough
registers to allocate the current method.

This reverts commit c345f141f11faad177aa9635a78088d00cf66086.

Change-Id: I5997aa52f8d4df373ae5ff4d4150dac0c44c4c10
c345f141f11faad177aa9635a78088d00cf66086 04-Jun-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Use HCurrentMethod in HInvokeStaticOrDirect."

Fails on baseline/x86.

This reverts commit 38207af82afb6f99c687f64b15601ed20d82220a.

Change-Id: Ib71018367eb7c6046965494a7e996c22af3de403
38207af82afb6f99c687f64b15601ed20d82220a 01-Jun-2015 Nicolas Geoffray <ngeoffray@google.com> Use HCurrentMethod in HInvokeStaticOrDirect.

Change-Id: I0d15244b6b44c8b10079398c55da5071a3e3af66
8272688499c2232355db34d94057983fd436173d 01-Jun-2015 Nicolas Geoffray <ngeoffray@google.com> Tweak one hint and one split in the linear scan.

- Return a hinted register if it is available. Otherwise
another move will be necessary.
- Use SplitBetween instead of raw split when a register
is not fully available. This will find the best split
position.

Change-Id: Ie464e536204ab556eb09345fe6426621eb86e5ac
0a23d74dc2751440822960eab218be4cb8843647 07-May-2015 Nicolas Geoffray <ngeoffray@google.com> Add a parent environment to HEnvironment.

This code has no functionality change. It adds a placeholder
for chaining inlined frames.

Change-Id: I5ec57335af76ee406052345b947aad98a6a4423a
db216f4d49ea1561a74261c29f1264952232728a 05-May-2015 Nicolas Geoffray <ngeoffray@google.com> Relax the only one back-edge restriction.

The rule is in the way for better register allocation, as
it creates an artificial join point between multiple paths.

Change-Id: Ia4392890f95bcea56d143138f28ddce6c572ad58
fbda5f3e1378f07ae202f62da625ee43a063a052 29-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Find better split positions in the register allocator.

In a standard if/else control flow graph, this avoids
doing a move in one branch if the other branch decided
to move an interval.

This also needs a new register hint kind, which is what
was the location of the interval at the predecessor block.

Change-Id: I18b78264587b4d693540fbb5e014d12df2add3e2
579026039080252878106118645ed70706f4838e 21-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Add synthesize uses at back edge.

This reduces the cost of linearizing the graph (hence removing
the notion of back edge). Since linear scan allocates/spills registers
based on next use, adding a use at a back edge ensures we do count
for loop uses.

Change-Id: Idaa882cb120edbdd08ca6bff142d326a8245bd14
4ed947a58de87d19d0609be773207c905ccb0f7f 27-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Dissociate uses with environment uses.

They are most of the times in the way when iterating. They
also complicate the logic of (future) back edge uses.

Change-Id: I152595d9913073fe901b267ca623fa0fe7432484
241a486267bdb59b32fe4c8db370eb936068fb39 16-Apr-2015 David Brazdil <dbrazdil@google.com> ART: Replace expensive calls to Covers in reg alloc

LiveInterval::Covers is implemented as a linear-time search over
liveness ranges and can therefore be rather expensive and should be
avoided unless necessary. This patch replaces calls to Covers when
searching for a sibling with the cheaper IsDefinedAt call.

Change-Id: I93fc73529c15a518335f4cbdc3a0def52d9501e5
0d9f17de8f21a10702de1510b73e89d07b3b9bbf 15-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Move the linear order to the HGraph.

Bug found by Zheng Xu: SsaLivenessAnalysis being a stack allocated
object, we should not refer to it in later phases of the compiler.
Specifically, the code generator was using the linear order, which
was stored in the liveness analysis object.

Change-Id: I574641f522b7b86fc43f3914166108efc72edb3b
d8126bef62df7f40f2e6abc74004f52e664daf45 27-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Fix locations at environment uses.

We were too agressive in not recording environment uses
when the instruction was not of type object. We have to
record the use to the use list of an interval, but it should
not affect the live ranges of that interval.

Change-Id: Id16fb7cc06f14083766d408a345837793583b6ea
f01d34445953e6b9c9b13de1dd32a5c0ee5abab5 27-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Implement a proper solution for temps.

We used to play some trickery when updating locations of temps. This
change creates a proper use of the temp, and use it for updating
its location.

Change-Id: I53e9447b87a55137a3a79841db21ad3864854825
46e2a3915aa68c77426b71e95b9f3658250646b7 16-Mar-2015 David Brazdil <dbrazdil@google.com> ART: Boolean simplifier

The optimization recognizes the negation pattern generated by 'javac'
and replaces it with a single condition. To this end, boolean values
are now consistently assumed to be represented by an integer.

This is a first optimization which deletes blocks from the HGraph and
does so by replacing the corresponding entries with null. Hence,
existing code can continue indexing the list of blocks with the block
ID, but must check for null when iterating over the list.

Change-Id: I7779da69cfa925c6521938ad0bcc11bc52335583
915b9d0c13bb5091875d868fbfa551d7b65d7477 11-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Tweak liveness when instructions are used in environments.

Instructions remain live when debuggable, but only instructions
with object types remain live when non-debuggable.

Enable StackVisitor::GetThisObject for optimizing.

Change-Id: Id87b2cbf33a02450059acc9993995782e5f28987
5b8e6a594b827f7dc88b2e3d895e08f5b3f22446 25-Feb-2015 David Brazdil <dbrazdil@google.com> ART: Cache last returned range in LiveInterval::Covers

Optimizing spends ~10% of compilation time in the register allocator.
One of the frequently called methods is LiveInterval::Covers which
has linear complexity w.r.t. the number of gaps in liveness intervals.
This patch leverages the fact that the register allocator calls Covers
with non-decreasing position values and caches the last returned
result to start the iteration closer to the result the next time the
method is invoked. Stats from compiling the framework show that this
optimization reduces the average number of iterations needed to find
the result by 40%.

Change-Id: I4dd26b900879d5e1d03818ebc1e117cc6a53053c
da02afe615191a19eae9a039786c4c4fc20dbfff 11-Feb-2015 Nicolas Geoffray <ngeoffray@google.com> Support hints for register pairs.

Change-Id: Ia49dc5bf3e9a2bd481425bfe7fbeea9feb66c8e6
c0572a451944f78397619dec34a38c36c11e9d2a 06-Feb-2015 Nicolas Geoffray <ngeoffray@google.com> Optimize leaf methods.

Avoid suspend checks and stack changes when not needed.

Change-Id: I0fdb31e8c631e99091b818874a558c9aa04b1628
ed59619b370ef23ffbb25d1d01f615e60a9262b6 23-Jan-2015 David Brazdil <dbrazdil@google.com> Optimizing: Speed up HEnvironment use removal

Removal of use records from HEnvironment vregs involved iterating over
potentially large linked lists which made compilation of huge methods
very slow. This patch turns use lists into doubly-linked lists, stores
pointers to the relevant nodes inside HEnvironment and subsequently
turns the removals into constant-time operations.

Change-Id: I0e1d4d782fd624e7b8075af75d4adf0a0634a1ee
840e5461a85f8908f51e7f6cd562a9129ff0e7ce 07-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Implement double and float support for arm in register allocator.

The basic approach is:
- An instruction that needs two registers gets two intervals.
- When allocating the low part, we also allocate the high part.
- When splitting a low (or high) interval, we also split the high
(or low) equivalent.
- Allocation follows the (S/D register) requirement that low
registers are always even and the high equivalent is low + 1.

Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
a8eed3acbc39c71ec22dc2943e71eaa07c6507dd 24-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Revert "Revert "Fix the computation of linear ordering.""

PS2 fixes the obvious typos/wrong refactoring.

This reverts commit e50fa5887b1342b845826197d81950e26753fc9c.

Change-Id: I22f81d63a12cf01aafd61535abc2399d936d49c2
e50fa5887b1342b845826197d81950e26753fc9c 24-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Revert "Fix the computation of linear ordering."

Build is broken.

This reverts commit 3054a90063d379ab8c9e5a42a7daf0d644b48b07.

Change-Id: I259bc2bd6a58e30391b8176f3db5fdb5c07e4d6d
3054a90063d379ab8c9e5a42a7daf0d644b48b07 21-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Fix the computation of linear ordering.

The register allocator makes assumptions on the order, and
we ended up not computing the right one. The algorithm worked
fine when the loop header is the block branching to the exit,
but in the presence of breaks or do/while, it was incorrect.

Change-Id: Iad0a89872cd3f7b7a8b2bdf560f0d03493f93ba5
277ccbd200ea43590dfc06a93ae184a765327ad0 04-Nov-2014 Andreas Gampe <agampe@google.com> ART: More warnings

Enable -Wno-conversion-null, -Wredundant-decls and -Wshadow in general,
and -Wunused-but-set-parameter for GCC builds.

Change-Id: I81bbdd762213444673c65d85edae594a523836e5
296bd60423e0630d8152b99fb7afb20fbff5a18a 07-Oct-2014 Mingyao Yang <mingyao@google.com> Some improvement to reg alloc.

Change-Id: If579a37791278500a7e5bc763f144c241f261920
102cbed1e52b7c5f09458b44903fe97bb3e14d5f 15-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Implement register allocator for floating point registers.

Also:
- Fix misuses of emitting the rex prefix in the x86_64 assembler.
- Fix movaps code generation in the x86_64 assembler.

Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
56b9ee6fe1d6880c5fca0e7feb28b25a1ded2e2f 09-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Stop converting from Location to ManagedRegister.

Now the source of truth is the Location object that knows
which register (core, pair, fpu) it needs to refer to.

Change-Id: I62401343d7479ecfb24b5ed161ec7829cda5a0b1
01ef345767ea609417fc511e42007705c9667546 01-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Add trivial register hints to the register allocator.

- Add hints for phis, same as first input, and expected registers.
- Make the if instruction accept non-condition instructions.

Change-Id: I34fa68393f0d0c19c68128f017b7a05be556fbe5
8ddb00ca935733f5d3b07816e5bb33d6cabe6ec4 29-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Improve detection of lifetime holes.

The check concluding that the next use was in a successor
was too conservative: two blocks following each other
in terms of liveness are not necessarily predecessor/sucessor.

Change-Id: Ideec98046c812aa5fb63781141b5fde24c706d6d
8a16d97fb8f031822b206e65f9109a071da40563 11-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Fix valgrind errors.

For now just stack allocate the code generator. Will think
about cleaning up the root problem later (CodeGenerator being an
arena object).

Change-Id: I161a6f61c5f27ea88851b446f3c1e12ee9c594d7
e77493c7217efdd1a0ecef521a6845a13da0305b 21-Aug-2014 Ian Rogers <irogers@google.com> Make common BitVector operations inline-able.

Change-Id: Ie25de4fae56c6712539f04172c42e3eff57df7ca
e50383288a75244255d3ecedcc79ffe9caf774cb 04-Jul-2014 Nicolas Geoffray <ngeoffray@google.com> Support fields in optimizing compiler.

- Required support for temporaries, to be only used by baseline compiler.
- Also fixed a few invalid assumptions around locations and instructions
that don't need materialization. These instructions should not have an Out.

Change-Id: Idc4a30dd95dd18015137300d36bec55fc024cf62
31d76b42ef5165351499da3f8ee0ac147428c5ed 09-Jun-2014 Nicolas Geoffray <ngeoffray@google.com> Plug code generator into liveness analysis.

Also implement spill slot support.

Change-Id: If5e28811e9fbbf3842a258772c633318a2f4fafc
ec7e4727e99aa1416398ac5a684f5024817a25c7 06-Jun-2014 Nicolas Geoffray <ngeoffray@google.com> Fix some bugs in graph construction/simplification methods.

Also fix a brano during SSA construction. The code should
not have been commented out. Added a test to cover what the code
intends.

Change-Id: Ia00ae79dcf75eb0d412f07649d73e7f94dbfb6f0
ffddfdf6fec0b9d98a692e27242eecb15af5ead2 03-Jun-2014 Tim Murray <timmurray@google.com> DO NOT MERGE

Merge ART from AOSP to lmp-preview-dev.

Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
a7062e05e6048c7f817d784a5b94e3122e25b1ec 22-May-2014 Nicolas Geoffray <ngeoffray@google.com> Add a linear scan register allocator to the optimizing compiler.

This is a "by-the-book" implementation. It currently only deals
with allocating registers, with no hint optimizations.

The changes remaining to make it functional are:
- Allocate spill slots.
- Resolution and placements of Move instructions.
- Connect it to the code generator.

Change-Id: Ie0b2f6ba1b98da85425be721ce4afecd6b4012a4
a5b8fde2d2bc3167078694fad417fddfe442a6fd 23-May-2014 Vladimir Marko <vmarko@google.com> Rewrite BitVector index iterator.

The BitVector::Iterator was not iterating over the bits but
rather over indexes of the set bits. Therefore, we rename it
to IndexIterator and provide a BitVector::Indexes() to get
a container-style interface with begin() and end() for range
based for loops.

Also, simplify InsertPhiNodes where the tmp_blocks isn't
needed since the phi_nodes and input_blocks cannot lose any
blocks in subsequent iterations, so we can do the Union()
directly in those bit vectors and we need to repeat the loop
only if we have new input_blocks, rather than on phi_nodes
change. And move the temporary bit vectors to scoped arena.

Change-Id: I6cb87a2f60724eeef67c6aaa34b36ed5acde6d43
ddb311fdeca82ca628fed694c4702f463b5c4927 16-May-2014 Nicolas Geoffray <ngeoffray@google.com> Build live ranges in preparation for register allocation.

Change-Id: I7ae24afaa4e49276136bf34f4ba7d62db7f28c01
0d3f578909d0d1ea072ca68d78301b6fb7a44451 14-May-2014 Nicolas Geoffray <ngeoffray@google.com> Linearize the graph before creating live ranges.

Change-Id: I02eb5671e3304ab062286131745c1366448aff58
f635e63318447ca04731b265a86a573c9ed1737c 14-May-2014 Nicolas Geoffray <ngeoffray@google.com> Add a compilation tracing mechanism to the new compiler.

Code mostly imported from: https://android-review.googlesource.com/#/c/81653/.

Change-Id: I150fe942be0fb270e03fabb19032180f7a065d13
622d9c31febd950255b36a48b47e1f630197c5fe 12-May-2014 Nicolas Geoffray <ngeoffray@google.com> Add loop recognition and CFG simplifications in new compiler.

We do three simplifications:
- Split critical edges, for code generation from SSA (new).
- Ensure one back edge per loop, to simplify loop recognition (new).
- Ensure only one pre header for a loop, to simplify SSA creation (existing).

Change-Id: I9bfccd4b236a00486a261078627b091c8a68be33
804d09372cc3d80d537da1489da4a45e0e19aa5d 02-May-2014 Nicolas Geoffray <ngeoffray@google.com> Build live-in, live-out and kill sets for each block.

This information will be used when computing live ranges of
instructions.

Change-Id: I345ee833c1ccb4a8e725c7976453f6d58d350d74