History log of /art/compiler/optimizing/instruction_simplifier.cc
Revision Date Author Comments
d59f3b1b7f5c1ab9f0731ff9dc60611e8d9a6ede 29-Mar-2016 Vladimir Marko <vmarko@google.com> Use iterators "before" the use node in HUserRecord<>.

Create a new template class IntrusiveForwardList<> that
mimicks std::forward_list<> except that all allocations
are handled externally. This is essentially the same as
boost::intrusive::slist<> but since we're not using Boost
we have to reinvent the wheel.

Use the new container to replace the HUseList and use the
iterators to "before" use nodes in HUserRecord<> to avoid
the extra pointer to the previous node which was used
exclusively for removing nodes from the list. This reduces
the size of the HUseListNode by 25%, 32B to 24B in 64-bit
compiler, 16B to 12B in 32-bit compiler. This translates
directly to overall memory savings for the 64-bit compiler
but due to rounding up of the arena allocations to 8B, we
do not get any improvement in the 32-bit compiler.

Compiling the Nexus 5 boot image with the 64-bit dex2oat
on host this CL reduces the memory used for compiling the
most hungry method, BatteryStats.dumpLocked(), by ~3.3MiB:

Before:
MEM: used: 47829200, allocated: 48769120, lost: 939920
Number of arenas allocated: 345,
Number of allocations: 815492, avg size: 58
...
UseListNode 13744640
...
After:
MEM: used: 44393040, allocated: 45361248, lost: 968208
Number of arenas allocated: 319,
Number of allocations: 815492, avg size: 54
...
UseListNode 10308480
...

Note that while we do not ship the 64-bit dex2oat to the
device, the JIT compilation for 64-bit processes is using
the 64-bit libart-compiler.

Bug: 28173563
Bug: 27856014

(cherry picked from commit 46817b876ab00d6b78905b80ed12b4344c522b6c)

Change-Id: Ifb2d7b357064b003244e92c0d601d81a05e56a7b
f355c3ff08710ac2eba3aac2aacc5e65caa06b4c 30-Mar-2016 Roland Levillain <rpl@google.com> Fix Boolean to integral types conversions.

Bug: 27616343
Change-Id: I050f92045bca1b8b5d6da53547cc617f17be84b1
d96a246e5b103bfc167acaa6315bd8abca9de493 23-Mar-2016 Vladimir Marko <vmarko@google.com> Optimizing: Do not insert suspend checks on back-edges.

Rely on HGraph::SimplifyLoop() to insert suspend checks.

CodeGenerator's CheckLoopEntriesCanBeUsedForOsr() checks the
dex pcs of suspend checks against branch targets to verify
that we always have an appropriate point for OSR transition.
However, the HSuspendChecks that were added by HGraphBuilder
to support the recently removed "baseline" interfered with
this in a specific case, namely an infinite loop where the
back-branch jumps to a nop. In that case, the HSuspendCheck
added by HGraphBuilder had a dex pc different from the block
and the branch target but its presence would stop the
HGraph::SimplifyLoop() from adding a new HSuspendCheck with
the correct dex pc.

Bug: 27623547
Change-Id: I83566a260210bc05aea0c44509a39bb490aa7003
5b5b9319ff970979ed47d41a41283e4faeffb602 22-Mar-2016 Roland Levillain <rpl@google.com> Fix and improve shift and rotate operations.

- Define maximum int and long shift & rotate distances as
int32_t constants, as shift & rotate distances are 32-bit
integer values.
- Consider the (long, long) inputs case as invalid for
static evaluation of shift & rotate rotations.
- Add more checks in shift & rotate operations constructors
as well as in art::GraphChecker.

Change-Id: I754b326c3a341c9cc567d1720b327dad6fcbf9d6
937e6cd515bbe7ff2f255c8fcd40bf1a575a9a16 22-Mar-2016 Roland Levillain <rpl@google.com> Tighten art::HNeg type constraints on its input.

Ensure art::HNeg is only passed a type having the kind of
its input. For a boolean, byte, short, or char input, it
means HNeg's type should be int.

Bug: 27684275
Change-Id: Ic8442c62090a8ab65590754874a14a0deb7acd8d
1a65388f1d86bb232c2e44fecb44cebe13105d2e 18-Mar-2016 Roland Levillain <rpl@google.com> Clean up art::HConstant predicates.

- Make the difference between arithmetic zero and zero-bit
pattern non ambiguous.
- Introduce Boolean predicates in art::HIntConstant for when
they are used as Booleans.
- Introduce aritmetic positive and negative zero predicates
for floating-point constants.

Bug: 27639313
Change-Id: Ia04ecc6f6aa7450136028c5362ed429760c883bd
22c4922c6b31e154a6814c4abe9015d9ba156911 18-Mar-2016 Roland Levillain <rpl@google.com> Ensure art::HRor support boolean, byte, short and char inputs.

Also extend tests covering the IntegerRotateLeft,
LongRotateLeft, IntegerRotateRight and LongRotateRight
intrinsics and their translation into an art::HRor
instruction.

Bug: 27682579
Change-Id: I89f6ea6a7315659a172482bf09875cfb7e7422a1
a5c4a4060edd03eda017abebc85f24cffb083ba7 15-Mar-2016 Roland Levillain <rpl@google.com> Make art::HCompare support boolean, byte, short and char inputs.

Also extend tests covering the IntegerSignum, LongSignum,
IntegerCompare and LongCompare intrinsics and their
translation into an art::HCompare instruction.

Bug: 27629913
Change-Id: I0afc75ee6e82602b01ec348bbb36a08e8abb8bb8
6915898b28cea6c9836ca1be6814d87e89cc6d76 16-Mar-2016 Calin Juravle <calin@google.com> Improve compiler stats

- report the max size of arena alloc
- report how many virtual or interface invokes were inlined

Change-Id: I82f154a8e25b5e3890181a1aa11346cdc3f93e37
5b1805357b80d780d6afc9e2c70c6544c7ac7e2f 15-Mar-2016 Vladimir Marko <vmarko@google.com> ART: Fix shift simplification, x >>> 64.

Fix braino in
https://android-review.googlesource.com/208199

Bug: 27638111
Change-Id: I8f12008af8bba943664c8a9eac3f2d2f7c820e79
164306e779de522efba7df637618a8eeed9e37ac 15-Mar-2016 Vladimir Marko <vmarko@google.com> Optimizing: Improve shift simplification, x >>> 64.

Simplify shifts by a multiple of bit size, not just 0.
ARM codegen does not expect to see such shifts and it
is guarding against them with a DCHECK().

Bug: 27638111
Change-Id: I3ae8383d7edefa0facd375ce511e7a226d5468a1
24bd89559c177af9e342f0d5a64a0a2855dfb887 15-Mar-2016 Vladimir Marko <vmarko@google.com> Optimizing: Prevent potential valgrind error.

This CL preemptively extends the workaround from
https://android-review.googlesource.com/208230
to an almost identical bit of code.

Bug: 27651442
Change-Id: I7683d42b46b16f2293916defc6ef1d871dc9af6c
a65ed3045ec2df95a30994752b3fb0576f479354 14-Mar-2016 Vladimir Marko <vmarko@google.com> Optimizing: Fix valgrind error in image_test64.

Bug: 27651442
Change-Id: Id9b80c6015dbc3b82966766ca4ad010be770f116
625090fe9bf47d8d735c9a66cbf491de3a3e3765 14-Mar-2016 Vladimir Marko <vmarko@google.com> Optimizing: Fix TypeConversion(And(x, const)) simplification.

Avoid introducing implicit conversions when simplifying the
expression TypeConversion(And(x, const)). Previously, when
we dropped the And, we could end up with a TypeConversion to
the same type which should be eliminated on subsequent pass
of the block's instructions; however, a subsequent dependent
TypeConversion in the same block would be processed earlier
and we would unexpectedly see its input as the conversion to
the same type, failing a DCHECK().

Bug: 27626509
Change-Id: I5874a9ceafbf635cf3391beea807ede8468ab5c3
bdd7935c2adc3ad190ee87958e714a36f33cedae 14-Feb-2016 Anton Shamin <anton.shamin@intel.com> Revert "Revert "Revert "Revert "Change condition to opposite if lhs is constant""""

This reverts commit d4aee949b3dd976295201b5310f13aa2df40afa1.

Change-Id: I505b8c9863c310a3a708f580b00d425b750c9541
1193259cb37c9763a111825aa04718a409d07145 08-Mar-2016 Aart Bik <ajcbik@google.com> Implement the 1.8 unsafe memory fences directly in HIR.

Rationale:
More efficient since it exposes full semantics to
all operations on the graph and allows for proper
code generation for all architectures.

bug=26264765

Change-Id: Ic435886cf0645927a101a8502f0623fa573989ff
2a6aad9d388bd29bff04aeec3eb9429d436d1873 25-Feb-2016 Aart Bik <ajcbik@google.com> Implement fp to bits methods as intrinsics.

Rationale:
Better optimization, better performance.

Results on libcore benchmark:

Most gain is from moving the invariant call out of the loop
after we detect everything is a side-effect free intrinsic.
But generated code in general case is much cleaner too.

Before:
timeFloatToIntBits() in 181 ms.
timeFloatToRawIntBits() in 35 ms.
timeDoubleToLongBits() in 208 ms.
timeDoubleToRawLongBits() in 35 ms.

After:
timeFloatToIntBits() in 36 ms.
timeFloatToRawIntBits() in 35 ms.
timeDoubleToLongBits() in 35 ms.
timeDoubleToRawLongBits() in 34 ms.

bug=11548336

Change-Id: I6e001bd3708e800bd75a82b8950fb3a0fc01766e
8ffc1fa556eb92f50a0bd3d5eab56435fff206f6 18-Feb-2016 Aart Bik <ajcbik@google.com> Set bias on != comparison for isNaN.

Change-Id: I83969ecf7252b5e001bdd501c4ca31e7d0608854
75a38b24801bd4d27c95acef969930f626dd11da 17-Feb-2016 Aart Bik <ajcbik@google.com> Implement isNaN intrinsic through HIR equivalent.

Rationale:
Efficient implementation on all platforms.
Subject to better compiler optimizations.

Change-Id: Ie8876bf5943cbe1138491a25d32ee9fee554043c
8428bd376e660df2ffceee72f797d1cfc6c66433 12-Feb-2016 Vladimir Marko <vmarko@google.com> Optimizing: Remove unnecessary And before TypeConversion.

For example `(byte) (x & 0xff)` doesn't need the `& 0xff`.

Bug: 23965701
Change-Id: I5fc8419491aff2cdc7074451e74e873b5f582d41
b52bbde2870e5ab5d126612961dcb3da8e5236ee 12-Feb-2016 Vladimir Marko <vmarko@google.com> Optimizing: Simplify consecutive type conversions.

Merge two consecutive type conversions to one if the result
of such merged conversion is guaranteed to be the same and
remove all implicit conversions, not just conversions to the
same type. Improve codegens to handle conversions from long
to integral types smaller than int.

This will make it easier to simplify `(byte) (x & 0xffL)` to
`(byte) x` where the conversion from long to byte is done by
two dex instructions, long-to-int and in int-to-byte.

Bug: 23965701
Change-Id: I833f193556671136ad2cd3f5b31cdfbc2d99c19d
a19616e3363276e7f2c471eb2839fb16f1d43f27 02-Feb-2016 Aart Bik <ajcbik@google.com> Implemented compare/signum intrinsics as HCompare
(with all code generation for all)

Rationale:
At HIR level, many more optimizations are possible, while ultimately
generated code can take advantage of full semantics.

Change-Id: I6e2ee0311784e5e336847346f7f3c4faef4fd17e
9f98025ba5541641cfa9abb7b9cf30332d91fad1 05-Feb-2016 Alexandre Rames <alexandre.rames@linaro.org> Extend De Morgan factorisation to `HBooleanNot`.

Change-Id: I81aa92277fa136d675e7ef01be8e4acdbd3d3b7c
ca0e3a0c9f1fd5902dc40043b061d2f9b79ec098 03-Feb-2016 Alexandre Rames <alexandre.rames@linaro.org> Revert "Revert "Optimizing: double-negated bitwise operations simplifications""

This reverts commit 737c0a99dfbba306ec1f50e2adf66b5d97805af6 with fixes.

In the original patch, the new instruction could be inserted before
one of its inputs. A regression test is also added.

Change-Id: Ie49a17ac90ff048355d9cc944b468cd1b1914424
74eb1b264691c4eb399d0858015a7fc13c476ac6 14-Dec-2015 David Brazdil <dbrazdil@google.com> ART: Implement HSelect

This patch adds a new HIR instruction to Optimizing. HSelect returns
one of two inputs based on the outcome of a condition.

This is only initial implementation which:
- defines the new instruction,
- repurposes BooleanSimplifier to emit it,
- extends InstructionSimplifier to statically resolve it,
- updates existing code and tests accordingly.

Code generators currently emit fallback if/then/else code and will be
updated in follow-up CLs to use platform-specific conditional moves
when possible.

Change-Id: Ib61b17146487ebe6b55350c2b589f0b971dcaaee
737c0a99dfbba306ec1f50e2adf66b5d97805af6 25-Jan-2016 Nicolas Geoffray <ngeoffray@google.com> Revert "Optimizing: double-negated bitwise operations simplifications"

Fails compiling the Wallet.apk with:

dex2oatd F 40736 41007 art/compiler/optimizing/optimizing_compiler.cc:194] Error after instruction_simplifier: art::SSAChecker: Instruction Add:59 in block 4 does not dominate use Or:153 in block 4.

This reverts commit 96798493170521691d709be50dd2102ead47b083.

Change-Id: Ia4b02e62e6133aa104f5db12ba82d5561b6fc090
96798493170521691d709be50dd2102ead47b083 15-Jan-2016 Kevin Brodsky <kevin.brodsky@linaro.org> Optimizing: double-negated bitwise operations simplifications

Generic instruction simplifications applying to bitwise operations when
both inputs are Not's. And and Or are handled by De Morgan's laws,
removing one instruction:
~a & ~b -> ~(a | b)
~a | ~b -> ~(a & b)
Xor is handled by this trivial relation, removing two instructions:
~a ^ ~b = a ^ b

The simplifications only happen when neither Not is used by other
instructions.

Change-Id: I5d5187af2f625c475c3e49466af6bc3e87595f8f
d4aee949b3dd976295201b5310f13aa2df40afa1 22-Jan-2016 Nicolas Geoffray <ngeoffray@google.com> Revert "Revert "Revert "Change condition to opposite if lhs is constant"""

Fails two checker tests:

458-checker-instruction-simplification
537-checker-jump-over-jump

This reverts commit 884e54c8a45e49b58cb1127c8ed890f79f382601.

Change-Id: I22553e4e77662736b8b453d911a2f4e601f3a27e
884e54c8a45e49b58cb1127c8ed890f79f382601 22-Jan-2016 Nicolas Geoffray <ngeoffray@google.com> Revert "Revert "Change condition to opposite if lhs is constant""

This reverts commit a05cacc11fa075246c38497c01b949745fadc54b.

Change-Id: Ifdc261fd4dfb2c538017fe1d69af723aafd4afef
6de1938e562b0d06e462512dd806166e754035ea 08-Jan-2016 David Brazdil <dbrazdil@google.com> ART: Remove incorrect HFakeString optimization

Simplification of HFakeString assumes that it cannot be used until
String.<init> is called which is not true and causes different
behaviour between the compiler and the interpreter. This patch
removes the optimization together with the HFakeString instruction.

Instead, HNewInstance is generated and an empty String allocated
until it is replaced with the result of the StringFactory call. This
is consistent with the behaviour of the interpreter but is too
conservative. A follow-up CL will attempt to optimize out the initial
allocation when possible.

Bug: 26457745
Bug: 26486014

Change-Id: I7139e37ed00a880715bfc234896a930fde670c44
a05cacc11fa075246c38497c01b949745fadc54b 12-Jan-2016 Nicolas Geoffray <ngeoffray@google.com> Revert "Change condition to opposite if lhs is constant"

Breaks arm64

This reverts commit f9f196c55f3b25c3b09350cd8ed5d7ead31f1757.

Change-Id: Ie1027a218154b8ded6c1c8f0007720f5be68780d
f9f196c55f3b25c3b09350cd8ed5d7ead31f1757 08-Sep-2015 Anton Shamin <anton.shamin@intel.com> Change condition to opposite if lhs is constant

Swap operands if lhs is constant. Handeled unsigned comparison
in insruction simplifier. Fixed NaN comparison: no matter what
bias is set result of Equal and NotEqual operations should not
depend on it. Added checker tests.

Change-Id: I5a9ac25fb10f2705127a52534867cee43368ed1b
Signed-off-by: Anton Shamin <anton.shamin@intel.com>
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
92d9060c0cdff7c726549a9d9494e5655404bed7 19-Dec-2015 Alexey Frunze <Alexey.Frunze@imgtec.com> MIPS: Implement HRor

This also fixes differentiation between the SRL and ROTR
instructions in the disassembler.

Change-Id: Ie19697f8d6ea8fa4e338adde3e3cf8e4a0383eae
299a93993fb8f3efbf0465cf674d80c3bcfdc66c 09-Dec-2015 Alexey Frunze <Alexey.Frunze@imgtec.com> MIPS64: Fuse long and FP compare & condition in Optimizing.

Bug: 25559148

Change-Id: I2d14ac75460a76848c71c08cffff6d7a18f5f580
cd7b0ee296b0462961c63e51d99c9c323e2690df 04-Dec-2015 Alexey Frunze <Alexey.Frunze@imgtec.com> MIPS32: Fuse long and FP compare & condition in Optimizing.

This also does a minor clean-up in the assembler and
its test.

Bug: 25559148
Change-Id: I9bad3c500b592a09013b56745f70752eb284a842
351dddf4025f07477161209e374741f089d97cb4 11-Dec-2015 Vladimir Marko <vmarko@google.com> Optimizing: Clean up after HRor.

Change-Id: I96bd7fa2e8bdccb87a3380d063dad0dd57fed9d7
40a04bf64e5837fa48aceaffe970c9984c94084a 11-Dec-2015 Scott Wakeling <scott.wakeling@linaro.org> Replace rotate patterns and invokes with HRor IR.

Replace constant and register version bitfield rotate patterns, and
rotateRight/Left intrinsic invokes, with new HRor IR.

Where k is constant and r is a register, with the UShr and Shl on
either side of a |, +, or ^, the following patterns are replaced:

x >>> #k OP x << #(reg_size - k)
x >>> #k OP x << #-k

x >>> r OP x << (#reg_size - r)
x >>> (#reg_size - r) OP x << r

x >>> r OP x << -r
x >>> -r OP x << r

Implemented for ARM/ARM64 & X86/X86_64.

Tests changed to not be inlined to prevent optimization from folding
them out. Additional tests added for constant rotate amounts.

Change-Id: I5847d104c0a0348e5792be6c5072ce5090ca2c34
f652917de5634b30c974c81d35a72871915b352a 17-Nov-2015 Mark Mendell <mark.p.mendell@intel.com> Simplify boolean condition compared to 0

CaffeineMarkRR Logic has some boolean flipping which can be helped by
some simplification.

Simplify non-FP (A COND_OP B) != 0 to A OPPOSITE_COND_OP B.
This is better than the original code, which would use a HBooleanNot
after the condition.

Also simplify non-FP (A COND_OP B) == 1 to A OPPOSITE_COND_OP B.

Move GetOppositeCondition to nodes.h/nodes.cc to share with Boolean
Simplification, renaming it to InsertOppositeCondition, as it inserts
the new HInstruction (unless it is a constant).

Change-Id: I34ded7758836e375de0d6fdba9239d2d451928d0
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
f9d741e32c6f1629ce70eefc68d3363fa1cfd696 20-Nov-2015 Vladimir Marko <vmarko@google.com> Optimizing/ARM: Improve long shifts by 1.

Implement long
Shl(x,1) as LSLS+ADC,
Shr(x,1) as ASR+RRX and
UShr(x,1) as LSR+RRX.

Remove the simplification substituting Shl(x,1) with
ADD(x,x) as it interferes with some other optimizations
instead of helping them. And since it didn't help 64-bit
architectures anyway, codegen is the correct place for it.
This is now implemented for ARM and x86, so only mips32 can
be improved.

Change-Id: Idd14f23292198b2260189e1497ca5411b21743b3
38db785600757a832423e076b3cf0af3bee942d8 20-Nov-2015 Alexandre Rames <alexandre.rames@linaro.org> Opt compiler: More strength reduction for multiplications.

We transform code looking like

MUL dst, src, (2^n + 1)

into

SHL tmp, src, n
ADD dst, src, tmp

and code looking like

MUL dst, src, (2^n - 1)

into

SHL tmp, src, n
SUB dst, tmp, src

Change-Id: Ia620ab68758caa70a01530b88cd65dd0444376d7
f652cecb984c104d44a0223c3c98400ef8ed8ce2 25-Aug-2015 Goran Jakovljevic <Goran.Jakovljevic@imgtec.com> MIPS: Initial version of optimizing compiler for MIPS32

Change-Id: I370388e8d5de52c7001552b513877ef5833aa621
bb245d199a5240b4c520263fd2c8c10dba79eadc 19-Oct-2015 Aart Bik <ajcbik@google.com> Generalize codegen and simplification of deopt.

Rationale: the de-opt instruction is very similar to an if,
so the existing assumption that it always has a
conditional "under the hood" is very unsafe, since
optimizations may have replaced conditionals with
actual values; this CL generalizes handling of deopt.

Change-Id: I1c6cb71fdad2af869fa4714b38417dceed676459
e9f37600e98ba21308ad4f70d9d68cf6c057bdbe 09-Oct-2015 Aart Bik <ajcbik@google.com> Added support for unsigned comparisons

Rationale: even though not directly supported in input graph,
having the ability to express unsigned comparisons
in HIR is useful for all sorts of optimizations.

Change-Id: I4543c96a8c1895c3d33aaf85685afbf80fe27d72
ee3cf0731d0ef0787bc2947c8e3ca432b513956b 06-Oct-2015 Nicolas Geoffray <ngeoffray@google.com> Intrinsify System.arraycopy.

Currently on x64, will do the other architectures in
different changes.

Change-Id: I15fbbadb450dd21787809759a8b14b21b1e42624
e53fb5582f8f6ece5d0ce3b9c0d5b1cdb654b254 07-Oct-2015 Calin Juravle <calin@google.com> Don't remove type checks if we need to perform an access check.

Change-Id: I9b9e07c7524e96ece8dc089c8379631c2f9e3320
a83a54d7f2322060f08480f8aabac5eb07268912 02-Oct-2015 Nicolas Geoffray <ngeoffray@google.com> Add support for intrinsic optimizations.

Change-Id: Ib5a4224022f9360e60c09a19ac8642270a7f3b64
98893e146b0ff0e1fd1d7c29252f1d1e75a163f2 02-Oct-2015 Calin Juravle <calin@google.com> Add support for unresolved classes in optimizing.

Change-Id: I0e299a81e560eb9cb0737ec46125dffc99333b54
e0395dd58454e27fc47c0ca273913929fb658e6c 25-Sep-2015 Nicolas Geoffray <ngeoffray@google.com> Optimize ArraySet for x86/x64/arm/arm64.

Change-Id: I5bc8c6adf7f82f3b211f0c21067f5bb54dd0c040
452c1b60120aee0883c3339b363f820b8d69c299 25-Sep-2015 Vladimir Marko <vmarko@google.com> Optimizing: Simplify UShr+And, Shr+And.

Eliminate And from UShr+And if the And-mask contains all the
bits that can be non-zero after UShr. Transform Shr+And to
UShr if the And-mask precisely clears the shifted-in sign
bits.

This prepares for detecting the Rotate pattern, i.e.
(x << N) | (x >>> (SIZE - N))
in code that unnecessarily masks the UShr, for example
(x << 1) | ((x >>> 31) & 1) ,
or uses Shr, for example
(x << 8) | ((x >> 24) & 0xff) .

Change-Id: I684c4b752547d9b1057d0d4c4d44550bb1a3ffb4
6e7455e90411c77088af5fcbf828219842bd2182 28-Sep-2015 Nicolas Geoffray <ngeoffray@google.com> Use dominance information for null optimization in write barrier.

Change-Id: I8b57dafcd321c9afa1bbfc6a0674cbea15cbf10c
aae9e66a727756bc965121a60ffcef89ed370e6c 21-Aug-2015 Serdjuk, Nikolay Y <nikolay.y.serdjuk@intel.com> ART: Fix the simplifier for NEGATE add/sub

Instruction simplifier for negate add/sub should not proceed
with floats because that might cause the incorrect behavior
with signed zero.

Change-Id: I4970694a2b265a3577cde34fee9cd3a437358c0f
efa8468c78fdd808043dfb664b56541f3f2dd0e8 13-Aug-2015 Nicolas Geoffray <ngeoffray@google.com> Small optimization improvements.

- Tune CanBeNull for HBoundType.
- Remove LoadClass when we know the class is loaded.
- Tune CanBeNull for StringInit.

Change-Id: I564ed33a506d65e991a514342bdfd1610bed0cf5
f2ea71cdb3ee4f5198bc0298aa8be1f9e945ee1c 05-Aug-2015 Serguei Katkov <serguei.i.katkov@intel.com> ART: Fix the simplifier for add/sub

Instruction simplifier for add/sub should not proceed with floats
because that might cause the incorrect behavior with signed zero.

Bug: 23001681

Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>

(cherry picked from commit 115b53f609e74672fa93eea1845bb17340d5112a)

Change-Id: I9928724c4158b3961e32e376b9203fe01ba2e442
115b53f609e74672fa93eea1845bb17340d5112a 05-Aug-2015 Serguei Katkov <serguei.i.katkov@intel.com> ART: Fix the simplifier for add/sub

Instruction simplifier for add/sub should not proceed with floats
because that might cause the incorrect behavior with signed zero.

Change-Id: If0c9bf3931bcbf96b0814f8605a86997aea37145
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
2e76830f0b3f23825677436c0633714402715099 28-Jul-2015 Calin Juravle <calin@google.com> Revert "Revert "Revert "Revert "Use the object class as top in reference type propagation""""

This reverts commit b734808d0c93af98ec4e3539fdb0a8c0787263b0.

Change-Id: Ifd925f166761bcb9be2268ff0fc9fa3a72f00c6f
b734808d0c93af98ec4e3539fdb0a8c0787263b0 28-Jul-2015 Calin Juravle <calin@google.com> Revert "Revert "Revert "Use the object class as top in reference type propagation"""

This reverts commit 80caa1478cf3df4eac1214d8a63a4da6f4fe622b.

Change-Id: I63b51ca418b19b2bfb5ede3f8444f8fbeb8a339d
80caa1478cf3df4eac1214d8a63a4da6f4fe622b 16-Jul-2015 Calin Juravle <calin@google.com> Revert "Revert "Use the object class as top in reference type propagation""

This reverts commit 7733bd644ac71f86d4b30a319624b23343882e53.

Change-Id: I7d393a808c01c084c18d632a54e0554b4b455f2c
7733bd644ac71f86d4b30a319624b23343882e53 22-Jul-2015 Calin Juravle <calin@google.com> Revert "Use the object class as top in reference type propagation"

This reverts commit 3fabec7a25d151b26ba7de13615bbead0dd615a6.

Change-Id: Id8614f6b6e3e0e4c9caeb9f771e4c145d9fec64f
3fabec7a25d151b26ba7de13615bbead0dd615a6 16-Jul-2015 Calin Juravle <calin@google.com> Use the object class as top in reference type propagation

This properly types all instructions, making it safe to query the type
at any time.

This also moves a few functions from class.h to class-inl.h to please
gcc linker when compiling for target.

Change-Id: I6b7ce965c10834c994b95529ab65a548515b4406
7f63c52c8e94ed1340b7a1d04b046ff12819d2bc 13-Jul-2015 Roland Levillain <rpl@google.com> Revert "Revert "Fuse long and FP compare & condition on ARM64 in Optimizing.""

This reverts commit bed50d2430e02a3d6b94972e8ab4873d7b3b8be0.

Bug: 21120453
Change-Id: I5e4aab2703966d9324ebde25bd8b83056fdb10ed
2e7cd752452d02499a2f5fbd604c5427aa372f00 10-Jul-2015 Nicolas Geoffray <ngeoffray@google.com> [optimizing] Don't rely on the verifier for String.<init>.

Continue work on cutting the dependency on the verifier.

Change-Id: I0f95b1eb2e10fd8f6bf54817f1202bdf6dfdb0fe
bed50d2430e02a3d6b94972e8ab4873d7b3b8be0 10-Jul-2015 Roland Levillain <rpl@google.com> Revert "Fuse long and FP compare & condition on ARM64 in Optimizing."

This reverts commit 5cfe61f27ed9203498169355bb95db756486d292.

Change-Id: I9879e76e7f8315cace05700e3b571a6a4749bf1a
5cfe61f27ed9203498169355bb95db756486d292 10-Jul-2015 Roland Levillain <rpl@google.com> Fuse long and FP compare & condition on ARM64 in Optimizing.

Bug: 21120453
Change-Id: I701e808600fb5ba9ff4d0f5e19e4ce22b1d34b29
4fa13f65ece3b68fe3d8722d679ebab8656bbf99 06-Jul-2015 Roland Levillain <rpl@google.com> Fuse long and FP compare & condition on ARM in Optimizing.

Also:
- Stylistic changes in corresponding parts on the x86 and
x86-64 code generators.
- Update and improve the documentation of
art::arm::Condition.

Bug: 21120453
Change-Id: If144772046e7d21362c3c2086246cb7d011d49ce
c470193cfc522fc818eb2eaab896aef9caf0c75a 10-Apr-2015 Mark Mendell <mark.p.mendell@intel.com> Fuse long and FP compare & condition on x86/x86-64 in Optimizing.

This is a preliminary implementation of fusing long/float/double
compares with conditions to avoid materializing the result from the
compare and condition.

The information from a HCompare is transferred to the HCondition if it
is legal. There must be only a single use of the HCompare, the HCompare
and HCondition must be in the same block, the HCondition must not need
materialization.

Added GetOppositeCondition() to HCondition to return the flipped
condition.

Bug: 21120453
Change-Id: I1f1db206e6dc336270cd71070ed3232dedc754d6
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
9fdb31e12023d94c710a766a54d8a57c91a196f9 01-Jul-2015 Nicolas Geoffray <ngeoffray@google.com> Do not do a type check when setting null to an array.

Change-Id: I7387d45aea697d4a3de273335647220a815a992b
0bc614dfaff593d77eb698c279044db44bad4a4b 19-Jun-2015 Nicolas Geoffray <ngeoffray@google.com> Do not expect 0 or 1 only when comparing a boolean.

bug:21866529

(cherry picked from commit 3c4ab80c102ff1bfc0e74d4abddbf5454bf4008d)

Change-Id: Ibdc0d4a9730bfc6e7307282276f084dae5ac55c1
1e9ec053008fca7eb713815716c69375c37b399c 22-Jun-2015 David Brazdil <dbrazdil@google.com> ART: Simplify (Not)Equal bool vs. int to true/false

Optimizations on the HGraph may produce comparisons of bool and ints.
Instruction simplifier will simplify these only for 0/1 int constants.
Since the range of bool is known, comparison against all other int
constants can always be determined statically.

Change-Id: I502651b7a08edf71ee0b2589069f00def6aacf66
3c4ab80c102ff1bfc0e74d4abddbf5454bf4008d 19-Jun-2015 Nicolas Geoffray <ngeoffray@google.com> Do not expect 0 or 1 only when comparing a boolean.

bug:21866529
Change-Id: I81ffba609a357010bd86073eb979024fc668ed20
7cb499b1af1575c854860b0d6a103c4a2a59a569 17-Jun-2015 Nicolas Geoffray <ngeoffray@google.com> Fix bug in optimizing around instanceof.

We were too aggressive when removing instanceof. We should
not remove it when there is one of the two static types that
is an interface.

Change-Id: I1fd80915b99b094f7b4393e7adb2b160201b30d5
222862ceaeed48528020412ef4f7b1cdaecf8789 09-Jun-2015 Guillaume Sanchez <guillaumesa@google.com> Add optimizations for instanceof/checkcast.

The optimizations try to statically determine the outcome of the
type tests, replacing/removing the instructions when possible.

This required to fix the is_exact flag for ReferenceTypePropagation.

Change-Id: I6cea29b6c351d118b62060e8420333085e9383fb
07276db28d654594e0e86e9e467cad393f752e6e 18-May-2015 Nicolas Geoffray <ngeoffray@google.com> Don't do a null test in MarkGCCard if the value cannot be null.

Change-Id: I45687f6d3505178e2fc3689eac9cb6ab1b2c1e29
8909bafa5d64e12eb53f3d37b984f53e7a632224 23-Apr-2015 Guillaume "Vermeille" Sanchez <guillaumesa@google.com> Mark CheckCast's and InstanceOf's input as !CanBeNull if used before in a NullCheck

Change-Id: Ied0412a01922b40a3f5d89bed49707498582abc1
ba56d060116d6e145be348fa575314654c6b0572 06-May-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Improve 32 bit long shift by 1.

Also change FOO << 1 to FOO+FOO in the instruction simplifier. This is
an architecture independent simplification, which helps 'long << 1' for
32 bit architectures.

Generate an add/adc for long << 1 in x86, in case something is generated
after the simplifier.

Add test cases for the simplification.

Change-Id: I0d512331ef13cc4ccf10c80f11c370a10ed02294
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
0d22184ec9e5b1e958c031ac92c7f053de3a13a2 27-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Revert "[optimizing] Replace FP divide by power of 2""

This reverts commit 067cae2c86627d2edcf01b918ee601774bc76aeb.

Change-Id: Iaaa8772500ea7d3dce6ae0829dc0dc3bbc9c14ca
067cae2c86627d2edcf01b918ee601774bc76aeb 26-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "[optimizing] Replace FP divide by power of 2"

Fails compiling docs.

This reverts commit b0bd8915cb257cdaf46ba663c450a6543bca75af.

Change-Id: I47d32525c83a73118e2163eb58c68bbb7a28bb38
af88835231c2508509eb19aa2d21b92879351962 20-Apr-2015 Guillaume "Vermeille" Sanchez <guillaumesa@google.com> Remove unnecessary null checks in CheckCast and InstanceOf

Change-Id: I6fd81cabd8673be360f369e6318df0de8b18b634
538491967d1514a263e99d78379d743fcc896eef 20-Apr-2015 Serguei Katkov <serguei.i.katkov@intel.com> Mul simplification should expect zero operand

It is possible that zero constant can appear due to
simplification of other instructions, so we cannot expect
zero handling from constant optimizations.

Change-Id: I084126fd0c106ac2683c4f10a451960d9807f4f6
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
943136fd031f4fe18f6e8a956c72171d7cf78da0 22-Apr-2015 Calin Juravle <calin@google.com> Remove warning when we do too many simplifications

This happens pretty often and pollutes the logs.

Change-Id: I074783d3cf3519a5186d2dd81c821d97071302e7
b0bd8915cb257cdaf46ba663c450a6543bca75af 16-Apr-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Replace FP divide by power of 2

Replace a floating point division by a power of two by a multiplication
of the reciprocal. This is guarenteed to have the exact same result as
it is exactly representable.

Add routines to allow generation of float and double constants after the
SSA Builder. I was unsure if float and double caches should be
implemented. Under the assumption that there is probably not a lot of
repetition of FP values. Please let me know.

Change-Id: I3a6c3847b49b4e747a7e7e8843ca32bb174b1584
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
339dfc209ad93482269eea1386e79973abc313cf 19-Apr-2015 Serguei Katkov <serguei.i.katkov@intel.com> Incorrect transformation of (sub,neg) to (sub) for fp

A pair (sub,neg) should not be transformed to (sub) for
floating point operations, otherwise we can lose the sign of
zero for instructions like this:
- (A - B) != B - A if B == A

Change-Id: I4d612612d4dc0a067fac5721ad206f74168bcd36
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
0d13fee6f4330cc9cb100c43135490a34c11d7a5 17-Apr-2015 David Brazdil <dbrazdil@google.com> ART: Simplify more bool operations

Now that we have the HBooleanNot instruction, the instruction
simplifier can optimize out more conditions comparing a boolean
against a constant, as well as sequences of Boolean negations.

Change-Id: I7f634f6428a3984dd97b27b3d6362491346f1ff6
188d4316a880ae24aed315aa52dc503c4fcb1ec7 09-Apr-2015 Alexandre Rames <alexandre.rames@arm.com> Opt compiler: Instruction simplification for HAdd, HNeg, HNot, HSub.

Under assumptions for the 'cost' of each IR (eg. neither HAdd nor HSub
are faster than the other), transformations are only applied if they
(locally) cannot degrade the quality of the graph. The code could be
extended to look at uses of the IRs and detect more opportunities for
optimisations. The optimisations in this patch do not look at other
uses for their inputs.

Change-Id: Ib60dab007af30f43421ef5bb55db2ec32fb8fc0c
8d5b8b295930aaa43255c4f0b74ece3ee8b43a47 24-Mar-2015 David Brazdil <dbrazdil@google.com> ART: Force constants into the entry block

Optimizations such as GVN and BCE make the assumption that all
constants are located in the entry block of the CFG, but not all
passes adhere to this rule.

This patch makes constructors of constants private and only accessible
to friend classes - HGraph for int/long constants and SsaBuilder for
float/double - which ensure that they are placed correctly and not
duplicated.

Note that the ArenaAllocatorAdapter was modified to not increment
the ArenaAllocator's internal reference counter in order to allow
for use of ArenaSafeMap inside an arena-allocated objects. Because
their destructor is not called, the counter does not get decremented.

Change-Id: I36a4fa29ae34fb905cdefd482ccbf386cff14166
b2fd7bca70b580921eebf7c45769c39d2dfd8a5a 11-Mar-2015 Alexandre Rames <alexandre.rames@arm.com> Opt compiler: Basic simplification for arithmetic operations.

The optimisations in this patch do not look further than the
inputs of each operation.

Change-Id: Iddd0ab6b360b9e7bb042db22086d51a31be85530
acf735c13998ad2a175f5a17e7bfce220073279d 12-Feb-2015 Calin Juravle <calin@google.com> Reference type propagation

- propagate reference types between instructions
- remove checked casts when possible
- add StackHandleScopeCollection to manage an arbitrary number of stack
handles (see comments)

Change-Id: I31200067c5e7375a5ea8e2f873c4374ebdb5ee60
0304e182adee81be32c744fd3c0d28add29974ff 31-Jan-2015 Mingyao Yang <mingyao@google.com> Improve bce so that more bounds checks can be eliminated.

For pattern like "int[] array = new int[size+1]", we record this range
for size:
[-1, array.length-1]
This can eliminate more bounds checks.

Also simplify overflow/underflow handling and make it more solid.

Enhance instruction simplifier such that if array is a result of
NewArray with a constant size, replace array.length with that constant.

Plan to move all bce gtests to checker in another change.

Change-Id: Ibe7cc7940b68fb6465dc3e0ff3ebdb0fd6487aa9
10e244f9e7f6d96a95c910a2bedef5bd3810c637 26-Jan-2015 Calin Juravle <calin@google.com> optimizing: NullCheck elimination

How it works:
- run a type analysis to propagate null information on instructions
- during the last instruction simplifier remove null checks for which
the input is known to be not null

The current type analysis is actually a nullability analysis but it will
be reused in follow up CLs to propagate type information: so it keeps
the more convenient name.

Change-Id: I54bb1d32ab24604b4d677d1ecdaf8d60a5ff5ce9
fa93b504b324784dd9a96e28e6e8f3f1b1ac456a 21-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Do not use HNot for creating !bool.

HNot folds to ~, not !.

Change-Id: I681f968449a2ade7110b2f316146ad16ba5da74c
01fcc9ee556f98d0163cc9b524e989760826926f 01-Dec-2014 Nicolas Geoffray <ngeoffray@google.com> Remove type conversion nodes converting to the same type.

When optimizing, we ensure these conversions do not reach the
code generators. When not optimizing, we cannot get such situations.

Change-Id: I717247c957667675dc261183019c88efa3a38452
5e6916cea259897baaca019c5c7a5d05746306ed 18-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Use HOptimization abstraction for running optimizations.

Move existing optimizations to it.

Change-Id: I3b43f9997faf4ed8875162e3a3abdf99375478dd
af07bc121121d7bd7e8329c55dfe24782207b561 12-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Minor object store optimizations.

- Avoid emitting write barrier when the value is null.
- Do not do a typecheck on an arraystore when storing something that
was loaded from the same array.

Change-Id: I902492928692e4553b5af0fc99cce3c2186c442a
1cc5f251df558b0e22cea5000626365eb644c727 22-Oct-2014 Roland Levillain <rpl@google.com> Implement int bit-wise not operation in the optimizing compiler.

- Add support for the not-int (integer one's complement
negate) instruction in the optimizing compiler.
- Extend the HNot control-flow graph node type and make it
inherit from HUnaryOperation.
- Generate ARM, x86 and x86-64 code for integer HNeg nodes.
- Exercise these additions in the codegen_test gtest, as there
is not direct way to assess the support of not-int from a
Java source. Indeed, compiling a Java expression such as
`~a' using javac and then dx generates an xor-int/lit8 Dex
instruction instead of the expected not-int Dex instruction.
This is probably because the Java bytecode has an `ixor'
instruction, but there's not instruction directly
corresponding to a bit-wise not operation.

Change-Id: I223aed75c4dac5785e04d99da0d22e8d699aee2b
01ef345767ea609417fc511e42007705c9667546 01-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Add trivial register hints to the register allocator.

- Add hints for phis, same as first input, and expected registers.
- Make the if instruction accept non-condition instructions.

Change-Id: I34fa68393f0d0c19c68128f017b7a05be556fbe5
3c04974a90b0e03f4b509010bff49f0b2a3da57f 24-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Optimize suspend checks in optimizing compiler.

- Remove the ones added during graph build (they were added
for the baseline code generator).
- Emit them at loop back edges after phi moves, so that the test
can directly jump to the loop header.
- Fix x86 and x86_64 suspend check by using cmpw instead of cmpl.

Change-Id: I6fad5795a55705d86c9e1cb85bf5d63dadfafa2a