History log of /art/compiler/dex/quick/arm/arm_lir.h
Revision Date Author Comments
8dea81ca9c0201ceaa88086b927a5838a06a3e69 06-Jun-2014 Vladimir Marko <vmarko@google.com> Rewrite use/def masks to support 128 bits.

Reduce LIR memory usage by holding masks by pointers in the
LIR rather than directly and using pre-defined const masks
for the common cases, allocating very few on the arena.

Change-Id: I0f6d27ef6867acd157184c8c74f9612cebfe6c16
04f4d8abe45d6e79eca983e057de76aea24b7df9 30-May-2014 Wei Jin <wejin@google.com> Add an optimization for removing redundant suspend tests in ART

This CL:
(1) eliminates redundant suspend checks (dominated by another check),

(2) removes the special treatment of the R4 register, which got
reset on every native call, possibly yielding long execution
sequences without any suspend checks, and

(3) fixes the absence of suspend checks in leaf methods.

(2) and (3) increase the frequency of suspend checks, which improves
the performance of GC and the accuracy of profile data. To
compensate for the increased number of checks, we implemented an
optimization that leverages dominance information to remove
redundant suspend checks on back edges. Based on the results of
running the Caffeine benchmark on Nexus 7, the patch performs
roughly 30% more useful suspend checks, spreading them much more
evenly along the execution trace, while incurring less than 1%
overhead. For flexibility consideration, this CL defines two flags
to control the enabling of optimizations. The original
implementation is the default.

Change-Id: I31e81a5b3c53030444dbe0434157274c9ab8640f
Signed-off-by: Wei Jin <wejin@google.com>
091cc408e9dc87e60fb64c61e186bea568fc3d3a 31-Mar-2014 buzbee <buzbee@google.com> Quick compiler: allocate doubles as doubles

Significant refactoring of register handling to unify usage across
all targets & 32/64 backends.

Reworked RegStorage encoding to allow expanded use of
x86 xmm registers; removed vector registers as a separate
register type. Reworked RegisterInfo to describe aliased
physical registers. Eliminated quite a bit of target-specific code
and generalized common code.

Use of RegStorage instead of int for registers now propagated down
to the NewLIRx() level. In future CLs, the NewLIRx() routines will
be replaced with versions that are explicit about what kind of
operand they expect (RegStorage, displacement, etc.). The goal
is to eventually use RegStorage all the way to the assembly phase.

TBD: MIPS needs verification.
TBD: Re-enable liveness tracking.

Change-Id: I388c006d5fa9b3ea72db4e37a19ce257f2a15964
d6ed642458c8820e1beca72f3d7b5f0be4a4b64b 10-Apr-2014 Dave Allison <dallison@google.com> Revert "Revert "Revert "Use trampolines for calls to helpers"""

This reverts commit f9487c039efb4112616d438593a2ab02792e0304.

Change-Id: Id48a4aae4ecce73db468587967968a3f7618b700
f9487c039efb4112616d438593a2ab02792e0304 09-Apr-2014 Dave Allison <dallison@google.com> Revert "Revert "Use trampolines for calls to helpers""

This reverts commit 081f73e888b3c246cf7635db37b7f1105cf1a2ff.

Change-Id: Ibd777f8ce73cf8ed6c4cb81d50bf6437ac28cb61

Conflicts:
compiler/dex/quick/mir_to_lir.h
081f73e888b3c246cf7635db37b7f1105cf1a2ff 07-Apr-2014 Dave Allison <dallison@google.com> Revert "Use trampolines for calls to helpers"

This reverts commit 754ddad084ccb610d0cf486f6131bdc69bae5bc6.

Change-Id: Icd979adee1d8d781b40a5e75daf3719444cb72e8
754ddad084ccb610d0cf486f6131bdc69bae5bc6 19-Feb-2014 Dave Allison <dallison@google.com> Use trampolines for calls to helpers

This is an ARM specific optimization to the compiler
that uses trampoline islands to make calls to runtime
helper functions. The intention is to reduce the size
of the generated code (by 2 bytes per call) without
affecting performance.

By default this is on when generating an OAT file. It is
off when compiling to memory.

To switch this off in dex2oat, use the command line option:
--no-helper-trampolines

Enhances disassembler to print the trampoline entry on the
BL instruction like this:

0xb6a850c0: f7ffff9e bl -196 (0xb6a85000) ; pTestSuspend

Bug: 12607709
Change-Id: I9202bdb7cf21252ad807bd48701f1f6ce8e3d0fe
c777e0de83cdffdb2e240d439c5595a4836553e8 03-Apr-2014 Vladimir Marko <vmarko@google.com> Disassemble Thumb2 shifts and more VFP instructions.

Disassemble Thumb2 instructions LSL, LSR, ASR, ROR and VFP
instructions VABS, VADD, VSUB, VMOV, VMUL, VNMUL, VDIV.

Clean up disassembly of VCMP, VCMPE, VNEG and VSQRT. These
could have been erroneously used for other insns (VSQRT for
VMOV was encountered) and one VSQRT branch was unreachable.

Remove duplicate VMOV opcodes from compiler.

Change-Id: I160a1e3e4b6eabb6a5101ce348ffd49c0573257d
2700f7e1edbcd2518f4978e4cd0e05a4149f91b6 07-Mar-2014 buzbee <buzbee@google.com> Continuing register cleanup

Ready for review.

Continue the process of using RegStorage rather than
ints to hold register value in the top layers of codegen.
Given the huge number of changes in this CL, I've attempted
to minimize the number of actual logic changes. With this
CL, the use of ints for registers has largely been eliminated
except in the lowest utility levels. "Wide" utility routines
have been updated to take a single RegStorage rather than
a pair of ints representing low and high registers.

Upcoming CLs will be smaller and more targeted. My expectations:
o Allocate float double registers as a single double rather than
a pair of float single registers.
o Refactor to push code which assumes long and double Dalvik
values are held in a pair of register to the target dependent
layer.
o Clean-up of the xxx_mir.h files to reduce the amount of #defines
for registers. May also do a register renumbering to bring all
of our targets' register naming more consistent. Possibly
introduce a target-independent float/non-float test at the
RegStorage level.

Change-Id: I646de7392bdec94595dd2c6f76e0f1c4331096ff
e19649a91702234f9aa9941d76da447a1e0dcc2a 27-Feb-2014 Zheng Xu <zheng.xu@arm.com> ARM: Remove duplicated instructions; add vcvt, vmla, vmls disassembler.

Remove kThumb2VcvtID in the assembler which was duplicated.
Add vcvt, vmla, vmls in the disassembler.

Change-Id: I14cc39375c922c9917274d8dcfcb515e888fdf26
00e1ec6581b5b7b46ca4c314c2854e9caa647dd2 28-Feb-2014 Bill Buzbee <buzbee@android.com> Revert "Revert "Rework Quick compiler's register handling""

This reverts commit 86ec520fc8b696ed6f164d7b756009ecd6e4aace.

Ready. Fixed the original type, plus some mechanical changes
for rebasing.

Still needs additional testing, but the problem with the original
CL appears to have been a typo in the definition of the x86
double return template RegLocation.

Change-Id: I828c721f91d9b2546ef008c6ea81f40756305891
dbb8c49d540edd2a39076093163c7218f03aa502 28-Feb-2014 Vladimir Marko <vmarko@google.com> Remove non-existent ARM insn kThumb2SubsRRI12.

For kOpSub/kOpAdd, prefer modified immediate encodings
because they set flags.

Change-Id: I41dcd2d43ba1e62120c99eaf9106edc61c41e157
86ec520fc8b696ed6f164d7b756009ecd6e4aace 26-Feb-2014 Bill Buzbee <buzbee@android.com> Revert "Rework Quick compiler's register handling"

This reverts commit 2c1ed456dcdb027d097825dd98dbe48c71599b6c.

Change-Id: If88d69ba88e0af0b407ff2240566d7e4545d8a99
2c1ed456dcdb027d097825dd98dbe48c71599b6c 20-Feb-2014 buzbee <buzbee@google.com> Rework Quick compiler's register handling

For historical reasons, the Quick backend found it convenient
to consider all 64-bit Dalvik values held in registers
to be contained in a pair of 32-bit registers. Though this
worked well for ARM (with double-precision registers also
treated as a pair of 32-bit single-precision registers) it doesn't
play well with other targets. And, it is somewhat problematic
for 64-bit architectures.

This is the first of several CLs that will rework the way the
Quick backend deals with physical registers. The goal is to
eliminate the "64-bit value backed with 32-bit register pair"
requirement from the target-indendent portions of the backend
and support 64-bit registers throughout.

The key RegLocation struct, which describes the location of
Dalvik virtual register & register pairs, previously contained
fields for high and low physical registers. The low_reg and
high_reg fields are being replaced with a new type: RegStorage.
There will be a single instance of RegStorage for each RegLocation.
Note that RegStorage does not increase the space used. It is
16 bits wide, the same as the sum of the 8-bit low_reg and
high_reg fields.

At a target-independent level, it will describe whether the physical
register storage associated with the Dalvik value is a single 32
bit, single 64 bit, pair of 32 bit or vector. The actual register
number encoding is left to the target-dependent code layer.

Because physical register handling is pervasive throughout the
backend, this restructuring necessarily involves large CLs with
lots of changes. I'm going to roll these out in stages, and
attempt to segregate the CLs with largely mechanical changes from
those which restructure or rework the logic.

This CL is of the mechanical change variety - it replaces low_reg
and high_reg from RegLocation and introduces RegStorage. It also
includes a lot of new code (such as many calls to GetReg())
that should go away in upcoming CLs.

The tentative plan for the subsequent CLs is:

o Rework standard register utilities such as AllocReg() and
FreeReg() to use RegStorage instead of ints.
o Rework the target-independent GenXXX, OpXXX, LoadValue,
StoreValue, etc. routines to take RegStorage rather than
int register encodings.
o Take advantage of the vector representation and eliminate
the current vector field in RegLocation.
o Replace the "wide" variants of codegen utilities that take
low_reg/high_reg pairs with versions that use RegStorage.
o Add 64-bit register target independent codegen utilities
where possible, and where not virtualize with 32-bit general
register and 64-bit general register variants in the target
dependent layer.
o Expand/rework the LIR def/use flags to allow for more registers
(currently, we lose out on 16 MIPS floating point regs as
well as ARM's D16..D31 for lack of space in the masks).
o [Possibly] move the float/non-float determination of a register
from the target-dependent encoding to RegStorage. In other
words, replace IsFpReg(register_encoding_bits).

At the end of the day, all code in the target independent layer
should be using RegStorage, as should much of the target dependent
layer. Ideally, we won't be using the physical register number
encoding extracted from RegStorage (i.e. GetReg()) until the
NewLIRx() layer.

Change-Id: Idc5c741478f720bdd1d7123b94e4288be5ce52cb
d61ba4ba6fcde666adb5d5c81b1c32f0534fb2c8 13-Jan-2014 Bill Buzbee <buzbee@android.com> Revert "Revert "Better support for x86 XMM registers""

This reverts commit 8ff67e3338952c70ccf3b609559bf8cc0f379cfd.

Fix applied to loc.fp usage.

Change-Id: I1eb3005392544fcf30c595923ed25bcee2dc4859
8ff67e3338952c70ccf3b609559bf8cc0f379cfd 11-Jan-2014 Bill Buzbee <buzbee@android.com> Revert "Better support for x86 XMM registers"

The invalid usage of loc.fp must be corrected before this change can be submitted.

This reverts commit 766a5e5940b469ab40e52770862c81cfec1d835b.

Change-Id: I1173a9bf829da89cccd9c2898f5e11164987a22b
766a5e5940b469ab40e52770862c81cfec1d835b 10-Jan-2014 Mark Mendell <mark.p.mendell@intel.com> Better support for x86 XMM registers

Currently, ART Quick mode assumes that a double FP register is composed
of two single consecutive FP registers. This is true for ARM and MIPS,
but not x86. This means that only half of the 8 XMM registers are
available for use by x86 doubles.

This patch breaks the assumption that a wide FP RegisterLocation must be
a paired set of FP registers. This is done by making some routines in
common code virtual and overriding them in the X86Mir2Lir class. For
these wide fp locations, the high register is set to the same value as
the low register, in order to minimize changes to common code. In a
couple of places, the common code checks for this case.

The changes are also supposed to allow the possibility of using the XMM
registers for vector operations,but that support is still WIP.

Change-Id: Ic6ef24ea764991c6f4d9fb88d483a619f5a468cb
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
ef6a776af2b4b8607d5f91add0ed0e8497100e31 20-Dec-2013 Ian Rogers <irogers@google.com> Inline codegen for long-to-double on ARM.

Change-Id: I4fc443c1b942a2231d680fc2c7a1530c86104584
3e5af82ae1a2cd69b7b045ac008ac3b394d17f41 21-Nov-2013 Vladimir Marko <vmarko@google.com> Intrinsic Unsafe.CompareAndSwapLong() for ARM.

(cherry picked from cb53fcd79b1a5ce608208ec454b5c19f64aaba37)

Change-Id: Iadd3cc8b4ed390670463b80f8efd579ce6ece226
332b7aa6220124dc638b9f7e59611c376473f128 18-Nov-2013 Vladimir Marko <vmarko@google.com> Improve Thumb2 instructions' use of constant operands.

Rename instructions using modified immediate to use suffix
I8M. Many were using I8 which may lead to confusion with
Thumb I8 instructions and some were using other suffixes.

Add and use CmnRI8M, increase constant range of AddRRI12 and
SubRRI12 and use BicRRI8M for applicable kOpAnd constants.
In particular, this should marginaly improve Math.abs(float)
and Math.abs(double) by converting x & 0x7fffffff to BIC.

Bug: 11579369

Change-Id: I0f17a9eb80752d2625730a60555152cdffed50ba
7020278bce98a0735dc6abcbd33bdf1ed2634f1d 23-Oct-2013 Dave Allison <dallison@google.com> Support hardware divide instruction

Bug: 11299025

Uses sdiv for division and a combo of sdiv, mul and sub for modulus.
Only does this on processors that are capable of the sdiv instruction, as determined
by the build system.

Also provides a command line arg --instruction-set-features= to allow cross compilation.
Makefile adds the --instruction-set-features= arg to build-time dex2oat runs and defaults
it to something obtained from the target architecture.

Provides a GetInstructionSetFeatures() function on CompilerDriver that can be
queried for various features. The only feature supported right now is hasDivideInstruction().

Also adds a few more instructions to the ARM disassembler

b/11535253 is an addition to this CL to be done later.

Change-Id: Ia8aaf801fd94bc71e476902749cf20f74eba9f68
a8b4caf7526b6b66a8ae0826bd52c39c66e3c714 24-Oct-2013 Vladimir Marko <vmarko@google.com> Add byte swap instructions for ARM and x86.

Change-Id: I03fdd61ffc811ae521141f532b3e04dda566c77d
b48819db07f9a0992a72173380c24249d7fc648a 15-Sep-2013 buzbee <buzbee@google.com> Compile-time tuning: assembly phase

Not as much compile-time gain from reworking the assembly phase as I'd
hoped, but still worthwhile. Should see ~2% improvement thanks to
the assembly rework. On the other hand, expect some huge gains for some
application thanks to better detection of large machine-generated init
methods. Thinkfree shows a 25% improvement.

The major assembly change was to establish thread the LIR nodes that
require fixup into a fixup chain. Only those are processed during the
final assembly pass(es). This doesn't help for methods which only
require a single pass to assemble, but does speed up the larger methods
which required multiple assembly passes.

Also replaced the block_map_ basic block lookup table (which contained
space for a BasicBlock* for each dex instruction unit) with a block id
map - cutting its space requirements by half in a 32-bit pointer
environment.

Changes:
o Reduce size of LIR struct by 12.5% (one of the big memory users)
o Repurpose the use/def portion of the LIR after optimization complete.
o Encode instruction bits to LIR
o Thread LIR nodes requiring pc fixup
o Change follow-on assembly passes to only consider fixup LIRs
o Switch on pc-rel fixup kind
o Fast-path for small methods - single pass assembly
o Avoid using cb[n]z for null checks (almost always exceed displacement)
o Improve detection of large initialization methods.
o Rework def/use flag setup.
o Remove a sequential search from FindBlock using lookup table of 16-bit
block ids rather than full block pointers.
o Eliminate pcRelFixup and use fixup kind instead.
o Add check for 16-bit overflow on dex offset.

Change-Id: I4c6615f83fed46f84629ad6cfe4237205a9562b4
7934ac288acfb2552bb0b06ec1f61e5820d924a4 26-Jul-2013 Brian Carlstrom <bdc@google.com> Fix cpplint whitespace/comments issues

Change-Id: Iae286862c85fb8fd8901eae1204cd6d271d69496
b1eba213afaf7fa6445de863ddc9680ab99762ea 18-Jul-2013 Brian Carlstrom <bdc@google.com> Fix cpplint whitespace/comma issues

Change-Id: I456fc8d80371d6dfc07e6d109b7f478c25602b65
fc0e3219edc9a5bf81b166e82fd5db2796eb6a0d 17-Jul-2013 Brian Carlstrom <bdc@google.com> Fix multiple inclusion guards to match new pathnames

Change-Id: Id7735be1d75bc315733b1773fba45c1deb8ace43
7940e44f4517de5e2634a7e07d58d0fb26160513 12-Jul-2013 Brian Carlstrom <bdc@google.com> Create separate Android.mk for main build targets

The runtime, compiler, dex2oat, and oatdump now are in seperate trees
to prevent dependency creep. They can now be individually built
without rebuilding the rest of the art projects. dalvikvm and jdwpspy
were already this way. Builds in the art directory should behave as
before, building everything including tests.

Change-Id: Ic6b1151e5ed0f823c3dd301afd2b13eb2d8feb81