History log of /art/compiler/optimizing/common_arm64.h
Revision Date Author Comments
d59f3b1b7f5c1ab9f0731ff9dc60611e8d9a6ede 29-Mar-2016 Vladimir Marko <vmarko@google.com> Use iterators "before" the use node in HUserRecord<>.

Create a new template class IntrusiveForwardList<> that
mimicks std::forward_list<> except that all allocations
are handled externally. This is essentially the same as
boost::intrusive::slist<> but since we're not using Boost
we have to reinvent the wheel.

Use the new container to replace the HUseList and use the
iterators to "before" use nodes in HUserRecord<> to avoid
the extra pointer to the previous node which was used
exclusively for removing nodes from the list. This reduces
the size of the HUseListNode by 25%, 32B to 24B in 64-bit
compiler, 16B to 12B in 32-bit compiler. This translates
directly to overall memory savings for the 64-bit compiler
but due to rounding up of the arena allocations to 8B, we
do not get any improvement in the 32-bit compiler.

Compiling the Nexus 5 boot image with the 64-bit dex2oat
on host this CL reduces the memory used for compiling the
most hungry method, BatteryStats.dumpLocked(), by ~3.3MiB:

Before:
MEM: used: 47829200, allocated: 48769120, lost: 939920
Number of arenas allocated: 345,
Number of allocations: 815492, avg size: 58
...
UseListNode 13744640
...
After:
MEM: used: 44393040, allocated: 45361248, lost: 968208
Number of arenas allocated: 319,
Number of allocations: 815492, avg size: 54
...
UseListNode 10308480
...

Note that while we do not ship the 64-bit dex2oat to the
device, the JIT compilation for 64-bit processes is using
the 64-bit libart-compiler.

Bug: 28173563
Bug: 27856014

(cherry picked from commit 46817b876ab00d6b78905b80ed12b4344c522b6c)

Change-Id: Ifb2d7b357064b003244e92c0d601d81a05e56a7b
3a448e4af9595392c1a2308f59c084842c955e3e 01-Apr-2016 Roland Levillain <rpl@google.com> Improve debugging in art/compiler/optimizing/common_arm64.h.

Change-Id: I44ff2cb64c1fd45390ed4a6517af2488fdbdaf41
22c4922c6b31e154a6814c4abe9015d9ba156911 18-Mar-2016 Roland Levillain <rpl@google.com> Ensure art::HRor support boolean, byte, short and char inputs.

Also extend tests covering the IntegerRotateLeft,
LongRotateLeft, IntegerRotateRight and LongRotateRight
intrinsics and their translation into an art::HRor
instruction.

Bug: 27682579
Change-Id: I89f6ea6a7315659a172482bf09875cfb7e7422a1
40a04bf64e5837fa48aceaffe970c9984c94084a 11-Dec-2015 Scott Wakeling <scott.wakeling@linaro.org> Replace rotate patterns and invokes with HRor IR.

Replace constant and register version bitfield rotate patterns, and
rotateRight/Left intrinsic invokes, with new HRor IR.

Where k is constant and r is a register, with the UShr and Shl on
either side of a |, +, or ^, the following patterns are replaced:

x >>> #k OP x << #(reg_size - k)
x >>> #k OP x << #-k

x >>> r OP x << (#reg_size - r)
x >>> (#reg_size - r) OP x << r

x >>> r OP x << -r
x >>> -r OP x << r

Implemented for ARM/ARM64 & X86/X86_64.

Tests changed to not be inlined to prevent optimization from folding
them out. Additional tests added for constant rotate amounts.

Change-Id: I5847d104c0a0348e5792be6c5072ce5090ca2c34
8626b741716390a0119ffeb88b5b9fcf08e13010 25-Nov-2015 Alexandre Rames <alexandre.rames@linaro.org> ARM64: Use the shifter operands.

This introduces architecture-specific instruction simplification.
On ARM64 we try to merge shifts and sign-extension operations into
arithmetic and logical instructions.

For example for the Java code

int res = a + (b << 5);

we would generate

lsl w3, w2, #5
add w0, w1, w3

and we now generate

add w0, w1, w2, lsl #5

Change-Id: Ic03bdff44a1c12e21ddff1b0513bd32a730742b7
e6dbf48d7a549e58a3d798bbbdc391e4d091b432 19-Oct-2015 Alexandre Rames <alexandre.rames@linaro.org> ARM64: Instruction simplification for array accesses.

HArrayGet and HArraySet with variable indexes generate two
instructions on arm64, like

add temp, obj, #data_offset
ldr out, [temp, index LSL #shift_amount]

When we have multiple accesses to the same array, the initial `add`
instruction is redundant.

This patch introduces the first instruction simplification in the
arm64-specific instruction simplification pass. It splits HArrayGet
and HArraySet using the new arm64-specific IR HIntermediateAddress.
After that we run GVN again to squash the multiple occurrences of
HIntermediateAddress.

Change-Id: I2e3d12fbb07fed07b2cb2f3f47f99f5a032f8312
b69fbfb5e43e404270e63b7a35dc5645b29b759c 16-Oct-2015 Alexandre Rames <alexandre.rames@linaro.org> ARM64: Better recognition of constants encodable as immediates.

When the right-hand side input is a constant, VIXL will automatically
switch between add and sub (or between similar pairs of instructions).

Change-Id: Icf05237b8653c409618f44e45049df87baf0f4c6
82000b0cf9bb32fc55cdb125bf37c884d44a8671 07-Jul-2015 Alexandre Rames <alexandre.rames@linaro.org> Improve code generation for ARM64 VisitArrayGet/Set.

We prefer the code sequence
add temp, obj, #offset
ldr out, [temp, index LSL #shift_amount]
to
add temp, obj, index LSL #shift_amount
ldr out, [temp, #offset]

Change-Id: I98f51a1b5a5ecd84c677d6dbd4c4bfc0f157f5e2
da40309f61f98c16d7d58e4c34cc0f5eef626f93 24-Apr-2015 Zheng Xu <zheng.xu@arm.com> Opt compiler: ARM64: Use ldp/stp on arm64 for slow paths.

It should be a bit faster than load/store single registers and reduce
the code size.

Change-Id: I67b8302adf6174b7bb728f7c2afd2c237e34ffde
760d8efd535764e54500bf65a944ed3f2a54c123 28-Mar-2015 Serban Constantinescu <serban.constantinescu@arm.com> Opt Compiler: ARM64 goodness

This patch:
* Switches on PreferAcquireRelease() (used to decide if load/store
volatile should use acquire release-semantics or explicit memory
barriers). Note that for ARMv8 CPUs we should always prefer this
(as proved by synthetic benchmarks on A53, A57 and Denver).

* Enables the use of constants for HBoundsCheck

Change-Id: I42524451772c05a1c74af73e97a59a95f49ba6d4
Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
82e52ce8364e3e1c644d0d3b3b4f61364bf7089a 26-Mar-2015 Serban Constantinescu <serban.constantinescu@arm.com> ARM64: Update to VIXL 1.9.

Update VIXL's interface to VIXL 1.9.

Change-Id: Iebae947539cbad65488b7195aaf01de284b71cbb
Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
2d35d9d7490ef3880ee366ccbf8f6e791f398c47 22-Feb-2015 Serban Constantinescu <serban.constantinescu@arm.com> Opt Compiler: Materialise constants that cannot be encoded

The VIXL MacroAssembler deals gracefully with any immediate. However
when the constant has multiple uses and cannot be encoded in the
instruction's immediate field we are better off using a register for
the constant and thus sharing the constant generation between multiple
uses.

Eg:
var += #Const; // #Const cannot be encoded.
var += #Const;

Before: After:
mov wip0, #Const mov w4, #Const
add w0, w0, wip0 add w0, w0, w4
mov wip0, #Const add w0, w0, w4
add w0, w0, wip0

Change-Id: Ied8577c879845777e52867aced16b2b45e06ac6c
Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
3ce57abd8fe50a0a772d14e033a9e7c34beff6cb 12-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Opt Compiler: Materialise constants that cannot be encoded"

Fails building the core image.

This reverts commit 758c2f65805564e0c51cccaacf8307e52a9e312b.

Change-Id: Ic3ebd8a08a3d17a513d820035b430f6de4125866
758c2f65805564e0c51cccaacf8307e52a9e312b 22-Feb-2015 Serban Constantinescu <serban.constantinescu@arm.com> Opt Compiler: Materialise constants that cannot be encoded

The VIXL MacroAssembler deals gracefully with any immediate. However
when the constant has multiple uses and cannot be encoded in the
instruction's immediate field we are better off using a register for
the constant and thus sharing the constant generation between multiple
uses.

Eg:
var += #Const; // #Const cannot be encoded.
var += #Const;

Before: After:
mov wip0, #Const mov w4, #Const
add w0, w0, wip0 add w0, w0, w4
mov wip0, #Const add w0, w0, w4
add w0, w0, wip0

Change-Id: I8d1f620872d1241cf582fb4f3b45b5091b790146
Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
de0eb6f59853f08d94fe42088d959b88f8448123 04-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Fix arm64 build.

Change-Id: Ib6babc1c6e8f2e78badc93cfcf89950e53f71bbb
542361f6e9ff05e3ca1f56c94c88bc3efeddd9c4 29-Jan-2015 Alexandre Rames <alexandre.rames@arm.com> Introduce primitive type helpers.

Change-Id: I81e909a185787f109c0afafa27b4335050a0dcdf
878d58cbaf6b17a9e3dcab790754527f3ebc69e5 16-Jan-2015 Andreas Gampe <agampe@google.com> ART: Arm64 optimizing compiler intrinsics

Implement most intrinsics for the optimizing compiler for Arm64.

Change-Id: Idb459be09f0524cb9aeab7a5c7fccb1c6b65a707