History log of /art/compiler/optimizing/intrinsics_x86_64.cc
Revision Date Author Comments
ce4b1329ca903d6b98734a27a46b54bb9cfd6d5b 31-Jul-2015 Pavel Vyssotski <pavel.n.vyssotski@intel.com> ART: x86_64 RoundDouble/Float intrinsics should initialize out value.

x86_64 RoundDouble intrinsic should initialize output register for the case of
"inPlusPointFive >= maxLong" as expected. The same for the RoundFloat intrinsic.
Fixed also the out register type in CreateSSE41FPToIntLocations provoking
a DCHECK failure.

Signed-off-by: Pavel Vyssotski <pavel.n.vyssotski@intel.com>

(cherry picked from commit 9ca257196b46fd7629bce0b338580e571e4113a8)

Bug: 22973442
Change-Id: If974e79d33311587d0b541a01ca8a4c9c11b9468
3d21bdf8894e780d349c481e5c9e29fe1556051c 22-Apr-2015 Mathieu Chartier <mathieuc@google.com> Move mirror::ArtMethod to native

Optimizing + quick tests are passing, devices boot.

TODO: Test and fix bugs in mips64.

Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS.
Some of the savings are from removal of virtual methods and direct
methods object arrays.

Bug: 19264997

(cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33)

Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d

Fix some ArtMethod related bugs

Added root visiting for runtime methods, not currently required
since the GcRoots in these methods are null.

Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes
--trace run-tests 005, 044.

Fixed optimizing compiler bug where we used a normal stack location
instead of double on ARM64, this fixes the debuggable tests.

TODO: Fix JDWP tests.

Bug: 19264997

Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3

ART: Fix casts for 64-bit pointers on 32-bit compiler.

Bug: 19264997
Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457

Fix JDWP tests after ArtMethod change

Fixes Throwable::GetStackDepth for exception event detection after
internal stack trace representation change.

Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of
proxy method.

Bug: 19264997
Change-Id: I363e293796848c3ec491c963813f62d868da44d2

Fix accidental IMT and root marking regression

Was always using the conflict trampoline. Also included fix for
regression in GC time caused by extra roots. Most of the regression
was IMT.

Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to
detached thread.

EvaluateAndApplyChanges:
From ~2500 -> ~1980
GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots

Bug: 19264997
Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0

Fix bogus image test assert

Previously we were comparing the size of the non moving space to
size of the image file.

Now we properly compare the size of the image space against the size
of the image file.

Bug: 19264997
Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a

[MIPS64] Fix art_quick_invoke_stub argument offsets.

ArtMethod reference's size got bigger, so we need to move other args
and leave enough space for ArtMethod* and 'this' pointer.

This fixes mips64 boot.

Bug: 19264997
Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
21030dd59b1e350f6f43de39e3c4ce0886ff539c 07-May-2015 Andreas Gampe <agampe@google.com> ART: x86 indexOf intrinsics for the optimizing compiler

Add intrinsics implementations for indexOf in the optimizing
compiler. These are mostly ported from Quick. Add instruction
support to assemblers where necessary.

Change-Id: Ife90ed0245532a5c436a26fe84715dc357f353c8
92e83bf8c0b2df8c977ffbc527989631d94b1819 07-May-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Tune some x86_64 moves

Generate Moves of constant FP values by loading from the constant table.

Use 'movl' to load a 64 bit register for positive 32-bit values, saving
a byte in the generated code by taking advantage of the implicit
zero extension.

Change a couple of xorq(reg, reg) to xorl to (potentially) save a byte
of code per xor.

Change-Id: I5b2a807f0d3b29294fd4e7b8ef6d654491fa0b01
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
ec525fc30848189051b888da53ba051bc0878b78 28-Apr-2015 Roland Levillain <rpl@google.com> Factor MoveArguments methods in Optimizing's intrinsics handlers.

Also add a precondition similar to the one present in code
generators, regarding static invoke related explicit clinit
check elimination in non-baseline compilations.

Change-Id: I26f4dcb5d02824d7556f90b4b0c85b08b737fa53
2d27c8e338af7262dbd4aaa66127bb8fa1758b86 28-Apr-2015 Roland Levillain <rpl@google.com> Refactor InvokeDexCallingConventionVisitor in Optimizing.

Change-Id: I7ede0f59d5109644887bf5d39201d4e1bf043f34
3e3d73349a2de81d14e2279f60ffbd9ab3f3ac28 28-Apr-2015 Roland Levillain <rpl@google.com> Have HInvoke instructions know their number of actual arguments.

Add an art::HInvoke::GetNumberOfArguments routine so that
art::HInvoke and its subclasses can return the number of
actual arguments of the called method. Use it in code
generators and intrinsics handlers.

Consequently, no longer remove a clinit check as last input
of a static invoke if it is still present during baseline
code generation, but ensure that static invokes have no such
check as last input in optimized compilations.

Change-Id: Iaf9e07d1057a3b15b83d9638538c02b70211e476
848f70a3d73833fc1bf3032a9ff6812e429661d9 15-Jan-2014 Jeff Hao <jeffhao@google.com> Replace String CharArray with internal uint16_t array.

Summary of high level changes:
- Adds compiler inliner support to identify string init methods
- Adds compiler support (quick & optimizing) with new invoke code path
that calls method off the thread pointer
- Adds thread entrypoints for all string init methods
- Adds map to verifier to log when receiver of string init has been
copied to other registers. used by compiler and interpreter

Change-Id: I797b992a8feb566f9ad73060011ab6f51eb7ce01
4c0eb42259d790fddcd9978b66328dbb3ab65615 24-Apr-2015 Roland Levillain <rpl@google.com> Ensure inlined static calls perform clinit checks in Optimizing.

Calls to static methods have implicit class initialization
(clinit) checks of the method's declaring class in
Optimizing. However, when such a static call is inlined,
the implicit clinit check vanishes, possibly leading to an
incorrect behavior.

To ensure that inlining static methods does not change the
behavior of a program, add explicit class initialization
checks (art::HClinitCheck) as well as load class
instructions (art::HLoadClass) as last input of static
calls (art::HInvokeStaticOrDirect) in Optimizing' control
flow graphs, when the declaring class is reachable and not
known to be already initialized. Then when considering the
inlining of a static method call, proceed only if the method
has no implicit clinit check requirement.

The added explicit clinit checks are already removed by the
art::PrepareForRegisterAllocation visitor. This CL also
extends this visitor to turn explicit clinit checks from
static invokes into implicit ones after the inlining step,
by removing the added art::HLoadClass nodes mentioned
hereinbefore.

Change-Id: I9ba452b8bd09ae1fdd9a3797ef556e3e7e19c651
641547a5f18ca2ea54469cceadcfef64f132e5e0 21-Apr-2015 Calin Juravle <calin@google.com> [optimizing] Fix a bug in moving the null check to the user.

When taking the decision to move a null check to the user we did not
verify if the next instruction checks the same object.

Change-Id: I2f4533a4bb18aa4b0b6d5e419f37dcccd60354d2
40741f394b2737e503f2c08be0ae9dd490fb106b 21-Apr-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Use more X86_64 addressing modes

Allow constant and memory addresses to more X86_64 instructions.

Add memory formats to X86_64 instructions to match.

Fix a bug in cmpq(CpuRegister, const Address&).

Allow mov <addr>,immediate (instruction 0xC7) to be a valid faulting
instruction.

Change-Id: I5b8a409444426633920cd08e09f687a7afc88a39
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
9021825d1e73998b99c81e89c73796f6f2845471 15-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Type MoveOperands.

The ParallelMoveResolver implementation needs to know if a move
is for 64bits or not, to handle swaps correctly.

Bug found, and test case courtesy of Serguei I. Katkov.

Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
39dcf55a56da746e04f477f89e7b00ba1de03880 10-Apr-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Address x86_64 RIP patch comments

Nicolas had some comments after the patch
https://android-review.googlesource.com/#/c/144100 had merged. Fix the
problems that he found.

Change-Id: I40e8a4273997860db7511dc8f1986281b72bead2
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
f55c3e0825cdfc4c5a27730031177d1a0198ec5a 27-Mar-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Add RIP support for x86_64

Support a constant area addressed using RIP on x86_64. Use it for FP
operations to avoid loading constants into a CPU register and moving
to a XMM register.

Change-Id: I58421759ef2a8475538876c20e696ec787015a72
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
58d25fd052e999a24734b0cf856a1563e3d1b2d0 03-Apr-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Implement more x86/x86_64 intrinsics

Implement CAS and bit reverse and byte reverse intrinsics that were
missing from x86 and x86_64 implementations.

Add assembler tests and compareAndSwapLong test.

Change-Id: Iabb2ff46036645df0a91f640288ef06090a64ee3
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
fb8d279bc011b31d0765dc7ca59afea324fd0d0c 01-Apr-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Implement x86/x86_64 math intrinsics

Implement floor/ceil/round/RoundFloat on x86 and x86_64.
Implement RoundDouble on x86_64.

Add support for roundss and roundsd on both architectures. Support them
in the disassembler as well.

Add the instruction set features for x86, as the 'round' instruction is
only supported if SSE4.1 is supported.

Fix the tests to handle the addition of passing the instruction set
features to x86 and x86_64.

Add assembler tests for roundsd and roundss to x86_64 assembler tests.

Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
3e90a96f403cbc353731e6687fe12a088f996cee 27-Mar-2015 Razvan A Lupusoru <razvan.a.lupusoru@intel.com> [optimizing] Do not inline intrinsics

The intrinsics generally have specialized code and the code for them
may be faster than what can be achieved with inlining. Thus inliner
should skip intrinsics.

At the same time, easy methods are not worth intrinsifying: ie String
length and isEmpty. Those can be handled by inliner with no problem
and can actually lead to better code since call is not kept around
through all of the optimizations.

Change-Id: Iab38e6c33f79efa54d845d4871cf26fa9b235ab0
Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
512e04d1ea7fb33e3992715fe55be8a834d4a79c 27-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Fix typos spotted by Andreas.

Change-Id: I564b4bc5995d91f4c6c4e4f2427ed7c279cb8740
d75948ac93a4a317feaf136cae78823071234ba5 27-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Intrinsify String.compareTo.

Change-Id: Ia540df98755ac493fe61bd63f0bd94f6d97fbb57
a8ac9130b872c080299afacf5dcaab513d13ea87 13-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Refactor code in preparation of correct stack maps in slow path.

Move the logic of saving/restoring live registers in slow path
in the SlowPathCode method. Also add a RecordPcInfo helper to
SlowPathCode, that will act as the placeholder of saving correct
stack maps.

Change-Id: I25c2bc7a642ef854bbc8a3eb570e5c8c8d2d030c
878d58cbaf6b17a9e3dcab790754527f3ebc69e5 16-Jan-2015 Andreas Gampe <agampe@google.com> ART: Arm64 optimizing compiler intrinsics

Implement most intrinsics for the optimizing compiler for Arm64.

Change-Id: Idb459be09f0524cb9aeab7a5c7fccb1c6b65a707
42d1f5f006c8bdbcbf855c53036cd50f9c69753e 16-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> Do not use register pair in a parallel move.

The ParallelMoveResolver does not work with pairs. Instead,
decompose the pair into two individual moves.

Change-Id: Ie9d3f0b078cef8dc20640c98b20bb20cc4971a7f
71fb52fee246b7d511f520febbd73dc7a9bbca79 30-Dec-2014 Andreas Gampe <agampe@google.com> ART: Optimizing compiler intrinsics

Add intrinsics infrastructure to the optimizing compiler.

Add almost all intrinsics supported by Quick to the x86-64 backend.
Further intrinsics require more assembler support.

Change-Id: I48de9b44c82886bb298d16e74e12a9506b8e8807