History log of /art/compiler/optimizing/intrinsics_arm64.cc
Revision Date Author Comments
b198b013ae7bd2da85e007414fc028cd51a13883 07-Jul-2016 Nicolas Geoffray <ngeoffray@google.com> Fix System.arraycopy when doing same array copying.

At compile time, if constant source < constant destination, and we don't
know if the arrays are the same, then we must emit code that checks
if the two arrays are the same. If so, we jump to the slow path.

test:610-arraycopy

bug:30030084

(cherry picked from commit 9f65db89353c46f6b189656f7f55a99054e5cfce)

Change-Id:Ida67993d472b0ba4056d9c21c68f6e5239421f7d
ebea3d2cce6aa34216502bb6b83d155d4c92e4ff 12-Apr-2016 Roland Levillain <rpl@google.com> Small changes in ARM and x86-64 SystemArrayCopy intrinsics.

Have these intrinsics share a more uniform style with the
ARM64 SystemArrayCopy intrinsic.

Also make some changes/improvements in:
- art::IntrinsicOptimizations
- art::arm64::GenSystemArrayCopyAddresses

Change-Id: Ieeb224795229580f8e5f7219c586d04786d8c705
fa3912edfac60a9f0a9b95a5862c7361b403fcc2 01-Apr-2016 Roland Levillain <rpl@google.com> Fix BitCount intrinsics assertions.

Bug: 27852035
Change-Id: Iba43039aadd9ba288b476d53cc2306a58356465f
c2ec9ad9a0008a95b29185e6a971244ef108c766 10-Mar-2016 donghui.bai <donghui.bai@linaro.org> Implement ARM64 support for SystemArrayCopy()

Change-Id: Iaaf6a9e8c1bea17fa9fde768869bcd4442ebc7d9
f969a209c30e3af636342d2fb7851d82a2529bf7 09-Mar-2016 Roland Levillain <rpl@google.com> Fix and enable java.lang.StringFactory intrinsics.

The following intrinsics were not considered by the
intrinsics recognizer:
- StringNewStringFromBytes
- StringNewStringFromChars
- StringNewStringFromString
This CL enables them and add tests for them.

This CL also:
- Fixes the locations of the ARM64 & MIPS64
StringNewStringFromString intrinsics.
- Fixes the definitions of the FOUR_ARG_DOWNCALL macros on
ARM and x86, which are used to implement the
art_quick_alloc_string_from_bytes* runtime entry points.
- Fixes PC info (stack maps) recording in the
StringNewStringFromBytes, StringNewStringFromChars and
StringNewStringFromString ARM, ARM64 & MIPS64 intrinsics.

Bug: 27425743
Change-Id: I38c00d3f0b2e6b64f7d3fe9146743493bef9e45c
1193259cb37c9763a111825aa04718a409d07145 08-Mar-2016 Aart Bik <ajcbik@google.com> Implement the 1.8 unsafe memory fences directly in HIR.

Rationale:
More efficient since it exposes full semantics to
all operations on the graph and allows for proper
code generation for all architectures.

bug=26264765

Change-Id: Ic435886cf0645927a101a8502f0623fa573989ff
0e54c0160c84894696c05af6cad9eae3690f9496 04-Mar-2016 Aart Bik <ajcbik@google.com> Unsafe: Recognize intrinsics for 1.8 java.util.concurrent
With unit test.

Rationale:
Recognizing the 1.8 methods as intrinsics is the first step
towards providing efficient implementation on all architectures.
Where not implemented (everywhere for now), the methods fall back
to the JNI native or reference implementation.

NOTE: needs iam's CL first!

bug=26264765

Change-Id: Ife65e81689821a16cbcdd2bb2d35641c6de6aeb6
d3d0da5148063fef921613f9557520860496f2f8 29-Feb-2016 Scott Wakeling <scott.wakeling@linaro.org> ARM64: Implement SystemArrayCopyChar intrinsic.

Change-Id: I33f559139a38ddf20cacb8c997e38fa7663a4066
457413a6d7b657ca5e4567b7be2d9c300c6cbb5b 04-Mar-2016 Nicolas Geoffray <ngeoffray@google.com> Fix lint issue.

Change-Id: I549cc641510a7f941d85f3a5f38127bc6701a0a3
49924c970536bc570b84e3bf0d525fa9f56debde 03-Mar-2016 xueliang.zhong <xueliang.zhong@linaro.org> Integer.bitCount and Long.bitCount intrinsics for ARM64

Change-Id: If6180acc90239e52e5d33901b65e194d1ca7e248
2f9fcc999fab4ba6cd86c30e664325b47b9618e5 02-Mar-2016 Aart Bik <ajcbik@google.com> Simplified intrinsic macro mechanism.

Rationale:
Reduces boiler-plate code in all intrinsics code generators.
Also, the newly introduced "unreachable" macro provides a
static verifier that we do not have unreachable and thus
redundant code in the generators. In fact, this change
exposes that the MIPS32 and MIPS64 rotation intrinsics
(IntegerRotateRight, LongRotateRight, IntegerRotateLeft,
LongRotateLeft) are unreachable, since they are handled
as HIR constructs for all architectures. Thus the code
can be removed.

Change-Id: I0309799a0db580232137ded72bb8a7bbd45440a8
42ad288254e660ad091d03fad8c8fbad1d34ec89 29-Feb-2016 Roland Levillain <rpl@google.com> Fix the signature of the IndexOf entry point.

The IndexOf entry point was declared as taking four
arguments (void*, uint32_t, uint32_t, uint32_t) whereas all
actual implementations use three arguments (void*, uint32_t,
uint32_t). As that fourth argument is not documented, drop
it from the intrinsic declaration to have it match the
implementations.

Change-Id: I65d747033192025ccd2b9a5e8f8ed05b77a21941
cc3839c15555a2751e13980638fc40e4d3da633e 29-Feb-2016 Roland Levillain <rpl@google.com> Improve documentation about StringFactory.newStringFromChars.

Make it clear that the native method requires its third
argument to be non-null, and therefore that the intrinsics
do not need a null check for it.

Bug: 27378573
Change-Id: Id2f78ceb0f7674f1066bc3f216b738358ca25542
2a6aad9d388bd29bff04aeec3eb9429d436d1873 25-Feb-2016 Aart Bik <ajcbik@google.com> Implement fp to bits methods as intrinsics.

Rationale:
Better optimization, better performance.

Results on libcore benchmark:

Most gain is from moving the invariant call out of the loop
after we detect everything is a side-effect free intrinsic.
But generated code in general case is much cleaner too.

Before:
timeFloatToIntBits() in 181 ms.
timeFloatToRawIntBits() in 35 ms.
timeDoubleToLongBits() in 208 ms.
timeDoubleToRawLongBits() in 35 ms.

After:
timeFloatToIntBits() in 36 ms.
timeFloatToRawIntBits() in 35 ms.
timeDoubleToLongBits() in 35 ms.
timeDoubleToRawLongBits() in 34 ms.

bug=11548336

Change-Id: I6e001bd3708e800bd75a82b8950fb3a0fc01766e
25abd6c0c9f5a6abebcdeeb6f4373e85eedcfb6b 19-Jan-2016 Tim Zhang <tim.zhang@linaro.org> Optimizing: Add ARM and ARM64 intrinsics support for StringGetCharsNoCheck

This change refers to x86 implementation of StringGetCharsNoCheck and
arm implementation of SystemArrayCopy.

Change-Id: I1cb86854a2a8fa8736af7726b8efacd00d416f6f
9cd6d378bd573cdc14d049d32bdd22a97fa4d84a 09-Feb-2016 David Srbecky <dsrbecky@google.com> Associate slow paths with the instruction that they belong to.

Almost all slow paths already know the instruction they belong to,
this CL just moves the knowledge to the base class as well.

This is needed to be be able to get the corresponding dex pc for
slow path, which allows us generate better native line numbers,
which in turn fixes some native debugging stepping issues.

Change-Id: I568dbe78a7cea6a43a4a71a014b3ad135782c270
75a38b24801bd4d27c95acef969930f626dd11da 17-Feb-2016 Aart Bik <ajcbik@google.com> Implement isNaN intrinsic through HIR equivalent.

Rationale:
Efficient implementation on all platforms.
Subject to better compiler optimizations.

Change-Id: Ie8876bf5943cbe1138491a25d32ee9fee554043c
a19616e3363276e7f2c471eb2839fb16f1d43f27 02-Feb-2016 Aart Bik <ajcbik@google.com> Implemented compare/signum intrinsics as HCompare
(with all code generation for all)

Rationale:
At HIR level, many more optimizations are possible, while ultimately
generated code can take advantage of full semantics.

Change-Id: I6e2ee0311784e5e336847346f7f3c4faef4fd17e
02fc24ea55aa71a352e64d6878ee3bace6050da1 20-Jan-2016 Anton Kirilov <anton.kirilov@linaro.org> ARM64: Add direct calls to math intrinsics

This change mirrors the work that has already been done for x86 and
x86_64. The following functions are affected: cos, sin, acos, asin,
atan, atan2, cbrt, cosh, exp, expm1, hypot, log, log10, nextafter,
sinh, tan, tanh.

Change-Id: I0f381bd2c1c4273b243c045107110fed551c6124
Signed-off-by: Anton Kirilov <anton.kirilov@linaro.org>
bb24bd0245aefa2f808e4656be01b990abecd80b 29-Jan-2016 Aart Bik <ajcbik@google.com> Implemented signum() on ARM64.

Change-Id: Ib805e62341f6c5e4fcc35c73d12e217fbae948ce
7b56502c52271c52ef0232ccd47e96badfe5dba6 28-Jan-2016 Aart Bik <ajcbik@google.com> Implement compare() on ARM64.

Change-Id: I6b5982aeb7401cd90fc37431a72bdd2b7f3e322b
4a6a67ca93289b232a620bdf8bf30ff8b7b0b428 27-Jan-2016 Serban Constantinescu <serban.constantinescu@linaro.org> Remove unused DMB code paths in the ARM64 Optimizing Compiler

Currently all ARM64 CPUs will be using the acquire-release code paths.
This patch removes the instruction set feature PreferAcquireRelease()
as well as all the unused DMB code paths.

Change-Id: I61c320d6d685f96c9e260f25eac3593907793830
Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
2e50ecb95ddf645595491438cf35e79b705ee366 27-Jan-2016 Roland Levillain <rpl@google.com> Fix ARM & ARM64 UnsafeCASObject intrinsic with heap poisoning.

Ensure aliasing does not make us poison/unpoison the same
value twice.

Also extend testing of Unsafe.compareAndSwap* routines.

Bug: 26204023
Bug: 12687968
Change-Id: I29d7e5dd2a969845e054798f77837d20e3c18483
59c9454b92c2096a30a2bbdffb64edf33dbdd916 25-Jan-2016 Aart Bik <ajcbik@google.com> Recognize common utilities as intrinsics.

Rationale:
Recognizing these method calls as intrinsics already has
major advantages (compiler knows about no-side-effects/no-throw
properties). Next step is, of course, to implement these
with native instructions on each architecture.

Change-Id: I06fd12973238caec00d67b31b195d7f8807a538e
44015868a5ed9f6915d510ade42e84949b719e3a 22-Jan-2016 Roland Levillain <rpl@google.com> Revert "Revert "ARM64 Baker's read barrier fast path implementation.""

This reverts commit 28a2ff0bd6c30549f3f6465d8316f5707b1d072f.

Bug: 12687968
Change-Id: I6e25c70f303368629cdb1084f1d7039261cbb79a
28a2ff0bd6c30549f3f6465d8316f5707b1d072f 21-Jan-2016 Mathieu Chartier <mathieuc@google.com> Revert "ARM64 Baker's read barrier fast path implementation."

This reverts commit c8f1df9965ca7f97ba9e6289f8c7a717765a59a9.

This breaks master.

Change-Id: Ic07f602af8732e2835bd11f65e3b9e766d3349c7
3f67e692860d281858485d48a4f1f81b907f1444 15-Jan-2016 Aart Bik <ajcbik@google.com> Implemented BitCount as an intrinsic. With unit test.

Rationale:
Recognizing this important operation as an intrinsic has
various advantages:
(1) having the no-side-effects/no-throw allows for
much more GVN/LICM/BCE.
(2) Some architectures, like x86_64, provide direct
support for this operation.

Performance improvements on X86_64:
CheckersEvalBench (32-bit bitboard): 27,210KNS -> 36,798KNS = + 35%
ReversiEvalBench (64-bit bitboard): 52,562KNS -> 89,086KNS = + 69%

Change-Id: I65d549b0469b7909b12c6611cdc34a8640a5751f
c8f1df9965ca7f97ba9e6289f8c7a717765a59a9 20-Jan-2016 Roland Levillain <rpl@google.com> ARM64 Baker's read barrier fast path implementation.

Introduce an ARM64 fast path implementation in Optimizing
for Baker's read barriers (for both heap reference loads and
GC root loads). The marking phase of the read barrier is
performed by a slow path, invoking the runtime entry point
artReadBarrierMark.

Other read barrier algorithms continue to use the original
slow path based implementation, which has been renamed as
GenerateReadBarrierSlow/GenerateReadBarrierForRootSlow.

Bug: 12687968
Bug: 26601270
Change-Id: I60da15249b58a8ee1a065ed9be2c4e438ee17150
4bedb3845ac33c95cb779987abd4e76a88b19989 12-Jan-2016 Roland Levillain <rpl@google.com> Fix memory fences in the ARM64 UnsafeCas intrinsics.

Also add some comments for the ARM UnsafeCas intrinsics.

Change-Id: Ic6e4f2c37e468db4582ac8709496a80f3c1f9a6b
e6d0d8de85f79c8702ee722a04cd89ee7e89aeb7 28-Dec-2015 Andreas Gampe <agampe@google.com> ART: Disable Math.round intrinsics

The move to OpenJDK means that Android has caught up with the
definition change of Math.round. Disable intrinsics.

Bug: 26327751
Change-Id: I00dc6cfca12bd7c95e56a4ab76ffee707d3822dc
095b1df3d20e806ed7ad8c545b03866c1561d1f6 28-Dec-2015 Andreas Gampe <agampe@google.com> Revert "Make Math.round consistent on arm64."

This reverts commit 40041c9a38e3961d8675d117517719458a115520.

Needs to be generalized to all platforms.

Bug: 26327751
Change-Id: Iae8f1c8846d120d8e3e99b6eb87f3760bf793ec5
40041c9a38e3961d8675d117517719458a115520 27-Dec-2015 Nicolas Geoffray <ngeoffray@google.com> Make Math.round consistent on arm64.

OpenJDK seems to have a different rounding implementation than
libcore. Temporarily disable the intrinsic.

Test that fails:
Assert.assertEquals(StrictMath.round(0.49999999999999994d), 1l);
Assert.assertEquals(Math.round(0.49999999999999994d), 1l);

bug:26327751

Change-Id: Iad2fb847e4a553b8c1f5031f772c81e7e4db9f4c
391b866ce55b8e78b1f9a6b98321d837256e8d66 18-Dec-2015 Roland Levillain <rpl@google.com> Disable the UnsafeCASObject intrinsic with read barriers.

The current implementations of the UnsafeCASObject
intrinsics are missing a read barrier. Temporarily disable
them when read barriers are enabled.

Also re-enable the jsr166.LinkedTransferQueueTest tests that
were failing on the concurrent collector configuration, as
the UnsafeCASObject JNI implementation now correctly
implements the read barrier which was missing.

Bug: 25883050
Bug: 26205973
Change-Id: Iaf5d515532949662d0ac6702c9452a00aa0a23e6
40a04bf64e5837fa48aceaffe970c9984c94084a 11-Dec-2015 Scott Wakeling <scott.wakeling@linaro.org> Replace rotate patterns and invokes with HRor IR.

Replace constant and register version bitfield rotate patterns, and
rotateRight/Left intrinsic invokes, with new HRor IR.

Where k is constant and r is a register, with the UShr and Shl on
either side of a |, +, or ^, the following patterns are replaced:

x >>> #k OP x << #(reg_size - k)
x >>> #k OP x << #-k

x >>> r OP x << (#reg_size - r)
x >>> (#reg_size - r) OP x << r

x >>> r OP x << -r
x >>> -r OP x << r

Implemented for ARM/ARM64 & X86/X86_64.

Tests changed to not be inlined to prevent optimization from folding
them out. Additional tests added for constant rotate amounts.

Change-Id: I5847d104c0a0348e5792be6c5072ce5090ca2c34
a4f1220c1518074db18ca1044e9201492975750b 06-Aug-2015 Mark Mendell <mark.p.mendell@intel.com> Optimizing: Add direct calls to math intrinsics

Support the double forms of:
cos, sin, acos, asin, atan, atan2, cbrt, cosh, exp, expm1,
hypot, log, log10, nextAfter, sinh, tan, tanh

Add these entries to the vector addressed off the thread pointer. Call
the libc routines directly, which means that we have to implement the
native ABI, not the ART one. For x86_64, that includes saving XMM12-15
as the native ABI considers them caller-save, while the ART ABI
considers them callee-save. We save them by marking them as used by the
call to the math function. For x86, this is not an issue, as all the XMM
registers are caller-save.

Other architectures will call Java as before until they are ready to
implement the new intrinsics.

Bump the OAT version since we are incompatible with old boot.oat files.

Change-Id: Ic6332c3555c09393a17d1ad4daf62932488722fb
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
bf84a3d2aa29c0975b4ac0f6f983d56724b2cc57 04-Dec-2015 Roland Levillain <rpl@google.com> Annotate Boolean literals more uniformly in Optimizing's intrinsics.

Change-Id: Ida40309b4bc170a18b4e5db552b77f021a7b89df
22ccc3a93d32fa6991535eaebb17daf5abaf4ebf 24-Nov-2015 Roland Levillain <rpl@google.com> ARM64 read barrier support for concurrent GC in Optimizing.

This first implementation uses slow paths to instrument heap
reference loads and GC root loads for the concurrent copying
collector, respectively calling the artReadBarrierSlow and
artReadBarrierForRootSlow runtime entry points.

Notes:
- This implementation does not instrument HInvokeVirtual
nor HInvokeInterface instructions (for class reference
loads), as the corresponding read barriers are not stricly
required with the current concurrent copying collector.
- Intrinsics which may eventually call (on slow path) are
disabled when read barriers are enabled, as the current
slow path infrastructure does not support this case.
- When read barriers are enabled, the code generated for a
HArraySet instruction always go into the array set slow
path for object arrays (delegating the operation to the
runtime), as we are lacking a mechanism to keep a
temporary register live accross a runtime call (needed for
the instrumentation of type checking code, which requires
two successive read barriers).

Bug: 12687968
Change-Id: Icfb74f67bf23ae80e7723ee6a0c9ff34ba325d48
985ff70d3dbd954f75749fb7109a71fa0e9d8838 23-Oct-2015 Roland Levillain <rpl@google.com> Disable the ARM & ARM64 UnsafeCASObject intrinsic with heap poisoning.

The current heap poisoning instrumentation of this intrinsic
does not always work properly when heap poisoning in
enabled, hence this quick fix to let the build & test
infrastructure turn green again.

Bug: 12687968
Change-Id: I546a392a61e429cd13209261f806d0aed8d1cd86
ee3cf0731d0ef0787bc2947c8e3ca432b513956b 06-Oct-2015 Nicolas Geoffray <ngeoffray@google.com> Intrinsify System.arraycopy.

Currently on x64, will do the other architectures in
different changes.

Change-Id: I15fbbadb450dd21787809759a8b14b21b1e42624
9ee23f4273efed8d6378f6ad8e63c65e30a17139 23-Jul-2015 Scott Wakeling <scott.wakeling@linaro.org> ARM/ARM64: Intrinsics - numberOfTrailingZeros, rotateLeft, rotateRight

Change-Id: I2a07c279756ee804fb7c129416bdc4a3962e93ed
bfb5ba90cd6425ce49c2125a87e3b12222cc2601 01-Sep-2015 Andreas Gampe <agampe@google.com> Revert "Revert "Do a second check for testing intrinsic types.""

This reverts commit a14b9fef395b94fa9a32147862c198fe7c22e3d7.

When an intrinsic with invoke-type virtual is recognized, replace
the instruction with a new HInvokeStaticOrDirect.

Minimal update for dex-cache rework. Fix includes.

Change-Id: I1c8e735a2fa7cda4419f76ca0717125ef236d332
4ab02352db4051d590b793f34d166a0b5c633c4a 12-Aug-2015 Serban Constantinescu <serban.constantinescu@linaro.org> Use CodeGenerator::RecordPcInfo instead of SlowPathCode::RecordPcInfo.

Part of a clean-up and refactoring series. SlowPathCode::RecordPcInfo
is currently just a wrapper around CodGenerator::RecordPcInfo.

Change-Id: Iffabef4ef37c365051130bf98a6aa6dc0a0fb254
Signed-off-by: Serban Constantinescu <serban.constantinescu@linaro.org>
ea34b40783064ef73fb555a9cfd04c35cff624d8 14-Aug-2015 Agi Csaki <agicsaki@google.com> Optimizing String.Equals as an intrinsic (ARM64)

The fifth implementation of String.Equals. I added an intrinsic
in ARM64 which is similar to the original java implementation
of String.equals: an instanceof check, null check, length check, and
reference equality check followed by a loop comparing strings four
characters at a time starting at the beginning of the string.

Interesting Benchmarking Values:

64 Bit Nexus 9:
Intrinsic Short (1-5 Character) Strings: 40 ns
Original Short (1-5 Character) Strings: 80 ns
Intrinsic Very Long (1000+ Character) Strings: 1556 ns
Original Very Long (1000+ Character) Strings: 4554 ns
Intrinsic Non-String Argument: 15 ns
Original Non-String Argument: 62 ns

Bug: 21481923
Change-Id: If37b399614c2250f52ac709a3b50c356419ca88a
7da072feb160079734331e994ea52760cb2a3243 13-Aug-2015 agicsaki <agicsaki@google.com> Structure for String.Equals intrinsic

Added structure for implementing String.Equals intrinsics. There is no
functional change at this point- the intrinsic is marked as unimplemented
for all instruction sets and compilers.

Bug: 21481923
Change-Id: Ic2a1e22a113ff6091581126f12e926478c011340
611d3395e9efc0ab8dbfa4a197fa022fbd8c7204 10-Jul-2015 Scott Wakeling <scott.wakeling@linaro.org> ARM/ARM64: Implement numberOfLeadingZeros intrinsic.

Change-Id: I4042fb7a0b75140475dcfca23e8f79d310f5333b
aabdf8ad2e8d3de953dff5c7591e7b3df4d4f60b 03-Aug-2015 Roland Levillain <rpl@google.com> Revert "Optimizing String.Equals as an intrinsic (x86)"

Reverted as it breaks the compilation of boot.{oat,art} on x86 (although this CL may not be the culprit, as the issue seems to come from Optimizing's register allocator).

This reverts commit 8ab7bd6c8b10ad58758c33a1dc9326212bd200e9.

Change-Id: If7c8b6258d1e690f4d2a06bcc82c92563ac6cdef
8ab7bd6c8b10ad58758c33a1dc9326212bd200e9 27-Jul-2015 agicsaki <agicsaki@google.com> Optimizing String.Equals as an intrinsic (x86)

The third implementation of String.Equals. I added an intrinsic
in x86 which is similar to the original java implementation of
String.equals: an instanceof check, null check, length check, and
reference equality check followed by a loop comparing strings
character by character.

Interesting Benchmarking Values:

Optimizing Compiler on Nexus Player
Intrinsic 15-30 Character Strings: 177 ns
Original 15-30 Character Strings: 275 ns
Intrinsic Null Argument: 59 ns
Original Null Argument: 137 ns
Intrinsic 100-1000 Character Strings: 1812 ns
Original 100-1000 Character Strings: 6334 ns

Bug: 21481923
Change-Id: Ia386e19b9dbfe0dac688b20ec93d8f90f67af47e
4d02711ea578dbb789abb30cbaf12f9926e13d81 01-Jul-2015 Roland Levillain <rpl@google.com> Implement heap poisoning in ART's Optimizing compiler.

- Instrument ARM, ARM64, x86 and x86-64 code generators.
- Note: To turn heap poisoning on in Optimizing, set the
environment variable `ART_HEAP_POISONING' to "true"
before compiling ART.

Bug: 12687968
Change-Id: Ib3120b38cf805a8a50207a314b9ccc90c8d93740
9931f319cf86c56c2855d800339a3410697633a6 19-Jun-2015 Alexandre Rames <alexandre.rames@linaro.org> Opt compiler: Add a description to slow paths.

Change-Id: I22160d90de3fe0ab3e6a2acc440bda8daa00e0f0
94015b939060f5041d408d48717f22443e55b6ad 04-Jun-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Revert "Use HCurrentMethod in HInvokeStaticOrDirect.""

Fix was to special case baseline for x86, which does not have enough
registers to allocate the current method.

This reverts commit c345f141f11faad177aa9635a78088d00cf66086.

Change-Id: I5997aa52f8d4df373ae5ff4d4150dac0c44c4c10
c345f141f11faad177aa9635a78088d00cf66086 04-Jun-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Use HCurrentMethod in HInvokeStaticOrDirect."

Fails on baseline/x86.

This reverts commit 38207af82afb6f99c687f64b15601ed20d82220a.

Change-Id: Ib71018367eb7c6046965494a7e996c22af3de403
38207af82afb6f99c687f64b15601ed20d82220a 01-Jun-2015 Nicolas Geoffray <ngeoffray@google.com> Use HCurrentMethod in HInvokeStaticOrDirect.

Change-Id: I0d15244b6b44c8b10079398c55da5071a3e3af66
3d21bdf8894e780d349c481e5c9e29fe1556051c 22-Apr-2015 Mathieu Chartier <mathieuc@google.com> Move mirror::ArtMethod to native

Optimizing + quick tests are passing, devices boot.

TODO: Test and fix bugs in mips64.

Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS.
Some of the savings are from removal of virtual methods and direct
methods object arrays.

Bug: 19264997

(cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33)

Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d

Fix some ArtMethod related bugs

Added root visiting for runtime methods, not currently required
since the GcRoots in these methods are null.

Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes
--trace run-tests 005, 044.

Fixed optimizing compiler bug where we used a normal stack location
instead of double on ARM64, this fixes the debuggable tests.

TODO: Fix JDWP tests.

Bug: 19264997

Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3

ART: Fix casts for 64-bit pointers on 32-bit compiler.

Bug: 19264997
Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457

Fix JDWP tests after ArtMethod change

Fixes Throwable::GetStackDepth for exception event detection after
internal stack trace representation change.

Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of
proxy method.

Bug: 19264997
Change-Id: I363e293796848c3ec491c963813f62d868da44d2

Fix accidental IMT and root marking regression

Was always using the conflict trampoline. Also included fix for
regression in GC time caused by extra roots. Most of the regression
was IMT.

Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to
detached thread.

EvaluateAndApplyChanges:
From ~2500 -> ~1980
GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots

Bug: 19264997
Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0

Fix bogus image test assert

Previously we were comparing the size of the non moving space to
size of the image file.

Now we properly compare the size of the image space against the size
of the image file.

Bug: 19264997
Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a

[MIPS64] Fix art_quick_invoke_stub argument offsets.

ArtMethod reference's size got bigger, so we need to move other args
and leave enough space for ArtMethod* and 'this' pointer.

This fixes mips64 boot.

Bug: 19264997
Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
e401d146407d61eeb99f8d6176b2ac13c4df1e33 22-Apr-2015 Mathieu Chartier <mathieuc@google.com> Move mirror::ArtMethod to native

Optimizing + quick tests are passing, devices boot.

TODO: Test and fix bugs in mips64.

Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS.
Some of the savings are from removal of virtual methods and direct
methods object arrays.

Bug: 19264997
Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
07276db28d654594e0e86e9e467cad393f752e6e 18-May-2015 Nicolas Geoffray <ngeoffray@google.com> Don't do a null test in MarkGCCard if the value cannot be null.

Change-Id: I45687f6d3505178e2fc3689eac9cb6ab1b2c1e29
ce7d005c1ba0716423d44861d2d0f58f142ff06a 08-May-2015 Andreas Gampe <agampe@google.com> ART: arm indexOf intrinsics for the optimizing compiler

Add intrinsics implementations for indexOf in the optimizing
compiler. These are mostly ported from Quick.

Bug: 20889065

(cherry picked from commit ba6fdbcb764d5a8972f5ff2d7147e4d78226b347)

Change-Id: I18ee849d41187a381f99529669e6f97040aaacf6
ba6fdbcb764d5a8972f5ff2d7147e4d78226b347 08-May-2015 Andreas Gampe <agampe@google.com> ART: arm indexOf intrinsics for the optimizing compiler

Add intrinsics implementations for indexOf in the optimizing
compiler. These are mostly ported from Quick.

Bug: 20889065
Change-Id: I18ee849d41187a381f99529669e6f97040aaacf6
ec525fc30848189051b888da53ba051bc0878b78 28-Apr-2015 Roland Levillain <rpl@google.com> Factor MoveArguments methods in Optimizing's intrinsics handlers.

Also add a precondition similar to the one present in code
generators, regarding static invoke related explicit clinit
check elimination in non-baseline compilations.

Change-Id: I26f4dcb5d02824d7556f90b4b0c85b08b737fa53
2d27c8e338af7262dbd4aaa66127bb8fa1758b86 28-Apr-2015 Roland Levillain <rpl@google.com> Refactor InvokeDexCallingConventionVisitor in Optimizing.

Change-Id: I7ede0f59d5109644887bf5d39201d4e1bf043f34
3e3d73349a2de81d14e2279f60ffbd9ab3f3ac28 28-Apr-2015 Roland Levillain <rpl@google.com> Have HInvoke instructions know their number of actual arguments.

Add an art::HInvoke::GetNumberOfArguments routine so that
art::HInvoke and its subclasses can return the number of
actual arguments of the called method. Use it in code
generators and intrinsics handlers.

Consequently, no longer remove a clinit check as last input
of a static invoke if it is still present during baseline
code generation, but ensure that static invokes have no such
check as last input in optimized compilations.

Change-Id: Iaf9e07d1057a3b15b83d9638538c02b70211e476
848f70a3d73833fc1bf3032a9ff6812e429661d9 15-Jan-2014 Jeff Hao <jeffhao@google.com> Replace String CharArray with internal uint16_t array.

Summary of high level changes:
- Adds compiler inliner support to identify string init methods
- Adds compiler support (quick & optimizing) with new invoke code path
that calls method off the thread pointer
- Adds thread entrypoints for all string init methods
- Adds map to verifier to log when receiver of string init has been
copied to other registers. used by compiler and interpreter

Change-Id: I797b992a8feb566f9ad73060011ab6f51eb7ce01
4c0eb42259d790fddcd9978b66328dbb3ab65615 24-Apr-2015 Roland Levillain <rpl@google.com> Ensure inlined static calls perform clinit checks in Optimizing.

Calls to static methods have implicit class initialization
(clinit) checks of the method's declaring class in
Optimizing. However, when such a static call is inlined,
the implicit clinit check vanishes, possibly leading to an
incorrect behavior.

To ensure that inlining static methods does not change the
behavior of a program, add explicit class initialization
checks (art::HClinitCheck) as well as load class
instructions (art::HLoadClass) as last input of static
calls (art::HInvokeStaticOrDirect) in Optimizing' control
flow graphs, when the declaring class is reachable and not
known to be already initialized. Then when considering the
inlining of a static method call, proceed only if the method
has no implicit clinit check requirement.

The added explicit clinit checks are already removed by the
art::PrepareForRegisterAllocation visitor. This CL also
extends this visitor to turn explicit clinit checks from
static invokes into implicit ones after the inlining step,
by removing the added art::HLoadClass nodes mentioned
hereinbefore.

Change-Id: I9ba452b8bd09ae1fdd9a3797ef556e3e7e19c651
641547a5f18ca2ea54469cceadcfef64f132e5e0 21-Apr-2015 Calin Juravle <calin@google.com> [optimizing] Fix a bug in moving the null check to the user.

When taking the decision to move a null check to the user we did not
verify if the next instruction checks the same object.

Change-Id: I2f4533a4bb18aa4b0b6d5e419f37dcccd60354d2
9021825d1e73998b99c81e89c73796f6f2845471 15-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Type MoveOperands.

The ParallelMoveResolver implementation needs to know if a move
is for 64bits or not, to handle swaps correctly.

Bug found, and test case courtesy of Serguei I. Katkov.

Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
82e52ce8364e3e1c644d0d3b3b4f61364bf7089a 26-Mar-2015 Serban Constantinescu <serban.constantinescu@arm.com> ARM64: Update to VIXL 1.9.

Update VIXL's interface to VIXL 1.9.

Change-Id: Iebae947539cbad65488b7195aaf01de284b71cbb
Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
3e90a96f403cbc353731e6687fe12a088f996cee 27-Mar-2015 Razvan A Lupusoru <razvan.a.lupusoru@intel.com> [optimizing] Do not inline intrinsics

The intrinsics generally have specialized code and the code for them
may be faster than what can be achieved with inlining. Thus inliner
should skip intrinsics.

At the same time, easy methods are not worth intrinsifying: ie String
length and isEmpty. Those can be handled by inliner with no problem
and can actually lead to better code since call is not kept around
through all of the optimizations.

Change-Id: Iab38e6c33f79efa54d845d4871cf26fa9b235ab0
Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
512e04d1ea7fb33e3992715fe55be8a834d4a79c 27-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Fix typos spotted by Andreas.

Change-Id: I564b4bc5995d91f4c6c4e4f2427ed7c279cb8740
d75948ac93a4a317feaf136cae78823071234ba5 27-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Intrinsify String.compareTo.

Change-Id: Ia540df98755ac493fe61bd63f0bd94f6d97fbb57
a8ac9130b872c080299afacf5dcaab513d13ea87 13-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Refactor code in preparation of correct stack maps in slow path.

Move the logic of saving/restoring live registers in slow path
in the SlowPathCode method. Also add a RecordPcInfo helper to
SlowPathCode, that will act as the placeholder of saving correct
stack maps.

Change-Id: I25c2bc7a642ef854bbc8a3eb570e5c8c8d2d030c
579885a26d761f5ba9550f2a1cd7f0f598c2e1e3 22-Feb-2015 Serban Constantinescu <serban.constantinescu@arm.com> Opt Compiler: ARM64: Enable explicit memory barriers over acquire/release

Implement remaining explicit memory barrier code paths and temporarily
enable the use of explicit memory barriers for testing.

This CL also enables the use of instruction set features in the ARM64
backend. kUseAcquireRelease has been replaced with PreferAcquireRelease(),
which for now is statically set to false (prefer explicit memory barriers).

Please note that we still prefer acquire-release for the ARM64 Optimizing
Compiler, but we would like to exercise the explicit memory barrier code
path too.

Change-Id: I84e047ecd43b6fbefc5b82cf532e3f5c59076458
Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
2bcf9bf784a0021630d8fe63d7230d46d6891780 29-Jan-2015 Andreas Gampe <agampe@google.com> ART: Arm intrinsics for Optimizing compiler

Add arm32 intrinsics to the optimizing compiler.

Change-Id: If4aeedbf560862074d8ee08ca4484b666d6b9bf0
82f344970ad65538d341706b02eeaa94508474b8 04-Feb-2015 Nicolas Geoffray <ngeoffray@google.com> Fix a bug in combination of intrinsics and kNoOutputOverlap.

In case we need to go in the slow path for an intrinsic call,
we can't have the output be the same as the input: the current
liveness analysis considers the input to be live at the point of the call.

Change-Id: I5cbdc7f50dd06b4fefcbd3c213274fa645bd3fa0
542361f6e9ff05e3ca1f56c94c88bc3efeddd9c4 29-Jan-2015 Alexandre Rames <alexandre.rames@arm.com> Introduce primitive type helpers.

Change-Id: I81e909a185787f109c0afafa27b4335050a0dcdf
878d58cbaf6b17a9e3dcab790754527f3ebc69e5 16-Jan-2015 Andreas Gampe <agampe@google.com> ART: Arm64 optimizing compiler intrinsics

Implement most intrinsics for the optimizing compiler for Arm64.

Change-Id: Idb459be09f0524cb9aeab7a5c7fccb1c6b65a707