History log of /art/runtime/arch/x86_64/quick_entrypoints_x86_64.S
Revision Date Author Comments
fc6898769ae1ef91ec3e41c0a273401213cb82cd 26-Apr-2016 Andreas Gampe <agampe@google.com> ART: Log all monitor operations to systrace

Add a VLOG option ("-verbose:systrace-locks") to log all monitor
operations to systrace. This requires non-fastpath thread
entrypoints, and ATRACE tags for locking and unlocking.

Do a bit of cleanup to the entrypoint initialization to share
common setup.

Bug: 28423466
Change-Id: Ie67e4aa946ec15f8fcf8cb7134c5d3cff0119ab3
59028d90d51a800bcea8be354d77d7be924da3a0 29-Mar-2016 Goran Jakovljevic <Goran.Jakovljevic@imgtec.com> MIPS: Improving art_quick_imt_conflict_trampoline

This is fixing stub_test for MIPS32 and MIPS64. This is follow up
change for Ie74d1c77cf73d451a1142bdc5e3683f9f84bb4e7.

Change-Id: I3c53ef690aff49d7cf9ad3de3aaed9a3d2e1c6b9
796d63050a18f263b93ea34951a61deaecab3422 13-Mar-2016 Nicolas Geoffray <ngeoffray@google.com> Add an ImtConflictTable to better resolve IMT conflicts.

- Attach a ImtConflictTable to conflict runtime ArtMethod.
- Initially 0, a new one will be created at the first hit of
the conflict method.
- If the assembly code does not find a target method in the table,
we will create a new one again, copying the data from the previous
table and adding the new mapping.

Implemented for arm/arm64/x86/x64.

bug:27556801
bug:24769046

Change-Id: Ie74d1c77cf73d451a1142bdc5e3683f9f84bb4e7
10d4c08c0ea9df0a85a11e1c77974df24078c0ec 24-Feb-2016 Hiroshi Yamauchi <yamauchi@google.com> Assembly region TLAB allocation fast path for arm.

This is for the CC collector.

Share the common fast path code with the tlab fast path code.

Speedup (on N5):
BinaryTrees: 2291 -> 902 ms (-60%)
MemAllocTest: 2137 -> 1845 ms (-14%)

Bug: 9986565
Bug: 12687968

Change-Id: Ica63094ec2f85eaa4fd04d202a20090399275d85
d9994f069dfeaa32ba929ca78816b5b83e2a4134 11-Feb-2016 Nicolas Geoffray <ngeoffray@google.com> Re-enable OSR.

Fixes two bugs:
- Dealing with proxy methods, which the compiler and code cache
does not handle.
- Dealing with phi types, that may have been speculatively optimized
but do not hold once jumping to the compiled code.

Change-Id: I7dcd9976ef7b12128fff95d2b7ed3e69cc42e90a
b331febbab8e916680faba722cc84b66b84218a3 05-Feb-2016 Nicolas Geoffray <ngeoffray@google.com> Revert "Revert "Implement on-stack replacement for arm/arm64/x86/x86_64.""

This reverts commit bd89a5c556324062b7d841843b039392e84cfaf4.

Change-Id: I08d190431520baa7fcec8fbdb444519f25ac8d44
bd89a5c556324062b7d841843b039392e84cfaf4 05-Feb-2016 David Brazdil <dbrazdil@google.com> Revert "Implement on-stack replacement for arm/arm64/x86/x86_64."

DCHECK whether loop headers are covered fails.

This reverts commit 891bc286963892ed96134ca1adb7822737af9710.

Change-Id: I0f9a90630b014b16d20ba1dfba31ce63e6648021
891bc286963892ed96134ca1adb7822737af9710 29-Jan-2016 Nicolas Geoffray <ngeoffray@google.com> Implement on-stack replacement for arm/arm64/x86/x86_64.

High-level overview:
- osr_method_threshold is used to know when to compile a method
in osr mode (-> treat all loops as irreducible).
- branch instructions in the compiler query whether they can
jump to an osr method.
- An osr entry point is found through the stack maps: if a stack
map is duplicated in the CodeInfo, it is an osr entry point.

Change-Id: Ifb39338cd281e2c7eccce67f4e18d46428be71e4
7c1559a06041c9c299d5ab514d54b2102f204a84 15-Dec-2015 Roland Levillain <rpl@google.com> x86 Baker's read barrier fast path implementation.

Introduce an x86 fast path implementation in Optimizing for
Baker's read barriers (for both heap reference loads and GC
root loads). The marking phase of the read barrier is
performed by a slow path, invoking a new runtime entry point
(artReadBarrierMark).

Other read barrier algorithms continue to use the original
slow path based implementation, which has been renamed as
GenerateReadBarrierSlow/GenerateReadBarrierForRootSlow.

Bug: 12687968
Change-Id: Ie610c4befc19ff22378a8cba38b422dcacb54320
a7a4759946d9f11c88dc108b2b6a9518ce9c1e18 24-Nov-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "lambda: Add support for invoke-interface for boxed innate lambdas"

955-lambda is flaky

Bug: 24618608
Bug: 25107649

This reverts commit 457e874459ae638145cab6d572e34d48480e39d2.

(cherry picked from commit 3a0909248e04b22c3981cbf617bc2502ed5b6380)

Change-Id: I24884344d21d7a4262e53e3f5dba57032687ddb7
3a0909248e04b22c3981cbf617bc2502ed5b6380 24-Nov-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "lambda: Add support for invoke-interface for boxed innate lambdas"

955-lambda is flaky

Bug: 24618608
Bug: 25107649

This reverts commit 457e874459ae638145cab6d572e34d48480e39d2.

Change-Id: I24884344d21d7a4262e53e3f5dba57032687ddb7
457e874459ae638145cab6d572e34d48480e39d2 23-Oct-2015 Igor Murashkin <iam@google.com> lambda: Add support for invoke-interface for boxed innate lambdas

Lambda closures created with the 'create-lambda' instruction
(termed "innate lambdas") can be turned into an object with 'box-lambda'.

This CL enables support for those kinds of lambdas to work with
'invoke-interface' by generating a proxy class for the lambda.

Note: MIPS32/64 support not included.

Bug: 24618608
Bug: 25107649
Change-Id: Ic8f1bb66ebeaed4097e758a50becf1cff6ccaefb
0d5a281c671444bfa75d63caf1427a8c0e6e1177 13-Nov-2015 Roland Levillain <rpl@google.com> x86/x86-64 read barrier support for concurrent GC in Optimizing.

This first implementation uses slow paths to instrument heap
reference loads and GC root loads for the concurrent copying
collector, respectively calling the artReadBarrierSlow and
artReadBarrierForRootSlow (new) runtime entry points.

Notes:
- This implementation does not instrument HInvokeVirtual
nor HInvokeInterface instructions (for class reference
loads), as the corresponding read barriers are not stricly
required with the current concurrent copying collector.
- Intrinsics which may eventually call (on slow path) are
disabled when read barriers are enabled, as the current
slow path infrastructure does not support this case.
- When read barriers are enabled, the code generated for a
HArraySet instruction always go into the array set slow
path for object arrays (delegating the operation to the
runtime), as we are lacking a mechanism to keep a
temporary register live accross a runtime call (needed for
the instrumentation of type checking code, which requires
two successive read barriers).

Bug: 12687968
Change-Id: I14cd6107233c326389120336f93955b28ffbb329
86c3f4805a8656d52bff8fc3f48fc27daf1e6c67 29-Oct-2015 Hiroshi Yamauchi <yamauchi@google.com> Rosalloc fast path in assembly for x86_64.

Measurements (host, ms)
BinaryTrees: 324 -> 360 (+11%)
BinaryTrees with 64 MB alloc stack + 1 GB heap:
299 -> 275 (-8%)
MemAllocTest: 414 -> 368 (-11%)

Interestingly, BinaryTrees gets slower with default settings due to more
blocking (gc-for-alloc) collections. It seems because allocations are
faster, the allocation stack size and the heap size become the
bottleneck (note both an allocation stack overflow as well as heap
exhaustion cause gc-for-alloc collections). With a larger allocation
stack and heap where no blocking collections are observed, BinaryTrees
gets faster.

Bug: 9986565
Change-Id: I642b9fecd0a583cc133998c2f3932de815c4a757
dc412b6f49a65774b7af654f65cbff619cb7d85a 15-Oct-2015 Hiroshi Yamauchi <yamauchi@google.com> Revert "Revert "Implement rosalloc fast path in assembly for 32 bit arm.""

With a heap poisoning fix.

This reverts commit cf91c7d973f3b2f491abc61d47c141782c96d46e.

Bug: 9986565
Change-Id: Ia72edbde65ef6119e1931a77cc4c595a0b80ce31
cf91c7d973f3b2f491abc61d47c141782c96d46e 15-Oct-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Implement rosalloc fast path in assembly for 32 bit arm."

Tentative. Looks like heap poisoning breaks with this change.

bug: 9986565

This reverts commit e6316940db61faead36f9642cce137d41fc8f606.

Change-Id: I5c63758221464fe319315f40ae79c656048faed0
e6316940db61faead36f9642cce137d41fc8f606 08-Oct-2015 Hiroshi Yamauchi <yamauchi@google.com> Implement rosalloc fast path in assembly for 32 bit arm.

Measurements (N5, ms)
BinaryTrees: 1702 -> 987 (-42%)
MemAllocTest: 2480 -> 2270 (-8%)

Bug: 9986565

Change-Id: I460af3626ad724078463d27cf74a94b7ff7468c5
4adeab196d160f70b4865fb8be048ddd2ac7ab82 03-Oct-2015 Hiroshi Yamauchi <yamauchi@google.com> Refactor the alloc entry point generation code.

Move the x86/x86-64 specific alloc entrypoint generation code to a macro
GENERATE_ALLOC_ENTRYPOINTS_FOR_EACH_ALLOCATOR in a common file to remove
duplication.

This will make it easier to selectively add more hand-written assembly
allocation fast path code.

Rename RETURN_IF_RESULT_IS_NON_ZERO to
RETURN_IF_RESULT_IS_NON_ZERO_OR_DELIVER in the x86/x86_64 files to match
the other architectures.

Bug: 9986565
Change-Id: I56f33b790f94db68891db8a2f42e9231d1770eef
e460d1df1f789c7c8bb97024a8efbd713ac175e9 29-Sep-2015 Calin Juravle <calin@google.com> Revert "Revert "Support unresolved fields in optimizing"

The CL also changes the calling convetion for 64bit static field set
to use kArg2 instead of kArg1. This allows optimizing to keep
the asumptions:
- arm pairs are always of form (even_reg, odd_reg)
- ecx_edx is not used as a register on x86.

This reverts commit e6f49b47b6a4dc9c7684e4483757872cfc7ff1a1.

Change-Id: I93159917565824084abc96775f31be1a4249f2f3
639bdd13993644a267f177f8f5936496bda65e2b 03-Jun-2015 Andreas Gampe <agampe@google.com> ART: Single-frame deopt

Add deoptimization of a single frame. Works by removing the managed
code frame and jumping into the quick-to-interpreter bridge, and
the bridge understanding a stored ShadowFrame.

We need a separate fixup pass. For x86, we leave the return address
on the stack so we don't need to push it there.

Bug: 21611912
Change-Id: I06625685ced8b054244f8685ab50b238a705b9d2
e7049cab51769605798a380897de9f200be45cde 04-Sep-2015 Mathieu Chartier <mathieuc@google.com> Fix art_quick_alloc_object_tlab

Was not updated for the new dex cache arrays.

Change-Id: I47b14fbaa071428abf87b18a045009a1c04d2376
05792b98980741111b4d0a24d68cff2a8e070a3a 03-Aug-2015 Vladimir Marko <vmarko@google.com> ART: Move DexCache arrays to native.

This CL has a companion CL in libcore/
https://android-review.googlesource.com/162985

Change-Id: Icbc9e20ad1b565e603195b12714762bb446515fa
0747466fca310eedea5fc49e37d54f240a0b3c0f 25-Aug-2015 Sebastien Hertz <shertz@google.com> Revert "Revert "Fix deoptimization with pending exception""

This reverts commit 6e2d5747d00697a25251d25dd33b953e54709507.

Fixes the deoptimization path from compiled code (generated by the
Optimizing compiler) by adding wrapper artDeoptimizeFromCompiledCode.
This wrapper, called through the matching assembler stub
art_quick_deoptimize_from_compiled_code, pushes the deoptimization
context just before deoptimizing the stack.

Bug: 23371176
Bug: 19944235
Change-Id: Ia7082656998aebdd0157438f7e6504c120e10d3e
6306921722283d2b0f8aac01883ad83215d6e864 22-Aug-2015 Man Cao <manc@google.com> Add a missing reader barrier in entrypoint stub

Also refactored some comments.

Change-Id: I5c50f487bf9d71f1be5f6c8814bf039993fc1267
1aee900d5a0b3a8d78725a7551356bda0d8554e1 15-Jul-2015 Man Cao <manc@google.com> Add read barrier support to the entrypoints.

Also remove "THIS_LOAD_REQUIRES_READ_BARRIER" since reading
an ArtMethod* no longer needs read barrier.

stub_test should also work with read barriers now.

Change-Id: I3fba18042de2f867a18dbdc38519986212bd9769
71cef231c39da9d911ad2a1976adcd7e664b5b17 23-Jul-2015 Man Cao <manc@google.com> Fix alignments in quick_entrypoints_x86_64.S

Places calling artIsAssignableFromCode() were not 16-byte aligned.

Change-Id: I86ff4f73a942ede09c0206e76614eb826dd896c2
4360be281a5f938e1762a1e3ec3b8a949ba05ff3 15-Jul-2015 Andreas Gampe <agampe@google.com> ART: Remove some of the Mac craziness

We rely on new-enough Clang/LLVM builds nowadays. The integrated
assembler supports named parameters. Throw away most of the
old duplication (effectively cutting support for older Clang
versions). The only required duplications are:

1) Clang as does not support .altmacro. However, the Clang
preprocessor works different wrt/ the GCC preprocessor
and does not give us trouble with inserted spaces.

2) On the Mac, symbols are prefixed with an underscore.

This should help to avoid breaking the Mac build when changing
the assembly code, and prepare for a complete Clang-only build
for x86 and x86-64. Switching to the integrated assembler for
the host build may be done in a follow-up CL.

Bug: 17443165
Change-Id: I1a077d4b612abc2b1b851c1bdabb5008a52e5aa6
55978b87877f4774af213b405c9492dc08549912 15-Jul-2015 Andreas Gampe <agampe@google.com> ART: Fix mac build

Fix Clang assembler bugs introduced in commit
3031c8da0c5009183f770b005c245f9bf2a4d01b.

Change-Id: I460c7c1b8f4380244925d248b90c88239540527a
3031c8da0c5009183f770b005c245f9bf2a4d01b 14-Jul-2015 Andreas Gampe <agampe@google.com> ART: Remove art_quick_invoke_interface_trampoline

The function has only been used by the IMT conflict resolution
trampoline for a while. Merge the two, which saves a branch.

Change-Id: I2f8c9204adf839ddc5459cc04e70d98f858110a1
e7d876adcfc1977800264ab7540aa488c1568b48 28-Jun-2015 Mathieu Chartier <mathieuc@google.com> ART: Fix CFI annotation for art_quick_aput_obj

Fix the CFI state after an early return.

Bug: 22014525

(cherry picked from commit 2738639bcd30b908d825725169b7497ed047debb)

Change-Id: I56b9ba8cf8c47d70a642f064e59c7e04a476dd2f
2738639bcd30b908d825725169b7497ed047debb 28-Jun-2015 Mathieu Chartier <mathieuc@google.com> ART: Fix CFI annotation for art_quick_aput_obj

Fix the CFI state after an early return.

Bug: 22014525
Change-Id: I56b9ba8cf8c47d70a642f064e59c7e04a476dd2f
6b90d42316e0370c789dddb5dda48d7403ea378f 27-Jun-2015 Andreas Gampe <agampe@google.com> ART: Fix CFI annotation in arm64, x86 and x86-64 assembly

To be able to unroll in the exception case, the state needs to be
reset to before the jump.

Bug: 22014525
Change-Id: Ic60400b5bf0efcb713c24df1728623d072f344ab
dfc5db6a6deea37c217e29e810e757945dae8586 18-Jun-2015 Mathieu Chartier <mathieuc@google.com> Fix moving GC bugs in proxy stub for X86/X86_64

Needed to restore the refs.

(cherry picked from commit 9346ff0cfad6344d0bf4eaa69362dbe1987ac054)

Bug: 21907554
Change-Id: I562906dff07dcaa78dfb39646ba9ab35a5f56c6c
9346ff0cfad6344d0bf4eaa69362dbe1987ac054 18-Jun-2015 Mathieu Chartier <mathieuc@google.com> Fix moving GC bugs in proxy stub for X86/X86_64

Needed to restore the refs.

Bug: 21907554
Change-Id: I562906dff07dcaa78dfb39646ba9ab35a5f56c6c
bfa5eb6e8d15ea73a36f8df449630f285a91e995 30-May-2015 Hiroshi Yamauchi <yamauchi@google.com> Add heap poisoning support to the entrypoints.

In preparation for full compiler/managed-code support.

Enable stub_test with heap poisoning.

Bug: 12687968
Change-Id: I79fc54ce6386c0a1eb9621759bb4cc23bc393a75
3d21bdf8894e780d349c481e5c9e29fe1556051c 22-Apr-2015 Mathieu Chartier <mathieuc@google.com> Move mirror::ArtMethod to native

Optimizing + quick tests are passing, devices boot.

TODO: Test and fix bugs in mips64.

Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS.
Some of the savings are from removal of virtual methods and direct
methods object arrays.

Bug: 19264997

(cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33)

Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d

Fix some ArtMethod related bugs

Added root visiting for runtime methods, not currently required
since the GcRoots in these methods are null.

Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes
--trace run-tests 005, 044.

Fixed optimizing compiler bug where we used a normal stack location
instead of double on ARM64, this fixes the debuggable tests.

TODO: Fix JDWP tests.

Bug: 19264997

Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3

ART: Fix casts for 64-bit pointers on 32-bit compiler.

Bug: 19264997
Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457

Fix JDWP tests after ArtMethod change

Fixes Throwable::GetStackDepth for exception event detection after
internal stack trace representation change.

Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of
proxy method.

Bug: 19264997
Change-Id: I363e293796848c3ec491c963813f62d868da44d2

Fix accidental IMT and root marking regression

Was always using the conflict trampoline. Also included fix for
regression in GC time caused by extra roots. Most of the regression
was IMT.

Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to
detached thread.

EvaluateAndApplyChanges:
From ~2500 -> ~1980
GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots

Bug: 19264997
Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0

Fix bogus image test assert

Previously we were comparing the size of the non moving space to
size of the image file.

Now we properly compare the size of the image space against the size
of the image file.

Bug: 19264997
Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a

[MIPS64] Fix art_quick_invoke_stub argument offsets.

ArtMethod reference's size got bigger, so we need to move other args
and leave enough space for ArtMethod* and 'this' pointer.

This fixes mips64 boot.

Bug: 19264997
Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
e401d146407d61eeb99f8d6176b2ac13c4df1e33 22-Apr-2015 Mathieu Chartier <mathieuc@google.com> Move mirror::ArtMethod to native

Optimizing + quick tests are passing, devices boot.

TODO: Test and fix bugs in mips64.

Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS.
Some of the savings are from removal of virtual methods and direct
methods object arrays.

Bug: 19264997
Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
8ea18d0f066f63fa4e5d154f14327468bf288e2b 26-May-2015 Nicolas Geoffray <ngeoffray@google.com> Pass the dex method index directly to interface trampoline.

This avoids computing the dex pc and re-finding the method
index again. I have kept the code for kDebugBuild.

Change-Id: Icd60e0deade755e32b54021c0875b1af592b8c3e
7ea6a170486d81b127e69673cd1020c4db628c93 19-May-2015 Nicolas Geoffray <ngeoffray@google.com> Don't hardcode the location of the caller.

This is to avoid shooting ourselves in the foot when
dealing with inlined frames. Instead, use common methods
for fetching the caller and its dex pc.

Change-Id: I3467a7b50cf163022d332e80356f0aab747de252
848f70a3d73833fc1bf3032a9ff6812e429661d9 15-Jan-2014 Jeff Hao <jeffhao@google.com> Replace String CharArray with internal uint16_t array.

Summary of high level changes:
- Adds compiler inliner support to identify string init methods
- Adds compiler support (quick & optimizing) with new invoke code path
that calls method off the thread pointer
- Adds thread entrypoints for all string init methods
- Adds map to verifier to log when receiver of string init has been
copied to other registers. used by compiler and interpreter

Change-Id: I797b992a8feb566f9ad73060011ab6f51eb7ce01
5ea536aa4a6414db01beaf6f8bd8cb9adc5cfc92 20-Apr-2015 Vladimir Marko <vmarko@google.com> Remove ArtMethod* parameter from dex cache entry points.

Load the ArtMethod* using an optimized stack walk instead.
This reduces the size of the generated code.

Three of the entry points are called only from a slow-path
and the fourth (InitializeTypeAndVerifyAccess) is rare and
already slow enough that the one or two extra loads
(depending on whether we already have the ArtMethod* in a
register) are insignificant. And as we're starting to use
PC-relative addressing of the dex cache arrays (already
done by Quick for the boot image), having the ArtMethod* in
a register becomes less likely anyway.

Change-Id: Ib19b9d204e355e13bf386662a8b158178bf8ad28
2cebb24bfc3247d3e9be138a3350106737455918 22-Apr-2015 Mathieu Chartier <mathieuc@google.com> Replace NULL with nullptr

Also fixed some lines that were too long, and a few other minor
details.

Change-Id: I6efba5fb6e03eb5d0a300fddb2a75bf8e2f175cb
bb87e0f1a52de656bc77cb01cb887e51a0e5198b 03-Apr-2015 Mathieu Chartier <mathieuc@google.com> Refactor and improve GC root handling

Changed GcRoot to use compressed references. Changed root visiting to
use virtual functions instead of function pointers. Changed root visting
interface to be an array of roots instead of a single root at a time.
Added buffered root marking helper to avoid dispatch overhead.

Root marking seems a bit faster on EvaluateAndApplyChanges due to batch
marking. Pause times unaffected.

Mips64 is untested but might work, maybe.

Before:
MarkConcurrentRoots: Sum: 67.678ms 99% C.I. 2us-664.999us Avg: 161.138us Max: 671us

After:
MarkConcurrentRoots: Sum: 54.806ms 99% C.I. 2us-499.986us Avg: 136.333us Max: 602us

Bug: 19264997

Change-Id: I0a71ebb5928f205b9b3f7945b25db6489d5657ca
d43b3ac88cd46b8815890188c9c2b9a3f1564648 01-Apr-2015 Mingyao Yang <mingyao@google.com> Revert "Revert "Deoptimization-based bce.""

This reverts commit 0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430.

Change-Id: I1ca10d15bbb49897a0cf541ab160431ec180a006
0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430 24-Mar-2015 Andreas Gampe <agampe@google.com> Revert "Deoptimization-based bce."

This breaks compiling the core image:

Error after BCE: art::SSAChecker: Instruction 219 in block 1 does not dominate use 221 in block 1.

This reverts commit e295e6ec5beaea31be5d7d3c996cd8cfa2053129.

Change-Id: Ieeb48797d451836ed506ccb940872f1443942e4e
e295e6ec5beaea31be5d7d3c996cd8cfa2053129 07-Mar-2015 Mingyao Yang <mingyao@google.com> Deoptimization-based bce.

A mechanism is introduced that a runtime method can be called
from code compiled with optimizing compiler to deoptimize into
interpreter. This can be used to establish invariants in the managed code
If the invariant does not hold at runtime, we will deoptimize and continue
execution in the interpreter. This allows to optimize the managed code as
if the invariant was proven during compile time. However, the exception
will be thrown according to the semantics demanded by the spec.

The invariant and optimization included in this patch are based on the
length of an array. Given a set of array accesses with constant indices
{c1, ..., cn}, we can optimize away all bounds checks iff all 0 <= min(ci) and
max(ci) < array-length. The first can be proven statically. The second can be
established with a deoptimization-based invariant. This replaces n bounds
checks with one invariant check (plus slow-path code).

Change-Id: I8c6e34b56c85d25b91074832d13dba1db0a81569
3d900a7400701be1fb6a1e7fa192bab8aeec5467 21-Mar-2015 Hiroshi Yamauchi <yamauchi@google.com> Fix the mac build.

Use SYMBOL around a function name in the assembly code.

Change-Id: I624361ff15a00288c834bd90d1b7783138802ea7
e01a520fe0010f8abd344b5ed7120787d7ed1d71 19-Mar-2015 Hiroshi Yamauchi <yamauchi@google.com> Assembly TLAB allocation fast path for x86_64.

TODO: resolved/initialized cases, other architectures.

Bug: 9986565
Change-Id: If6df3449a3b2f5074d11babdda0fd2791fd54946
20e7d600cdcba4b1ab2f4e01e14903d641fbc073 12-Mar-2015 Sebastien Hertz <shertz@google.com> Fix art_quick_instrumentation_exit stub for x86_64

Restores callee-saved registers.

Bug: 19708384
Change-Id: I1cb47b1cc616af613816c4ee041bdfc975bf9f20
e15ea086439b41a805d164d2beb07b4ba96aaa97 10-Feb-2015 Hiroshi Yamauchi <yamauchi@google.com> Reserve bits in the lock word for read barriers.

This prepares for the CC collector to use the standard object header
model by storing the read barrier state in the lock word.

Bug: 19355854
Bug: 12687968
Change-Id: Ia7585662dd2cebf0479a3e74f734afe5059fb70f
126d65952a03b3e44d5021208673c01920a982a4 03-Mar-2015 Nicolas Geoffray <ngeoffray@google.com> Fix generic JNI stubs to not discard the Java native frame.

Change-Id: Ic856b442fdde5ce91673fc5856eb0dfc84c75d28
2cd334ae2d4287216523882f0d298cf3901b7ab1 09-Jan-2015 Hiroshi Yamauchi <yamauchi@google.com> More of the concurrent copying collector.

Bug: 12687968
Change-Id: I62f70274d47df6d6cab714df95c518b750ce3105
24f2dfae084b2382c053f5d688fd6bb26cb8a328 15-Jan-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing compiler] Implement inline x86 FP '%'

Replace the calls to fmod/fmodf by inline code as is done in the Quick
compiler.

Remove the quick fmod/fmodf runtime entries, as they are no longer in
use.

64 bit code generator Move() routine needed to be enhanced to handle
constants, as Location::Any() allows them to be generated.

Change-Id: I6b6a42f6faeed4b0b3c940453e487daf5b25d184
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
4808846b2a8647a448aaa05d561a4f60b190196b 12-Dec-2014 Nicolas Geoffray <ngeoffray@google.com> Save all registers in native to Java stubs.

This will make things more friendly when experimenting with the
number of callee saves in optimizing.

Change-Id: Iefd9a2da329a420eb69fc2fa9e91c06bbda30cdb
d2ec87d84057174d4884ee16f652cbcfd31362e9 08-Dec-2014 Calin Juravle <calin@google.com> [optimizing compiler] Add REM_FLOAT and REM_DOUBLE

- for arm, x86, x86_64 backends
- reinstated fmod quick entry points for x86. This is a partial revert
of bd3682eada753de52975ae2b4a712bd87dc139a6 which added inline assembly
for floting point rem on x86. Note that Quick still uses the inline
version.
- fix rem tests for longs

Change-Id: I73be19a9f2f2bcf3f718d9ca636e67bdd72b5440
2d7210188805292e463be4bcf7a133b654d7e0ea 10-Nov-2014 Mathieu Chartier <mathieuc@google.com> Change 64 bit ArtMethod fields to be pointer sized

Changed the 64 bit entrypoint and gc map fields in ArtMethod to be
pointer sized. This saves a large amount of memory on 32 bit systems.
Reduces ArtMethod size by 16 bytes on 32 bit.

Total number of ArtMethod on low memory mako: 169957
Image size: 49203 methods -> 787248 image size reduction.
Zygote space size: 1070 methods -> 17120 size reduction.
App methods: ~120k -> 2 MB savings.

Savings per app on low memory mako: 125K+ per app
(less active apps -> more image methods per app).

Savings depend on how often the shared methods are on dirty pages vs
shared.

TODO in another CL, delete gc map field from ArtMethod since we
should be able to get it from the Oat method header.

Bug: 17643507

Change-Id: Ie9508f05907a9f693882d4d32a564460bf273ee8

(cherry picked from commit e832e64a7e82d7f72aedbd7d798fb929d458ee8f)
e832e64a7e82d7f72aedbd7d798fb929d458ee8f 10-Nov-2014 Mathieu Chartier <mathieuc@google.com> Change 64 bit ArtMethod fields to be pointer sized

Changed the 64 bit entrypoint and gc map fields in ArtMethod to be
pointer sized. This saves a large amount of memory on 32 bit systems.
Reduces ArtMethod size by 16 bytes on 32 bit.

Total number of ArtMethod on low memory mako: 169957
Image size: 49203 methods -> 787248 image size reduction.
Zygote space size: 1070 methods -> 17120 size reduction.
App methods: ~120k -> 2 MB savings.

Savings per app on low memory mako: 125K+ per app
(less active apps -> more image methods per app).

Savings depend on how often the shared methods are on dirty pages vs
shared.

TODO in another CL, delete gc map field from ArtMethod since we
should be able to get it from the Oat method header.

Bug: 17643507

Change-Id: Ie9508f05907a9f693882d4d32a564460bf273ee8
32b12f8ae491e1acfeaee334e9a30c6c0a232072 17-Nov-2014 Sebastien Hertz <shertz@google.com> Fix art_quick_instrumentation_entry stub for x86/x86_64

Fixes bad stack offset for x86 where we read the return pc from an
incorrect location.

Fixes bad register for x86_64. The return pc is the 4th argument of
the called C function. It must be passed in rcx instead of r8 (which
is used for 5th argument).

Bug: 18170596
Change-Id: Idb521d2f6da415448fa61acf8b7d21076822830f
1d8cdbc5202378a5f1a4b3a1fba610675ed4dcd5 23-Sep-2014 Ian Rogers <irogers@google.com> Refactor quick entrypoints

Remove FinishCalleeSaveFrameSetup.
Assembly routines write down anchor into TLS as well as placing runtime
method in callee save frame.
Simplify artSet64InstanceFromCode by not computing the referrer from the
stack in the C++ code.
Move assembly offset tests next to constant declaration and tidy arch_test.

Change-Id: Iededeebc05e54a1e2bb7bb3572b8ba012cffa1c8
677cd61ad05d993c4d3b22656675874f06d6aabc 15-Oct-2014 Ian Rogers <irogers@google.com> Make ART compile with GCC -O0 again.

Tidy up InstructionSetFeatures so that it has a type hierarchy dependent on
architecture.
Add to instruction_set_test to warn when InstructionSetFeatures don't agree
with ones from system properties, AT_HWCAP and /proc/cpuinfo.
Clean-up class linker entry point logic to not return entry points but to
test whether the passed code is the particular entrypoint. This works around
image trampolines that replicate entrypoints.
Bug: 17993736

(cherry picked from commit 6f3dbbadf4ce66982eb3d400e0a74cb73eb034f3)

Change-Id: I3e7595f437db4828072589d475a5453b7f31003e
6f3dbbadf4ce66982eb3d400e0a74cb73eb034f3 15-Oct-2014 Ian Rogers <irogers@google.com> Make ART compile with GCC -O0 again.

Tidy up InstructionSetFeatures so that it has a type hierarchy dependent on
architecture.
Add to instruction_set_test to warn when InstructionSetFeatures don't agree
with ones from system properties, AT_HWCAP and /proc/cpuinfo.
Clean-up class linker entry point logic to not return entry points but to
test whether the passed code is the particular entrypoint. This works around
image trampolines that replicate entrypoints.
Bug: 17993736

Change-Id: I5f4b49e88c3b02a79f9bee04f83395146ed7be23
832336b3c9eb892045a8de1bb12c9361112ca3c5 09-Oct-2014 Ian Rogers <irogers@google.com> Don't copy fill array data to quick literal pool.

Currently quick copies the fill array data from the dex file to the literal
pool. It then has to go through hoops to pass this PC relative address down
to out-of-line code. Instead, pass the offset of the table to the out-of-line
code and use the CodeItem data associated with the ArtMethod. This reduces
the size of oat code while greatly simplifying it.
Unify the FillArrayData implementation in quick, portable and the interpreters.

Change-Id: I9c6971cf46285fbf197856627368c0185fdc98ca
8ce6b9040747054b444a7fa706503cd257801936 26-Aug-2014 Dave Allison <dallison@google.com> Handle nested signals

This allows for signals to be raised inside the ART signal handler.
This can occur when the JavaStackTraceHandler attempts to generate
a stack trace and something goes wrong.

It also fixes an issue where the fault manager was not being
correctly shut down inside the signal chaining code. In this
case the signal handler was not restored to the original.

Bug: 17006816
Bug: 17133266

(cherry picked from commit fabe91e0d558936ac26b98d2b4ee1af08f58831d)

Change-Id: I10730ef52d5d8d34610a5293253b3be6caf4829e
fabe91e0d558936ac26b98d2b4ee1af08f58831d 26-Aug-2014 Dave Allison <dallison@google.com> Handle nested signals

This allows for signals to be raised inside the ART signal handler.
This can occur when the JavaStackTraceHandler attempts to generate
a stack trace and something goes wrong.

It also fixes an issue where the fault manager was not being
correctly shut down inside the signal chaining code. In this
case the signal handler was not restored to the original.

Bug: 17006816
Bug: 17133266
Change-Id: I9c25bf4f6921e6a107aefbdf47d2c0db9f41508f
37f05ef45e0393de812d51261dc293240c17294d 17-Jul-2014 Fred Shih <ffred@google.com> Reduced memory usage of primitive fields smaller than 4-bytes

Reduced memory used by byte and boolean fields from 4 bytes down to a
single byte and shorts and chars down to two bytes. Fields are now
arranged as Reference followed by decreasing component sizes, with
fields shuffled forward as needed.

Bug: 8135266
Change-Id: I65eaf31ed27e5bd5ba0c7d4606454b720b074752
29b3841ad8c1c18ee7ddd2d8cab85806b3d62eaa 13-Aug-2014 Andreas Gampe <agampe@google.com> ART: Set default visibility to protected

Set default visibility of symbols to protected. This allows the
linker to optimize internal calls and helps avoid plt calls.

Make almost all assembly stubs hidden, as -fvisibility does not
seem to apply to them. Change the assembly tests accordingly. Also
allows to clean up previous hacks to avoid plt calls.

Bug: 16974467

(cherry picked from commit 235e77bd9f19e4faefda109be40f8744f3a66f40)

Change-Id: I9030dcf6116251f434f94a2b08e56e12085af652
b038ba66a166fb264ca121632f447712e0973b5b 14-Aug-2014 Dave Allison <dallison@google.com> Revert "Revert "Reduce stack usage for overflow checks""

Fixes stack protection issue.
Fixes mac build issue.

This reverts commit 83b1940e6482b9d8feba5c492507735686650ea5.

Change-Id: I7ba17252882b23a740bcda2ea94aacf398255406
4cf00ba324f5f6884059796a6ba41937f32e1844 14-Aug-2014 Dave Allison <dallison@google.com> Revert "Reduce stack usage for overflow checks"

This reverts commit 63c051a540e6dfc806f656b88ac3a63e99395429.

Change-Id: I282a048994fcd130fe73842b16c21680053c592f
03c9785a8a6d712775cf406c4371d0227c44148f 14-Aug-2014 Dave Allison <dallison@google.com> Revert "Revert "Reduce stack usage for overflow checks""

Fixes stack protection issue.
Fixes mac build issue.

This reverts commit 83b1940e6482b9d8feba5c492507735686650ea5.

Change-Id: I7ba17252882b23a740bcda2ea94aacf398255406
83b1940e6482b9d8feba5c492507735686650ea5 14-Aug-2014 Dave Allison <dallison@google.com> Revert "Reduce stack usage for overflow checks"

This reverts commit 63c051a540e6dfc806f656b88ac3a63e99395429.

Change-Id: I282a048994fcd130fe73842b16c21680053c592f
235e77bd9f19e4faefda109be40f8744f3a66f40 13-Aug-2014 Andreas Gampe <agampe@google.com> ART: Set default visibility to protected

Set default visibility of symbols to protected. This allows the
linker to optimize internal calls and helps avoid plt calls.

Make almost all assembly stubs hidden, as -fvisibility does not
seem to apply to them. Change the assembly tests accordingly. Also
allows to clean up previous hacks to avoid plt calls.

Bug: 16974467
Change-Id: I9030dcf6116251f434f94a2b08e56e12085af652
63c051a540e6dfc806f656b88ac3a63e99395429 26-Jul-2014 Dave Allison <dallison@google.com> Reduce stack usage for overflow checks

This reduces the stack space reserved for overflow checks to 12K, split
into an 8K gap and a 4K protected region. GC needs over 8K when running
in a stack overflow situation.

Also prevents signal runaway by detecting a signal inside code that
resulted from a signal handler invokation. And adds a max signal count to
the SignalTest to prevent it running forever.

Also reduces the number of iterations for the InterfaceTest as this was
taking (almost) forever with the --trace option on run-test.

Bug: 15435566

Change-Id: Id4fd46f22d52d42a9eb431ca07948673e8fda694

Conflicts:
compiler/optimizing/code_generator_x86_64.cc
runtime/arch/x86/fault_handler_x86.cc
runtime/arch/x86_64/quick_entrypoints_x86_64.S
b0f05b9654eb005bc8c8e15f615a7f5a312f640c 17-Jul-2014 Dave Allison <dallison@google.com> Add implicit checks for x86_64 architecture.

This combines the x86 and x86_64 fault handlers into one. It also
merges in the change to the entrypoints for X86_64.

Replaces generic instruction length calculator with one that only
works with the specific instructions we use.

Bug: 16256184

Change-Id: I1e8ab5ad43f46060de9597615b423c89a836035c
Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
648d7112609dd19c38131b3e71c37bcbbd19d11e 26-Jul-2014 Dave Allison <dallison@google.com> Reduce stack usage for overflow checks

This reduces the stack space reserved for overflow checks to 12K, split
into an 8K gap and a 4K protected region. GC needs over 8K when running
in a stack overflow situation.

Also prevents signal runaway by detecting a signal inside code that
resulted from a signal handler invokation. And adds a max signal count to
the SignalTest to prevent it running forever.

Also reduces the number of iterations for the InterfaceTest as this was
taking (almost) forever with the --trace option on run-test.

Bug: 15435566

Change-Id: Id4fd46f22d52d42a9eb431ca07948673e8fda694
85fa796277d23e6bf1679cbd0da0019b03d8066b 10-Aug-2014 Dan Albert <danalbert@google.com> Fix more of the Mac build.

Change-Id: I0fa52ef73e86318bb68de2c69bbed81a00bfc3e0
dfd3b47813c14c5f1607cbe7b10a28b1b2f29cbc 17-Jul-2014 Dave Allison <dallison@google.com> Add implicit checks for x86_64 architecture.

This combines the x86 and x86_64 fault handlers into one. It also
merges in the change to the entrypoints for X86_64.

Replaces generic instruction length calculator with one that only
works with the specific instructions we use.

Bug: 16256184

Change-Id: I1e8ab5ad43f46060de9597615b423c89a836035c
Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
f5881ed68a05b371e7578966470ff3801b180578 23-Jul-2014 Andreas Gampe <agampe@google.com> ART: Fix x86_64 instrumentation_exit, also movsd -> movq

Change movd/movsd to movq.

Bug: 16386215

(cherry picked from commit fea29016a79f39ac12a4ba4ebdcbc86800c03427)

Change-Id: Icca71ca2aeeb2917aff46043051d6046f04395d4
fea29016a79f39ac12a4ba4ebdcbc86800c03427 23-Jul-2014 Andreas Gampe <agampe@google.com> ART: Fix x86_64 instrumentation_exit, also movsd -> movq

Change movd/movsd to movq.

Bug: 16386215
Change-Id: Icca71ca2aeeb2917aff46043051d6046f04395d4
ab088118d33caafb00815ab72ac0fd7374169f64 14-Jul-2014 Hiroshi Yamauchi <yamauchi@google.com> Add read barriers for the roots in Runtime.

Bug: 12687968
Change-Id: If26518a8251702cfe4d5cd7d1f50e80e342704cf
ae91207f068d3fac6c5a116bc9117f1f83201c6a 11-Jul-2014 Christopher Ferris <cferris@google.com> Fix mac build.

Change-Id: I34a330ee038c7216eb3c4bcecbff2eb0cfa08589
e9343344d9bd268a05d1eae1ce80a3278ec19c89 11-Jul-2014 Dave Allison <dallison@google.com> Fix mac build

Fixes x86 fault handler, sigchain and quick_entrypoints for x86_64.

Bug: 16215218
Change-Id: I5e58660ea815042968444e6352c57a5f53314cfd
c380191f3048db2a3796d65db8e5d5a5e7b08c65 08-Jul-2014 Serguei Katkov <serguei.i.katkov@intel.com> x86_64: Enable fp-reg promotion

Patch introduces 4 register XMM12-15 available for promotion of
fp virtual registers.

Change-Id: I3f89ad07fc8ae98b70f550eada09be7b693ffb67
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
bc8a28896af5b99f0a42028f98bf0c74eb8047c9 11-Jul-2014 Christopher Ferris <cferris@google.com> Revert "Fix mac build"

This reverts commit e9343344d9bd268a05d1eae1ce80a3278ec19c89.

Change-Id: I43d1717af9c3b1237dcacec66f55a4e4b8e1f0fe
c200a4abeca91e19969f5b35543f17f812ba32b9 17-Jun-2014 Andreas Gampe <agampe@google.com> ART: Rework Generic JNI, add ARM version

Refactors and optimizes Generic JNI. This version uses TwoWordReturn
to avoid writing to / loading from the bottom of the alloca.

Change-Id: I3287007c976f79c9fd32d3b3a43f2d1371bf4cd3
c3ccc1039e0bbc0744f958cb8719cf96bce5b853 25-Jun-2014 Ian Rogers <irogers@google.com> Fix the Mac build on x86-64.

Change-Id: I4ed3783a96d844de0b0a295df26d0a48c02a3726
d3703d82a0afc28a4ea0cb0f6d88e9f8adc23e43 09-Jun-2014 Mark Mendell <mark.p.mendell@intel.com> X86_64: Pass 'hidden method index' in EAX

Method* is in EDI, and EAX isn't an argument register, so EAX is free
to hold the hidden method index.

Change-Id: I793a54d00a4593e140f97144419d849b53bfdf44
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
d58342caa97108ba413bad467c285c0377f138f5 05-Jun-2014 Andreas Gampe <agampe@google.com> ART: Add instrumentation stubs for ARM64 and X86-64

Adds instrumentation stubs necessary for debugger support.

Refactors MethodAndCode to a top-level TwoWordReturn. A function
having a return type of TwoWordReturn will return its two-word
content, either 2x32b or 2x64b, in two registers according to
the architecture's ABI.

Bug: 15443938
Change-Id: Id7e1fbd4ad8eb6f29e23d48903c76f77b28d981a
7c748c17109e35f316c3b1916dbe02d9c77e355c 06-Jun-2014 Serguei Katkov <serguei.i.katkov@intel.com> x86_64: Fix stubs after 4-byte method handler

It is 4 byte now and should be handled accordingly.

Change-Id: Ie373235f961eabfd33266bd89fbf8169a3714a03
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
f208ae986d4b145978ded4c240501609a85997a5 29-May-2014 Dmitry Petrochenko <dmitry.petrochenko@intel.com> x86_64: Fix art_quick_aput_obj

The ebx register is not scratch for x86_64 and it leads to
its corruption (seen on art/test 201). Replacing ebx by ecx.

Change-Id: I7f5eeba47688ada5afba82a9303fa736f823d77e
Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
ffddfdf6fec0b9d98a692e27242eecb15af5ead2 03-Jun-2014 Tim Murray <timmurray@google.com> DO NOT MERGE

Merge ART from AOSP to lmp-preview-dev.

Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
cf4035a4c41ccfcc3e89a0cee25f5218a11b0705 29-May-2014 Andreas Gampe <agampe@google.com> ART: Use StackReference in Quick Stack Frame

The method reference at the bottom of a quick frame is a stack
reference and not a native pointer. This is important for 64b
architectures, where the notions do not coincide.

Change key methods to have StackReference<mirror::ArtMethod>*
parameter instead of mirror::ArtMethod**. Make changes to
invoke stubs for 64b archs, change the frame setup for JNI code
(both generic JNI and compilers), tie up loose ends.

Tested on x86 and x86-64 with host tests. On x86-64, tests succeed
with jni compiler activated. x86-64 QCG was not tested.

Tested on ARM32 with device tests.

Fix ARM64 not saving x19 (used for wSUSPEND) on upcalls.

Tested on ARM64 in interpreter-only + generic-jni mode.

Fix ARM64 JNI Compiler to work with the CL.

Tested on ARM64 in interpreter-only + jni compiler.

Change-Id: I77931a0cbadd04d163b3eb8d6f6a6f8740578f13
1d4d7bdafd0c3d4df7bf8e907b08db9669db7023 23-May-2014 Alexei Zavjalov <alexei.zavjalov@intel.com> ART: refactor x86/x86-64 entrypoints

This patch:

- removes unused stubs in x86/64 runtimes (art_quick_l2d, art_quick_l2f
and art_quick_idivmod)
- replaces art_quick_fmod, art_quick_fmodf and art_quick_is_assignable
entrypoints in x86-64 to the direct calls
- removes art_quick_indexof stub in x86-64

Change-Id: I6141c5c73b0b449fa3b866068b101e0be211b93e
Signed-off-by: Alexei Zavjalov <alexei.zavjalov@intel.com>
51f763506c7bb34420242e88e1631550f94d6417 21-May-2014 Andreas Gampe <agampe@google.com> ART: Add INVOKE_TRAMPOLINE and imt_conflict stub to 64b architectures

"Generalize" the return type notion of the interface helpers.

Includes a simple test for imt_conflict. The other interface
trampolines are as of yet untested.

Change-Id: I30fc75f5103766d57628ff22bcbac7c7f81037e3
421c53742610c053543f8c84e04d5e0c5185d68c 14-May-2014 Mathieu Chartier <mathieuc@google.com> Address comments from HandleScope change.

For:
https://android-review.googlesource.com/#/c/93793

Change-Id: I020d22a1508bf4f1770e6806d70e4fbb9a0fa0ab
eb8167a4f4d27fce0530f6724ab8032610cd146b 08-May-2014 Mathieu Chartier <mathieuc@google.com> Add Handle/HandleScope and delete SirtRef.

Delete SirtRef and replaced it with Handle. Handles are value types
which wrap around StackReference*.

Renamed StackIndirectReferenceTable to HandleScope.

Added a scoped handle wrapper which wraps around an Object** and
restores it in its destructor.

Renamed Handle::get -> Get.

Bug: 8473721

Change-Id: Idbfebd4f35af629f0f43931b7c5184b334822c7a
78150c726559f0fe0828bcd4f320ba5c9c3e7cb0 05-May-2014 Dmitry Petrochenko <dmitry.petrochenko@intel.com> x86_64: Fix issues in entrypoints

Minor fixes, also avoiding duplicate restore
in art_quick_resolution_trampoline (084-class-init issue)

Change-Id: I9991accb286c3ea231054d5eeb6eefc229df80f6
Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
4fc046e78efbc98541388cdda986b5d8a2b951ad 07-May-2014 Andreas Gampe <agampe@google.com> ART: Add lock and unlock stubs for ARM64, fix for X86-64

Basic translation of ARM stubs using dmb memory barrier.

Fix placement of dmb in unlock_object of ARM and ARM64.

Update lock and unlock tests in stub_test to force fat locks.

Fix X86-64 unlock stub.

Change-Id: Ie2e4328d9631e06843115888644e75fde8b319ee
9d4e5e2c83682d12061338a5ce74f3119696ca57 06-May-2014 Andreas Gampe <agampe@google.com> ART: Clean field entrypoints for X86-64

The structure is highly regular. Introduce a handful of macros.

Change-Id: I84873be35987d3670491bee1417c7a2c09b233d0
8d07e2d37d1eed21c8ea7e87f63cd34fd2f9f833 05-May-2014 Alexei Zavjalov <alexei.zavjalov@intel.com> Implement field entrypoints for x86-64

This adds some stubs related to fields access.

Change-Id: Ie950624fb2d00475cb2018c9c3d0a12ebd892b12
Signed-off-by: Alexei Zavjalov <alexei.zavjalov@intel.com>
80c7934b90c9f568c667ca52afbb807932ebf2f3 02-May-2014 Alexei Zavjalov <alexei.zavjalov@intel.com> Implement object lock and unlock entrypoints for x86-64

This patch adds implementation for art_quick_lock_object and
art_quick_unlock_object stubs for x86-64 and enables their
testing in stub_test.

Change-Id: Ia373c9b0ebc7ebb959968464cf55607afd5384b0
Signed-off-by: Alexei Zavjalov <alexei.zavjalov@intel.com>
315ccab0d190af0482d9e656c66a184d87cab050 01-May-2014 Alexei Zavjalov <alexei.zavjalov@intel.com> Implement art_quick_string_compareto entrypoint for x86-64

This adds implementation of the art_quick_string_compareto
entrypoint for the x86-64 platform.

Add a test to stub_test, enabled for arm, x86 and x86-64.

Change-Id: I71b318b03d4c8920ccb3723b59c43542e219bf47
Signed-off-by: Alexei Zavjalov <alexei.zavjalov@intel.com>
f4e910badbaa5aa48efbf697715f4815f0ebc7e4 30-Apr-2014 Andreas Gampe <agampe@google.com> Implement art_quick_aput_object stubs for X86-64 and ARM64

Implement the aput_object stubs for 64b architectures and enable
their testing in stub_test.

Fix missing @PLT for x86.

Add automatic _local labels in function definitions in x86-64 so we
can make local jumps (instead of PLT hoops).

Change-Id: I614b88fd5966acd8a564b87c47d4c50ee605320c
00c1e6d5fa6c2c20f25c38591b9780114bf7ddbf 26-Apr-2014 Andreas Gampe <agampe@google.com> Add ARM64 & X86_64 Assembly, plus tests

This adds assembly code or removes UNTESTED annotation from
TWO_ARG_DOWNCALLand THREE_ARG_DOWNCALL macros and supporting code,
generating working allocation stubs.

Some object and array allocation tests are added to the stub_test.

Change-Id: I5e93b7543c1e6dbd33b0d4cf564c7cbd963e74ef
5c1e4352614d61fed6868567e58b96682828cb4d 22-Apr-2014 Andreas Gampe <agampe@google.com> Add "arch_test" gtest for assembly stub constants, add some ARM64 assembly code

Add a test that (1) checks all callee-save method frame sizes for
all architectures, (2) checks thread offsets for the runtime
architecture and (3) checks callee-save method offsets for the
runtime architecture.

The "asm_support_XXX.h" files now only contain definitions that are
common between all architectures. Architecture-specific definitions
(i.e., special registers names) have been pushed into the corresponding
.S file. This change was required to be able to undefine definitions
in the test, so that multiple tests can be written in one file.

Test (1) above is in a sense two-stage. The arch_test gtest compares
constants (if it finds them) against the frame size as reported by
the ArtMethods created by the Runtime. This works for all architectures
as we can provide the instruction-set to CreateCalleeSaveMethod. The
second stage of the "test" are preprocessor tests with "#error" in the
case that the constants are not the expected value.

Optimally I'd like to change that to an actual runtime test exercising
the assembly code, which would also allow to check whether the right
registers are stored.

Also added missing assembly code for ARM64 for the callee-save macros.

Also fix X86_64 compilation for Clang 3.5.

Change-Id: I018e6433dffd3d31ba3bfcd75661653f4c7b6552
525cde2dd7cc7ad4212765ad3975cf260a934d3e 23-Apr-2014 Andreas Gampe <agampe@google.com> Add a GTest for Assembly Stubs, Add some ARM64 and X86-64 Stubs

This GTest adds some runtime testing for the stubs that does not
rely on the compiler. This should allow to add or update the stubs
and do testing, especially on architectures without working compiler.

This test is a bit dangerous: if it doesn't know how to handle an
architecture, it will only log a warning. This is so that testing
does not break at the moment. The warning is forced to stdout, too,
so that it is always visible.

Add art_quick_check_cast to ARM64 and X86-64. Add art_quick_memcpy
to X86-64. The latter should be removed in a good compiler, as it is
practically only overhead. Add minor CFI information in ARM.

Change-Id: Ia9c6d0f4035eb1527c12b5f6067dece59e25528d
47d00c0a893b13b69e4bac1836e10cc3e1812d41 17-Apr-2014 Ian Rogers <irogers@google.com> Add untested x86-64 downcall and exception assembly.

Change-Id: Ic555f9f5af8c3a2110a92e55772ff6c0128e5c19
dd7624d2b9e599d57762d12031b10b89defc9807 15-Mar-2014 Ian Rogers <irogers@google.com> Allow mixing of thread offsets between 32 and 64bit architectures.

Begin a more full implementation x86-64 REX prefixes.
Doesn't implement 64bit thread offset support for the JNI compiler.

Change-Id: If9af2f08a1833c21ddb4b4077f9b03add1a05147
fca82208f7128fcda09b6a4743199308332558a2 21-Mar-2014 Dmitry Petrochenko <dmitry.petrochenko@intel.com> x86_64: JNI compiler

Passed all tests from jni_compiler_test and art/test on host with jni_copiler.
Incoming argument spill is enabled, entry_spills refactored. Now each entry spill
contains data type size (4 or 8) and offset which should be used for spill.
Assembler REX support implemented in opcodes used in JNI compiler.
Please note, JNI compiler is not enabled by default yet (see compiler_driver.cc:1875).

Change-Id: I5fd19cca72122b197aec07c3708b1e80c324be44
Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
36fea8dd490ab6439f391b8cd7f366c59f026fd2 10-Mar-2014 Andreas Gampe <agampe@google.com> Fixing structure of native frame for Generic JNI

This changes the layout of the callee-save frame used in generic
JNI to be consistent with the JNI compiler, that is, the SIRT is
inline (above the method reference). Now the location of the
"this" object is consistent.

Change-Id: Ibad0882680712cb640b4c70ada0229ef7cf4e62c
b7dabf5215f807bda713dbc120eb63f038dbaeeb 12-Mar-2014 Ian Rogers <irogers@google.com> Implement proxy support for x86-64.

Change-Id: I91490a38347fdee413a191412f07b883654d4229
1a5706611bffa5d6ed6843ee5e320f504590e097 12-Mar-2014 Ian Rogers <irogers@google.com> A few 64bit fixes.

Change-Id: I1fe189d638b9cb5127b897da6cecdad6902db930
e0dcd46314d07eeb332edea292f5110178e4e3d2 09-Mar-2014 Ian Rogers <irogers@google.com> JNI down call fixes.

Ensure SIRT isn't accessed via quick callee save frame.
Some tidying of code.

Change-Id: I8fec3e89aa6d2e86789c60a07550db2e92478ca7
c147b00f86c28f5275c99c8ce515499c90c01e31 07-Mar-2014 Andreas Gampe <agampe@google.com> Release unused parts of a JNI frame before calling native code

Two-pass process for setting up the JNI frame so we can put Sirt
and native call stack as close together as possible.

Change-Id: I827167a55fafc4eba7d4eaf14a35fc69fd5f85ce
44d6ff197b340b5ac2a4094db148b39c366317dd 07-Mar-2014 Ian Rogers <irogers@google.com> Fix issues with clang and BUILD_HOST_64bit.

Change-Id: Id954d0c1144de6eaf89a4d27d205e3bf6ccb655f
befbd5731ecca08f08780ee28a913d08ffb14656 06-Mar-2014 Ian Rogers <irogers@google.com> Fix host architecture for 64bit.

Also, hack x86 assembler for use as a x86-64 trampoline compiler's assembler.
Implement missing x86-64 quick resolution trampoline.
Add x86-64 to the quick elf writer.

Change-Id: I08216c67014a83492ada12898ab8000218ba7bb4
bf6b92a158053c98b15f4393abb3b86344ec9a20 06-Mar-2014 Andreas Gampe <agampe@google.com> Generic JNI implementation for x86_64

Starting implementation for generic JNI on x86_64. Frames are of
large static size (>4K) right now, should be compacted later. Passes
the whole of jni_compiler_test.

Change-Id: I88ac3e13a534afe7568d62a1ef97cb766e8260e4
2da882315a61072664f7ce3c212307342e907207 27-Feb-2014 Andreas Gampe <agampe@google.com> Initial changes towards Generic JNI option

Some initial changes that lead to an UNIMPLEMENTED. Works
by not compiling for JNI right now and tracking native methods
which have neither quick nor portable code. Uses new trampoline.

Change-Id: I5448654044eb2717752fd7359f4ef8bd5c17be6e
936b37f3a7f224d990a36b2ec66782a4462180d6 14-Feb-2014 Ian Rogers <irogers@google.com> Upcall support for x86-64.

Sufficient to pass jni_internal_test.

Change-Id: Ia0d9b8241ab8450e04765b9c32eb6dc8fc1a8733
0177e53ea521ad58b70c305700dab32f1ac773b7 12-Feb-2014 Ian Rogers <irogers@google.com> Work in the direction of hard float quick ABIs.

Pass a shorty to ArtMethod::Invoke so that register setup can use it.
Document x86-64 ABI.
Add extra debug output for one JNI native method registration fails, namely a
dump of the Class and its dex file's location.
Add hack to get testing of OatMethod's without GC maps working in 64bit.

Change-Id: Ic06b68e18eac33637df2caf5e7e775ff95ae70f3
ef7d42fca18c16fbaf103822ad16f23246e2905d 06-Jan-2014 Ian Rogers <irogers@google.com> Object model changes to support 64bit.

Modify mirror objects so that references between them use an ObjectReference
value type rather than an Object* so that functionality to compress larger
references can be captured in the ObjectRefererence implementation.
ObjectReferences are 32bit and all other aspects of object layout remain as
they are currently.

Expand fields in objects holding pointers so they can hold 64bit pointers. Its
expected the size of these will come down by improving where we hold compiler
meta-data.
Stub out x86_64 architecture specific runtime implementation.
Modify OutputStream so that reads and writes are of unsigned quantities.
Make the use of portable or quick code more explicit.
Templatize AtomicInteger to support more than just int32_t as a type.
Add missing, and fix issues relating to, missing annotalysis information on the
mutator lock.
Refactor and share implementations for array copy between System and uses
elsewhere in the runtime.
Fix numerous 64bit build issues.

Change-Id: I1a5694c251a42c9eff71084dfdd4b51fff716822