History log of /art/compiler/dex/quick/x86/call_x86.cc
Revision Date Author Comments
03c9785a8a6d712775cf406c4371d0227c44148f 14-Aug-2014 Dave Allison <dallison@google.com> Revert "Revert "Reduce stack usage for overflow checks""

Fixes stack protection issue.
Fixes mac build issue.

This reverts commit 83b1940e6482b9d8feba5c492507735686650ea5.

Change-Id: I7ba17252882b23a740bcda2ea94aacf398255406
ab9a0dbf3b63d517da5278b8298e6cd316e09f68 14-Aug-2014 Dave Allison <dallison@google.com> Revert "Reduce stack usage for overflow checks"

This reverts commit 63c051a540e6dfc806f656b88ac3a63e99395429.

Change-Id: I282a048994fcd130fe73842b16c21680053c592f
63c051a540e6dfc806f656b88ac3a63e99395429 26-Jul-2014 Dave Allison <dallison@google.com> Reduce stack usage for overflow checks

This reduces the stack space reserved for overflow checks to 12K, split
into an 8K gap and a 4K protected region. GC needs over 8K when running
in a stack overflow situation.

Also prevents signal runaway by detecting a signal inside code that
resulted from a signal handler invokation. And adds a max signal count to
the SignalTest to prevent it running forever.

Also reduces the number of iterations for the InterfaceTest as this was
taking (almost) forever with the --trace option on run-test.

Bug: 15435566

Change-Id: Id4fd46f22d52d42a9eb431ca07948673e8fda694

Conflicts:
compiler/optimizing/code_generator_x86_64.cc
runtime/arch/x86/fault_handler_x86.cc
runtime/arch/x86_64/quick_entrypoints_x86_64.S
83b1940e6482b9d8feba5c492507735686650ea5 14-Aug-2014 Dave Allison <dallison@google.com> Revert "Reduce stack usage for overflow checks"

This reverts commit 63c051a540e6dfc806f656b88ac3a63e99395429.

Change-Id: I282a048994fcd130fe73842b16c21680053c592f
8c18c2aaedb171f9b03ec49c94b0e33449dc411b 06-Aug-2014 Andreas Gampe <agampe@google.com> ART: Generate chained compare-and-branch for short switches

Refactor Mir2Lir to generate chained compare-and-branch sequences
for short switches on all architectures.

Bug: 16241558

(cherry picked from commit 48971b3242e5126bcd800cc9c68df64596b43d13)

Change-Id: I0bb3071b8676523e90e0258e9b0e3fd69c1237f4
984305917bf57b3f8d92965e4715a0370cc5bcfb 28-Jul-2014 Andreas Gampe <agampe@google.com> ART: Rework quick entrypoint code in Mir2Lir, cleanup

To reduce the complexity of calling trampolines in generic code,
introduce an enumeration for entrypoints. Introduce a header that lists
the entrypoint enum and exposes a templatized method that translates an
enum value to the corresponding thread offset value.

Call helpers are rewritten to have an enum parameter instead of the
thread offset. Also rewrite LoadHelper and GenConversionCall this way.
It is now LoadHelper's duty to select the right thread offset size.

Introduce InvokeTrampoline virtual method to Mir2Lir. This allows to
further simplify the call helpers, as well as make OpThreadMem specific
to X86 only (removed from Mir2Lir).

Make GenInlinedCharAt virtual, move a copy to X86 backend, and simplify
both copies. Remove LoadBaseIndexedDisp and OpRegMem from Mir2Lir, as they
are now specific to X86 only.

Remove StoreBaseIndexedDisp from Mir2Lir, as it was only ever used in the
X86 backend.

Remove OpTlsCmp from Mir2Lir, as it was only ever used in the X86 backend.

Remove OpLea from Mir2Lir, as it was only ever defined in the X86 backend.

Remove GenImmedCheck from Mir2Lir as it was neither used nor implemented.

Change-Id: If0a6182288c5d57653e3979bf547840a4c47626e
147eb41b53729ec8d5c188d1cac90964a51afb8a 11-Jul-2014 Dave Allison <dallison@google.com> Revert "Revert "Revert "Revert "Add implicit null and stack checks for x86""""

This reverts commit 0025a86411145eb7cd4971f9234fc21c7b4aced1.

Bug: 16256184
Change-Id: Ie0760a0c293aa3b62e2885398a8c512b7a946a73

Conflicts:
compiler/dex/quick/arm64/target_arm64.cc
compiler/image_test.cc
runtime/fault_handler.cc
69dfe51b684dd9d510dbcb63295fe180f998efde 11-Jul-2014 Dave Allison <dallison@google.com> Revert "Revert "Revert "Revert "Add implicit null and stack checks for x86""""

This reverts commit 0025a86411145eb7cd4971f9234fc21c7b4aced1.

Bug: 16256184
Change-Id: Ie0760a0c293aa3b62e2885398a8c512b7a946a73
ccc60264229ac96d798528d2cb7dbbdd0deca993 05-Jul-2014 Andreas Gampe <agampe@google.com> ART: Rework TargetReg(symbolic_reg, wide)

Make the standard implementation in Mir2Lir and the specialized one
in the x86 backend return a pair when wide = "true". Introduce
WideKind enumeration to improve code readability. Simplify generic
code based on this implementation.

Change-Id: I670d45aa2572eedfdc77ac763e6486c83f8e26b4
7fb36ded9cd5b1d254b63b3091f35c1e6471b90e 10-Jul-2014 Dave Allison <dallison@google.com> Revert "Revert "Add implicit null and stack checks for x86""

Fixes x86_64 cross compile issue. Removes command line options
and property to set implicit checks - this is hard coded now.

This reverts commit 3d14eb620716e92c21c4d2c2d11a95be53319791.

Change-Id: I5404473b5aaf1a9c68b7181f5952cb174d93a90d
c380191f3048db2a3796d65db8e5d5a5e7b08c65 08-Jul-2014 Serguei Katkov <serguei.i.katkov@intel.com> x86_64: Enable fp-reg promotion

Patch introduces 4 register XMM12-15 available for promotion of
fp virtual registers.

Change-Id: I3f89ad07fc8ae98b70f550eada09be7b693ffb67
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
0025a86411145eb7cd4971f9234fc21c7b4aced1 11-Jul-2014 Nicolas Geoffray <ngeoffray@google.com> Revert "Revert "Revert "Add implicit null and stack checks for x86"""

Broke the build.

This reverts commit 7fb36ded9cd5b1d254b63b3091f35c1e6471b90e.

Change-Id: I9df0e7446ff0913a0e1276a558b2ccf6c8f4c949
34e826ccc80dc1cf7c4c045de6b7f8360d504ccf 29-May-2014 Dave Allison <dallison@google.com> Add implicit null and stack checks for x86

This adds compiler and runtime changes for x86
implicit checks. 32 bit only.

Both host and target are supported.
By default, on the host, the implicit checks are null pointer and
stack overflow. Suspend is implemented but not switched on.

Change-Id: I88a609e98d6bf32f283eaa4e6ec8bbf8dc1df78a
3d14eb620716e92c21c4d2c2d11a95be53319791 10-Jul-2014 Dave Allison <dallison@google.com> Revert "Add implicit null and stack checks for x86"

It breaks cross compilation with x86_64.

This reverts commit 34e826ccc80dc1cf7c4c045de6b7f8360d504ccf.

Change-Id: I34ba07821fc0a022fda33a7ae21850957bbec5e7
407a9d2847161b843966a443b71760b1280bd396 04-Jul-2014 Serguei Katkov <serguei.i.katkov@intel.com> Clean-up call_x86.cc

Also adds some DCHECKs and fixes for the bugs found by them.

Change-Id: I455bbfe2c6018590cf491880cd9273edbe39c4c7
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
a77ee5103532abb197f492c14a9e6fb437054e2a 02-Jul-2014 Chao-ying Fu <chao-ying.fu@intel.com> x86_64: TargetReg update for x86

Also includes changes in common code. Elimination of use of TargetReg
with one parameter and direct access to special target registers.

Change-Id: Ied2c1f87d4d1e4345248afe74bca40487a46a371
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
dd64450b37776f68b9bfc47f8d9a88bc72c95727 01-Jul-2014 Elena Sayapina <elena.v.sayapina@intel.com> x86_64: Unify 64-bit check in x86 compiler

Update x86-specific Gen64Bit() check with the CompilationUnit target64 field
which is set using unified Is64BitInstructionSet(InstructionSet) check.

Change-Id: Ic00ac863ed19e4543d7ea878d6c6c76d0bd85ce8
Signed-off-by: Elena Sayapina <elena.v.sayapina@intel.com>
de68676b24f61a55adc0b22fe828f036a5925c41 24-Jun-2014 Andreas Gampe <agampe@google.com> Revert "ART: Split out more cases of Load/StoreRef, volatile as parameter"

This reverts commit 2689fbad6b5ec1ae8f8c8791a80c6fd3cf24144d.

Breaks the build.

Change-Id: I9faad4e9a83b32f5f38b2ef95d6f9a33345efa33
3c12c512faf6837844d5465b23b9410889e5eb11 24-Jun-2014 Andreas Gampe <agampe@google.com> Revert "Revert "ART: Split out more cases of Load/StoreRef, volatile as parameter""

This reverts commit de68676b24f61a55adc0b22fe828f036a5925c41.

Fixes an API comment, and differentiates between inserting and appending.

Change-Id: I0e9a21bb1d25766e3cbd802d8b48633ae251a6bf
2689fbad6b5ec1ae8f8c8791a80c6fd3cf24144d 23-Jun-2014 Andreas Gampe <agampe@google.com> ART: Split out more cases of Load/StoreRef, volatile as parameter

Splits out more cases of ref registers being loaded or stored. For
code clarity, adds volatile as a flag parameter instead of a separate
method.

On ARM64, continue cleanup. Add flags to print/fatal on size mismatches.

Change-Id: I30ed88433a6b4ff5399aefffe44c14a5e6f4ca4e
7cd26f355ba83be75b72ed628ed5ee84a3245c4f 19-Jun-2014 Andreas Gampe <agampe@google.com> ART: Target-dependent stack overflow, less check elision

Refactor the separate stack overflow reserved sizes from thread.h
into instruction_set.h and make sure they're used in the compiler.

Refactor the decision on when to elide stack overflow checks:
especially with large interpreter stack frames, it is not a good
idea to elide checks when the frame size is even close to the
reserved size. Currently enforce checks when the frame size is
>= 2KB, but make sure that frame sizes 1KB and below will elide
the checks (number from experience).

Bug: 15728765
Change-Id: I016bfd3d8218170cbccbd123ed5e2203db167c06
33ae5583bdd69847a7316ab38a8fa8ccd63093ef 12-Jun-2014 buzbee <buzbee@google.com> Arm64 hard-float

Basic enabling of hard-float for Arm64. In future CLs we'll
consolidate the various targets - there is a lot of overlap.

Compilation remains turned off in this CL, but I expect
to enable a subset shortly. With compilation fully enabled
(including the EXPERIMENTAL opcodes with the exception of
REM and THROW), we get the following run-test results:

003-omnibus-opcode failures:
Classes.checkCast
Classes.arrayInstance
UnresTest2
Haven't gone deep, but these appear to be related to throw/catch and/or
stacktrace.

For REM, the generated code looks reasonable to me - my guess is that
we've got something wrong on the transition to the runtime. Haven't
looked deeper yet, though.

The bulk of the other failure also appear to be related to transitioning
to the runtime system, or handling try/catch.

run-test status:
Status with optimizations disabled, REM_FLOAT/DOUBLE and THROW disabled:
succeeded tests: 94
failed tests: 22
failed: 003-omnibus-opcodes
failed: 004-annotations
failed: 009-instanceof2
failed: 024-illegal-access
failed: 025-access-controller
failed: 031-class-attributes
failed: 044-proxy
failed: 045-reflect-array
failed: 046-reflect
failed: 058-enum-order
failed: 062-character-encodings
failed: 063-process-manager
failed: 064-field-access
failed: 068-classloader
failed: 071-dexfile
failed: 083-compiler-regressions
failed: 084-class-init
failed: 086-null-super
failed: 087-gc-after-link
failed: 100-reflect2
failed: 107-int-math2
failed: 201-built-in-exception-detail-messages

Change-Id: Ib66209285cad8998d77a14781de300af02a96b15
e0ccdc0dd166136cd43e5f54201179a4496d33e8 07-Jun-2014 Chao-ying Fu <chao-ying.fu@intel.com> x86_64: Add long bytecode supports (1/2)

This patch includes switch enabling and GenFillArray,
assembler changes, updates of regalloc behavior for 64-bit,
usage in basic utility operations, loading constants,
and update for memory operations.

Change-Id: I6d8aa35a75c5fd01d69c38a770c3398d0188cc8a
Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
576ca0cd692c0b6ae70e776de91015b8ff000a08 07-Jun-2014 Ian Rogers <irogers@google.com> Reduce header files including header files.

Main focus is getting heap.h out of runtime.h.

Change-Id: I8d13dce8512816db2820a27b24f5866cc871a04b
a0cd2d701f29e0bc6275f1b13c0edfd4ec391879 01-Jun-2014 buzbee <buzbee@google.com> Quick compiler: reference cleanup

For 32-bit targets, object references are 32 bits wide both in
Dalvik virtual registers and in core physical registers. Because of
this, object references and non-floating point values were both
handled as if they had the same register class (kCoreReg).

However, for 64-bit systems, references are 32 bits in Dalvik vregs, but
64 bits in physical registers. Although the same underlying physical
core registers will still be used for object reference and non-float
values, different register class views will be used to represent them.
For example, an object reference in arm64 might be held in x3 at some
point, while the same underlying physical register, w3, would be used
to hold a 32-bit int.

This CL breaks apart the handling of object reference and non-float values
to allow the proper register class (or register view) to be used. A
new register class, kRefReg, is introduced which will map to a 32-bit
core register on 32-bit targets, and 64-bit core registers on 64-bit
targets. From this point on, object references should be allocated
registers in the kRefReg class rather than kCoreReg.

Change-Id: I6166827daa8a0ea3af326940d56a6a14874f5810
9ee801f5308aa3c62ae3bedae2658612762ffb91 12-May-2014 Dmitry Petrochenko <dmitry.petrochenko@intel.com> Add x86_64 code generation support

Utilizes r0..r7 in register allocator, implements spill/unsill
core regs as well as operations with stack pointer.

Change-Id: I973d5a1acb9aa735f6832df3d440185d9e896c67
Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
2f244e9faccfcca68af3c5484c397a01a1c3a342 08-May-2014 Andreas Gampe <agampe@google.com> ART: Add more ThreadOffset in Mir2Lir and backends

This duplicates all methods with ThreadOffset parameters, so that
both ThreadOffset<4> and ThreadOffset<8> can be handled. Dynamic
checks against the compilation unit's instruction set determine
which pointer size to use and therefore which methods to call.

Methods with unsupported pointer sizes should fatally fail, as
this indicates an issue during method selection.

Change-Id: Ifdb445b3732d3dc5e6a220db57374a55e91e1bf6
091cc408e9dc87e60fb64c61e186bea568fc3d3a 31-Mar-2014 buzbee <buzbee@google.com> Quick compiler: allocate doubles as doubles

Significant refactoring of register handling to unify usage across
all targets & 32/64 backends.

Reworked RegStorage encoding to allow expanded use of
x86 xmm registers; removed vector registers as a separate
register type. Reworked RegisterInfo to describe aliased
physical registers. Eliminated quite a bit of target-specific code
and generalized common code.

Use of RegStorage instead of int for registers now propagated down
to the NewLIRx() level. In future CLs, the NewLIRx() routines will
be replaced with versions that are explicit about what kind of
operand they expect (RegStorage, displacement, etc.). The goal
is to eventually use RegStorage all the way to the assembly phase.

TBD: MIPS needs verification.
TBD: Re-enable liveness tracking.

Change-Id: I388c006d5fa9b3ea72db4e37a19ce257f2a15964
6ffcfa04ebb2660e238742a6000f5ccebdd5df15 25-Apr-2014 Mingyao Yang <mingyao@google.com> Rewrite suspend test check with LIRSlowPath.

Change-Id: I2dc17d079655586bfc588349c7a04afc2c6879af
695d13a82d6dd801aaa57a22a9d4b3f6db0d0fdb 19-Apr-2014 buzbee <buzbee@google.com> Update load/store utilities for 64-bit backends

This CL replaces the typical use of LoadWord/StoreWord
utilities (which, in practice, were 32-bit load/store) in
favor of a new set that make the size explicit. We now have:

LoadWordDisp/StoreWordDisp:
32 or 64 depending on target. Load or store the natural
word size. Expect this to be used infrequently - generally
when we know we're dealing with a native pointer or flushed
register not holding a Dalvik value (Dalvik values will flush
to home location sizes based on Dalvik, rather than the target).

Load32Disp/Store32Disp:
Load or store 32 bits, regardless of target.

Load64Disp/Store64Disp:
Load or store 64 bits, regardless of target.

LoadRefDisp:
Load a 32-bit compressed reference, and expand it to the
natural word size in the target register.

StoreRefDisp:
Compress a reference held in a register of the natural word
size and store it as a 32-bit compressed reference.

Change-Id: I50fcbc8684476abd9527777ee7c152c61ba41c6f
3a74d15ccc9a902874473ac9632e568b19b91b1c 22-Apr-2014 Mingyao Yang <mingyao@google.com> Delete throw launchpads.

Bug: 13170824

Change-Id: I9d5834f5a66f5eb00f2ac80774e8c27dea99949e
d6ed642458c8820e1beca72f3d7b5f0be4a4b64b 10-Apr-2014 Dave Allison <dallison@google.com> Revert "Revert "Revert "Use trampolines for calls to helpers"""

This reverts commit f9487c039efb4112616d438593a2ab02792e0304.

Change-Id: Id48a4aae4ecce73db468587967968a3f7618b700
f9487c039efb4112616d438593a2ab02792e0304 09-Apr-2014 Dave Allison <dallison@google.com> Revert "Revert "Use trampolines for calls to helpers""

This reverts commit 081f73e888b3c246cf7635db37b7f1105cf1a2ff.

Change-Id: Ibd777f8ce73cf8ed6c4cb81d50bf6437ac28cb61

Conflicts:
compiler/dex/quick/mir_to_lir.h
081f73e888b3c246cf7635db37b7f1105cf1a2ff 07-Apr-2014 Dave Allison <dallison@google.com> Revert "Use trampolines for calls to helpers"

This reverts commit 754ddad084ccb610d0cf486f6131bdc69bae5bc6.

Change-Id: Icd979adee1d8d781b40a5e75daf3719444cb72e8
754ddad084ccb610d0cf486f6131bdc69bae5bc6 19-Feb-2014 Dave Allison <dallison@google.com> Use trampolines for calls to helpers

This is an ARM specific optimization to the compiler
that uses trampoline islands to make calls to runtime
helper functions. The intention is to reduce the size
of the generated code (by 2 bytes per call) without
affecting performance.

By default this is on when generating an OAT file. It is
off when compiling to memory.

To switch this off in dex2oat, use the command line option:
--no-helper-trampolines

Enhances disassembler to print the trampoline entry on the
BL instruction like this:

0xb6a850c0: f7ffff9e bl -196 (0xb6a85000) ; pTestSuspend

Bug: 12607709
Change-Id: I9202bdb7cf21252ad807bd48701f1f6ce8e3d0fe
dd7624d2b9e599d57762d12031b10b89defc9807 15-Mar-2014 Ian Rogers <irogers@google.com> Allow mixing of thread offsets between 32 and 64bit architectures.

Begin a more full implementation x86-64 REX prefixes.
Doesn't implement 64bit thread offset support for the JNI compiler.

Change-Id: If9af2f08a1833c21ddb4b4077f9b03add1a05147
2700f7e1edbcd2518f4978e4cd0e05a4149f91b6 07-Mar-2014 buzbee <buzbee@google.com> Continuing register cleanup

Ready for review.

Continue the process of using RegStorage rather than
ints to hold register value in the top layers of codegen.
Given the huge number of changes in this CL, I've attempted
to minimize the number of actual logic changes. With this
CL, the use of ints for registers has largely been eliminated
except in the lowest utility levels. "Wide" utility routines
have been updated to take a single RegStorage rather than
a pair of ints representing low and high registers.

Upcoming CLs will be smaller and more targeted. My expectations:
o Allocate float double registers as a single double rather than
a pair of float single registers.
o Refactor to push code which assumes long and double Dalvik
values are held in a pair of register to the target dependent
layer.
o Clean-up of the xxx_mir.h files to reduce the amount of #defines
for registers. May also do a register renumbering to bring all
of our targets' register naming more consistent. Possibly
introduce a target-independent float/non-float test at the
RegStorage level.

Change-Id: I646de7392bdec94595dd2c6f76e0f1c4331096ff
0d507d1e0441e6bd6f3affca3a60774ea920f317 19-Mar-2014 Mathieu Chartier <mathieuc@google.com> Optimize stack overflow handling.

We now subtract the frame size from the stack pointer for methods
which have a frame smaller than a certain size. Also changed code to
use slow paths instead of launchpads.

Delete kStackOverflow launchpad since it is no longer needed.

ARM optimizations:
One less move per stack overflow check (without fault handler for
stack overflows). Use ldr pc instead of ldr r12, b r12.
Code size (boot.oat):
Before: 58405348
After: 57803236

TODO: X86 doesn't have the case for large frames. This could case an
incoming signal to go past the end of the stack (unlikely however).

Change-Id: Ie3a5635cd6fb09de27960e1f8cee45bfae38fb33
60d7a65f7fb60f502160a2e479e86014c7787553 14-Mar-2014 Brian Carlstrom <bdc@google.com> Fix stack overflow for mutual recursion.

There was an error where we would have a pc that was in the method
which generated the stack overflow. This didn't work however
because the stack overflow check was before we stored the method in
the stack. The result was that the stack overflow handler had a PC
which wasnt necessarily in the method at the top of the stack. This
is now fixed by always restoring the link register before branching
to the throw entrypoint.

Slight code size regression on ARM/Mips (unmeasured). Regression on ARM
is 4 bytes of code per stack overflow check. Some of this regression is
mitigated by having one less GC safepoint.

Also adds test case for StackOverflowError issue (from bdc).

Tests passing: ARM, X86, Mips
Phone booting: ARM

Bug: https://code.google.com/p/android/issues/detail?id=66411
Bug: 12967914
Change-Id: I96fe667799458b58d1f86671e051968f7be78d5d

(cherry-picked from c0f96d03a1855fda7d94332331b94860404874dd)
c0f96d03a1855fda7d94332331b94860404874dd 14-Mar-2014 Brian Carlstrom <bdc@google.com> Fix stack overflow for mutual recursion.

There was an error where we would have a pc that was in the method
which generated the stack overflow. This didn't work however
because the stack overflow check was before we stored the method in
the stack. The result was that the stack overflow handler had a PC
which wasnt necessarily in the method at the top of the stack. This
is now fixed by always restoring the link register before branching
to the throw entrypoint.

Slight code size regression on ARM/Mips (unmeasured). Regression on ARM
is 4 bytes of code per stack overflow check. Some of this regression is
mitigated by having one less GC safepoint.

Also adds test case for StackOverflowError issue (from bdc).

Tests passing: ARM, X86, Mips
Phone booting: ARM

Bug: https://code.google.com/p/android/issues/detail?id=66411
Bug: 12967914
Change-Id: I96fe667799458b58d1f86671e051968f7be78d5d
83cc7ae96d4176533dd0391a1591d321b0a87f4f 12-Feb-2014 Vladimir Marko <vmarko@google.com> Create a scoped arena allocator and use that for LVN.

This saves more than 0.5s of boot.oat compilation time
on Nexus 5.

TODO: Move other stuff to the scoped allocator. This CL
alone increases the peak memory allocation. By reusing
the memory for other parts of the compilation we should
reduce this overhead.

Change-Id: Ifbc00aab4f3afd0000da818dfe68b96713824a08
00e1ec6581b5b7b46ca4c314c2854e9caa647dd2 28-Feb-2014 Bill Buzbee <buzbee@android.com> Revert "Revert "Rework Quick compiler's register handling""

This reverts commit 86ec520fc8b696ed6f164d7b756009ecd6e4aace.

Ready. Fixed the original type, plus some mechanical changes
for rebasing.

Still needs additional testing, but the problem with the original
CL appears to have been a typo in the definition of the x86
double return template RegLocation.

Change-Id: I828c721f91d9b2546ef008c6ea81f40756305891
ae9fd93c39a341e2dffe15c61cc7d9e841fa92c4 11-Feb-2014 Mark Mendell <mark.p.mendell@intel.com> Tell GDB about Quick ART generated code

This is actually a lot of work. To do this, we need:
.debug_info
.debug_abbrev
.debug_frame
.debug_str

These are generated into the OAT file by OatWriter and ElfWriterQuick.

Since the Quick ART runtime doesn't use dlopen to load the OAT files,
GDB can't find this information. Use the alternate GDB JIT interface,
which can be invoked at runtime. To use this interface, an ELF image
needs to be built in memory. Read the information from the OAT file,
fixup the addresses to point to the real locations, add a symbol table
to hold the .text symbol, and then let GDB know about the information,
which will be read from the runtime address space.

This is quite primitive now, and could be cleaned up considerably. It
probably needs symbol table entries for the methods, and descriptions of
parameters and return types.

Currently only supported for X86.

This defaults to enabled for debug builds. Added dexoat --gen-gdb-info
and --no-gen-gdb-info flags to override.

Change-Id: I4d18b2370f6dfaa00c8cc1925f10717be3bd1a62
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
86ec520fc8b696ed6f164d7b756009ecd6e4aace 26-Feb-2014 Bill Buzbee <buzbee@android.com> Revert "Rework Quick compiler's register handling"

This reverts commit 2c1ed456dcdb027d097825dd98dbe48c71599b6c.

Change-Id: If88d69ba88e0af0b407ff2240566d7e4545d8a99
2c1ed456dcdb027d097825dd98dbe48c71599b6c 20-Feb-2014 buzbee <buzbee@google.com> Rework Quick compiler's register handling

For historical reasons, the Quick backend found it convenient
to consider all 64-bit Dalvik values held in registers
to be contained in a pair of 32-bit registers. Though this
worked well for ARM (with double-precision registers also
treated as a pair of 32-bit single-precision registers) it doesn't
play well with other targets. And, it is somewhat problematic
for 64-bit architectures.

This is the first of several CLs that will rework the way the
Quick backend deals with physical registers. The goal is to
eliminate the "64-bit value backed with 32-bit register pair"
requirement from the target-indendent portions of the backend
and support 64-bit registers throughout.

The key RegLocation struct, which describes the location of
Dalvik virtual register & register pairs, previously contained
fields for high and low physical registers. The low_reg and
high_reg fields are being replaced with a new type: RegStorage.
There will be a single instance of RegStorage for each RegLocation.
Note that RegStorage does not increase the space used. It is
16 bits wide, the same as the sum of the 8-bit low_reg and
high_reg fields.

At a target-independent level, it will describe whether the physical
register storage associated with the Dalvik value is a single 32
bit, single 64 bit, pair of 32 bit or vector. The actual register
number encoding is left to the target-dependent code layer.

Because physical register handling is pervasive throughout the
backend, this restructuring necessarily involves large CLs with
lots of changes. I'm going to roll these out in stages, and
attempt to segregate the CLs with largely mechanical changes from
those which restructure or rework the logic.

This CL is of the mechanical change variety - it replaces low_reg
and high_reg from RegLocation and introduces RegStorage. It also
includes a lot of new code (such as many calls to GetReg())
that should go away in upcoming CLs.

The tentative plan for the subsequent CLs is:

o Rework standard register utilities such as AllocReg() and
FreeReg() to use RegStorage instead of ints.
o Rework the target-independent GenXXX, OpXXX, LoadValue,
StoreValue, etc. routines to take RegStorage rather than
int register encodings.
o Take advantage of the vector representation and eliminate
the current vector field in RegLocation.
o Replace the "wide" variants of codegen utilities that take
low_reg/high_reg pairs with versions that use RegStorage.
o Add 64-bit register target independent codegen utilities
where possible, and where not virtualize with 32-bit general
register and 64-bit general register variants in the target
dependent layer.
o Expand/rework the LIR def/use flags to allow for more registers
(currently, we lose out on 16 MIPS floating point regs as
well as ARM's D16..D31 for lack of space in the masks).
o [Possibly] move the float/non-float determination of a register
from the target-dependent encoding to RegStorage. In other
words, replace IsFpReg(register_encoding_bits).

At the end of the day, all code in the target independent layer
should be using RegStorage, as should much of the target dependent
layer. Ideally, we won't be using the physical register number
encoding extracted from RegStorage (i.e. GetReg()) until the
NewLIRx() layer.

Change-Id: Idc5c741478f720bdd1d7123b94e4288be5ce52cb
3bc01748ef1c3e43361bdf520947a9d656658bf8 06-Feb-2014 Razvan A Lupusoru <razvan.a.lupusoru@intel.com> GenSpecialCase support for x86

Moved GenSpecialCase from being ARM specific to common code to allow
it to be used by x86 quick as well.

Change-Id: I728733e8f4c4da99af6091ef77e5c76ae0fee850
Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
55d0eac918321e0525f6e6491f36a80977e0d416 06-Feb-2014 Mark Mendell <mark.p.mendell@intel.com> Support Direct Method/Type access for X86

Thumb generates code to optimize calls to methods within core.oat.
Implement this for X86 as well, but take advantage of mov with 32 bit
immediate and call relative with 32 bit immediate.

Fix some incorrect return locations for long inlines.

Change-Id: I1907bdfc7574f3d0aa76c7fad13dc537acdf1ed3
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
67c39c4aefca23cb136157b889c09ee200b3dec6 01-Feb-2014 Mark Mendell <mark.p.mendell@intel.com> Support Literal pools for x86

They are being used to store double constants, which are very
expensive to generate into XMM registers. Uses the 'Compiler
Temporary' support just added. The MIR instructions are scanned for
a reference to a double constant, a packed switch or a FillArray.
These all need the address of the start of the method, since 32
bit x86 doesn't have a PC-relative addressing mode.

If needed, a compiler temporary is allocated, and the address of
the base of the method is calculated, and stored. Later uses can
just refer to the saved value.

Trickiness comes when generating the load from the literal area,
as the offset is unknown before final assembler. Assume a 32 bit
displacement is needed, and fix this if it wasn't necessary.

Use LoadValue to load the 'base of method' pointer. Fix an incorrect
test in GetRegLocation.

Change-Id: I53ffaa725dabc370e9820c4e0e78664ede3563e6
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
5816ed48bc339c983b40dc493e96b97821ce7966 27-Nov-2013 Vladimir Marko <vmarko@google.com> Detect special methods at the end of verification.

This moves special method handling to method inliner
and prepares for eventual inlining of these methods.

Change-Id: I51c51b940fb7bc714e33135cd61be69467861352
0d82948094d9a198e01aa95f64012bdedd5b6fc9 12-Oct-2013 buzbee <buzbee@google.com> 64-bit prep

Preparation for 64-bit roll.
o Eliminated storing pointers in 32-bit int slots in LIR.
o General size reductions of common structures to reduce impact
of doubled pointer sizes:
- BasicBlock struct was 72 bytes, now is 48.
- MIR struct was 72 bytes, now is 64.
- RegLocation was 12 bytes, now is 8.
o Generally replaced uses of BasicBlock* pointers with 16-bit Ids.
o Replaced several doubly-linked lists with singly-linked to save
one stored pointer per node.
o We had quite a few uses of uintptr_t's that were a holdover from
the JIT (which used pointers to mapped dex & actual code cache
addresses rather than trace-relative offsets). Replaced those with
uint32_t's.
o Clean up handling of embedded data for switch tables and array data.
o Miscellaneous cleanup.

I anticipate one or two additional CLs to reduce the size of MIR and LIR
structs.

Change-Id: I58e426d3f8e5efe64c1146b2823453da99451230
d9c4fc94fa618617f94e1de9af5f034549100753 02-Oct-2013 Ian Rogers <irogers@google.com> Inflate contended lock word by suspending owner.

Bug 6961405.
Don't inflate monitors for Notify and NotifyAll.
Tidy lock word, handle recursive lock case alongside unlocked case and move
assembly out of line (except for ARM quick). Also handle null in out-of-line
assembly as the test is quick and the enter/exit code is already a safepoint.
To gain ownership of a monitor on behalf of another thread, monitor contenders
must not hold the monitor_lock_, so they wait on a condition variable.
Reduce size of per mutex contention log.
Be consistent in calling thin lock thread ids just thread ids.
Fix potential thread death races caused by the use of FindThreadByThreadId,
make it invariant that returned threads are either self or suspended now.

Code size reduction on ARM boot.oat 0.2%.
Old nexus 7 speedup 0.25%, new nexus 7 speedup 1.4%, nexus 10 speedup 2.24%,
nexus 4 speedup 2.09% on DeltaBlue.

Change-Id: Id52558b914f160d9c8578fdd7fc8199a9598576a
f6c4b3ba3825de1dbb3e747a68b809c6cc8eb4db 25-Aug-2013 Mathieu Chartier <mathieuc@google.com> New arena memory allocator.

Before we were creating arenas for each method. The issue with doing this
is that we needed to memset each memory allocation. This can be improved
if you start out with arenas that contain all zeroed memory and recycle
them for each method. When you give memory back to the arena pool you do
a single memset to zero out all of the memory that you used.

Always inlined the fast path of the allocation code.

Removed the "zero" parameter since the new arena allocator always returns
zeroed memory.

Host dex2oat time on target oat apks (2 samples each).
Before:
real 1m11.958s
user 4m34.020s
sys 1m28.570s

After:
real 1m9.690s
user 4m17.670s
sys 1m23.960s

Target device dex2oat samples (Mako, Thinkfree.apk):
Without new arena allocator:
0m26.47s real 0m54.60s user 0m25.85s system
0m25.91s real 0m54.39s user 0m26.69s system
0m26.61s real 0m53.77s user 0m27.35s system
0m26.33s real 0m54.90s user 0m25.30s system
0m26.34s real 0m53.94s user 0m27.23s system

With new arena allocator:
0m25.02s real 0m54.46s user 0m19.94s system
0m25.17s real 0m55.06s user 0m20.72s system
0m24.85s real 0m55.14s user 0m19.30s system
0m24.59s real 0m54.02s user 0m20.07s system
0m25.06s real 0m55.00s user 0m20.42s system

Correctness of Thinkfree.apk.oat verified by diffing both of the oat files.

Change-Id: I5ff7b85ffe86c57d3434294ca7a621a695bf57a9
468532ea115657709bc32ee498e701a4c71762d4 05-Aug-2013 Ian Rogers <irogers@google.com> Entry point clean up.

Create set of entry points needed for image methods to avoid fix-up at load time:
- interpreter - bridge to interpreter, bridge to compiled code
- jni - dlsym lookup
- quick - resolution and bridge to interpreter
- portable - resolution and bridge to interpreter

Fix JNI work around to use JNI work around argument rewriting code that'd been
accidentally disabled.
Remove abstact method error stub, use interpreter bridge instead.
Consolidate trampoline (previously stub) generation in generic helper.
Simplify trampolines to jump directly into assembly code, keeps stack crawlable.
Dex: replace use of int with ThreadOffset for values that are thread offsets.
Tidy entry point routines between interpreter, jni, quick and portable.

Change-Id: I52a7c2bbb1b7e0ff8a3c3100b774212309d0828e
(cherry picked from commit 848871b4d8481229c32e0d048a9856e5a9a17ef9)
848871b4d8481229c32e0d048a9856e5a9a17ef9 05-Aug-2013 Ian Rogers <irogers@google.com> Entry point clean up.

Create set of entry points needed for image methods to avoid fix-up at load time:
- interpreter - bridge to interpreter, bridge to compiled code
- jni - dlsym lookup
- quick - resolution and bridge to interpreter
- portable - resolution and bridge to interpreter

Fix JNI work around to use JNI work around argument rewriting code that'd been
accidentally disabled.
Remove abstact method error stub, use interpreter bridge instead.
Consolidate trampoline (previously stub) generation in generic helper.
Simplify trampolines to jump directly into assembly code, keeps stack crawlable.
Dex: replace use of int with ThreadOffset for values that are thread offsets.
Tidy entry point routines between interpreter, jni, quick and portable.

Change-Id: I52a7c2bbb1b7e0ff8a3c3100b774212309d0828e
834b394ee759ed31c5371d8093d7cd8cd90014a8 31-Jul-2013 Brian Carlstrom <bdc@google.com> Merge remote-tracking branch 'goog/dalvik-dev' into merge-art-to-dalvik-dev

Change-Id: I323e9e8c29c3e39d50d9aba93121b26266c52a46
7655f29fabc0a12765de828914a18314382e5a35 29-Jul-2013 Ian Rogers <irogers@google.com> Portable refactorings.

Separate quick from portable entrypoints.
Move architectural dependencies into arch.

Change-Id: I9adbc0a9782e2959fdc3308215f01e3107632b7c
7934ac288acfb2552bb0b06ec1f61e5820d924a4 26-Jul-2013 Brian Carlstrom <bdc@google.com> Fix cpplint whitespace/comments issues

Change-Id: Iae286862c85fb8fd8901eae1204cd6d271d69496
2ce745c06271d5223d57dbf08117b20d5b60694a 18-Jul-2013 Brian Carlstrom <bdc@google.com> Fix cpplint whitespace/braces issues

Change-Id: Ide80939faf8e8690d8842dde8133902ac725ed1a
7940e44f4517de5e2634a7e07d58d0fb26160513 12-Jul-2013 Brian Carlstrom <bdc@google.com> Create separate Android.mk for main build targets

The runtime, compiler, dex2oat, and oatdump now are in seperate trees
to prevent dependency creep. They can now be individually built
without rebuilding the rest of the art projects. dalvikvm and jdwpspy
were already this way. Builds in the art directory should behave as
before, building everything including tests.

Change-Id: Ic6b1151e5ed0f823c3dd301afd2b13eb2d8feb81