History log of /art/compiler/utils/x86_64/assembler_x86_64.cc
Revision Date Author Comments
d1ee80948144526b985afb44a0574248cf7da58a 13-Apr-2016 Vladimir Marko <vmarko@google.com> Move Assemblers to the Arena.

And clean up some APIs to return std::unique_ptr<> instead
of raw pointers that don't communicate ownership.

(cherry picked from commit 93205e395f777c1dd81d3f164cf9a4aec4bde45f)

Bug: 27505766
Change-Id: I3017302307a0253d661240750298802fb0d9585e
abdac47c3c471d034a5b81aec35bf4201ba86a88 12-Feb-2016 Mark Mendell <mark.p.mendell@intel.com> Add X86/X86_64 support for CMOV from memory.

Add support for the memory form of CMOV. Add tests.

Change-Id: Ib9f5dbd3031c7e235ee3f2afdb7db75eed46277a
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
3f67e692860d281858485d48a4f1f81b907f1444 15-Jan-2016 Aart Bik <ajcbik@google.com> Implemented BitCount as an intrinsic. With unit test.

Rationale:
Recognizing this important operation as an intrinsic has
various advantages:
(1) having the no-side-effects/no-throw allows for
much more GVN/LICM/BCE.
(2) Some architectures, like x86_64, provide direct
support for this operation.

Performance improvements on X86_64:
CheckersEvalBench (32-bit bitboard): 27,210KNS -> 36,798KNS = + 35%
ReversiEvalBench (64-bit bitboard): 52,562KNS -> 89,086KNS = + 69%

Change-Id: I65d549b0469b7909b12c6611cdc34a8640a5751f
6ce017304099d1df97ffa016ce0efce79c67f344 30-Dec-2015 Nicolas Geoffray <ngeoffray@google.com> On x64, cmpl can never take a int64 immediate.

Fix a wrong type widening in x64 code generator and add
CHECKs in the assembler.

Change-Id: Id35f5d47c6cf78ed07e73ab783db09712d3c437f
9c86b485bc6169eadf846dd5f7cdf0958fe1eb23 18-Sep-2015 Mark Mendell <mark.p.mendell@intel.com> X86_64 jump tables for PackedSwitch

Implement PackedSwitch using a jump table of offsets to blocks.

Bug: 24092914
Bug: 21119474
Change-Id: I83430086c03ef728d30d79b4022607e9245ef98f
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
46fe0650be6a69f63b54c0967194350c6a145557 18-Sep-2015 Nicolas Geoffray <ngeoffray@google.com> Fix x64's cmpw.

Change-Id: If700f2994990864c8b34aa52eb7a767153a1f917
bcee092d7b0cbb7181d428115ad98d25ce844061 16-Sep-2015 Mark Mendell <mark.p.mendell@intel.com> Add X86 bsf and rotate instructions

These are for use in new intrinsics. Bsf (Bit Scan Forward) is used in
{Long,Integer}NumberOfTrailingZeros and the rotates are used in
{Long,Integer}Rotate{Left,Right}.

Change-Id: Icb599d7e1eec4e4ea9e5b4f0b1654c7b8d4de678
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
73f455ecb76d063846a82735eb80596ceee8cee3 21-Aug-2015 Mark Mendell <mark.p.mendell@intel.com> X86: Assembler support for near labels

The optimizing compiler uses 32 bit relative jumps for all forward
jumps, just in case the offset is too large to fit in one byte. Some of
the generated code knows that the jumps will in fact fit.

Add a 'NearLabel' class to the x86 and x86_64 assemblers. This will be
used to generate known short forward branches.

Add jecxz/jrcxz instructions, which only handle a short offset. They
will be used for intrinsics.

Add tests for the new instructions and NearLabel.

Change-Id: I11177f36394d35d63b32364b0e6289ee6d97de46
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
8ae3ffb29489a127f2a6242c33845dac8d50e508 13-Aug-2015 Mark Mendell <mark.p.mendell@intel.com> Add 'bsr' instruction to x86 and x86_64

Add support for 'bsr' instruction. Add tests.

Change-Id: I1cd8b30d7f3f5ee7fbeef8124cc6a31bf8ce59d5
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
b9c4bbee9364a9351376fd1fec9604e7c84778d8 01-Jul-2015 Mark Mendell <mark.p.mendell@intel.com> Add rep movsw to x86 and x86_64 instructions.

Add 'REP MOVSW' as a supported instruction for x86 32 and 64 bit.

Added tests.

Change-Id: I1c615ac1e7fa46c48983c90f791b92be0375c8b8
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
3fd0e6a86430bf060c7eb391c1378394c4a2c574 04-Aug-2015 agicsaki <agicsaki@google.com> Added repe_cmpsq instruction to x86_64 assembler

Change-Id: I9085694fd3313581b2775a8267ccda58fec19a1a
970abfb65530b700f3a0cc8b90b131df5420cec3 31-Jul-2015 agicsaki <agicsaki@google.com> Added repe_cmpsl instruction to x86, x86_64 assemblers

Support for this instruction has already been added to the disassembler
in commit 124b392d.

Change-Id: I6e8401a7b814618758427f5cc6b4992e265f937c
7a08fb53bd13c74dec92256bef22a37250db1373 15-Jul-2015 Mark Mendell <mark.p.mendell@intel.com> Optimizing: Add Non Temporal Move support for x86

Add moves that don't pollute the data cache. These can be used for
assigning large data structures.

Change-Id: I14d91ba6264f5ce2f128033d65d59b2536426643
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
4a2aa4af61e653a89f88d776dcdc55f6c7ca05f2 27-Jul-2015 Mark Mendell <mark.p.mendell@intel.com> Optimizing: Use more X86 3 operand multiplies

The X86_64 code generator generated 3 operand multiplies for long
multiplication only. Add support for 3 operand multiplication for
int as well for both X86 and X86_64.

Note that the RHS operand must be a 32 bit constant, and that it is
possible for the constant to end up in a register (!) due to a previous
use by another instruction. Handle this case by checking the operand,
otherwise the first input might not be the same as the output, due to
the use of Any().

Also allow stack operands for multiplication.

Change-Id: I8f3d14cc01e9a91210f418258aa18065ee87979d
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
71311f868e2579fa5d40b24e620198734119d1a0 27-Jul-2015 agicsaki <agicsaki@google.com> Added repe_cmpsw instruction to x86, x86_64 assemblers

Change-Id: I7634959eebb64d607f47497db320d5c2afdef16b
4d02711ea578dbb789abb30cbaf12f9926e13d81 01-Jul-2015 Roland Levillain <rpl@google.com> Implement heap poisoning in ART's Optimizing compiler.

- Instrument ARM, ARM64, x86 and x86-64 code generators.
- Note: To turn heap poisoning on in Optimizing, set the
environment variable `ART_HEAP_POISONING' to "true"
before compiling ART.

Bug: 12687968
Change-Id: Ib3120b38cf805a8a50207a314b9ccc90c8d93740
3d21bdf8894e780d349c481e5c9e29fe1556051c 22-Apr-2015 Mathieu Chartier <mathieuc@google.com> Move mirror::ArtMethod to native

Optimizing + quick tests are passing, devices boot.

TODO: Test and fix bugs in mips64.

Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS.
Some of the savings are from removal of virtual methods and direct
methods object arrays.

Bug: 19264997

(cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33)

Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d

Fix some ArtMethod related bugs

Added root visiting for runtime methods, not currently required
since the GcRoots in these methods are null.

Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes
--trace run-tests 005, 044.

Fixed optimizing compiler bug where we used a normal stack location
instead of double on ARM64, this fixes the debuggable tests.

TODO: Fix JDWP tests.

Bug: 19264997

Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3

ART: Fix casts for 64-bit pointers on 32-bit compiler.

Bug: 19264997
Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457

Fix JDWP tests after ArtMethod change

Fixes Throwable::GetStackDepth for exception event detection after
internal stack trace representation change.

Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of
proxy method.

Bug: 19264997
Change-Id: I363e293796848c3ec491c963813f62d868da44d2

Fix accidental IMT and root marking regression

Was always using the conflict trampoline. Also included fix for
regression in GC time caused by extra roots. Most of the regression
was IMT.

Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to
detached thread.

EvaluateAndApplyChanges:
From ~2500 -> ~1980
GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots

Bug: 19264997
Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0

Fix bogus image test assert

Previously we were comparing the size of the non moving space to
size of the image file.

Now we properly compare the size of the image space against the size
of the image file.

Bug: 19264997
Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a

[MIPS64] Fix art_quick_invoke_stub argument offsets.

ArtMethod reference's size got bigger, so we need to move other args
and leave enough space for ArtMethod* and 'this' pointer.

This fixes mips64 boot.

Bug: 19264997
Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
e401d146407d61eeb99f8d6176b2ac13c4df1e33 22-Apr-2015 Mathieu Chartier <mathieuc@google.com> Move mirror::ArtMethod to native

Optimizing + quick tests are passing, devices boot.

TODO: Test and fix bugs in mips64.

Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS.
Some of the savings are from removal of virtual methods and direct
methods object arrays.

Bug: 19264997
Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
21030dd59b1e350f6f43de39e3c4ce0886ff539c 07-May-2015 Andreas Gampe <agampe@google.com> ART: x86 indexOf intrinsics for the optimizing compiler

Add intrinsics implementations for indexOf in the optimizing
compiler. These are mostly ported from Quick. Add instruction
support to assemblers where necessary.

Change-Id: Ife90ed0245532a5c436a26fe84715dc357f353c8
2cebb24bfc3247d3e9be138a3350106737455918 22-Apr-2015 Mathieu Chartier <mathieuc@google.com> Replace NULL with nullptr

Also fixed some lines that were too long, and a few other minor
details.

Change-Id: I6efba5fb6e03eb5d0a300fddb2a75bf8e2f175cb
7fd8b59ab9fcd896a95883ce7be781d74e849d60 22-Apr-2015 Mark Mendell <mark.p.mendell@intel.com> Fix X86_64 assembler REX instructions

A couple of instructions don't pass the 'Address' to EmitRex64. This
will cause the incorrect register number to be assembled if the register
is >= 8.

This may cause bad code to be generated in some cases.

Change-Id: I2907ae8b7629ee95d542e3fab429318994a78938
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
40741f394b2737e503f2c08be0ae9dd490fb106b 21-Apr-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Use more X86_64 addressing modes

Allow constant and memory addresses to more X86_64 instructions.

Add memory formats to X86_64 instructions to match.

Fix a bug in cmpq(CpuRegister, const Address&).

Allow mov <addr>,immediate (instruction 0xC7) to be a valid faulting
instruction.

Change-Id: I5b8a409444426633920cd08e09f687a7afc88a39
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
0a18601f141d864a26d4b74ff5613e69ae411483 13-Apr-2015 Roland Levillain <rpl@google.com> Exercise the x86 and x86-64 FILD and FISTP instructions.

- Ensure the double- and quadword x87 (FPU) instructions for
integer loading (resp. fildl and fildll) are properly
generated by the x86 and x86-64 generators (resp.
X86Assembler::filds/X86_64Assembler::filds and
X86Assembler::fildl/X86_64Assembler::fildl).
- Ensure the double- and quadword x87 (FPU) instructions for
integer storing & popping (resp. filstpl and fistpll) are
properly generated by the x86 and x86-64 generators (resp.
X86Assembler::fistps/X86_64Assembler::fistps and
X86Assembler::fistpl/X86_64Assembler::fistpl).

These instructions can be used in the implementation of the
long-to-float and long-to-double Dex type conversions.

Change-Id: Iade52a9aee326d189d77d3dbd352a2b5dab52e46
386ce406f150645158d6067c4e0a36565aefc44f 13-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Optimizing: Fix long-to-fp conversion on x86."

Test fails on arm.

This reverts commit 2d45b4df3838d9c0e5a213305ccd1d7009e01437.

Change-Id: Id2864917b52f7ffba459680303a2d15b34f16a4e
2d45b4df3838d9c0e5a213305ccd1d7009e01437 07-Apr-2015 Serguei Katkov <serguei.i.katkov@intel.com> Optimizing: Fix long-to-fp conversion on x86.

long-to-fp conversion implemented using SSE loses the precision.
The test is included. CL uses FPU to provide the correct result.

Change-Id: I8eaf3c46819a8cb52642a7e7d7c4e3e0edbc88db
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
39dcf55a56da746e04f477f89e7b00ba1de03880 10-Apr-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Address x86_64 RIP patch comments

Nicolas had some comments after the patch
https://android-review.googlesource.com/#/c/144100 had merged. Fix the
problems that he found.

Change-Id: I40e8a4273997860db7511dc8f1986281b72bead2
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
f55c3e0825cdfc4c5a27730031177d1a0198ec5a 27-Mar-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Add RIP support for x86_64

Support a constant area addressed using RIP on x86_64. Use it for FP
operations to avoid loading constants into a CPU register and moving
to a XMM register.

Change-Id: I58421759ef2a8475538876c20e696ec787015a72
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
0f88e87085b7cf6544dadff3f555773966a6853e 30-Mar-2015 Guillaume Sanchez <guillaumesa@google.com> Speedup div/rem by constants on x86 and x86_64

This is done using the algorithms in Hacker's Delight chapter 10.

Change-Id: I7bacefe10067569769ed31a1f7834f796fb41119
d23840d3ed900c6072d71e6599b3568b68de6b7c 08-Apr-2015 Chao-ying Fu <chao-ying.fu@intel.com> x86_64: Fix the rex prefix for movzxb, movsxb, movb

This patch sets the rex prefix for the source byte register of
movzxb, movsxb, and movb that has the destination memory operand,
when the register is SPL, BPL, SIL, DIL.

This patch adds tests for movzxb and movsxb via Repeatrb(),
and adds the tertiary and quaternary register views for word and
byte registers on x86_64.
TODO: Support tests with memory operands.

Change-Id: I0c5c727f3dd4a75af039b87f7e57d0741e689038
Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
dd97393aca1a3ff2abec4dc4f78d7724300971bc 07-Apr-2015 David Srbecky <dsrbecky@google.com> Implement CFI for JNI.

CFI is necessary for stack unwinding in gdb, lldb, and libunwind.

Change-Id: I37eb7973f99a6975034cf0e699e138c3a9aba10f
8c57831b2b07185ee1986b9af68a351e1ca584c3 07-Apr-2015 David Srbecky <dsrbecky@google.com> Remove the old CFI infrastructure.

Change-Id: I12a17a8a1c39ffccaa499c328ebac36e4d74dc4e
58d25fd052e999a24734b0cf856a1563e3d1b2d0 03-Apr-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Implement more x86/x86_64 intrinsics

Implement CAS and bit reverse and byte reverse intrinsics that were
missing from x86 and x86_64 implementations.

Add assembler tests and compareAndSwapLong test.

Change-Id: Iabb2ff46036645df0a91f640288ef06090a64ee3
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
fb8d279bc011b31d0765dc7ca59afea324fd0d0c 01-Apr-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Implement x86/x86_64 math intrinsics

Implement floor/ceil/round/RoundFloat on x86 and x86_64.
Implement RoundDouble on x86_64.

Add support for roundss and roundsd on both architectures. Support them
in the disassembler as well.

Add the instruction set features for x86, as the 'round' instruction is
only supported if SSE4.1 is supported.

Fix the tests to handle the addition of passing the instruction set
features to x86 and x86_64.

Add assembler tests for roundsd and roundss to x86_64 assembler tests.

Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
3f6c7f61855172d3d9b7a9221baba76136088e7c 13-Mar-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing] Improve x86, x86_64 code

Tweak the generated code to allow more use of constants and other small
changes
- Use test vs. compare to 0
- EmitMove of 0.0 should use xorps
- VisitCompare kPrimLong can use constants
- cmp/add/sub/mul on x86_64 can use constants if in int32_t range
- long bit operations on x86 examine long constant high/low to optimize
- Use 3 operand imulq if constant is in int32_t range

Change-Id: I2dd4010fdffa129fe00905b0020590fe95f3f926
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
ab1eb0d1d047e3478ebb891e5259d2f1d1dd78bd 14-Feb-2015 Andreas Gampe <agampe@google.com> ART: Templatize IsInt & IsUint

Ensure that things are used correctly.

Change-Id: I76f082b32dcee28bbfb4c519daa401ac595873b3
748f140d5f0631780dbeecb033c1416faf78930d 27-Jan-2015 Nicolas Geoffray <ngeoffray@google.com> x64 goodness.

- Use test instead of cmp when comparing against 0.
- Make it possible to use lea for add.
- Use xor instead of mov when loading 0.

Change-Id: Ide95c4e2d9b773e952412892f2df6869600c324e
1cf95287364948689f6a1a320567acd7728e94a3 12-Dec-2014 Nicolas Geoffray <ngeoffray@google.com> Small optimization for recursive calls: avoid dex cache.

Change-Id: I044757a2f06e535cdc1480c4fc8182b89635baf6
24f2dfae084b2382c053f5d688fd6bb26cb8a328 15-Jan-2015 Mark Mendell <mark.p.mendell@intel.com> [optimizing compiler] Implement inline x86 FP '%'

Replace the calls to fmod/fmodf by inline code as is done in the Quick
compiler.

Remove the quick fmod/fmodf runtime entries, as they are no longer in
use.

64 bit code generator Move() routine needed to be enhanced to handle
constants, as Location::Any() allows them to be generated.

Change-Id: I6b6a42f6faeed4b0b3c940453e487daf5b25d184
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
cd6dffedf1bd8e6dfb3fb0c933551f9a90f7de3f 08-Jan-2015 Calin Juravle <calin@google.com> Add implicit null checks for the optimizing compiler

- for backends: arm, arm64, x86, x86_64
- fixed parameter passing for CodeGenerator
- 003-omnibus-opcodes test verifies that NullPointerExceptions work as
expected

Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
71fb52fee246b7d511f520febbd73dc7a9bbca79 30-Dec-2014 Andreas Gampe <agampe@google.com> ART: Optimizing compiler intrinsics

Add intrinsics infrastructure to the optimizing compiler.

Add almost all intrinsics supported by Quick to the x86-64 backend.
Further intrinsics require more assembler support.

Change-Id: I48de9b44c82886bb298d16e74e12a9506b8e8807
140c2c7ca3794210a5376f1b942e12d8b7795faf 06-Jan-2015 Andreas Gampe <agampe@google.com> ART: Remove unused parts of x86 assemblers

These functions are neither used nor functional.

Change-Id: Ib6d0761388a45662ad9448ceb2c539c6f0b77f23
4c0b61f506644bb6b647be05d02c5fb45b9ceb48 05-Dec-2014 Roland Levillain <rpl@google.com> Add support for double-to-int & double-to-long in optimizing.

- Add support for the double-to-int and double-to-long Dex
instructions in the optimizing compiler.
- Add S1 to the list of ARM FPU parameter registers so that
a double value can be passed as parameter during a call
to the runtime through D0.
- Have art::x86_64::X86_64Assembler::cvttsd2si work with
64-bit operands.
- Generate x86, x86-64 and ARM (but not ARM64) code for
double to int and double to long HTypeConversion nodes.
- Add related tests to test/422-type-conversion.

Change-Id: Ic93b9ec6630c26e940f7966a3346ad3fd5a2ab3a
624279f3c70f9904cbaf428078981b05d3b324c0 04-Dec-2014 Roland Levillain <rpl@google.com> Add support for float-to-long in the optimizing compiler.

- Add support for the float-to-long Dex instruction in the
optimizing compiler.
- Add a Dex PC field to art::HTypeConversion to allow the
x86 and ARM code generators to produce runtime calls.
- Instruct art::CodeGenerator::RecordPcInfo not to record
PC information for HTypeConversion instructions.
- Add S0 to the list of ARM FPU parameter registers.
- Have art::x86_64::X86_64Assembler::cvttss2si work with
64-bit operands.
- Generate x86, x86-64 and ARM (but not ARM64) code for
float to long HTypeConversion nodes.
- Add related tests to test/422-type-conversion.

Change-Id: I954214f0d537187883f83f7a83a1bb2dd8a21fd4
6d0e483dd2e0b63e952de060738c10e2abd12ff7 27-Nov-2014 Roland Levillain <rpl@google.com> Add support for long-to-float in the optimizing compiler.

- Add support for the long-to-float Dex instruction in the
optimizing compiler.
- Have art::x86_64::X86_64Assembler::cvtsi2ss work with
64-bit operands.
- Generate x86, x86-64 and ARM (but not ARM64) code for
long to float HTypeConversion nodes.
- Add related tests to test/422-type-conversion.

Change-Id: Ic983cbeb1ae2051add40bc519a8f00a6196166c9
ddb7df25af45d7cd19ed1138e537973735cc78a5 25-Nov-2014 Calin Juravle <calin@google.com> [optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE}

Adds:
- float comparison for arm, x86, x86_64 backends.
- ucomis{s,d} assembly to x86 and x86_64.
- vmstat assebmly for thumb2
- new assembly tests

Change-Id: Ie3e19d0c08b3b875cd0a4be4ee4e9c8a4a076290
647b9ed41cdb7cf302fd356627a3ba372419b78c 27-Nov-2014 Roland Levillain <rpl@google.com> Add support for long-to-double in the optimizing compiler.

- Add support for the long-to-double Dex instruction in the
optimizing compiler.
- Enable requests of temporary FPU (double) registers during
code generation.
- Fix art::x86::X86Assembler::LoadLongConstant and extend
it to int64_t values.
- Have art::x86_64::X86_64Assembler::cvtsi2sd work with
64-bit operands.
- Generate x86, x86-64 and ARM (but not ARM64) code for
long to double HTypeConversion nodes.
- Add related tests to test/422-type-conversion.

Change-Id: Ie73d9e5e25bd2e15f585c371e8fc2dcb83438ccd
91debbc3da3e3376416e4394155d9f9e355255cb 26-Nov-2014 Calin Juravle <calin@google.com> Revert "[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE}"

Fails on arm due to missing vmrs op after vcmp. I revert this instead of pushing the fix because I don't understand yet why it compiles with run-test but not with dex2oat.

This reverts commit fd861249f31ab360c12dd1ffb131d50f02b0bfc6.

Change-Id: Idc2d30f6a0f39ddd3596aa18a532ae90f8aaf62f
fd861249f31ab360c12dd1ffb131d50f02b0bfc6 25-Nov-2014 Calin Juravle <calin@google.com> [optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE}

- adds float comparison for arm, x86, x86_64 backends.
- adds ucomis{s,d} assembly to x86 and x86_64.

Change-Id: I232d2b6e9ecf373beb5cc63698dd97a658ff9c83
799f506b8d48bcceef5e6cf50f3f5eb6bcea05e1 26-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Revert "[optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE}"

Fails on x86_64 and target.

This reverts commit cea28ec4b9e94ec942899acf1dbf20f8999b36b4.

Change-Id: I30c1d188c7ecfe765f137a307022ede84f15482c
cea28ec4b9e94ec942899acf1dbf20f8999b36b4 25-Nov-2014 Calin Juravle <calin@google.com> [optimizing compiler] Add CMP{L,G}_{FLOAT,DOUBLE}

- adds float comparison for arm, x86, x86_64 backends.
- adds ucomis{s,d} assembly to x86 and x86_64.

Change-Id: Ie91e04bfb402025073054f3803a3a569e4705caa
9aec02fc5df5518c16f1e5a9b6cb198a192db973 19-Nov-2014 Calin Juravle <calin@google.com> [optimizing compiler] Add shifts

Added SHL, SHR, USHR for arm, x86, x86_64.

Change-Id: I971f594e270179457e6958acf1401ff7630df07e
8366ca0d7ba3b80a2d5be65ba436446cc32440bd 17-Nov-2014 Elliott Hughes <enh@google.com> Fix the last users of TARGET_CPU_SMP.

Everyone else assumes SMP.

Change-Id: I7ff7faef46fbec6c67d6e446812d599e473cba39
851df20225593b10e698a760ac3cd5243620700b 12-Nov-2014 Andreas Gampe <agampe@google.com> ART: Multiview assembler_test, fix x86-64 assembler

Expose "secondary" names for registers so it is possible to test
32b views for 64b architectures.

Add floating-point register testing.

Refactor assembler_test for better code reuse (and simpler adding
of combination drivers).

Fix movss, movsd (MR instead of RM encoding), xchgl, xchgq,
both versions of EmitGenericShift.

Tighten imull(Reg,Imm), imulq(Reg,Imm), xchgl and xchgq encoding.

Clarify cv*** variants with a comment.

Add tests for movl, addl, imull, imuli, mull, subl, cmpqi, cmpl,
xorq (regs), xorl, movss, movsd, addss, addsd, subss, subsd, mulss,
mulsd, divss, divsd, cvtsi2ss, cvtsi2sd, cvtss2si, cvtss2sd, cvtsd2si,
cvttss2si, cvttsd2si, cvtsd2ss, cvtdq2pd, comiss, comisd, sqrtss,
sqrtsd, xorps, xorpd, fincstp, fsin, fcos, fptan, xchgl (disabled,
see code comment), xchgq, testl, andl, andq, orl, orq, shll, shrl,
sarl, negq, negl, notq, notl, enter and leave, call, ret, and jmp,
and make some older ones more exhaustive.

Follow-up TODOs:
1) Support memory (Address).
2) Support tertiary and quaternary register views.

Bug: 18117217
Change-Id: I1d583a3bec552e3cc7c315925e1e006f393ab687
d6fb6cfb6f2d0d9595f55e8cc18d2753be5d9a13 11-Nov-2014 Calin Juravle <calin@google.com> [optimizing compiler] Add DIV_LONG

- for backends: arm, x86, x86_64
- added cqo, idivq, testq assembly for x64_64
- small cleanups

Change-Id: I762ef37880749038ed25d6014370be9a61795200
9574c4b5f5ef039d694ac12c97e25ca02eca83c0 12-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Implement and/or/xor in optimizing.

Change-Id: I7cf6da1fd334a7177a5580931b8f174dd40b7cec
57a88d4ac205874dc85d22f9f6a9ca3c4c373eeb 10-Nov-2014 Nicolas Geoffray <ngeoffray@google.com> Implement checkcast for optimizing.

- Ended up not using HTypeCheck because of how
instanceof and checkcast end up having different logic
for code generation.

- Fix a x86_64 assembler bug triggered by now enabling
more methods to be compiled. Difficult to test today
without b/18117217.

Change-Id: I3022e7ae03befb1d10bea9637ad21fadc430abe0
946e143941d456a4ec666f7f54719c65c5aa3f5d 11-Nov-2014 Roland Levillain <rpl@google.com> Revert "Revert "Add support for long-to-int in the optimizing compiler.""

This reverts commit 3adfd1b4fb20ac2b0217b5d2737bfe30ad90257a.

Change-Id: Iacf0c6492d49267e24f1b727dbf6379b21fd02db
dff1f2812ecdaea89978c5351f0c70cdabbc0821 05-Nov-2014 Roland Levillain <rpl@google.com> Support int-to-long conversions in the optimizing compiler.

- Add support for the int-to-float Dex instruction in the
optimizing compiler.
- Add a HTypeConversion node type for control-flow graphs.
- Generate x86, x86-64 and ARM (but not ARM64) code for
int-to-float HTypeConversion nodes.
- Add a 64-bit "Move doubleword to quadword with
sign-extension" (MOVSXD) instruction to the x86-64
assembler.
- Add related tests to test/422-type-conversion.

Change-Id: Ieb8ec5380f9c411857119c79aa8d0728fd10f780
705664321a5cc1418255172f92d7d7195cf60a7b 24-Oct-2014 Roland Levillain <rpl@google.com> Add long bitwise not instruction in the optimizing compiler.

- Add support for the not-long (long integer one's
complement negation) instruction in the optimizing
compiler.
- Add a 64-bit NOT instruction (notq) to the x86-64
assembler.
- Generate ARM, x86 and x86-64 code for long HNot nodes.
- Gather not-related tests in test/416-optimizing-arith-not.

Change-Id: I2d5b75e9875664d6032d04f8401b2bbb84506948
b5de00f1c8f53e6552f1778702673c6274a98bb3 24-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Fix encoding of imul in x86_64 assembler.

Change-Id: I5b97f5698ed8ec9d0759d0e1eba8be29119c16c5
2e07b4f0a84a7968b4690c2b1be2e2f75cc6fa8e 23-Oct-2014 Roland Levillain <rpl@google.com> Revert "Revert "Implement long negate instruction in the optimizing compiler.""

This reverts commit 30ca3d847fe72cfa33e1b2473100ea2d8bea4517.

Change-Id: I188ca8d460d55d3a9966bcf31e0588575afa77d2
30ca3d847fe72cfa33e1b2473100ea2d8bea4517 23-Oct-2014 Roland Levillain <rpl@google.com> Revert "Implement long negate instruction in the optimizing compiler."

This reverts commit 66ce173a40eff4392e9949ede169ccf3108be2db.
66ce173a40eff4392e9949ede169ccf3108be2db 23-Oct-2014 Roland Levillain <rpl@google.com> Implement long negate instruction in the optimizing compiler.

- Add support for the neg-long (long integer two's
complement negate) instruction in the optimizing compiler.
- Add a 64-bit NEG instruction (negq) to the x86-64
assembler.
- Generate ARM, x86 and x86-64 code for integer HNeg nodes.
- Put neg-related tests into test/415-optimizing-arith-neg.

Change-Id: I1fbe9611e134408a6b8745d1df20ab6ffa5e50f2
102cbed1e52b7c5f09458b44903fe97bb3e14d5f 15-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Implement register allocator for floating point registers.

Also:
- Fix misuses of emitting the rex prefix in the x86_64 assembler.
- Fix movaps code generation in the x86_64 assembler.

Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
34bacdf7eb46c0ffbf24ba7aa14a904bc9176fb2 07-Oct-2014 Calin Juravle <calin@google.com> Add multiplication for integral types

This also fixes an issue where we could allocate a pair register even if
one of its parts was already blocked.

Change-Id: I4869175933409add2a56f1ccfb369c3d3dd3cb01
13735955f39b3b304c37d2b2840663c131262c18 08-Oct-2014 Ian Rogers <irogers@google.com> stdint types all the way!

Change-Id: I4e4ef3a2002fc59ebd9097087f150eaf3f2a7e08
7fb49da8ec62e8a10ed9419ade9f32c6b1174687 06-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Add support for floats and doubles.

- Follows Quick conventions.
- Currently only works with baseline register allocator.

Change-Id: Ie4b8e298f4f5e1cd82364da83e4344d4fc3621a3
b6e7206ad7a426adda9cfd649a4ef969607d79d6 07-Oct-2014 Nicolas Geoffray <ngeoffray@google.com> Fix movw on x86/x86_64 to accept any 16bits immediate.

Change-Id: I282eece0cd497431f207cec61852b4585ed3655c
26a25ef62a13f409f941aa39825a51b4d6f0f047 30-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Add a prepare for register allocation pass.

- Currently the pass just changes the uses of checks to the
actual values.
- Also optimize array access, now that inputs can be constants.
- And fix another bug in the register allocator reveiled by
this change.

Change-Id: I43be0dbde9330ee5c8f9d678de11361292d8bd98
f889267adadd62c92d1d3726764598946a961c10 30-Sep-2014 Hiroshi Yamauchi <yamauchi@google.com> Fix x86_64 assembler LoadRef to use movl.

As references are 32-bit, we should use movl instead movq.

Change-Id: Iffefbb9d86d5f40375f73994fd481f9bd28499b2
b88f0b16dbaff09a140d2a62b66eca2736ff514b 26-Sep-2014 Hiroshi Yamauchi <yamauchi@google.com> Get heap poisoning working in 64-bit.

This adds the reference negate code in arm64 and x86_64 that's used by
the jni compiler.

Bug: 12687968
Bug: 8367515
Change-Id: I28a44bcead1ee613866645620b4eaf54fad6a3aa
3c04974a90b0e03f4b509010bff49f0b2a3da57f 24-Sep-2014 Nicolas Geoffray <ngeoffray@google.com> Optimize suspend checks in optimizing compiler.

- Remove the ones added during graph build (they were added
for the baseline code generator).
- Emit them at loop back edges after phi moves, so that the test
can directly jump to the loop header.
- Fix x86 and x86_64 suspend check by using cmpw instead of cmpl.

Change-Id: I6fad5795a55705d86c9e1cb85bf5d63dadfafa2a
e3ea83811d47152c00abea24a9b420651a33b496 08-Aug-2014 Yevgeny Rouban <yevgeny.y.rouban@intel.com> ART source line debug info in OAT files

OAT files have source line information enough for ART runtime needs like
jump to/from interpreter and thread suspension. But this information
is not enough for finer grained source level debugging and low-level
profiling (VTune or perf).

This patch adds to OAT files two additional sections:
.debug_line - DWARF formatted Elf32 section with detailed source line
information (mapping from native PC to Java source lines).

In addition to the debugging symbols added using the dex2oat option
--include-debug-symbols, the source line information is added to
the section .debug_line.

The source line info can be read by many Elf reading tools like objdump,
readelf, dwarfdump, gdb, perf, VTune, ...

gdb can use this debug line information in x86. In 64-bit mode
the information can be used if the oat file is mapped in the lower
address space (address has higher 32 bits zeroed). Relocation works.

Testing:
1. art/test/run-test --host --gdb [--64] 001-HelloWorld
2. in gdb: break Main.java:19
3. in gdb: break Runtime.java:111
4. in gdb: run - stops at void java.lang.Runtime.<init>()
5. in gdb: backtrace - shows call stack down to main()
6. in gdb: continue - stops at void Main.main() (only in 32-bit mode)
7. in gdb: backtrace - shows call stack down to main()
8. objdump -W <oat-file> - addresses are from VMA range of .text
section reported by objdump -h <file>
9. dwarfdump -ka <oat-file> - no errors expected

Size of aosp-x86-eng boot.oat increased by 11% from 80.5Mb to 89.2Mb
with two sections added .debug_line (7.2Mb) and .rel.debug (1.5Mb).

Change-Id: Ib8828832686e49782a63d5529008ff4814ed9cda
Signed-off-by: Yevgeny Rouban <yevgeny.y.rouban@intel.com>
547cdfd21ee21e4ab9ca8692d6ef47c62ee7ea52 05-Aug-2014 Tong Shen <endlessroad@google.com> Emit CFI for x86 & x86_64 JNI compiler.

Now for host-side x86 & x86_64 ART, we are able to get complete stacktrace with even mixed C/C++ & Java stack frames.

Testing:
1. art/test/run-test --host --gdb [--64] --no-relocate 005
2. In gdb, run 'b art::Class_classForName' which is implementation of a Java native method, then 'r'
3. In gdb, run 'bt'. You should see stack frames down to main()

Change-Id: I2d17e9aa0f6d42d374b5362a15ea35a2fce96302
e4ded41ae2648973d5fed8c6bafaebf917ea7d17 05-Aug-2014 Nicolas Geoffray <ngeoffray@google.com> Fix movw in x86_64 assembler.

Change-Id: Ibceb03fd57adea09643aa77a9399be196fa14709
f12feb8e0e857f2832545b3f28d31bad5a9d3903 17-Jul-2014 Nicolas Geoffray <ngeoffray@google.com> Stack overflow checks and NPE checks for optimizing.

Change-Id: I59e97448bf29778769b79b51ee4ea43f43493d96
1a43dd78d054dbad8d7af9ba4829ea2f1cb70b53 17-Jul-2014 Nicolas Geoffray <ngeoffray@google.com> Add write barriers to optimizing compiler.

Change-Id: I43a40954757f51d49782e70bc28f7c314d6dbe17
96f89a290eb67d7bf4b1636798fa28df14309cc7 11-Jul-2014 Nicolas Geoffray <ngeoffray@google.com> Add assembly operations with constants in optimizing compiler.

Change-Id: I5bcc35ab50d4457186effef5592a75d7f4e5b65f
c380191f3048db2a3796d65db8e5d5a5e7b08c65 08-Jul-2014 Serguei Katkov <serguei.i.katkov@intel.com> x86_64: Enable fp-reg promotion

Patch introduces 4 register XMM12-15 available for promotion of
fp virtual registers.

Change-Id: I3f89ad07fc8ae98b70f550eada09be7b693ffb67
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
412f10cfed002ab617c78f2621d68446ca4dd8bd 19-Jun-2014 Nicolas Geoffray <ngeoffray@google.com> Support longs in the register allocator for x86_64.

Change-Id: I7fb6dfb761bc5cf9e5705682032855a0a70ca867
5408b6ba5d73ac0890683ebd7ddb4151a8ac2721 04-Jun-2014 avignate <aleksey.v.ignatenko@intel.com> x86_64: Fix issue in JNI compiler

This patch fixed 64 bit conversion issue in Immediate.
The issue is inside type conversion of Immediate:
explicit Immediate(int64_t value) : value_(value) {}.
In case of the following example we'll have unexpected value in Immediate:
size_t t = 1;
Immediate(-t) will contain value 4294967295 because by conversion
rules -t is first transformed to unsigned and then transformed
to 64bit (size64_t). The issue can be fixed by using long value
as a parameter of Immediate constructor.

Added tests for BuildFrame, RemoveFrame, IncreaseFrameSize and
DecreaseFrameSize to assembler_x86_64_test.

Change-Id: I0652bac83e4266fd4153bc6a4e9d3aae7cc4cb6f
Signed-off-by: avignate <aleksey.v.ignatenko@intel.com>
Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
ecb2f9ba57b08ceac4204ddd6a0a88a0524f8741 13-Jun-2014 Nicolas Geoffray <ngeoffray@google.com> Enable the register allocator on x86_64.

Also fix an x86_64 assembler bug for movl.

Change-Id: I8d17c68cd35ddd1d8df159f2d6173a013a7c3347
ffddfdf6fec0b9d98a692e27242eecb15af5ead2 03-Jun-2014 Tim Murray <timmurray@google.com> DO NOT MERGE

Merge ART from AOSP to lmp-preview-dev.

Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
cf4035a4c41ccfcc3e89a0cee25f5218a11b0705 29-May-2014 Andreas Gampe <agampe@google.com> ART: Use StackReference in Quick Stack Frame

The method reference at the bottom of a quick frame is a stack
reference and not a native pointer. This is important for 64b
architectures, where the notions do not coincide.

Change key methods to have StackReference<mirror::ArtMethod>*
parameter instead of mirror::ArtMethod**. Make changes to
invoke stubs for 64b archs, change the frame setup for JNI code
(both generic JNI and compilers), tie up loose ends.

Tested on x86 and x86-64 with host tests. On x86-64, tests succeed
with jni compiler activated. x86-64 QCG was not tested.

Tested on ARM32 with device tests.

Fix ARM64 not saving x19 (used for wSUSPEND) on upcalls.

Tested on ARM64 in interpreter-only + generic-jni mode.

Fix ARM64 JNI Compiler to work with the CL.

Tested on ARM64 in interpreter-only + jni compiler.

Change-Id: I77931a0cbadd04d163b3eb8d6f6a6f8740578f13
eb8167a4f4d27fce0530f6724ab8032610cd146b 08-May-2014 Mathieu Chartier <mathieuc@google.com> Add Handle/HandleScope and delete SirtRef.

Delete SirtRef and replaced it with Handle. Handles are value types
which wrap around StackReference*.

Renamed StackIndirectReferenceTable to HandleScope.

Added a scoped handle wrapper which wraps around an Object** and
restores it in its destructor.

Renamed Handle::get -> Get.

Bug: 8473721

Change-Id: Idbfebd4f35af629f0f43931b7c5184b334822c7a
5a4fa82ab42af6e728a60e3261963aa243c3e2cd 01-Apr-2014 Andreas Gampe <agampe@google.com> x86_64 Assembler Test Infrastructure, fix x86_64 assembler

Some infrastructure to do real assembler testing. Need to extend to
other assemblers, and a lot more tests.

Fix some of the cases of the x86_64 assembler.

Change-Id: I15b5f3a094af469130db68a95a66602cf30d8fc4
fba52f1b4bf753790c1d98265c4b0fabb54c7536 15-Apr-2014 Vladimir Kostyukov <vladimir.kostyukov@intel.com> ART: Fixes an issue with REX prefix for instructions with no ModRM byte

There are instructions (such as push, pop, mov) in the x86 ISA
that encode first operands in their opcodes (opcode + reg).
In order to enable an extended 64bit registers (R9-R15) a special
prefix REX.B should be emitted before such instructions.

This patch fixes the issue when REX.R prefix was emitted before
instructions with no MorRM byte. So, the REX-prefix was simply
ignored by CPU for those instructions whose operands are encoded
in their opcodes.

This patch makes the jni_compiler_test passed with JNI compiler
enabled for x86_64 target.

Change-Id: Ib84da1cf9f8ff96bd7afd4e0fc53078f3231f8ec
Signed-off-by: Vladimir Kostyukov <vladimir.kostyukov@intel.com>
790a6b7312979513710c366b411ba6791ddf78c2 01-Apr-2014 Ian Rogers <irogers@google.com> Calling convention support for cross 64/32 compilation.

Add REX support for x86-64 operands.

Change-Id: I093ae26fb8c111d54b8c72166f054984564c04c6
dd7624d2b9e599d57762d12031b10b89defc9807 15-Mar-2014 Ian Rogers <irogers@google.com> Allow mixing of thread offsets between 32 and 64bit architectures.

Begin a more full implementation x86-64 REX prefixes.
Doesn't implement 64bit thread offset support for the JNI compiler.

Change-Id: If9af2f08a1833c21ddb4b4077f9b03add1a05147
fca82208f7128fcda09b6a4743199308332558a2 21-Mar-2014 Dmitry Petrochenko <dmitry.petrochenko@intel.com> x86_64: JNI compiler

Passed all tests from jni_compiler_test and art/test on host with jni_copiler.
Incoming argument spill is enabled, entry_spills refactored. Now each entry spill
contains data type size (4 or 8) and offset which should be used for spill.
Assembler REX support implemented in opcodes used in JNI compiler.
Please note, JNI compiler is not enabled by default yet (see compiler_driver.cc:1875).

Change-Id: I5fd19cca72122b197aec07c3708b1e80c324be44
Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>