3332db8345de39eb5067d99987fcae140184672b |
|
12-Aug-2017 |
Aart Bik <ajcbik@google.com> |
Bunch of SIMD for x86 and x86_64 Rationale: Few instructions needed to implement SIMD reductions. Test: assembler_x86_[64_]test Bug: 64091002 Change-Id: I785acfc6c8c4ad4f290ddeab32da9b767f944e24
|
c8e93c736c149ce41be073dd24324fb08afb9ae4 |
|
10-May-2017 |
Aart Bik <ajcbik@ajcbik2.mtv.corp.google.com> |
Min/max SIMDization support. Rationale: The more vectorized, the better! Test: test-art-target, test-art-host Change-Id: I758becca5beaa5b97fab2ab70f2e00cb53458703
|
8939c6474a34eb6d642db8fecb8b3a5c3194e464 |
|
03-Apr-2017 |
Aart Bik <ajcbik@google.com> |
SIMD pcmpgtb,w,d,q for x86/x86_64 Rationale: Enables fast compare gt. Test: assembler_x86[_64]_test Change-Id: I0a069649480529f3fec2c2b100e2aaaa2cd79820
|
67d3fd77d1572e46f537dea2fd4ded3ecfd7c202 |
|
01-Apr-2017 |
Aart Bik <ajcbik@google.com> |
SIMD pavgb,w for x86/x86_64 Rationale: Break-out CL of ART Vectorizer. Enables fast halving add with rounding Bug: 34083438 Test: assembler_x86[_64]_test Change-Id: I09173376b803d671a6b05a33e630f45f778cea52
|
149fb784740a48d9a7ffdcaa9aabbbfcaa9acb98 |
|
23-Mar-2017 |
Aart Bik <ajcbik@google.com> |
Properly disassemble cmpeq for x86/x86_64 Rationale: Break-out CL of ART Vectorizer. Bug: 34083438 Test: test-art-host Change-Id: I4027033cbe48a19c426326fc307fe4437b143d61
|
3c89d4234589816fb7dafb5215543f2cf023ce6c |
|
17-Feb-2017 |
Vladimir Marko <vmarko@google.com> |
x86/string compression: Use TESTB instead of TESTL in String.charAt(). And fix disassembly of the now unused TESTL. Test: testrunner.py --host with string compression enabled. Test: Manual inspection of dump-oat output. Bug: 35433135 Bug: 31040547 Change-Id: I36c955bc1f2243954ecc315266a2f3fce5d87693
|
68555e952eea58023fa403951b1491496acf0f4b |
|
13-Feb-2017 |
Aart Bik <ajcbik@google.com> |
Added a few integral SIMD extensions for x86/x86_64 (SSE). Rationale: ART vectorizer needs SIMD for integer operations too. Test: assembler_x86[_64]_test Bug: 34083438 Change-Id: Id6fec558c617d38cb643839eafcd10e59dcd6e0a
|
bda1d606f2d31086874b68edd9254e3817d8049c |
|
30-Aug-2016 |
Andreas Gampe <agampe@google.com> |
ART: Detach libart-disassembler from libart Some more intrusive changes than I would have liked, as long as ART logging is different from libbase logging. Fix up some includes. Bug: 15436106 Bug: 31338270 Test: m test-art-host Change-Id: I9fbe4b85b2d74e079a4981f3aec9af63b163a461
|
372f3a374681ef11f003460e14249adb7bc8313d |
|
19-Aug-2016 |
Andreas Gampe <agampe@google.com> |
ART: Add thread offset printing hook to disassembler To prepare separation of disassembler from libart, add a function hook to the disassembler options for thread offset name printing. Bug: 15436106 Change-Id: I9e9b7e565ae923952c64026f675ac527b560f51b
|
542451cc546779f5c67840e105c51205a1b0a8fd |
|
26-Jul-2016 |
Andreas Gampe <agampe@google.com> |
ART: Convert pointer size to enum Move away from size_t to dedicated enum (class). Bug: 30373134 Bug: 30419309 Test: m test-art-host Change-Id: Id453c330f1065012e7d4f9fc24ac477cc9bb9269
|
33dd909468e377aaa8f0ec27fc4b3cb4d8481119 |
|
02-Aug-2016 |
Aart Bik <ajcbik@google.com> |
Fixed bug in disassembly of roundss/roundsd Rationale: These instructions should be marked as load, so that, using Intel syntax, destination (xmm0) appears at left hand side, as in roundss xmm0, xmm1 and not the other way around. First I suspected a bug in the encoding (hence the test) and even the register allocator, but since the code behaved correctly, only disassembly was really wrong. Test: disassembler_x86_test (but nothing for actual disassembly) BUG=26327751 Change-Id: I060ef57f4d5a64cdc04b97ae8a799d1c0d22da05
|
3f67e692860d281858485d48a4f1f81b907f1444 |
|
15-Jan-2016 |
Aart Bik <ajcbik@google.com> |
Implemented BitCount as an intrinsic. With unit test. Rationale: Recognizing this important operation as an intrinsic has various advantages: (1) having the no-side-effects/no-throw allows for much more GVN/LICM/BCE. (2) Some architectures, like x86_64, provide direct support for this operation. Performance improvements on X86_64: CheckersEvalBench (32-bit bitboard): 27,210KNS -> 36,798KNS = + 35% ReversiEvalBench (64-bit bitboard): 52,562KNS -> 89,086KNS = + 69% Change-Id: I65d549b0469b7909b12c6611cdc34a8640a5751f
|
4414822df8483d499fbac02563ebe8c7fc000563 |
|
14-Sep-2015 |
Serdjuk, Nikolay Y <nikolay.y.serdjuk@intel.com> |
ART: disassembler_x86 doesn't recognize NOPs There are some variations of NOPs which are possible on x86. Change-Id: I6aab3bc98682e521532cc746f3a371d9c5d98ee8
|
bcee092d7b0cbb7181d428115ad98d25ce844061 |
|
16-Sep-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Add X86 bsf and rotate instructions These are for use in new intrinsics. Bsf (Bit Scan Forward) is used in {Long,Integer}NumberOfTrailingZeros and the rotates are used in {Long,Integer}Rotate{Left,Right}. Change-Id: Icb599d7e1eec4e4ea9e5b4f0b1654c7b8d4de678 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
8ae3ffb29489a127f2a6242c33845dac8d50e508 |
|
13-Aug-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Add 'bsr' instruction to x86 and x86_64 Add support for 'bsr' instruction. Add tests. Change-Id: I1cd8b30d7f3f5ee7fbeef8124cc6a31bf8ce59d5 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
b9c4bbee9364a9351376fd1fec9604e7c84778d8 |
|
01-Jul-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Add rep movsw to x86 and x86_64 instructions. Add 'REP MOVSW' as a supported instruction for x86 32 and 64 bit. Added tests. Change-Id: I1c615ac1e7fa46c48983c90f791b92be0375c8b8 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
124b392d35595f5a8e31e6a9dbefcff5b3ef5760 |
|
30-Jul-2015 |
agicsaki <agicsaki@google.com> |
Added disassembler support for repe_cmpsw instruction in x86, x86_64 Also included support for repe_cmpsl instruction. This is a follow up to commit 71311f868e2 which added support for repe_cmpsw in the x86 and x86_64 assemblers. Change-Id: I2beac05a57341539acf96cdf77062facd031a864
|
dd17bc3806e800d3b82d5cb27e85ccc1c4e2ee1d |
|
27-Apr-2015 |
nikolay serdjuk <nikolay.y.serdjuk@intel.com> |
Fix for incorrect encode and parse of PEXTRW instruction The instruction PEXTRW encoded by sequence 66 0F 3A 15 was incorrectly encoded in compiler table and incorrectly parsed by disassembler. Signed-off-by: nikolay serdjuk <nikolay.y.serdjuk@intel.com> (cherry picked from commit e0705f51fdc71e9670a29f8c3a47168f50724b35) Change-Id: I7f051e23789aa3745d6eb854c97f80c475748b74
|
e0705f51fdc71e9670a29f8c3a47168f50724b35 |
|
27-Apr-2015 |
nikolay serdjuk <nikolay.y.serdjuk@intel.com> |
Fix for incorrect encode and parse of PEXTRW instruction The instruction PEXTRW encoded by sequence 66 0F 3A 15 was incorrectly encoded in compiler table and incorrectly parsed by disassembler. Change-Id: Ib4d4db923cb15a76e74f13f6b5514cb0d1cbe164 Signed-off-by: nikolay serdjuk <nikolay.y.serdjuk@intel.com>
|
bd4e6a828fc4aefea7d34a1bbedb81c560c60b6b |
|
27-Mar-2015 |
nikolay serdjuk <nikolay.y.serdjuk@intel.com> |
Fix for incorrect parse of PEXTRW instruction The instruction PEXTRW encoded by sequence 66 0F C5 has form: PEXTRW reg, xmm, imm8. Its reg is encoded in the REG part and xmm is encoded in the R/M part of ModR/M byte. Since the order is opposite to the PEXTRB and PEXTRD, we have to set 'load' to true and 'store' leave as false. Change-Id: I32c42ea005eec29f7bf969f275c36ffa0a95fa6d
|
fb8d279bc011b31d0765dc7ca59afea324fd0d0c |
|
01-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Implement x86/x86_64 math intrinsics Implement floor/ceil/round/RoundFloat on x86 and x86_64. Implement RoundDouble on x86_64. Add support for roundss and roundsd on both architectures. Support them in the disassembler as well. Add the instruction set features for x86, as the 'round' instruction is only supported if SSE4.1 is supported. Fix the tests to handle the addition of passing the instruction set features to x86 and x86_64. Add assembler tests for roundsd and roundss to x86_64 assembler tests. Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
031b00dc87cca699f02ce4206a9ecd99d59090dd |
|
27-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Fix x86 disassembler Index 4 in SIB is valid when given Rex.x, where it denotes r12 and not the invalid rsp. Bug: 19149560 Change-Id: I1a74bcbb1ccf3686e45a3df5d852a86444f9d850
|
6a0b920512b72542b3f1a3d232fba7ded45ea455 |
|
16-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix crash in x86 disassembler. Probably a typo from last refactoring. Change-Id: I086a87120ca0f0dfddbe803573b0e0f79cc6d945
|
8683038c1f59bea790d8c7691e40eed7f6250e4a |
|
13-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Do not inline elf writer debug symbols Using Clang, this pushes the frame size of the caller across our limit. Thus forbid inlining. The function is only called once per compile, impact is insignificant. Bug: 18738594 Change-Id: I19c3f1168a5104ab508a8dbf9f2a8c035cb97e3c
|
e5eb7060dbacfd7c768692a8fcc4a6017d0bd1cc |
|
13-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Break up x86 disassembler main function The function leads to large stack frames with Clang. Break out some parts and use four char* variables for opcode. Bug: 18733806 Change-Id: I8bf6da6c763175d7081c4231fa5d3b6809316220
|
677c12fe1939cad5795e7c9f4738941508c4d56f |
|
08-Nov-2014 |
Ian Rogers <irogers@google.com> |
Tidy x86 disassembler Change-Id: I2f0a2851a15f5a099a5bc0249e3ea0616cdcd94e
|
2c4257be8191c5eefde744e8965fcefc80a0a97d |
|
24-Oct-2014 |
Ian Rogers <irogers@google.com> |
Tidy logging code not using UNIMPLEMENTED. Change-Id: I7a79c1671a6ff8b2040887133b3e0925ef9a3cfe
|
cf7f19135f0e273f7b0136315633c2abfc715343 |
|
23-Oct-2014 |
Ian Rogers <irogers@google.com> |
C++11 related clean-up of DISALLOW_.. Move DISALLOW_COPY_AND_ASSIGN to delete functions. By no having declarations with no definitions this prompts better warning messages so deal with these by correcting the code. Add a DISALLOW_ALLOCATION and use for ValueObject and mirror::Object. Make X86 assembly operand types ValueObjects to fix compilation errors. Tidy the use of iostream and ostream. Avoid making cutils a dependency via mutex-inl.h for tests that link against libart. Push tracing dependencies into appropriate files and mutex.cc. x86 32-bit host symbols size is increased for libarttest, avoid copying this in run-test 115 by using symlinks and remove this test's higher than normal ulimit. Fix the RunningOnValgrind test in RosAllocSpace to not use GetHeap as it returns NULL when the heap is under construction by Runtime. Change-Id: Ia246f7ac0c11f73072b30d70566a196e9b78472b
|
c7dd295a4e0cc1d15c0c96088e55a85389bade74 |
|
22-Oct-2014 |
Ian Rogers <irogers@google.com> |
Tidy up logging. Move gVerboseMethods to CompilerOptions. Now "--verbose-methods=" option to dex2oat rather than runtime argument "-verbose-methods:". Move ToStr and Dumpable out of logging.h, move LogMessageData into logging.cc except for a forward declaration. Remove ConstDumpable as Dump methods are all const (and make this so if not currently true). Make LogSeverity an enum and improve compile time assertions and type checking. Remove log_severity.h that's only used in logging.h. With system headers gone from logging.h, go add to .cc files missing system header includes. Also, make operator new in ValueObject private for compile time instantiation checking. Change-Id: I3228f614500ccc9b14b49c72b9821c8b0db3d641
|
fc787ecd91127b2c8458afd94e5148e2ae51a1f5 |
|
10-Oct-2014 |
Ian Rogers <irogers@google.com> |
Enable -Wimplicit-fallthrough. Falling through switch cases on a clang build must now annotate the fallthrough with the FALLTHROUGH_INTENDED macro. Bug: 17731372 Change-Id: I836451cd5f96b01d1ababdbf9eef677fe8fa8324
|
c8ccf68b805c92674545f63e0341ba47e8d9701c |
|
30-Sep-2014 |
Andreas Gampe <agampe@google.com> |
ART: Fix some -Wpedantic errors Remove extra semicolons. Dollar signs in C++ identifiers are an extension. Named variadic macros are an extension. Binary literals are a C++14 feature. Enum re-declarations are not allowed. Overflow. Change-Id: I7d16b2217b2ef2959ca69de84eaecc754517714a
|
2cbaccb67e22c0b313a9785bfc65bcb4b25d0676 |
|
15-Sep-2014 |
Brian Carlstrom <bdc@google.com> |
Avoid printing absolute addresses in oatdump - Added printing of OatClass offsets. - Added printing of OatMethod offsets. - Added bounds checks for code size size, code size, mapping table, gc map, vmap table. - Added sanity check of 100k for code size. - Added partial disassembly of questionable code. - Added --no-disassemble to disable disassembly. - Added --no-dump:vmap to disable vmap dumping. - Reordered OatMethod info to be in file order. Bug: 15567083 (cherry picked from commit 34fa79ece5b3a1940d412cd94dbdcc4225aae72f) Change-Id: I2c368f3b81af53b735149a866f3e491c9ac33fb8
|
34fa79ece5b3a1940d412cd94dbdcc4225aae72f |
|
15-Sep-2014 |
Brian Carlstrom <bdc@google.com> |
Avoid printing absolute addresses in oatdump - Added printing of OatClass offsets. - Added printing of OatMethod offsets. - Added bounds checks for code size size, code size, mapping table, gc map, vmap table. - Added sanity check of 100k for code size. - Added partial disassembly of questionable code. - Added --no-disassemble to disable disassembly. - Added --no-dump:vmap to disable vmap dumping. - Reordered OatMethod info to be in file order. Bug: 15567083 Change-Id: Id86a21e06d4a28f29f16fd018cba7e55c57f849a
|
b3a84e2f308b3ed7d17b8e96fc7adfcac36ebe77 |
|
28-Jul-2014 |
Lupusoru, Razvan A <razvan.a.lupusoru@intel.com> |
ART: Vectorization opcode implementation fixes This patch fixes the implementation of the x86 vectorization opcodes. Change-Id: I0028d54a9fa6edce791b7e3a053002d076798748 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com> Signed-off-by: Udayan Banerji <udayan.banerji@intel.com> Signed-off-by: Philbert Lin <philbert.lin@intel.com>
|
b5bce7cc9f1130ab4932ba8e6917c362bf871f24 |
|
25-Jul-2014 |
Jean Christophe Beyler <jean.christophe.beyler@intel.com> |
ART: Add non-temporal store support Added non-temporal store support as a hint from the ME. Added the implementation of the memory barrier extended instruction that supports non-temporal stores by explicitly serializing all previous store-to-memory instructions. Change-Id: I8205a92083f9725253d8ce893671a133a0b6849d Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com> Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
|
f40f890ae3acd7b3275355ec90e2814bba8d4fd6 |
|
14-Aug-2014 |
Yixin Shou <yixin.shou@intel.com> |
Implement inlined shift long for 32bit Added support for x86 inlined shift long for 32bit Change-Id: I6caef60dd7d80227c3057fd6f64b0ecb11025afa Signed-off-by: Yixin Shou <yixin.shou@intel.com>
|
ec95f72490de0a7f86c35de3d00b50bb80d036a1 |
|
22-Jul-2014 |
Vladimir Kostyukov <vladimir.kostyukov@intel.com> |
ART: Correct disassembling of 64bit immediates on x86_64 The patch fixes an issue with disassembling 'movsxd' and 'movabsq' instructions altered with 64bit immediates: not only a REX.W prefix may be prepended to these instructions. Change-Id: Ida7c7b368327a6b5cae1ff12ec00ceb0769c0a3d Signed-off-by: Vladimir Kostyukov <vladimir.kostyukov@intel.com>
|
79bb184ec0a661bf1276eef555dd5e20828bc528 |
|
01-Jul-2014 |
Vladimir Kostyukov <vladimir.kostyukov@intel.com> |
ART: Correct disassembling of regs from opcodes Registers, which are part of opcode might have 1-byte size or 2-byte size depending on the instruction and 66h prefix. This patch makes the decoding of such instruction correct. Examples: - '664155' should be decoded as 'push r13w' (66h + REX.B) - '41B320' should be decoded as 'mov r11l, 0x20' (byte-operand + REX.B) Change-Id: I83913e3a5f2ef03c4019c0f5eea6b11fc51ee4cc Signed-off-by: Vladimir Kostyukov <vladimir.kostyukov@intel.com>
|
60bfe7b3e8f00f0a8ef3f5d8716adfdf86b71f43 |
|
09-Jul-2014 |
Udayan Banerji <udayan.banerji@intel.com> |
X86 Backend support for vectorized float and byte 16x16 operations Add support for reserving vector registers for the duration of vector loop. Add support for 16x16 multiplication, shifts, and add reduce. Changed the vectorization implementation to be able to use the dataflow elements for SSA recreation and fixed a few implementation details. Change-Id: I2f358f05f574fc4ab299d9497517b9906f234b98 Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com> Signed-off-by: Olivier Come <olivier.come@intel.com> Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
|
94f3eb0c757d0a6a145e24ef95ef7d35c091bb01 |
|
24-Jun-2014 |
Serguei Katkov <serguei.i.katkov@intel.com> |
x86_64: Clean-up after cmp-long fix The patch adresses the coments from review done by Ian Rogers. Clean-up of assembler. Change-Id: I9dbb350dfc6645f8a63d624b2b785233529459a9 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
e443a8063518fb1c5229afa3081b9fd1f6d33b16 |
|
30-Jun-2014 |
Vladimir Kostyukov <vladimir.kostyukov@intel.com> |
ART: FF-opcodes are target-specific Some of the FF-opcodes' (i.e., push, call, jmp) register names depend on the the target (32-bit vs 64-bit). This patch makes such opcodes target-specific. Change-Id: I4fa0b7ee5310e14f4022850ac2160c21be5d1c99 Signed-off-by: Vladimir Kostyukov <vladimir.kostyukov@intel.com>
|
5192cbb12856b12620dc346758605baaa1469ced |
|
01-Jul-2014 |
Yixin Shou <yixin.shou@intel.com> |
Load 64 bit constant into GPR by single instruction for 64bit mode This patch load 64 bit constant into a register by a single movabsq instruction on 64 bit bit instead of previous mov, shift, add instruction sequences. Change-Id: I9d013c4f6c0b5c2e43bd125f91436263c7e6028c Signed-off-by: Yixin Shou <yixin.shou@intel.com>
|
d48b8a2bc111d30ebafdd2c661e9c0789f5c66a7 |
|
24-Jun-2014 |
Vladimir Kostyukov <vladimir.kostyukov@intel.com> |
ART: FPU instructions support in disassembler This patch extends the disassembler with new FPU instructions: - fstsw - fucompp - fprem Change-Id: I9458510bc17f2b3b286edec102552f64be05147e Signed-off-by: Vladimir Kostyukov <vladimir.kostyukov@intel.com>
|
fb0fecffb31398adb6f74f58482f2c4aac95b9bf |
|
20-Jun-2014 |
Olivier Come <olivier.come@intel.com> |
ART: Add HADDPS/HADDPD/SHUFPS/SHUFPD instruction generation The patch adds the HADDPS, HADDPD, SHUFPS, and SHUFPD instruction generation for X86. Change-Id: Ida105d3e57be231a5331564c1a9bc298cf176ce6 Signed-off-by: Olivier Come <olivier.come@intel.com>
|
a33720c7370d1c9e0d6569d7126bb06f2083c614 |
|
19-Jun-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
X86 Dis: Add missing mov byte; Add size suffixes Yet another instruction not disassembled properly. Add 'b', 'w', 'q' to opcodes to diffferentiate between various versions and make it more understandable. Change-Id: Ib794aac660bc8bc4900bfa49eab5aed682996adc Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
33ecf8d692eb192aa0ddb752d3ffe1e899e0f42e |
|
06-Jun-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Add Move with Sign Extend Double to disassembler I noticed another missing instruction. Change-Id: I71170496b014ac2609116eff2aeb13a13e71e263 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
88649c790cb437c130dcb6e428cddeb1ae62601c |
|
05-Jun-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Fix X86 disassambler printing of XMM, MM registers Printing of uint8_t is done as a char, rather than an integer. Change-Id: I996e7d7dd902695be6366ab816fea65b675c2ad9 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
122113a8a233f824c014a8fe9d90626218c4dcca |
|
30-May-2014 |
Vladimir Kostyukov <vladimir.kostyukov@intel.com> |
ART: x86_64 disassembler improvements This patch (a) enables full support of 64bit extended regs r8-r15, including 8bit r8l-r15l, 16bit r8w-r15w and also 32bit r8d-r15d (b) fixes an issue with decoding reg from ModRM byte (REX.B should be used) (c) fixes an issue with decoding regs from SIB byte (regs that contain addr are target-specific) Change-Id: I6bf3d7102780907b1cbe2a46927352ac0b506295 Signed-off-by: Vladimir Kostyukov <vladimir.kostyukov@intel.com>
|
ffddfdf6fec0b9d98a692e27242eecb15af5ead2 |
|
03-Jun-2014 |
Tim Murray <timmurray@google.com> |
DO NOT MERGE Merge ART from AOSP to lmp-preview-dev. Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
|
67d18be2a5bddbd8ee9ef144b34ccaeba08a1db2 |
|
30-May-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Support disassembly of 16-bit immediates Change-Id: I66f5ce93077241204311e52c547599f5287bae04 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
fe94578b63380f464c3abd5c156b7b31d068db6c |
|
22-May-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Implement all vector instructions for X86 Add X86 code generation for the vector operations. Added support for X86 disassembler for the new instructions. Change-Id: I72b48f5efa3a516a16bb1dd4bdb5c9270a8db53a Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
e8861b30ac8b2b1ca49386f9c9218f1d6fedc511 |
|
18-Apr-2014 |
Vladimir Kostyukov <vladimir.kostyukov@intel.com> |
ART: Enables x86_64 disassembly This patch (a) cuts a REX prefix from the instruction and (b) adds missed 32bit disp to instructions with ModR/M and SIB bytes. Change-Id: I2674678224ca27746b33d4006ed38d497972309f Signed-off-by: Vladimir Kostyukov <vladimir.kostyukov@intel.com>
|
fba52f1b4bf753790c1d98265c4b0fabb54c7536 |
|
15-Apr-2014 |
Vladimir Kostyukov <vladimir.kostyukov@intel.com> |
ART: Fixes an issue with REX prefix for instructions with no ModRM byte There are instructions (such as push, pop, mov) in the x86 ISA that encode first operands in their opcodes (opcode + reg). In order to enable an extended 64bit registers (R9-R15) a special prefix REX.B should be emitted before such instructions. This patch fixes the issue when REX.R prefix was emitted before instructions with no MorRM byte. So, the REX-prefix was simply ignored by CPU for those instructions whose operands are encoded in their opcodes. This patch makes the jni_compiler_test passed with JNI compiler enabled for x86_64 target. Change-Id: Ib84da1cf9f8ff96bd7afd4e0fc53078f3231f8ec Signed-off-by: Vladimir Kostyukov <vladimir.kostyukov@intel.com>
|
dd7624d2b9e599d57762d12031b10b89defc9807 |
|
15-Mar-2014 |
Ian Rogers <irogers@google.com> |
Allow mixing of thread offsets between 32 and 64bit architectures. Begin a more full implementation x86-64 REX prefixes. Doesn't implement 64bit thread offset support for the JNI compiler. Change-Id: If9af2f08a1833c21ddb4b4077f9b03add1a05147
|
99ad7230ccaace93bf323dea9790f35fe991a4a2 |
|
26-Feb-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
Relaxed memory barriers for x86 X86 provides stronger memory guarantees and thus the memory barriers can be optimized. This patch ensures that all memory barriers for x86 are treated as scheduling barriers. And in cases where a barrier is needed (StoreLoad case), an mfence is used. Change-Id: I13d02bf3f152083ba9f358052aedb583b0d48640 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
38e12034f1ef2b32e98b6e49cb36b7cc37a7f1be |
|
14-Mar-2014 |
Ian Rogers <irogers@google.com> |
x86-64 disassembler support. Change-Id: I0ae39ae1ffdae2500ff368354f9e4702445176f0
|
4028a6c83a339036864999fdfd2855b012a9f1a7 |
|
20-Feb-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Inline x86 String.indexOf Take advantage of the presence of a constant search char or start index to tune the generated code. Change-Id: I0adcf184fb91b899a95aa4d8ef044a14deb51d88 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
614c2b4e219631e8c190fd9fd5d4d9cd343434e1 |
|
29-Jan-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
Support to generate inline long to FP bytecodes for x86 long-to-float and long-to-double are now generated inline instead of calling a helper routine. The conversion is done by using x87. Change-Id: I196e526afec1be212898baceca8527549c3655b6 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
2c498d1f28e62e81fbdb477ff93ca7454e7493d7 |
|
30-Jan-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
Specializing x86 range argument copying The ARM implementation of range argument copying was specialized in some cases. For all other architectures, it would fall back to generating memcpy. This patch updates the x86 implementation so it does not call memcpy and instead generates loads and stores, favoring movement of 128-bit chunks. Change-Id: Ic891e5609a4b0e81a47c29cc5a9b301bd10a1933 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
d3266bcc340d653e178e3ab9d74512c8db122eee |
|
24-Jan-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
Reduce x86 sequence for GP pair to XMM Added support for punpckldq which is useful for interleaving 32-bit values from two xmm registers. This new instruction is now used for transfers from GP pairs to XMM in order to reduce path length. Change-Id: I70d9b69449dfcfb9a94a628deb74a7cffe96bac7 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
4708dcd68eebf1173aef1097dad8ab13466059aa |
|
22-Jan-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Improve x86 long multiply and shifts Generate inline code for long shifts by constants and do long multiplication inline. Convert multiplication by a constant to a shift when we can. Fix some x86 assembler problems and add the new instructions that were needed (64 bit shifts). Change-Id: I6237a31c36159096e399d40d01eb6bfa22ac2772 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
2bf31e67694da24a19fc1f328285cebb1a4b9964 |
|
23-Jan-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Improve x86 long divide Implement inline division for literal and variable divisors. Use the general case for dividing by a literal by using a double length multiply by the appropriate constant with fixups. This is the Hacker's Delight algorithm. Change-Id: I563c250f99d89fca5ff8bcbf13de74de13815cfe Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
bd288c2c1206bc99fafebfb9120a83f13cf9723b |
|
21-Dec-2013 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
Add conditional move support to x86 and allow GenMinMax to use it X86 supports conditional moves which is useful for reducing branchiness. This patch adds support to the x86 backend to generate conditional reg to reg operations. Both encoder and decoder support was added for cmov. The x86 version of GenMinMax used for generating inlined version Math.min/max has been updated to make use of the conditional move support. Change-Id: I92c5428e40aa8ff88bd3071619957ac3130efae7 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
d19b55a05b52b7f7da9f894eba63ed03e2a62283 |
|
12-Dec-2013 |
Mark Mendell <mark.p.mendell@intel.com> |
Disassemble more x86 instructions By using oatdump on the core.oat, I found a couple more instructions that didn't disassemble properly. These included another form of imul and some FP instructions used by the JNI code. Now the only unknown opcodes I could find seem to be literal data at the end of the method. Change-Id: Icea1da1c7d1f9dce99e6b6517cfca34b47d6827a Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
f723f0cdc693f81581c0781fa472b1c85a8b42d6 |
|
12-Dec-2013 |
Mark Mendell <mark.p.mendell@intel.com> |
Add missing x86 imul opcode to disassembler When playing with ART, I noticed that an integer multiply didn't disassemble properly. This patch adds the instruction. Change-Id: Ic4d4921b1b301a9d674a257f094e8b3d834ed991 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
70b797d998f2a28e39f7d6ffc8a07c9cbc47da14 |
|
03-Dec-2013 |
Vladimir Marko <vmarko@google.com> |
Unsafe.compareAndSwapLong() intrinsic for x86. Change-Id: Idbc5371a62dfdd84485a657d4548990519200205
|
a8b4caf7526b6b66a8ae0826bd52c39c66e3c714 |
|
24-Oct-2013 |
Vladimir Marko <vmarko@google.com> |
Add byte swap instructions for ARM and x86. Change-Id: I03fdd61ffc811ae521141f532b3e04dda566c77d
|
02ed4c04468ca5f5540c5b704ac3e2f30eb9e8f4 |
|
06-Sep-2013 |
Ian Rogers <irogers@google.com> |
Move disassembler out of runtime. Bug: 9877500. Change-Id: Ica6d9f5ecfd20c86e5230a2213827bd78cd29a29
|