Cross Reference: /art/compiler/dex/quick/x86/target

History log of /art/compiler/dex/quick/x86/target_x86.cc
Revision	Date	Author	Comments
db7239ccce7748f2b494fb3b91c128b37019a093	17-Sep-2014	avignate <aleksey.v.ignatenko@intel.com>	ART: Overflow of bound check in ArrayCopy intrinsic System.arraycopy method is implemented as intrinsic on x86. It has bound check which has a bug to overflow in certain conditions when summ of array offset and number elements to be copied are more than MAX_INT. For the methods like CarArrayBuffer.get it means no OutOfBound exception to be thrown. The proposed solution fixed that. b/17711775 Signed-off-by: avignate <aleksey.v.ignatenko@intel.com> (cherry picked from commit f9f0ed401f7fe4138a71b36719423b908a3b7bfb) Change-Id: I1d4ca900df262d483a94ebea8fa686ea361772c8
02ff2d4187249d26fabe8e5eacc27b99984ee353	04-Sep-2014	Serguei Katkov <serguei.i.katkov@intel.com>	AddIntrinsicSlowPath with resume requires clobbering AddIntrinsicSlowPath with resume results in a call. So all temps must be clobbered at the point where AddIntrinsicSlowPath returns. (cherry-picked from 9863daf4fdc1a08339edac794452dbc719aef4f1) Change-Id: If9eb887e295ff5e59920f4da1cef63258ad490b0 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
b5874a47a4d2c4d2971116b031b4068021ffda05	19-Aug-2014	Vladimir Marko <vmarko@google.com>	X86: Fix alias info in GenInlinedIndexOf(). For 32-bit X86, GenInlinedIndexOf() pushes and pops EDI. In one branch it then calls Load32Disp() with adjusted stack offset. That calculates wrong alias_info for the generated insn. If left unfixed, this could confuse load hoisting. Bug: 17128502 (cherry picked from commit 74de63bb1cc275b411cae28a96f9b3a78b939bc2) Change-Id: I5dc82b7aae9e9655e75843a952b8ebb04269f46b
b0f05b9654eb005bc8c8e15f615a7f5a312f640c	17-Jul-2014	Dave Allison <dallison@google.com>	Add implicit checks for x86_64 architecture. This combines the x86 and x86_64 fault handlers into one. It also merges in the change to the entrypoints for X86_64. Replaces generic instruction length calculator with one that only works with the specific instructions we use. Bug: 16256184 Change-Id: I1e8ab5ad43f46060de9597615b423c89a836035c Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
5a5e85693b1d5952d88377be5826068b67b0dcec	18-Jul-2014	DaniilSokolov <daniil.y.sokolov@intel.com>	ART: Enable x86_64 bit support for intrinsic for System.arraycopy(char[], ..) Implements x86_64 support for intrinsic for java.lang.System.arraycopy(char[], int, char[], int, int). With this fix the intrinsic works on x86 and x86_64 architectures. Change-Id: Icc2889ccd0cf7d821522abb7437893e3149e7c99 Signed-off-by: Daniil Sokolov <daniil.y.sokolov@intel.com>
6bbf0967d217ab2b7bdbb78bfd076b8fb07a44e8	14-Jul-2014	Alexei Zavjalov <alexei.zavjalov@intel.com>	ART: Implement the easy long division/remainder by a constant Also optimizes long/int divisions by power-of-two values. Also do some clean-up. Change-Id: Ie414e64aac251c81361ae107d157c14439e6dab5 Signed-off-by: Alexei Zavjalov <alexei.zavjalov@intel.com>
bda2722ba62e5be9f9fd6a6eb0db8259bb383629	31-Jul-2014	Andreas Gampe <agampe@google.com>	ART: Build fix Make lint happy, as comments should be separated from code. Change-Id: I4bfd88357302be9a6a104f1152e3b1fda386371e
35e1e6ad4b50f1adbe9f93fe467766f042491896	30-Jul-2014	Tong Shen <endlessroad@google.com>	1. Fix CFI for quick compiled code in x86 & x86_64; 2. Emit CFI in .eh_frame instead of .debug_frame. With CFI, we can correctly unwind past quick generated code. Now gdb should unwind to main() for both x86 & x86_64 host-side ART. Note that it does not work with relocation yet. Testing: 1. art/test/run-test --host --gdb [--64] --no-relocate 005 2. In gdb, run 'b art_quick_invoke_stub', then 'r', then 'c' a few times 3. In gdb, run 'bt'. You should see stack frames down to main() Change-Id: I5350d4097dc3d360a60cb17c94f1d02b99bc58bb
984305917bf57b3f8d92965e4715a0370cc5bcfb	28-Jul-2014	Andreas Gampe <agampe@google.com>	ART: Rework quick entrypoint code in Mir2Lir, cleanup To reduce the complexity of calling trampolines in generic code, introduce an enumeration for entrypoints. Introduce a header that lists the entrypoint enum and exposes a templatized method that translates an enum value to the corresponding thread offset value. Call helpers are rewritten to have an enum parameter instead of the thread offset. Also rewrite LoadHelper and GenConversionCall this way. It is now LoadHelper's duty to select the right thread offset size. Introduce InvokeTrampoline virtual method to Mir2Lir. This allows to further simplify the call helpers, as well as make OpThreadMem specific to X86 only (removed from Mir2Lir). Make GenInlinedCharAt virtual, move a copy to X86 backend, and simplify both copies. Remove LoadBaseIndexedDisp and OpRegMem from Mir2Lir, as they are now specific to X86 only. Remove StoreBaseIndexedDisp from Mir2Lir, as it was only ever used in the X86 backend. Remove OpTlsCmp from Mir2Lir, as it was only ever used in the X86 backend. Remove OpLea from Mir2Lir, as it was only ever defined in the X86 backend. Remove GenImmedCheck from Mir2Lir as it was neither used nor implemented. Change-Id: If0a6182288c5d57653e3979bf547840a4c47626e
147eb41b53729ec8d5c188d1cac90964a51afb8a	11-Jul-2014	Dave Allison <dallison@google.com>	Revert "Revert "Revert "Revert "Add implicit null and stack checks for x86"""" This reverts commit 0025a86411145eb7cd4971f9234fc21c7b4aced1. Bug: 16256184 Change-Id: Ie0760a0c293aa3b62e2885398a8c512b7a946a73 Conflicts: compiler/dex/quick/arm64/target_arm64.cc compiler/image_test.cc runtime/fault_handler.cc
c3561ae381960cbd52a83b7591504f158ec06920	17-Jul-2014	nikolay serdjuk <nikolay.y.serdjuk@intel.com>	Improved implementation of inline of String.indexOf This version pushes EDI only once and only in 32-bit mode. Change-Id: I4e871d3531ac539536f8f53ec09ffb664409c9cc
8e3acdd132aef1391676a5db2696804900aacd8e	14-Jul-2014	Serguei Katkov <serguei.i.katkov@intel.com>	x86_64: Fix GenDalvikArgsRange for 64-bit ref 32-bit virtual register can be in 64-bit solo register. So we should not compute the size of virtual register basing on size of phyical register. Change-Id: I4e11be13df8469be63808d0ce9d1ca6f80bef483 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
69dfe51b684dd9d510dbcb63295fe180f998efde	11-Jul-2014	Dave Allison <dallison@google.com>	Revert "Revert "Revert "Revert "Add implicit null and stack checks for x86"""" This reverts commit 0025a86411145eb7cd4971f9234fc21c7b4aced1. Bug: 16256184 Change-Id: Ie0760a0c293aa3b62e2885398a8c512b7a946a73
d9cb8ae2ed78f957a773af61759432d7a7bf78af	09-Jul-2014	Douglas Leung <douglas@mips.com>	Fix art test failures for Mips. This patch fixes the following art test failures for Mips: 003-omnibus-opcodes 030-bad-finalizer 041-narrowing 059-finalizer-throw Change-Id: I4e0e9ff75f949c92059dd6b8d579450dc15f4467 Signed-off-by: Douglas Leung <douglas@mips.com>
af263df7f643e699abf622c64447d31bacc14c34	12-Jul-2014	Andreas Gampe <agampe@google.com>	ART: Change GenPCUseDefEncoding(), turn on Load Hoisting for ARM64 This defines the PC resource mask as empty, as the PC is not accessible on ARM64. Unify code paths with x86 in LoadStoreElimination and LoadHoisting. Change-Id: Iea8b9e666f306c7a6ff52b6c5bf7e05b35346b2c
48f5c47907654350ce30a8dfdda0e977f5d3d39f	27-Jun-2014	Hans Boehm <hboehm@google.com>	Replace memory barriers to better reflect Java needs. Replaces barriers that enforce ordering of one access type (e.g. Load) with respect to another (e.g. store) with more general ones that better reflect both Java requirements and actual hardware barrier/fence instructions. The old code was inconsistent and unclear about which barriers implied which others. Sometimes multiple barriers were generated and then eliminated; sometimes it was assumed that certain barriers implied others. The new barriers closely parallel those in C++11, though, for now, we use something closer to the old naming. Bug: 14685856 Change-Id: Ie1c80afe3470057fc6f2b693a9831dfe83add831
ccc60264229ac96d798528d2cb7dbbdd0deca993	05-Jul-2014	Andreas Gampe <agampe@google.com>	ART: Rework TargetReg(symbolic_reg, wide) Make the standard implementation in Mir2Lir and the specialized one in the x86 backend return a pair when wide = "true". Introduce WideKind enumeration to improve code readability. Simplify generic code based on this implementation. Change-Id: I670d45aa2572eedfdc77ac763e6486c83f8e26b4
7fb36ded9cd5b1d254b63b3091f35c1e6471b90e	10-Jul-2014	Dave Allison <dallison@google.com>	Revert "Revert "Add implicit null and stack checks for x86"" Fixes x86_64 cross compile issue. Removes command line options and property to set implicit checks - this is hard coded now. This reverts commit 3d14eb620716e92c21c4d2c2d11a95be53319791. Change-Id: I5404473b5aaf1a9c68b7181f5952cb174d93a90d
c380191f3048db2a3796d65db8e5d5a5e7b08c65	08-Jul-2014	Serguei Katkov <serguei.i.katkov@intel.com>	x86_64: Enable fp-reg promotion Patch introduces 4 register XMM12-15 available for promotion of fp virtual registers. Change-Id: I3f89ad07fc8ae98b70f550eada09be7b693ffb67 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
0025a86411145eb7cd4971f9234fc21c7b4aced1	11-Jul-2014	Nicolas Geoffray <ngeoffray@google.com>	Revert "Revert "Revert "Add implicit null and stack checks for x86""" Broke the build. This reverts commit 7fb36ded9cd5b1d254b63b3091f35c1e6471b90e. Change-Id: I9df0e7446ff0913a0e1276a558b2ccf6c8f4c949
34e826ccc80dc1cf7c4c045de6b7f8360d504ccf	29-May-2014	Dave Allison <dallison@google.com>	Add implicit null and stack checks for x86 This adds compiler and runtime changes for x86 implicit checks. 32 bit only. Both host and target are supported. By default, on the host, the implicit checks are null pointer and stack overflow. Suspend is implemented but not switched on. Change-Id: I88a609e98d6bf32f283eaa4e6ec8bbf8dc1df78a
3d14eb620716e92c21c4d2c2d11a95be53319791	10-Jul-2014	Dave Allison <dallison@google.com>	Revert "Add implicit null and stack checks for x86" It breaks cross compilation with x86_64. This reverts commit 34e826ccc80dc1cf7c4c045de6b7f8360d504ccf. Change-Id: I34ba07821fc0a022fda33a7ae21850957bbec5e7
60bfe7b3e8f00f0a8ef3f5d8716adfdf86b71f43	09-Jul-2014	Udayan Banerji <udayan.banerji@intel.com>	X86 Backend support for vectorized float and byte 16x16 operations Add support for reserving vector registers for the duration of vector loop. Add support for 16x16 multiplication, shifts, and add reduce. Changed the vectorization implementation to be able to use the dataflow elements for SSA recreation and fixed a few implementation details. Change-Id: I2f358f05f574fc4ab299d9497517b9906f234b98 Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com> Signed-off-by: Olivier Come <olivier.come@intel.com> Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
407a9d2847161b843966a443b71760b1280bd396	04-Jul-2014	Serguei Katkov <serguei.i.katkov@intel.com>	Clean-up call_x86.cc Also adds some DCHECKs and fixes for the bugs found by them. Change-Id: I455bbfe2c6018590cf491880cd9273edbe39c4c7 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
70c4f06f9965cdb9319a2c85f65acda20086d765	25-Jun-2014	DaniilSokolov <daniil.y.sokolov@intel.com>	ART: Intrinsic implementation for java.lang.System.arraycopy. Implements intrinsic for java.lang.System.arraycopy(char[], int, char[], int, int) - this method is internal to android class libraries and used in such classes as StringBuffer and StringBuilder. It is not possible to call it from application code. The intrinsic for this method is implemented as inline method (assembly code is generated manually). The intrinsic is x86 32 bit only. Change-Id: Id1b1e0a20d5f6d5f5ebfe1fdc2447b6d8a515432 Signed-off-by: Daniil Sokolov <daniil.y.sokolov@intel.com>
a77ee5103532abb197f492c14a9e6fb437054e2a	02-Jul-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: TargetReg update for x86 Also includes changes in common code. Elimination of use of TargetReg with one parameter and direct access to special target registers. Change-Id: Ied2c1f87d4d1e4345248afe74bca40487a46a371 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
b5860fb459f1ed71f39d8a87b45bee6727d79fe8	22-Jun-2014	buzbee <buzbee@google.com>	Register promotion support for 64-bit targets Not sufficiently tested for 64-bit targets, but should be fairly close. A significant amount of refactoring could stil be done, (in later CLs). With this change we are not making any changes to the vmap scheme. As a result, it is a requirement that if a vreg is promoted to both a 32-bit view and the low half of a 64-bit view it must share the same physical register. We may change this restriction later on to allow for more flexibility for 32-bit Arm. For example, if v4, v5, v4/v5 and v5/v6 are all hot enough to promote, we'd end up with something like: v4 (as an int) -> r10 v4/v5 (as a long) -> r10 v5 (as an int) -> r11 v5/v6 (as a long) -> r11 Fix a couple of ARM64 bugs on the way... Change-Id: I6a152b9c164d9f1a053622266e165428045362f3
c5e4ce116e4d44bfdf162f0c949e77772d7e0654	10-Jun-2014	nikolay serdjuk <nikolay.y.serdjuk@intel.com>	x86_64: Fix intrinsics The following intrinsics have been ported: - Abs(double/long/int/float) - String.indexOf/charAt/compareTo/is_empty/length - Float.floatToRawIntBits, Float.intBitsToFloat - Double.doubleToRawLongBits, Double.longBitsToDouble - Thread.currentThread - Unsafe.getInt/Long/Object, Unsafe.putInt/Long/Object - Math.sqrt, Math.max, Math.min - Long.reverseBytes Math.min and max for longs have been implemented for x86_64. Commented out until good tests available: - Memory.peekShort/Int/Long, Memory.pokeShort/Int/Long Turned off on x86-64 as reported having problems - Cas Change-Id: I934bc9c90fdf953be0d3836a17b6ee4e7c98f244
5192cbb12856b12620dc346758605baaa1469ced	01-Jul-2014	Yixin Shou <yixin.shou@intel.com>	Load 64 bit constant into GPR by single instruction for 64bit mode This patch load 64 bit constant into a register by a single movabsq instruction on 64 bit bit instead of previous mov, shift, add instruction sequences. Change-Id: I9d013c4f6c0b5c2e43bd125f91436263c7e6028c Signed-off-by: Yixin Shou <yixin.shou@intel.com>
dd64450b37776f68b9bfc47f8d9a88bc72c95727	01-Jul-2014	Elena Sayapina <elena.v.sayapina@intel.com>	x86_64: Unify 64-bit check in x86 compiler Update x86-specific Gen64Bit() check with the CompilationUnit target64 field which is set using unified Is64BitInstructionSet(InstructionSet) check. Change-Id: Ic00ac863ed19e4543d7ea878d6c6c76d0bd85ce8 Signed-off-by: Elena Sayapina <elena.v.sayapina@intel.com>
4d5d794382cd6d3a25392d17543d5987e432d314	26-Jun-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	x86_64: Simplify FlushIns This change simplifies FlushIns for x86_64. Change-Id: I2b41fae32603e0951e3847cc1e4f9c6bfab349a0 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
b6564c19c5e14a3caa3f8da423b0da510fda7026	24-Jun-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: Fix wide argument increment This patch fixes to always increment the index for a wide argument, and fixes the index upper bound. Otherwise, the mapping may be incorrect. Change-Id: I0116d8fd0a0a5c1270a23129c73a9e3651132977 Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
de68676b24f61a55adc0b22fe828f036a5925c41	24-Jun-2014	Andreas Gampe <agampe@google.com>	Revert "ART: Split out more cases of Load/StoreRef, volatile as parameter" This reverts commit 2689fbad6b5ec1ae8f8c8791a80c6fd3cf24144d. Breaks the build. Change-Id: I9faad4e9a83b32f5f38b2ef95d6f9a33345efa33
3c12c512faf6837844d5465b23b9410889e5eb11	24-Jun-2014	Andreas Gampe <agampe@google.com>	Revert "Revert "ART: Split out more cases of Load/StoreRef, volatile as parameter"" This reverts commit de68676b24f61a55adc0b22fe828f036a5925c41. Fixes an API comment, and differentiates between inserting and appending. Change-Id: I0e9a21bb1d25766e3cbd802d8b48633ae251a6bf
2689fbad6b5ec1ae8f8c8791a80c6fd3cf24144d	23-Jun-2014	Andreas Gampe <agampe@google.com>	ART: Split out more cases of Load/StoreRef, volatile as parameter Splits out more cases of ref registers being loaded or stored. For code clarity, adds volatile as a flag parameter instead of a separate method. On ARM64, continue cleanup. Add flags to print/fatal on size mismatches. Change-Id: I30ed88433a6b4ff5399aefffe44c14a5e6f4ca4e
5655e84e8d71697d8ef3ea901a0b853af42c559e	18-Jun-2014	Andreas Gampe <agampe@google.com>	ART: Implicit checks in the compiler are independent from Runtime When cross-compiling, those flags are independent. This is an initial CL that helps bypass fatal failures when cross-compiling, as not all architectures support (and have turned on) implicit checks. The actual transport for the target architecture when it is different from the runtime needs to be implemented in a follow-up CL. Bug: 15703710 Change-Id: Idc881a9a4abfd38643b862a491a5af9b8841f693
35ec2b5faf9a2dbc3c0cddb7ebc09952b8a27d2a	17-Jun-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: Clobber r8 to r11 and xmm0 to xmm15 This clobbers r8 to r11 and xmm0 to xmm15, so that they can be reloaded after an external C call. Change-Id: If5cac97e475083912026309891dc332f14f8683a Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
7e399fd3a99ba9c9dbfafdf14f75dd318fa7d454	11-Jun-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: Disable all optimizations and fix bugs This disables all optimizations and ensures that art tests still pass. Change-Id: I43217378d6889bb04f4d064f8d53cb3ff4c20aa0 Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
8dea81ca9c0201ceaa88086b927a5838a06a3e69	06-Jun-2014	Vladimir Marko <vmarko@google.com>	Rewrite use/def masks to support 128 bits. Reduce LIR memory usage by holding masks by pointers in the LIR rather than directly and using pre-defined const masks for the common cases, allocating very few on the arena. Change-Id: I0f6d27ef6867acd157184c8c74f9612cebfe6c16
55884bc1e2e1b324809b462455ccaf5811ffafd8	10-Jun-2014	Mark Mendell <mark.p.mendell@intel.com>	X86_64: Proper IMT fix Unfortunately, 97184: X86_64: Pass 'hidden method index' in EAX wasn't correct. TargetReg(kInvokeTgt) is ALSO EAX, and so invoke-interface blows up, since the saved index is overwritten by the generated code. Change kInvokeTgt to EDI (the same as ARG0). Change-Id: I4b1d260237274ee26b9283d810d1b74484ea59af Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
ea248f8b048d904a8fe806b6a52372985945274d	10-Jun-2014	Ian Rogers <irogers@google.com>	Remove TARGET_REX_SUPPORT define. Change-Id: I1c3644176c101064261d13b50484d2e3ae456316
0f9b9c508814a62c6e21c6a06cfe4de39b5036c0	09-Jun-2014	Ian Rogers <irogers@google.com>	Tidy up x86 assembler and fix byte register encoding. Also fix reg storage int size issues. Also fix bad use of byte registers in GenInlinedCas. Change-Id: Id47424f36f9000e051110553e0b51816910e2fe8
d3703d82a0afc28a4ea0cb0f6d88e9f8adc23e43	09-Jun-2014	Mark Mendell <mark.p.mendell@intel.com>	X86_64: Pass 'hidden method index' in EAX Method* is in EDI, and EAX isn't an argument register, so EAX is free to hold the hidden method index. Change-Id: I793a54d00a4593e140f97144419d849b53bfdf44 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
a014776f4474579d4dfc72e3374ba45c6f6e5f35	07-Jun-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: Add long bytecode supports (2/2) This patch adds implementation of math and complex long bytcodes, and basic long arithmetic. Change-Id: I811397d7e0ee8ad0d12b23d32ba58314d479d714 Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com> Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
e0ccdc0dd166136cd43e5f54201179a4496d33e8	07-Jun-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: Add long bytecode supports (1/2) This patch includes switch enabling and GenFillArray, assembler changes, updates of regalloc behavior for 64-bit, usage in basic utility operations, loading constants, and update for memory operations. Change-Id: I6d8aa35a75c5fd01d69c38a770c3398d0188cc8a Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com> Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
58994cdb00b323339bd83828eddc53976048006f	16-May-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	x86_64: Hard Float ABI support in QCG This patch shows our efforts on resolving the ART limitations: - passing "float"/"double" arguments via FPR - passing "long" arguments via single GPR, not pair - passing more than 3 agruments via GPR. Work done: - Extended SpecialTargetRegister enum with kARG4, kARG5, fARG4..fARG7. - Created initial LoadArgRegs/GenDalvikX/FlushIns version in X86Mir2Lir. - Unlimited number of long/double/float arguments support - Refactored (v2) Change-Id: I5deadd320b4341d5b2f50ba6fa4a98031abc3902 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com> Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
76af0d307194045ece429dbaf62e93d3e08c6c20	05-Jun-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	x86_64: Turn on 64-bit core registers initialization. This enables 64-bit core registers initialization for x86_64. The backend update with 64-bit temp support is in progress. Change-Id: If7c9a62c1145f81050adda86f2beed427220baa2 Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com> Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
089142cf1d0c028b5a7c703baf0b97f4a4ada3f7	05-Jun-2014	Vladimir Marko <vmarko@google.com>	Avoid register pool allocations on the heap. Create a helper template class ArrayRef and use it instead of std::vector<> for register pools in target_<arch>.cc to avoid these heap allocations during program startup. Change-Id: I4ab0205af9c1d28a239c0a105fcdc60ba800a70a
a0cd2d701f29e0bc6275f1b13c0edfd4ec391879	01-Jun-2014	buzbee <buzbee@google.com>	Quick compiler: reference cleanup For 32-bit targets, object references are 32 bits wide both in Dalvik virtual registers and in core physical registers. Because of this, object references and non-floating point values were both handled as if they had the same register class (kCoreReg). However, for 64-bit systems, references are 32 bits in Dalvik vregs, but 64 bits in physical registers. Although the same underlying physical core registers will still be used for object reference and non-float values, different register class views will be used to represent them. For example, an object reference in arm64 might be held in x3 at some point, while the same underlying physical register, w3, would be used to hold a 32-bit int. This CL breaks apart the handling of object reference and non-float values to allow the proper register class (or register view) to be used. A new register class, kRefReg, is introduced which will map to a 32-bit core register on 32-bit targets, and 64-bit core registers on 64-bit targets. From this point on, object references should be allocated registers in the kRefReg class rather than kCoreReg. Change-Id: I6166827daa8a0ea3af326940d56a6a14874f5810
ffddfdf6fec0b9d98a692e27242eecb15af5ead2	03-Jun-2014	Tim Murray <timmurray@google.com>	DO NOT MERGE Merge ART from AOSP to lmp-preview-dev. Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
a20468c004264592f309a548fc71ba62a69b8742	30-Apr-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	x86_64: Support r8-r15, xmm8-xmm15 in assembler Added REX support. The TARGET_REX_SUPPORT should be used during build. Change-Id: I82b457ff5085c8192ad873923bd939fbb91022ce Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
fe94578b63380f464c3abd5c156b7b31d068db6c	22-May-2014	Mark Mendell <mark.p.mendell@intel.com>	Implement all vector instructions for X86 Add X86 code generation for the vector operations. Added support for X86 disassembler for the new instructions. Change-Id: I72b48f5efa3a516a16bb1dd4bdb5c9270a8db53a Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
0999a6f7c83d10aa59b75f079f0d2fdbac982cf7	21-May-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	x86_64: Rebase on top of "64-bit temp register support" Added the 64-bit core/temp register definition, fixed RegisterPool creation for x86_64 so that 64-bit core/temps are NOT used for now. The long arithmetic still operates with register pair on x86_64 and it is a subject for change in a separate patch. Change-Id: I2be06d5aefaf80141983bc9d8ed8a2ee24c2b21b Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
b01bf15d18f9b08d77e7a3c6e2897af0e02bf8ca	14-May-2014	buzbee <buzbee@google.com>	64-bit temp register support. Add a 64-bit temp register allocation path. The recent physical register handling rework supports multiple views of the same physical register (or, such as for Arm's float/double regs, different parts of the same physical register). This CL adds a 64-bit core register view for 64-bit targets. In short, each core register will have a 64-bit name, and a 32-bit name. The different views will be kept in separate register pools, but aliasing will be tracked. The core temp register allocation routines will be largely identical - except for 32-bit targets, which will continue to use pairs of 32-bit core registers for holding long values. Change-Id: I8f118e845eac7903ad8b6dcec1952f185023c053
e87f9b5185379c8cf8392d65a63e7bf7e51b97e7	30-Apr-2014	Mark Mendell <mark.p.mendell@intel.com>	Allow X86 QBE to be extended Enhancements and updates to allow X86Mir2LIR Backend to be subclassed for experimentation. Add virtual in a whole bunch of places, and make some other changes to get this to work. Change-Id: I0980a19bc5d5725f91660f98c95f1f51c17ee9b6 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
082833c8d577db0b2bebc100602f31e4e971613e	18-May-2014	buzbee <buzbee@google.com>	Quick compiler, out of registers fix It turns out that the register pool sanity checker was not working as expected, leaving some inconsistencies unreported. This could result in "out of registers" failures, as well as other more subtle problems. This CL fixes the sanity checker, adds a lot more check and cleans up the previously undetected episodes of insanity. Cherry-pick of internal change 468162 Change-Id: Id2da97e99105a4c272c5fd256205a94b904ecea8
05d3aeb33683b16837741f9348d6fba9a8432068	18-May-2014	buzbee <buzbee@google.com>	Quick compiler, out of registers fix Fixes b/15024623 It turns out that the register pool sanity checker was not working as expected, leaving some inconsistencies unreported. This CL fixes the sanity checker, adds a lot more check and cleans up the previously undetected episodes of insanity. Change-Id: I4d67db864ca5926a1975db251e7e631b65a86275
d65c51a556e6649db4e18bd083c8fec37607a442	29-Apr-2014	Mark Mendell <mark.p.mendell@intel.com>	ART: Add support for constant vector literals Add in some vector instructions. Implement the ConstVector instruction, which takes 4 words of data and loads it into an XMM register. Initially, only the ConstVector MIR opcode is implemented. Others will be added after this one goes in. Change-Id: I5c79bc8b7de9030ef1c213fc8b227debc47f6337 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
b14329f90f725af0f67c45dfcb94933a426d63ce	15-May-2014	Andreas Gampe <agampe@google.com>	ART: Fix MonitorExit code on ARM We do not emit barriers on non-SMP systems. But on ARM, we have places that need to conditionally execute, which is done through an IT instruction. The guide of said instruction thus changes between SMP and non-SMP systems. To cleanly approach this, change the API so that GenMemBarrier returns whether it generated an instruction. ARM will have to query the result and update any dependent IT. Throw a build system error if TARGET_CPU_SMP is not set. Fix runtime/Android.mk to work with new multilib host. Bug: 14989275 Change-Id: I9e611b770e8a1cd4ca19367d7dae0573ec08dc61
9ee801f5308aa3c62ae3bedae2658612762ffb91	12-May-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	Add x86_64 code generation support Utilizes r0..r7 in register allocator, implements spill/unsill core regs as well as operations with stack pointer. Change-Id: I973d5a1acb9aa735f6832df3d440185d9e896c67 Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
2f244e9faccfcca68af3c5484c397a01a1c3a342	08-May-2014	Andreas Gampe <agampe@google.com>	ART: Add more ThreadOffset in Mir2Lir and backends This duplicates all methods with ThreadOffset parameters, so that both ThreadOffset<4> and ThreadOffset<8> can be handled. Dynamic checks against the compilation unit's instruction set determine which pointer size to use and therefore which methods to call. Methods with unsupported pointer sizes should fatally fail, as this indicates an issue during method selection. Change-Id: Ifdb445b3732d3dc5e6a220db57374a55e91e1bf6
30adc7383a74eb3cb6db3bf42cea3a5595055ce1	10-May-2014	buzbee <buzbee@google.com>	Quick compiler: Fix liveness tracking Rework temp register liveness tracking to play nicely with aliased physical registers, and re-enable liveness tracking optimization. Add a pair of x86 utility routines that act like UpdateLoc(), but only show in-register live temps if they are of the expected register class. Change-Id: I92779e0da2554689103e7488025be281f1a58989
674744e635ddbdfb311fbd25b5a27356560d30c3	24-Apr-2014	Vladimir Marko <vmarko@google.com>	Use atomic load/store for volatile IGET/IPUT/SGET/SPUT. Bug: 14112919 Change-Id: I79316f438dd3adea9b2653ffc968af83671ad282
091cc408e9dc87e60fb64c61e186bea568fc3d3a	31-Mar-2014	buzbee <buzbee@google.com>	Quick compiler: allocate doubles as doubles Significant refactoring of register handling to unify usage across all targets & 32/64 backends. Reworked RegStorage encoding to allow expanded use of x86 xmm registers; removed vector registers as a separate register type. Reworked RegisterInfo to describe aliased physical registers. Eliminated quite a bit of target-specific code and generalized common code. Use of RegStorage instead of int for registers now propagated down to the NewLIRx() level. In future CLs, the NewLIRx() routines will be replaced with versions that are explicit about what kind of operand they expect (RegStorage, displacement, etc.). The goal is to eventually use RegStorage all the way to the assembly phase. TBD: MIPS needs verification. TBD: Re-enable liveness tracking. Change-Id: I388c006d5fa9b3ea72db4e37a19ce257f2a15964
695d13a82d6dd801aaa57a22a9d4b3f6db0d0fdb	19-Apr-2014	buzbee <buzbee@google.com>	Update load/store utilities for 64-bit backends This CL replaces the typical use of LoadWord/StoreWord utilities (which, in practice, were 32-bit load/store) in favor of a new set that make the size explicit. We now have: LoadWordDisp/StoreWordDisp: 32 or 64 depending on target. Load or store the natural word size. Expect this to be used infrequently - generally when we know we're dealing with a native pointer or flushed register not holding a Dalvik value (Dalvik values will flush to home location sizes based on Dalvik, rather than the target). Load32Disp/Store32Disp: Load or store 32 bits, regardless of target. Load64Disp/Store64Disp: Load or store 64 bits, regardless of target. LoadRefDisp: Load a 32-bit compressed reference, and expand it to the natural word size in the target register. StoreRefDisp: Compress a reference held in a register of the natural word size and store it as a 32-bit compressed reference. Change-Id: I50fcbc8684476abd9527777ee7c152c61ba41c6f
3a74d15ccc9a902874473ac9632e568b19b91b1c	22-Apr-2014	Mingyao Yang <mingyao@google.com>	Delete throw launchpads. Bug: 13170824 Change-Id: I9d5834f5a66f5eb00f2ac80774e8c27dea99949e
a1758d83e298c9ee31848bcae07c2a35f6efd618	16-Apr-2014	Alexei Zavjalov <alexei.zavjalov@intel.com>	String.IndexOf method handles negative start index value in incorrect way The standard implementation of String.IndexOf converts the negative value of the start index to 0 and searching will start from the beginning of the string. But current implementation may start searching from the incorrect memory offset, that can lead to sigsegv or return incorrect result. This patch adds the handler for cases when fromIndex is negative. Change-Id: I3ac86290712789559eaf5e46bef0006872395bfa Signed-off-by: Alexei Zavjalov <alexei.zavjalov@intel.com>
6a58cb16d803c9a7b3a75ccac8be19dd9d4e520d	02-Apr-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	art: Handle x86_64 architecture equal to x86 This patch forces FE/ME to treat x86_64 as x86 exactly. The x86_64 logic will be revised later when assembly will be ready. Change-Id: I4a92477a6eeaa9a11fd710d35c602d8d6f88cbb6 Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
dd7624d2b9e599d57762d12031b10b89defc9807	15-Mar-2014	Ian Rogers <irogers@google.com>	Allow mixing of thread offsets between 32 and 64bit architectures. Begin a more full implementation x86-64 REX prefixes. Doesn't implement 64bit thread offset support for the JNI compiler. Change-Id: If9af2f08a1833c21ddb4b4077f9b03add1a05147
2700f7e1edbcd2518f4978e4cd0e05a4149f91b6	07-Mar-2014	buzbee <buzbee@google.com>	Continuing register cleanup Ready for review. Continue the process of using RegStorage rather than ints to hold register value in the top layers of codegen. Given the huge number of changes in this CL, I've attempted to minimize the number of actual logic changes. With this CL, the use of ints for registers has largely been eliminated except in the lowest utility levels. "Wide" utility routines have been updated to take a single RegStorage rather than a pair of ints representing low and high registers. Upcoming CLs will be smaller and more targeted. My expectations: o Allocate float double registers as a single double rather than a pair of float single registers. o Refactor to push code which assumes long and double Dalvik values are held in a pair of register to the target dependent layer. o Clean-up of the xxx_mir.h files to reduce the amount of #defines for registers. May also do a register renumbering to bring all of our targets' register naming more consistent. Possibly introduce a target-independent float/non-float test at the RegStorage level. Change-Id: I646de7392bdec94595dd2c6f76e0f1c4331096ff
99ad7230ccaace93bf323dea9790f35fe991a4a2	26-Feb-2014	Razvan A Lupusoru <razvan.a.lupusoru@intel.com>	Relaxed memory barriers for x86 X86 provides stronger memory guarantees and thus the memory barriers can be optimized. This patch ensures that all memory barriers for x86 are treated as scheduling barriers. And in cases where a barrier is needed (StoreLoad case), an mfence is used. Change-Id: I13d02bf3f152083ba9f358052aedb583b0d48640 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
e90501da0222717d75c126ebf89569db3976927e	12-Mar-2014	Serguei Katkov <serguei.i.katkov@intel.com>	Add dependency for operations with x86 FPU stack Load Hoisting optimization can re-order operations with FPU stack due to no dependency set. Patch adds resource dependency between these operations. Change-Id: Iccce98c8f3c565903667c03803884d9de1281ea8 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
b373e091eac39b1a79c11f2dcbd610af01e9e8a9	21-Feb-2014	Dave Allison <dallison@google.com>	Implicit null/suspend checks (oat version bump) This adds the ability to use SEGV signals to throw NullPointerException exceptions from Java code rather than having the compiler generate explicit comparisons and branches. It does this by using sigaction to trap SIGSEGV and when triggered makes sure it's in compiled code and if so, sets the return address to the entry point to throw the exception. It also uses this signal mechanism to determine whether to check for thread suspension. Instead of the compiler generating calls to a function to check for threads being suspended, the compiler will now load indirect via an address in the TLS area. To trigger a suspend, the contents of this address are changed from something valid to 0. A SIGSEGV will occur and the handler will check for a valid instruction pattern before invoking the thread suspension check code. If a user program taps SIGSEGV it will prevent our signal handler working. This will cause a failure in the runtime. There are two signal handlers at present. You can control them individually using the flags -implicit-checks: on the runtime command line. This takes a string parameter, a comma separated set of strings. Each can be one of: none switch off null null pointer checks suspend suspend checks all all checks So to switch only suspend checks on, pass: -implicit-checks:suspend There is also -explicit-checks to provide the reverse once we change the default. For dalvikvm, pass --runtime-arg -implicit-checks:foo,bar The default is -implicit-checks:none There is also a property 'dalvik.vm.implicit_checks' whose value is the same string as the command option. The default is 'none'. For example to switch on null checks using the option: setprop dalvik.vm.implicit_checks null It only works for ARM right now. Bumps OAT version number due to change to Thread offsets. Bug: 13121132 Change-Id: If743849138162f3c7c44a523247e413785677370
3bc8615332b7848dec8c2297a40f7e4d176c0efb	13-Mar-2014	Vladimir Marko <vmarko@google.com>	Use LIRSlowPath for intrinsics, improve String.indexOf(). Rewrite intrinsic launchpads to use the LIRSlowPath. Improve String.indexOf for constant chars by avoiding the check for code points over 0xFFFF. Change-Id: I7fd5583214c5b4ab9c38ee36c5d6f003dd6345a8
34fa0d935bed7a0e17bc6df4bd079e3428a179e7	12-Mar-2014	Yevgeny Rouban <yevgeny.y.rouban@intel.com>	ART's intrinsic for String.indexOf use the incorrect register ART's intrinsic for String.indexOf of x86 platform use the incorrect register to compare start with the string length. It should be fixed. Change-Id: I22986b4d4b23f62b4bb97baab9fe43152d12145e Signed-off-by: Vladimir Ivanov <vladimir.a.ivanov@intel.com> Signed-off-by: Yevgeny Rouban <yevgeny.y.rouban@intel.com>
49161cef10a308aedada18e9aa742498d6e6c8c7	12-Mar-2014	Jeff Hao <jeffhao@google.com>	Allow patching between dex files in the boot classpath. Change-Id: I53f219a5382d0fcd580e96e50025fdad4fc399df
83cc7ae96d4176533dd0391a1591d321b0a87f4f	12-Feb-2014	Vladimir Marko <vmarko@google.com>	Create a scoped arena allocator and use that for LVN. This saves more than 0.5s of boot.oat compilation time on Nexus 5. TODO: Move other stuff to the scoped allocator. This CL alone increases the peak memory allocation. By reusing the memory for other parts of the compilation we should reduce this overhead. Change-Id: Ifbc00aab4f3afd0000da818dfe68b96713824a08
a44d4f508fa1642294e79d3ebecd790afe75ea60	05-Mar-2014	buzbee <buzbee@google.com>	Fix read of uninitialized memory in InlineIndexOf The are two flavors of IndexOf that we treat as an intrinsic: a zero-based verion with 2 args and a 3-arg version that also takes a start position. The same code is used for both, but Valgrind reminded us that we shouldn't try loading a RegLocation for the non-extent 3rd arg in the 2 argument version. We got lucky in that the bug was benign - the generated code would still be correct. Change-Id: I0bc7798c8034d35007ffe6d6d62f9ceb91fc44fd
00e1ec6581b5b7b46ca4c314c2854e9caa647dd2	28-Feb-2014	Bill Buzbee <buzbee@android.com>	Revert "Revert "Rework Quick compiler's register handling"" This reverts commit 86ec520fc8b696ed6f164d7b756009ecd6e4aace. Ready. Fixed the original type, plus some mechanical changes for rebasing. Still needs additional testing, but the problem with the original CL appears to have been a typo in the definition of the x86 double return template RegLocation. Change-Id: I828c721f91d9b2546ef008c6ea81f40756305891
ae9fd93c39a341e2dffe15c61cc7d9e841fa92c4	11-Feb-2014	Mark Mendell <mark.p.mendell@intel.com>	Tell GDB about Quick ART generated code This is actually a lot of work. To do this, we need: .debug_info .debug_abbrev .debug_frame .debug_str These are generated into the OAT file by OatWriter and ElfWriterQuick. Since the Quick ART runtime doesn't use dlopen to load the OAT files, GDB can't find this information. Use the alternate GDB JIT interface, which can be invoked at runtime. To use this interface, an ELF image needs to be built in memory. Read the information from the OAT file, fixup the addresses to point to the real locations, add a symbol table to hold the .text symbol, and then let GDB know about the information, which will be read from the runtime address space. This is quite primitive now, and could be cleaned up considerably. It probably needs symbol table entries for the methods, and descriptions of parameters and return types. Currently only supported for X86. This defaults to enabled for debug builds. Added dexoat --gen-gdb-info and --no-gen-gdb-info flags to override. Change-Id: I4d18b2370f6dfaa00c8cc1925f10717be3bd1a62 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
86ec520fc8b696ed6f164d7b756009ecd6e4aace	26-Feb-2014	Bill Buzbee <buzbee@android.com>	Revert "Rework Quick compiler's register handling" This reverts commit 2c1ed456dcdb027d097825dd98dbe48c71599b6c. Change-Id: If88d69ba88e0af0b407ff2240566d7e4545d8a99
2c1ed456dcdb027d097825dd98dbe48c71599b6c	20-Feb-2014	buzbee <buzbee@google.com>	Rework Quick compiler's register handling For historical reasons, the Quick backend found it convenient to consider all 64-bit Dalvik values held in registers to be contained in a pair of 32-bit registers. Though this worked well for ARM (with double-precision registers also treated as a pair of 32-bit single-precision registers) it doesn't play well with other targets. And, it is somewhat problematic for 64-bit architectures. This is the first of several CLs that will rework the way the Quick backend deals with physical registers. The goal is to eliminate the "64-bit value backed with 32-bit register pair" requirement from the target-indendent portions of the backend and support 64-bit registers throughout. The key RegLocation struct, which describes the location of Dalvik virtual register & register pairs, previously contained fields for high and low physical registers. The low_reg and high_reg fields are being replaced with a new type: RegStorage. There will be a single instance of RegStorage for each RegLocation. Note that RegStorage does not increase the space used. It is 16 bits wide, the same as the sum of the 8-bit low_reg and high_reg fields. At a target-independent level, it will describe whether the physical register storage associated with the Dalvik value is a single 32 bit, single 64 bit, pair of 32 bit or vector. The actual register number encoding is left to the target-dependent code layer. Because physical register handling is pervasive throughout the backend, this restructuring necessarily involves large CLs with lots of changes. I'm going to roll these out in stages, and attempt to segregate the CLs with largely mechanical changes from those which restructure or rework the logic. This CL is of the mechanical change variety - it replaces low_reg and high_reg from RegLocation and introduces RegStorage. It also includes a lot of new code (such as many calls to GetReg()) that should go away in upcoming CLs. The tentative plan for the subsequent CLs is: o Rework standard register utilities such as AllocReg() and FreeReg() to use RegStorage instead of ints. o Rework the target-independent GenXXX, OpXXX, LoadValue, StoreValue, etc. routines to take RegStorage rather than int register encodings. o Take advantage of the vector representation and eliminate the current vector field in RegLocation. o Replace the "wide" variants of codegen utilities that take low_reg/high_reg pairs with versions that use RegStorage. o Add 64-bit register target independent codegen utilities where possible, and where not virtualize with 32-bit general register and 64-bit general register variants in the target dependent layer. o Expand/rework the LIR def/use flags to allow for more registers (currently, we lose out on 16 MIPS floating point regs as well as ARM's D16..D31 for lack of space in the masks). o [Possibly] move the float/non-float determination of a register from the target-dependent encoding to RegStorage. In other words, replace IsFpReg(register_encoding_bits). At the end of the day, all code in the target independent layer should be using RegStorage, as should much of the target dependent layer. Ideally, we won't be using the physical register number encoding extracted from RegStorage (i.e. GetReg()) until the NewLIRx() layer. Change-Id: Idc5c741478f720bdd1d7123b94e4288be5ce52cb
e19c91fdb88ff6fd4e88bc5984772dcfb1e86f80	25-Feb-2014	Mark Mendell <mark.p.mendell@intel.com>	Fix hardcoded offsets in x86 String.indexOf. Use runtime code that will work for 32 and 64 bit too. The old code copied constants from the runtime .S file and is correct for 32 bit code only. Change-Id: I668e1d7f2db8186518c358bde0759633be0d7c40 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
4028a6c83a339036864999fdfd2855b012a9f1a7	20-Feb-2014	Mark Mendell <mark.p.mendell@intel.com>	Inline x86 String.indexOf Take advantage of the presence of a constant search char or start index to tune the generated code. Change-Id: I0adcf184fb91b899a95aa4d8ef044a14deb51d88 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
f3e2cc4a38389aa75eb8ee3973a535254bf1c8d2	18-Feb-2014	Nicolas Geoffray <ngeoffray@google.com>	Code cleanup to avoid LLVM dependency when building with quick only. Change-Id: I0985c227d775c72fd23975d4c9bf673ba32615c2
3bc01748ef1c3e43361bdf520947a9d656658bf8	06-Feb-2014	Razvan A Lupusoru <razvan.a.lupusoru@intel.com>	GenSpecialCase support for x86 Moved GenSpecialCase from being ARM specific to common code to allow it to be used by x86 quick as well. Change-Id: I728733e8f4c4da99af6091ef77e5c76ae0fee850 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
55d0eac918321e0525f6e6491f36a80977e0d416	06-Feb-2014	Mark Mendell <mark.p.mendell@intel.com>	Support Direct Method/Type access for X86 Thumb generates code to optimize calls to methods within core.oat. Implement this for X86 as well, but take advantage of mov with 32 bit immediate and call relative with 32 bit immediate. Fix some incorrect return locations for long inlines. Change-Id: I1907bdfc7574f3d0aa76c7fad13dc537acdf1ed3 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
67c39c4aefca23cb136157b889c09ee200b3dec6	01-Feb-2014	Mark Mendell <mark.p.mendell@intel.com>	Support Literal pools for x86 They are being used to store double constants, which are very expensive to generate into XMM registers. Uses the 'Compiler Temporary' support just added. The MIR instructions are scanned for a reference to a double constant, a packed switch or a FillArray. These all need the address of the start of the method, since 32 bit x86 doesn't have a PC-relative addressing mode. If needed, a compiler temporary is allocated, and the address of the base of the method is calculated, and stored. Later uses can just refer to the saved value. Trickiness comes when generating the load from the literal area, as the offset is unknown before final assembler. Assume a 32 bit displacement is needed, and fix this if it wasn't necessary. Use LoadValue to load the 'base of method' pointer. Fix an incorrect test in GetRegLocation. Change-Id: I53ffaa725dabc370e9820c4e0e78664ede3563e6 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
107c31e598b649a8bb8d959d6a0377937e63e624	24-Jan-2014	Ian Rogers <irogers@google.com>	64bit friendly printf modifiers in LIR dumping. Also correct header file inclusion ordering. Change-Id: I8fb99e80cf1487e8b2278d4c1d110d14ed18c086
e02d48fb24747f90fd893e1c3572bb3c500afced	15-Jan-2014	Mark Mendell <mark.p.mendell@intel.com>	Optimize x86 long arithmetic Be smarter about taking advantage of a constant operand for x86 long add/sub/and/or/xor. Using instructions with immediates and generating results directly into memory reduces the number of temporary registers and avoids hardcoded register usage. Also rewrite the existing non-const x86 arithmetic to avoid fixed register use, and use the fact that x86 instructions are two operand. Pass the opcode to the XXXLong() routines to easily detect two operand DEX opcodes. Add a new StoreFinalValueWide() routine, which is similar to StoreValueWide, but doesn't do an EvalLoc to allocate registers. The src operand must already be in registers, and it just updates the dest location, and calls the right live/dirty routines to get the src into the dest properly. Change-Id: Iefc16e7bc2236a73dc780d3d5137ae8343171f62 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
d61ba4ba6fcde666adb5d5c81b1c32f0534fb2c8	13-Jan-2014	Bill Buzbee <buzbee@android.com>	Revert "Revert "Better support for x86 XMM registers"" This reverts commit 8ff67e3338952c70ccf3b609559bf8cc0f379cfd. Fix applied to loc.fp usage. Change-Id: I1eb3005392544fcf30c595923ed25bcee2dc4859
8ff67e3338952c70ccf3b609559bf8cc0f379cfd	11-Jan-2014	Bill Buzbee <buzbee@android.com>	Revert "Better support for x86 XMM registers" The invalid usage of loc.fp must be corrected before this change can be submitted. This reverts commit 766a5e5940b469ab40e52770862c81cfec1d835b. Change-Id: I1173a9bf829da89cccd9c2898f5e11164987a22b
766a5e5940b469ab40e52770862c81cfec1d835b	10-Jan-2014	Mark Mendell <mark.p.mendell@intel.com>	Better support for x86 XMM registers Currently, ART Quick mode assumes that a double FP register is composed of two single consecutive FP registers. This is true for ARM and MIPS, but not x86. This means that only half of the 8 XMM registers are available for use by x86 doubles. This patch breaks the assumption that a wide FP RegisterLocation must be a paired set of FP registers. This is done by making some routines in common code virtual and overriding them in the X86Mir2Lir class. For these wide fp locations, the high register is set to the same value as the low register, in order to minimize changes to common code. In a couple of places, the common code checks for this case. The changes are also supposed to allow the possibility of using the XMM registers for vector operations,but that support is still WIP. Change-Id: Ic6ef24ea764991c6f4d9fb88d483a619f5a468cb Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
988e6ea9ac66edf1e205851df9bb53de3f3763f3	08-Jan-2014	Ian Rogers <irogers@google.com>	Fix -O0 builds. Use snprintf rather than sprintf to avoid Werror failures. Work around an annotalysis bug when compiling -O0. Change-Id: Ie7e0a70dbceea5fa85f98262b91bcdbd74fdef1c
31c2aac7137b69d5622eea09597500731fbee2ef	09-Dec-2013	Vladimir Marko <vmarko@google.com>	Rename ClobberCalleeSave to Caller, fix it for x86. Change-Id: I6a72703a11985e2753fa9b4520c375a164301433
70b797d998f2a28e39f7d6ffc8a07c9cbc47da14	03-Dec-2013	Vladimir Marko <vmarko@google.com>	Unsafe.compareAndSwapLong() intrinsic for x86. Change-Id: Idbc5371a62dfdd84485a657d4548990519200205
88474b416eb257078e590bf9bc7957cee604a186	24-Oct-2013	Jeff Hao <jeffhao@google.com>	Implement Interface Method Tables (IMT). Change-Id: Idf7fe85e1293453a8ad862ff2380dcd5db4e3a39
0d82948094d9a198e01aa95f64012bdedd5b6fc9	12-Oct-2013	buzbee <buzbee@google.com>	64-bit prep Preparation for 64-bit roll. o Eliminated storing pointers in 32-bit int slots in LIR. o General size reductions of common structures to reduce impact of doubled pointer sizes: - BasicBlock struct was 72 bytes, now is 48. - MIR struct was 72 bytes, now is 64. - RegLocation was 12 bytes, now is 8. o Generally replaced uses of BasicBlock* pointers with 16-bit Ids. o Replaced several doubly-linked lists with singly-linked to save one stored pointer per node. o We had quite a few uses of uintptr_t's that were a holdover from the JIT (which used pointers to mapped dex & actual code cache addresses rather than trace-relative offsets). Replaced those with uint32_t's. o Clean up handling of embedded data for switch tables and array data. o Miscellaneous cleanup. I anticipate one or two additional CLs to reduce the size of MIR and LIR structs. Change-Id: I58e426d3f8e5efe64c1146b2823453da99451230
409fe94ad529d9334587be80b9f6a3d166805508	11-Oct-2013	buzbee <buzbee@google.com>	Quick assembler fix This CL re-instates the select pattern optimization disabled by CL 374310, and fixes the underlying problem: improper handling of the kPseudoBarrier LIR opcode. The bug was introduced in the recent assembler restructuring. In short, LIR pseudo opcodes (which have values < 0), should always have size 0 - and thus cause no bits to be emitted during assembly. In this case, bad logic caused us to set the size of a kPseudoBarrier opcode via lookup through the EncodingMap. Because all pseudo ops are < 0, this meant we did an array underflow load, picking up whatever garbage was located before the EncodingMap. This explains why this error showed up recently - we'd previuosly just gotten a lucky layout. This CL corrects the faulty logic, and adds DCHECKs to uses of the EncodingMap to ensure that we don't try to access w/ a pseudo op. Additionally, the existing is_pseudo_op() macro is replaced with IsPseudoLirOp(), named similar to the existing IsPseudoMirOp(). Change-Id: I46761a0275a923d85b545664cadf052e1ab120dc
b48819db07f9a0992a72173380c24249d7fc648a	15-Sep-2013	buzbee <buzbee@google.com>	Compile-time tuning: assembly phase Not as much compile-time gain from reworking the assembly phase as I'd hoped, but still worthwhile. Should see ~2% improvement thanks to the assembly rework. On the other hand, expect some huge gains for some application thanks to better detection of large machine-generated init methods. Thinkfree shows a 25% improvement. The major assembly change was to establish thread the LIR nodes that require fixup into a fixup chain. Only those are processed during the final assembly pass(es). This doesn't help for methods which only require a single pass to assemble, but does speed up the larger methods which required multiple assembly passes. Also replaced the block_map_ basic block lookup table (which contained space for a BasicBlock* for each dex instruction unit) with a block id map - cutting its space requirements by half in a 32-bit pointer environment. Changes: o Reduce size of LIR struct by 12.5% (one of the big memory users) o Repurpose the use/def portion of the LIR after optimization complete. o Encode instruction bits to LIR o Thread LIR nodes requiring pc fixup o Change follow-on assembly passes to only consider fixup LIRs o Switch on pc-rel fixup kind o Fast-path for small methods - single pass assembly o Avoid using cb[n]z for null checks (almost always exceed displacement) o Improve detection of large initialization methods. o Rework def/use flag setup. o Remove a sequential search from FindBlock using lookup table of 16-bit block ids rather than full block pointers. o Eliminate pcRelFixup and use fixup kind instead. o Add check for 16-bit overflow on dex offset. Change-Id: I4c6615f83fed46f84629ad6cfe4237205a9562b4
bd663de599b16229085759366c56e2ed5a1dc7ec	11-Sep-2013	buzbee <buzbee@google.com>	Compile-time tuning: register/bb utilities This CL yeilds about a 4% improvement in the compilation phase of dex2oat (single-threaded; multi-threaded compilation is more difficult to accurately measure). The register utilities could stand to be completely rewritten, but this gets most of the easy benefit. Next up: the assembly phase. Change-Id: Ife5a474e9b1a6d9e501e888dda6749d34eb77e96
f6c4b3ba3825de1dbb3e747a68b809c6cc8eb4db	25-Aug-2013	Mathieu Chartier <mathieuc@google.com>	New arena memory allocator. Before we were creating arenas for each method. The issue with doing this is that we needed to memset each memory allocation. This can be improved if you start out with arenas that contain all zeroed memory and recycle them for each method. When you give memory back to the arena pool you do a single memset to zero out all of the memory that you used. Always inlined the fast path of the allocation code. Removed the "zero" parameter since the new arena allocator always returns zeroed memory. Host dex2oat time on target oat apks (2 samples each). Before: real 1m11.958s user 4m34.020s sys 1m28.570s After: real 1m9.690s user 4m17.670s sys 1m23.960s Target device dex2oat samples (Mako, Thinkfree.apk): Without new arena allocator: 0m26.47s real 0m54.60s user 0m25.85s system 0m25.91s real 0m54.39s user 0m26.69s system 0m26.61s real 0m53.77s user 0m27.35s system 0m26.33s real 0m54.90s user 0m25.30s system 0m26.34s real 0m53.94s user 0m27.23s system With new arena allocator: 0m25.02s real 0m54.46s user 0m19.94s system 0m25.17s real 0m55.06s user 0m20.72s system 0m24.85s real 0m55.14s user 0m19.30s system 0m24.59s real 0m54.02s user 0m20.07s system 0m25.06s real 0m55.00s user 0m20.42s system Correctness of Thinkfree.apk.oat verified by diffing both of the oat files. Change-Id: I5ff7b85ffe86c57d3434294ca7a621a695bf57a9
468532ea115657709bc32ee498e701a4c71762d4	05-Aug-2013	Ian Rogers <irogers@google.com>	Entry point clean up. Create set of entry points needed for image methods to avoid fix-up at load time: - interpreter - bridge to interpreter, bridge to compiled code - jni - dlsym lookup - quick - resolution and bridge to interpreter - portable - resolution and bridge to interpreter Fix JNI work around to use JNI work around argument rewriting code that'd been accidentally disabled. Remove abstact method error stub, use interpreter bridge instead. Consolidate trampoline (previously stub) generation in generic helper. Simplify trampolines to jump directly into assembly code, keeps stack crawlable. Dex: replace use of int with ThreadOffset for values that are thread offsets. Tidy entry point routines between interpreter, jni, quick and portable. Change-Id: I52a7c2bbb1b7e0ff8a3c3100b774212309d0828e (cherry picked from commit 848871b4d8481229c32e0d048a9856e5a9a17ef9)
848871b4d8481229c32e0d048a9856e5a9a17ef9	05-Aug-2013	Ian Rogers <irogers@google.com>	Entry point clean up. Create set of entry points needed for image methods to avoid fix-up at load time: - interpreter - bridge to interpreter, bridge to compiled code - jni - dlsym lookup - quick - resolution and bridge to interpreter - portable - resolution and bridge to interpreter Fix JNI work around to use JNI work around argument rewriting code that'd been accidentally disabled. Remove abstact method error stub, use interpreter bridge instead. Consolidate trampoline (previously stub) generation in generic helper. Simplify trampolines to jump directly into assembly code, keeps stack crawlable. Dex: replace use of int with ThreadOffset for values that are thread offsets. Tidy entry point routines between interpreter, jni, quick and portable. Change-Id: I52a7c2bbb1b7e0ff8a3c3100b774212309d0828e
7934ac288acfb2552bb0b06ec1f61e5820d924a4	26-Jul-2013	Brian Carlstrom <bdc@google.com>	Fix cpplint whitespace/comments issues Change-Id: Iae286862c85fb8fd8901eae1204cd6d271d69496
2ce745c06271d5223d57dbf08117b20d5b60694a	18-Jul-2013	Brian Carlstrom <bdc@google.com>	Fix cpplint whitespace/braces issues Change-Id: Ide80939faf8e8690d8842dde8133902ac725ed1a
7940e44f4517de5e2634a7e07d58d0fb26160513	12-Jul-2013	Brian Carlstrom <bdc@google.com>	Create separate Android.mk for main build targets The runtime, compiler, dex2oat, and oatdump now are in seperate trees to prevent dependency creep. They can now be individually built without rebuilding the rest of the art projects. dalvikvm and jdwpspy were already this way. Builds in the art directory should behave as before, building everything including tests. Change-Id: Ic6b1151e5ed0f823c3dd301afd2b13eb2d8feb81