Cross Reference: /art/compiler/dex/quick/x86/target

History log of /art/compiler/dex/quick/x86/target_x86.cc
Revision	Date	Author	Comments
3d21bdf8894e780d349c481e5c9e29fe1556051c	22-Apr-2015	Mathieu Chartier <mathieuc@google.com>	Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 (cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33) Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3 ART: Fix casts for 64-bit pointers on 32-bit compiler. Bug: 19264997 Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457 Fix JDWP tests after ArtMethod change Fixes Throwable::GetStackDepth for exception event detection after internal stack trace representation change. Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of proxy method. Bug: 19264997 Change-Id: I363e293796848c3ec491c963813f62d868da44d2 Fix accidental IMT and root marking regression Was always using the conflict trampoline. Also included fix for regression in GC time caused by extra roots. Most of the regression was IMT. Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to detached thread. EvaluateAndApplyChanges: From ~2500 -> ~1980 GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots Bug: 19264997 Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0 Fix bogus image test assert Previously we were comparing the size of the non moving space to size of the image file. Now we properly compare the size of the image space against the size of the image file. Bug: 19264997 Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a [MIPS64] Fix art_quick_invoke_stub argument offsets. ArtMethod reference's size got bigger, so we need to move other args and leave enough space for ArtMethod* and 'this' pointer. This fixes mips64 boot. Bug: 19264997 Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
848f70a3d73833fc1bf3032a9ff6812e429661d9	15-Jan-2014	Jeff Hao <jeffhao@google.com>	Replace String CharArray with internal uint16_t array. Summary of high level changes: - Adds compiler inliner support to identify string init methods - Adds compiler support (quick & optimizing) with new invoke code path that calls method off the thread pointer - Adds thread entrypoints for all string init methods - Adds map to verifier to log when receiver of string init has been copied to other registers. used by compiler and interpreter Change-Id: I797b992a8feb566f9ad73060011ab6f51eb7ce01
2cebb24bfc3247d3e9be138a3350106737455918	22-Apr-2015	Mathieu Chartier <mathieuc@google.com>	Replace NULL with nullptr Also fixed some lines that were too long, and a few other minor details. Change-Id: I6efba5fb6e03eb5d0a300fddb2a75bf8e2f175cb
1961b609bfefaedb71cee3651c4f931cc3e7393d	08-Apr-2015	Vladimir Marko <vmarko@google.com>	Quick: PC-relative loads from dex cache arrays on x86. Rewrite all PC-relative addressing on x86 and implement PC-relative loads from dex cache arrays. Don't adjust the base to point to the start of the method, let it point to the anchor, i.e. the target of the "call +0" insn. Change-Id: Ic22544a8bc0c5e49eb00a75154dc8f3ead816989
1109fb3cacc8bb667979780c2b4b12ce5bb64549	07-Apr-2015	David Srbecky <dsrbecky@google.com>	Implement CFI for Quick. CFI is necessary for stack unwinding in gdb, lldb, and libunwind. Change-Id: Ic3b84c9dc91c4bae80e27cda02190f3274e95ae8
8c57831b2b07185ee1986b9af68a351e1ca584c3	07-Apr-2015	David Srbecky <dsrbecky@google.com>	Remove the old CFI infrastructure. Change-Id: I12a17a8a1c39ffccaa499c328ebac36e4d74dc4e
dc56cc509d8e1718ad321f7a91661dbe85ec8cef	27-Mar-2015	Vladimir Marko <vmarko@google.com>	PC-relative loads from dex cache arrays for x86-64. Change-Id: I6cfe22c7e69512b3c0f95b073aaa572db74ec189
f6737f7ed741b15cfd60c2530dab69f897540735	23-Mar-2015	Vladimir Marko <vmarko@google.com>	Quick: Clean up Mir2Lir codegen. Clean up WrapPointer()/UnwrapPointer() and OpPcRelLoad(). Change-Id: I1a91f01e1e779599c77f3f6efcac2a6ad34629cf
085b733d15ec09afa27b85358acb89d9bc02e843	24-Feb-2015	Maxim Kazantsev <maxim.kazantsev@intel.com>	ART: AddVectorReduce should store result in memory carefully When generating AddVectorReduce, in some cirsumstances we add value reduced from vector directly to memory. We must ensure that local LIR optimizations are aware about it. Change-Id: I8fe19939f67dcd184b08f63026b0da18007d34b8
80b96d1a76790527f72a660ac03d9c215eed17ce	19-Feb-2015	Vladimir Marko <vmarko@google.com>	Replace a few std::vector with ArenaVector in Mir2Lir. Change-Id: I7867d60afc60f57cdbbfd312f02883854d65c805
b3cdf93d70256c4b0a9f6ed55ba4601f8c70bad4	27-Jan-2015	Mark Mendell <mark.p.mendell@intel.com>	ART: Fix to X86Mir2Lir::GenReduceVector When generating the result to memory, the existing code didn't set the aliasing correctly. Mark the result as going to a Dalvik VR, and mark it as only a write. Change-Id: I12f3156b7f84548b320a4fc142ff5a87a14e73d1 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
3e6a3bf797e49b7f449256455c7e522e888687d8	19-Jan-2015	Mark Mendell <mark.p.mendell@intel.com>	ART: Change x86 long param ABI (Quick/JNI/Opt) Ensure that we don't pass a long parameter across the last register and the stack: skip the register and allocate it only on the stack. This was requested to simplify the optimizing compiler code generation for x86. Optimizing (Baseline) compiler support for x86 longs: - Remove QuickParameter from Location, as there are no longer any uses of it. Bump oat.h version because we changed an ABI again. I changed IsParamALong() to return false for argument 0 (this argument). I am not sure why it differed from all other tests. I have not tested on ARM. I followed Nicolas's suggestions for setting the value of kSplitPairAcrossRegisterAndStack for different architectures. Change-Id: I2f16b33c1dac58dd4f4f503e9c2309d845f5fb7a Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
966c3ae95d3c699ee9fbdbccc1acdaaf02325faf	27-Jan-2015	Mark P Mendell <mark.p.mendell@intel.com>	Revert "Revert "ART: Implement X86 hard float (Quick/JNI/Baseline)"" This reverts commit 949c91fb91f40a4a80b2b492913cf8541008975e. This time, don't clobber EBX before saving it. Redo some of the macros to make register usage explicit. Change-Id: I8db8662877cd006816e16a28f42444ab7c36bfef
949c91fb91f40a4a80b2b492913cf8541008975e	27-Jan-2015	Vladimir Marko <vmarko@google.com>	Revert "ART: Implement X86 hard float (Quick/JNI/Baseline)" And the 3 Mac build fixes. Fix conflicts in context_x86.* . This reverts commits 3d2c8e74c27efee58e24ec31441124f3f21384b9 , 34eda1dd66b92a361797c63d57fa19e83c08a1b4 , f601d1954348b71186fa160a0ae6a1f4f1c5aee6 , bc503348a1da573488503cc2819c9e30807bea31 . Bug: 19150481 Change-Id: I6650ee30a7d261159380fe2119e14379e4dc9970
0b9203e7996ee1856f620f95d95d8a273c43a3df	23-Jan-2015	Andreas Gampe <agampe@google.com>	ART: Some Quick cleanup Make several fields const in CompilationUnit. May benefit some Mir2Lir code that repeats tests, and in general immutability is good. Remove compiler_internals.h and refactor some other headers to reduce overly broad imports (and thus forced recompiles on changes). Change-Id: I898405907c68923581373b5981d8a85d2e5d185a
24c846a0df02d4cc2ef8a9c476305dca96be40db	26-Jan-2015	Vladimir Marko <vmarko@google.com>	Quick: Fix range check for intrinsic String.charAt() on x86. Bug: 19125146 (cherry picked from commit 00ca84730a21578dcc6b47bd8e08b78ab9b2dded) Change-Id: I67184371597fdcc9d9186172c1cff4efd3ca3093
00ca84730a21578dcc6b47bd8e08b78ab9b2dded	26-Jan-2015	Vladimir Marko <vmarko@google.com>	Quick: Fix range check for intrinsic String.charAt() on x86. Bug: 19125146 Change-Id: I274190a7a60cd2e29a854738ed1ec99a9e611969
3d2c8e74c27efee58e24ec31441124f3f21384b9	13-Jan-2015	Mark Mendell <mark.p.mendell@intel.com>	ART: Implement X86 hard float (Quick/JNI/Baseline) Use XMM0-XMM3 as parameter registers for float/double on X86. X86_64 already uses XMM0-XMM7 for parameters. Change the 'hidden' argument register from XMM0 to XMM7 to avoid a conflict. Add support for FPR save/restore in runtime/arch/x86. Minimal support for Optimizing baseline compiler. Bump the version in runtime/oat.h because this is an ABI change. Change-Id: Ia6fe150e8488b9e582b0178c0dda65fc81d5a8ba Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
0f9b03c65e0ee8bdc5ddf58af100f5fc356cc98b	12-Jan-2015	Vladimir Marko <vmarko@google.com>	Revert "ART: Implement hard float for X86" This reverts commit 59b9cf7ec0ccc13df91be0bd5c723b8c52410739. Change-Id: I08333b528032480def474286dc368d916a07e17f
59b9cf7ec0ccc13df91be0bd5c723b8c52410739	09-Jan-2015	Mark Mendell <mark.p.mendell@intel.com>	ART: Implement hard float for X86 Use XMM0-XMM3 as parameter registers for float/double on X86. X86_64 already uses XMM0-XMM7 for parameters. Change the 'hidden' argument register from XMM0 to XMM7 to avoid a conflict. This change was requested to simplify the Optimizing compiler implementation. Change-Id: I89ba8ade99b9a8a5b1ad1ee5f5cbfd33d656bfaa Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
bfe400bb1a28cde991cdb3e39bc27bae6b04b8c2	19-Dec-2014	Vladimir Marko <vmarko@google.com>	Fix running out of temps when storing invoke-interface result. On ARM, after emitting invoke-interface we didn't have any free temps to use for storing the result, so we would crash if the result was an unpromoted dalvik register with stack location too far from SP. Bug: 18769895 (cherry picked from commit d6bd06c713e8ec69de96510ef57bdf7adb4781ed) Change-Id: Id88f6f3788eaf6ecbc7bd68880b445423f6e4f94
d6bd06c713e8ec69de96510ef57bdf7adb4781ed	19-Dec-2014	Vladimir Marko <vmarko@google.com>	Fix running out of temps when storing invoke-interface result. On ARM, after emitting invoke-interface we didn't have any free temps to use for storing the result, so we would crash if the result was an unpromoted dalvik register with stack location too far from SP. Bug: 18769895 Change-Id: Ie6c131d68f1853a8317b305a22eab22faea80e90
6f5f5d05caed8465ad15ca5728e2a30c7a080d94	07-Dec-2014	Maxim Kazantsev <maxim.kazantsev@intel.com>	ART: Implement FP packed reduce for x86 This patch implements correct FP vector reduction by index. Previous implementation corresponded to packed add reduction. Change-Id: I02a9bcb8e8945937ba7a511b723f23ec30667d34
ca5413403192022d734ce76fda9a84aa63eb9148	15-Oct-2014	Mark Mendell <mark.p.mendell@intel.com>	ART: Ensure FP GET/PUT doesn't use Core register Routine void org.jbox2d.collision.AABB.combine( org.jbox2d.collision.AABB, org.jbox2d.collision.AABB) in the icyrocks application generated code for an iget of a FP field that was loaded into a Core register, and then into an XMM register. This was caused by the Dex code: 0x0030: iget v2, v2, F org.jbox2d.common.Vec2.x // field@3747 I traced this to GenIGet using a reg_class of kAnyReg, and EvalLoc finding that v2 was available in EDX. Since kAnyReg is compatible with EDX, The iget loaded the FP value into EDX, and then into an XMM register for subsequent use. Fix: Pass kSingle/kDouble into IGET/IPUT/SGET/SPUT/AGET/APUT when the source/destination is FP. Change X86Mir2Lir::RegClassForFieldLoadStore to return kFPReg for those cases. This causes EvalLoc to return an XMM register, and the load is done right to the XMM register. Change-Id: Ifbcc9e4d80bc6da8ea4ebf7e6cebaaf672a2766e Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
717a3e447c6f7a922cf9c3efe522747a187a045d	13-Nov-2014	Serguei Katkov <serguei.i.katkov@intel.com>	Re-factor Quick ABI support Now every architecture must provide a mapper between VRs parameters and physical registers. Additionally as a helper function architecture can provide a bulk copy helper for GenDalvikArgs utility. All other things becomes a common code stuff: GetArgMappingToPhysicalReg, GenDalvikArgsNoRange, GenDalvikArgsRange, FlushIns. Mapper now uses shorty representation of input parameters. This is required due to location are not enough to detect the type of parameter (fp or core). For the details see https://android-review.googlesource.com/#/c/113936/. Change-Id: Ie762b921e0acaa936518ee6b63c9a9d25f83e434 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
27dee8bcd7b4a53840b60818da8d2c819ef199bd	02-Dec-2014	Mark Mendell <mark.p.mendell@intel.com>	X86_64 QBE: use RIP addressing Take advantage of RIP addressing in 64 bit mode to improve the code generation for accesses to the constant area as well as packed switches. Avoid computing the address of the start of the method, which is needed in 32 bit mode. To do this, we add a new 'pseudo-register' kRIPReg to minimize the changes needed to get the new addressing mode to be generated. Change-Id: Ia28c93f98b09939806d91ff0bd7392e58996d108 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
b72c723bfb21e05cb9b0a7999db805df93fcaee8	29-Oct-2014	Razvan A Lupusoru <razvan.a.lupusoru@intel.com>	ART: X86 vectorized reduce may use incorrect extract index In the case of reduction to memory VR, the extract index is ignored. However, it should not be ignored because it is needed for pextr instruction. Change-Id: I46a0c76218a0553e677225e403786522c079d27d Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
8366ca0d7ba3b80a2d5be65ba436446cc32440bd	17-Nov-2014	Elliott Hughes <enh@google.com>	Fix the last users of TARGET_CPU_SMP. Everyone else assumes SMP. Change-Id: I7ff7faef46fbec6c67d6e446812d599e473cba39
2d7210188805292e463be4bcf7a133b654d7e0ea	10-Nov-2014	Mathieu Chartier <mathieuc@google.com>	Change 64 bit ArtMethod fields to be pointer sized Changed the 64 bit entrypoint and gc map fields in ArtMethod to be pointer sized. This saves a large amount of memory on 32 bit systems. Reduces ArtMethod size by 16 bytes on 32 bit. Total number of ArtMethod on low memory mako: 169957 Image size: 49203 methods -> 787248 image size reduction. Zygote space size: 1070 methods -> 17120 size reduction. App methods: ~120k -> 2 MB savings. Savings per app on low memory mako: 125K+ per app (less active apps -> more image methods per app). Savings depend on how often the shared methods are on dirty pages vs shared. TODO in another CL, delete gc map field from ArtMethod since we should be able to get it from the Oat method header. Bug: 17643507 Change-Id: Ie9508f05907a9f693882d4d32a564460bf273ee8 (cherry picked from commit e832e64a7e82d7f72aedbd7d798fb929d458ee8f)
b28c1c06236751aa5c9e64dcb68b3c940341e496	08-Nov-2014	Ian Rogers <irogers@google.com>	Tidy RegStorage for X86. Don't use global variables initialized in constructors to hold onto constant values, instead use the TargetReg32 helper. Improve this helper with the use of lookup tables. Elsewhere prefer to use constexpr values as they will have less runtime cost. Add an ostream operator to RegStorage for CHECK_EQ and use. Change-Id: Ib8d092d46c10dac5909ecdff3cc1e18b7e9b1633
277ccbd200ea43590dfc06a93ae184a765327ad0	04-Nov-2014	Andreas Gampe <agampe@google.com>	ART: More warnings Enable -Wno-conversion-null, -Wredundant-decls and -Wshadow in general, and -Wunused-but-set-parameter for GCC builds. Change-Id: I81bbdd762213444673c65d85edae594a523836e5
6a3c1fcb4ba42ad4d5d142c17a3712a6ddd3866f	31-Oct-2014	Ian Rogers <irogers@google.com>	Remove -Wno-unused-parameter and -Wno-sign-promo from base cflags. Fix associated errors about unused paramenters and implict sign conversions. For sign conversion this was largely in the area of enums, so add ostream operators for the effected enums and fix tools/generate-operator-out.py. Tidy arena allocation code and arena allocated data types, rather than fixing new and delete operators. Remove dead code. Change-Id: I5b433e722d2f75baacfacae4d32aef4a828bfe1b
5c5676b26a08454b3f0133783778991bbe5dd681	30-Sep-2014	Razvan A Lupusoru <razvan.a.lupusoru@intel.com>	ART: Add div/rem zero check elimination flag Just as with other throwing bytecodes, it is possible to prove in some cases that a divide/remainder won't throw ArithmeticException. For example, in case two divides with same denominator are in order, then provably the second one cannot throw if the first one did not. This patch adds the elimination flag and updates the signature of several Mir2Lir methods to take the instruction optimization flags into account. Change-Id: I0b078cf7f29899f0f059db1f14b65a37444b84e8 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
7e70b002c4552347ed1af8c002a0e13f08864f20	08-Oct-2014	Ian Rogers <irogers@google.com>	Header file clean up. Remove runtime.h from object.h. Move TypeStaticIf to its own header file to avoid bringing utils.h into allocator.h. Move Array::DataOffset into -inl.h as it now has a utils.h dependency. Fix include issues arising from this. Change-Id: I4605b1aa4ff5f8dc15706a0132e15df03c7c8ba0
db7239ccce7748f2b494fb3b91c128b37019a093	17-Sep-2014	avignate <aleksey.v.ignatenko@intel.com>	ART: Overflow of bound check in ArrayCopy intrinsic System.arraycopy method is implemented as intrinsic on x86. It has bound check which has a bug to overflow in certain conditions when summ of array offset and number elements to be copied are more than MAX_INT. For the methods like CarArrayBuffer.get it means no OutOfBound exception to be thrown. The proposed solution fixed that. b/17711775 Signed-off-by: avignate <aleksey.v.ignatenko@intel.com> (cherry picked from commit f9f0ed401f7fe4138a71b36719423b908a3b7bfb) Change-Id: I1d4ca900df262d483a94ebea8fa686ea361772c8
02ff2d4187249d26fabe8e5eacc27b99984ee353	04-Sep-2014	Serguei Katkov <serguei.i.katkov@intel.com>	AddIntrinsicSlowPath with resume requires clobbering AddIntrinsicSlowPath with resume results in a call. So all temps must be clobbered at the point where AddIntrinsicSlowPath returns. (cherry-picked from 9863daf4fdc1a08339edac794452dbc719aef4f1) Change-Id: If9eb887e295ff5e59920f4da1cef63258ad490b0 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
53cec00aa6789382621a53b33b13f45bd27148ca	26-Sep-2014	Udayan Banerji <udayan.banerji@intel.com>	ART: Fix GenReduceVector and GenSetVector For GenReduceVector: We now correctly load non-wide values for non-wide destination registers, and generate reg-reg and reg-mem forms of pextr correctly. For GenSetVector: We use the correct opcode from loading into an xmm from a 64-bit GPR Change-Id: I0a01d1f0b12b32a0dee8f79a0139ffcf6d6cb4d5 Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
f4da675bbc4615c5f854c81964cac9dd1153baea	01-Aug-2014	Vladimir Marko <vmarko@google.com>	Implement method calls using relative BL on ARM. Store the linker patches with each CompiledMethod instead of keeping them in CompilerDriver. Reorganize oat file creation to apply the patches as we're writing the method code. Add framework for platform-specific relative call patches in the OatWriter. Implement relative call patches for ARM. Change-Id: Ie2effb3d92b61ac8f356140eba09dc37d62290f8
e39c54ea575ec710d5e84277fcdcc049f8acb3c9	22-Sep-2014	Vladimir Marko <vmarko@google.com>	Deprecate GrowableArray, use ArenaVector instead. Purge GrowableArray from Quick and Portable. Remove GrowableArray<T>::Iterator. Change-Id: I92157d3a6ea5975f295662809585b2dc15caa1c6
f9f0ed401f7fe4138a71b36719423b908a3b7bfb	17-Sep-2014	avignate <aleksey.v.ignatenko@intel.com>	ART: Overflow of bound check in ArrayCopy intrinsic System.arraycopy method is implemented as intrinsic on x86. It has bound check which has a bug to overflow in certain conditions when summ of array offset and number elements to be copied are more than MAX_INT. For the methods like CarArrayBuffer.get it means no OutOfBound exception to be thrown. The proposed solution fixed that. Change-Id: Id16a26163a61d934b862a8729a52ca5c1a56caec Signed-off-by: avignate <aleksey.v.ignatenko@intel.com>
0a1174efd81fc25110ad106a84063c62af9ce7e5	11-Sep-2014	Mark Mendell <mark.p.mendell@intel.com>	X86 QBE: Make some X86 routines virtual Add virtual in one place, and move some code into a virtual routine. This allows subclassing and overriding for my purposes. Change-Id: Ie415df943b17b56ad1f057513b2df2a31801a72f Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
9863daf4fdc1a08339edac794452dbc719aef4f1	04-Sep-2014	Serguei Katkov <serguei.i.katkov@intel.com>	AddIntrinsicSlowPath with resume requires clobbering AddIntrinsicSlowPath with resume results in a call. So all temps must be clobbered at the point where AddIntrinsicSlowPath returns. Change-Id: If9eb887e295ff5e59920f4da1cef63258ad490b0 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
6dccdc2511c9f22d3cc2ea83386ce9db2688fa19	18-Aug-2014	Maxim Kazantsev <maxim.kazantsev@intel.com>	ART: Reduce LockCallTemps usage Using FlushAllRegs/LockCallTemps in integer arithmetics causes excess register flushing and clobbering. This patch adds API that allows to flush, clobber and lock only those registers we really need for calculations. Change-Id: Idabaa4fff4d18a33e5040a80f66f2df6432f8be0 Signed-off-by: Max Kazantsev <maxim.kazantsev@intel.com>
b3a84e2f308b3ed7d17b8e96fc7adfcac36ebe77	28-Jul-2014	Lupusoru, Razvan A <razvan.a.lupusoru@intel.com>	ART: Vectorization opcode implementation fixes This patch fixes the implementation of the x86 vectorization opcodes. Change-Id: I0028d54a9fa6edce791b7e3a053002d076798748 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com> Signed-off-by: Udayan Banerji <udayan.banerji@intel.com> Signed-off-by: Philbert Lin <philbert.lin@intel.com>
8d0d03e24325463f0060abfd05dba5598044e9b1	07-Jun-2014	Razvan A Lupusoru <razvan.a.lupusoru@intel.com>	ART: Change temporaries to positive names Changes compiler temporaries to have positive names. The numbering now puts them above the code VRs (locals + ins, in that order). The patch also introduces APIs to query the number of temporaries, locals and ins. The compiler temp infrastructure suffered from several issues which are also addressed by this patch: -There is no longer a queue of compiler temps. This would be polluted with Method* when post opts were called multiple times. -Sanity checks have been added to allow requesting of temps from BE and to prevent temps after frame is committed. -None of the structures holding temps can overflow because they are allocated to allow holding maximum temps. Thus temps can be requested by BE with no problem. -Since the queue of compiler temps is no longer maintained, it is no longer possible to refer to a temp that has invalid ssa (because it was requested before ssa was run). -The BE can now request temps after all ME allocations and it is guaranteed to actually receive them. -ME temps are now treated like normal VRs in all cases with no special handling. Only the BE temps are handled specially because there are no references to them from MIRs. -Deprecated and removed several fields in CompilationUnit that saved register information and updated callsites to call the new interface from MIRGraph. Change-Id: Ia8b1fec9384a1a83017800a59e5b0498dfb2698c Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com> Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
b5bce7cc9f1130ab4932ba8e6917c362bf871f24	25-Jul-2014	Jean Christophe Beyler <jean.christophe.beyler@intel.com>	ART: Add non-temporal store support Added non-temporal store support as a hint from the ME. Added the implementation of the memory barrier extended instruction that supports non-temporal stores by explicitly serializing all previous store-to-memory instructions. Change-Id: I8205a92083f9725253d8ce893671a133a0b6849d Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com> Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
53c913bb71b218714823c8c87a1f92830c336f61	13-Aug-2014	Andreas Gampe <agampe@google.com>	ART: Clean up compiler Clean up the compiler: less extern functions, dis-entangle compilers, hide some compiler specifics, lower global includes. Change-Id: Ibaf88d02505d86994d7845cf0075be5041cc8438
b5874a47a4d2c4d2971116b031b4068021ffda05	19-Aug-2014	Vladimir Marko <vmarko@google.com>	X86: Fix alias info in GenInlinedIndexOf(). For 32-bit X86, GenInlinedIndexOf() pushes and pops EDI. In one branch it then calls Load32Disp() with adjusted stack offset. That calculates wrong alias_info for the generated insn. If left unfixed, this could confuse load hoisting. Bug: 17128502 (cherry picked from commit 74de63bb1cc275b411cae28a96f9b3a78b939bc2) Change-Id: I5dc82b7aae9e9655e75843a952b8ebb04269f46b
74de63bb1cc275b411cae28a96f9b3a78b939bc2	19-Aug-2014	Vladimir Marko <vmarko@google.com>	X86: Fix alias info in GenInlinedIndexOf(). For 32-bit X86, GenInlinedIndexOf() pushes and pops EDI. In one branch it then calls Load32Disp() with adjusted stack offset. That calculates wrong alias_info for the generated insn. If left unfixed, this could confuse load hoisting. Bug: 17128502 Change-Id: I0ea07b8f5e25410e290304f662d5fd5bf66c0933
e3ea83811d47152c00abea24a9b420651a33b496	08-Aug-2014	Yevgeny Rouban <yevgeny.y.rouban@intel.com>	ART source line debug info in OAT files OAT files have source line information enough for ART runtime needs like jump to/from interpreter and thread suspension. But this information is not enough for finer grained source level debugging and low-level profiling (VTune or perf). This patch adds to OAT files two additional sections: .debug_line - DWARF formatted Elf32 section with detailed source line information (mapping from native PC to Java source lines). In addition to the debugging symbols added using the dex2oat option --include-debug-symbols, the source line information is added to the section .debug_line. The source line info can be read by many Elf reading tools like objdump, readelf, dwarfdump, gdb, perf, VTune, ... gdb can use this debug line information in x86. In 64-bit mode the information can be used if the oat file is mapped in the lower address space (address has higher 32 bits zeroed). Relocation works. Testing: 1. art/test/run-test --host --gdb [--64] 001-HelloWorld 2. in gdb: break Main.java:19 3. in gdb: break Runtime.java:111 4. in gdb: run - stops at void java.lang.Runtime.<init>() 5. in gdb: backtrace - shows call stack down to main() 6. in gdb: continue - stops at void Main.main() (only in 32-bit mode) 7. in gdb: backtrace - shows call stack down to main() 8. objdump -W <oat-file> - addresses are from VMA range of .text section reported by objdump -h <file> 9. dwarfdump -ka <oat-file> - no errors expected Size of aosp-x86-eng boot.oat increased by 11% from 80.5Mb to 89.2Mb with two sections added .debug_line (7.2Mb) and .rel.debug (1.5Mb). Change-Id: Ib8828832686e49782a63d5529008ff4814ed9cda Signed-off-by: Yevgeny Rouban <yevgeny.y.rouban@intel.com>
06839f868c9c4bb1f2f6333f9e88a560e80bcad8	15-Aug-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: Use RegClassBySize() for field accesses This patch optimizes x86_64 field accesses to use kAnyReg whenever possible via RegClassBySize(). Previously, using kCoreReg is too strict. Change-Id: I55a48765b9bfe6b11c4b09f85c4eb08a6e269f98 Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
b0f05b9654eb005bc8c8e15f615a7f5a312f640c	17-Jul-2014	Dave Allison <dallison@google.com>	Add implicit checks for x86_64 architecture. This combines the x86 and x86_64 fault handlers into one. It also merges in the change to the entrypoints for X86_64. Replaces generic instruction length calculator with one that only works with the specific instructions we use. Bug: 16256184 Change-Id: I1e8ab5ad43f46060de9597615b423c89a836035c Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
8bd698fb785b58302be684efcbb24a0b8c6535d7	01-Aug-2014	nikolay serdjuk <nikolay.y.serdjuk@intel.com>	x86: A couple of minor changes for String.indexOf() inlining 1. Removed sequence of FlushReg + Clobber + LockTemp for particular registers and added FlushAllRegs. I believe it will make sources more readable and not affect the performance too much. 2. Made MarkPossibleNullPointerException call unconditional Change-Id: I817f77718e15ec76cae35cf9fb04c0e5dcfb2d16
e70f179aca4f13b15be8a47a4d9e5b6c2422c69a	09-Aug-2014	Haitao Feng <haitao.feng@intel.com>	ART: Fix two small DumpLIRInsn issues for x86_64 port. Change-Id: I81ef32380bfc73d6c2bfc37a7f4903d912a5d9c8 Signed-off-by: Haitao Feng <haitao.feng@intel.com>
dfd3b47813c14c5f1607cbe7b10a28b1b2f29cbc	17-Jul-2014	Dave Allison <dallison@google.com>	Add implicit checks for x86_64 architecture. This combines the x86 and x86_64 fault handlers into one. It also merges in the change to the entrypoints for X86_64. Replaces generic instruction length calculator with one that only works with the specific instructions we use. Bug: 16256184 Change-Id: I1e8ab5ad43f46060de9597615b423c89a836035c Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
79273802f2b788bcd3eb76edf4df1bcaa57f886f	06-Aug-2014	Andreas Gampe <agampe@google.com>	ART: Rework CFA frame initialization and writing code Move eh_frame initialization code and CFI writing code to elf_writer_quick to remove hard-wired dependencies on specific Quick-compiler backends. Change-Id: I27ee8ce7245da33a20c90e0086b8d4fd0a2baf4d
e7f82e2515f47f3c3292281312d7031a34a58ffc	06-Aug-2014	Fred Shih <ffred@google.com>	Added support for patching classes from different dex files. Added support for class patching from different dex files and moved ScopedObjectAccess from the quick compiler to driver. Slight refactoring for clarity. Bug: 16656190 Change-Id: I107fcbce75db42ca61321ea1c5d5f236680a1b3d
547cdfd21ee21e4ab9ca8692d6ef47c62ee7ea52	05-Aug-2014	Tong Shen <endlessroad@google.com>	Emit CFI for x86 & x86_64 JNI compiler. Now for host-side x86 & x86_64 ART, we are able to get complete stacktrace with even mixed C/C++ & Java stack frames. Testing: 1. art/test/run-test --host --gdb [--64] --no-relocate 005 2. In gdb, run 'b art::Class_classForName' which is implementation of a Java native method, then 'r' 3. In gdb, run 'bt'. You should see stack frames down to main() Change-Id: I2d17e9aa0f6d42d374b5362a15ea35a2fce96302
5a5e85693b1d5952d88377be5826068b67b0dcec	18-Jul-2014	DaniilSokolov <daniil.y.sokolov@intel.com>	ART: Enable x86_64 bit support for intrinsic for System.arraycopy(char[], ..) Implements x86_64 support for intrinsic for java.lang.System.arraycopy(char[], int, char[], int, int). With this fix the intrinsic works on x86 and x86_64 architectures. Change-Id: Icc2889ccd0cf7d821522abb7437893e3149e7c99 Signed-off-by: Daniil Sokolov <daniil.y.sokolov@intel.com>
6bbf0967d217ab2b7bdbb78bfd076b8fb07a44e8	14-Jul-2014	Alexei Zavjalov <alexei.zavjalov@intel.com>	ART: Implement the easy long division/remainder by a constant Also optimizes long/int divisions by power-of-two values. Also do some clean-up. Change-Id: Ie414e64aac251c81361ae107d157c14439e6dab5 Signed-off-by: Alexei Zavjalov <alexei.zavjalov@intel.com>
bda2722ba62e5be9f9fd6a6eb0db8259bb383629	31-Jul-2014	Andreas Gampe <agampe@google.com>	ART: Build fix Make lint happy, as comments should be separated from code. Change-Id: I4bfd88357302be9a6a104f1152e3b1fda386371e
35e1e6ad4b50f1adbe9f93fe467766f042491896	30-Jul-2014	Tong Shen <endlessroad@google.com>	1. Fix CFI for quick compiled code in x86 & x86_64; 2. Emit CFI in .eh_frame instead of .debug_frame. With CFI, we can correctly unwind past quick generated code. Now gdb should unwind to main() for both x86 & x86_64 host-side ART. Note that it does not work with relocation yet. Testing: 1. art/test/run-test --host --gdb [--64] --no-relocate 005 2. In gdb, run 'b art_quick_invoke_stub', then 'r', then 'c' a few times 3. In gdb, run 'bt'. You should see stack frames down to main() Change-Id: I5350d4097dc3d360a60cb17c94f1d02b99bc58bb
984305917bf57b3f8d92965e4715a0370cc5bcfb	28-Jul-2014	Andreas Gampe <agampe@google.com>	ART: Rework quick entrypoint code in Mir2Lir, cleanup To reduce the complexity of calling trampolines in generic code, introduce an enumeration for entrypoints. Introduce a header that lists the entrypoint enum and exposes a templatized method that translates an enum value to the corresponding thread offset value. Call helpers are rewritten to have an enum parameter instead of the thread offset. Also rewrite LoadHelper and GenConversionCall this way. It is now LoadHelper's duty to select the right thread offset size. Introduce InvokeTrampoline virtual method to Mir2Lir. This allows to further simplify the call helpers, as well as make OpThreadMem specific to X86 only (removed from Mir2Lir). Make GenInlinedCharAt virtual, move a copy to X86 backend, and simplify both copies. Remove LoadBaseIndexedDisp and OpRegMem from Mir2Lir, as they are now specific to X86 only. Remove StoreBaseIndexedDisp from Mir2Lir, as it was only ever used in the X86 backend. Remove OpTlsCmp from Mir2Lir, as it was only ever used in the X86 backend. Remove OpLea from Mir2Lir, as it was only ever defined in the X86 backend. Remove GenImmedCheck from Mir2Lir as it was neither used nor implemented. Change-Id: If0a6182288c5d57653e3979bf547840a4c47626e
147eb41b53729ec8d5c188d1cac90964a51afb8a	11-Jul-2014	Dave Allison <dallison@google.com>	Revert "Revert "Revert "Revert "Add implicit null and stack checks for x86"""" This reverts commit 0025a86411145eb7cd4971f9234fc21c7b4aced1. Bug: 16256184 Change-Id: Ie0760a0c293aa3b62e2885398a8c512b7a946a73 Conflicts: compiler/dex/quick/arm64/target_arm64.cc compiler/image_test.cc runtime/fault_handler.cc
c3561ae381960cbd52a83b7591504f158ec06920	17-Jul-2014	nikolay serdjuk <nikolay.y.serdjuk@intel.com>	Improved implementation of inline of String.indexOf This version pushes EDI only once and only in 32-bit mode. Change-Id: I4e871d3531ac539536f8f53ec09ffb664409c9cc
8e3acdd132aef1391676a5db2696804900aacd8e	14-Jul-2014	Serguei Katkov <serguei.i.katkov@intel.com>	x86_64: Fix GenDalvikArgsRange for 64-bit ref 32-bit virtual register can be in 64-bit solo register. So we should not compute the size of virtual register basing on size of phyical register. Change-Id: I4e11be13df8469be63808d0ce9d1ca6f80bef483 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
69dfe51b684dd9d510dbcb63295fe180f998efde	11-Jul-2014	Dave Allison <dallison@google.com>	Revert "Revert "Revert "Revert "Add implicit null and stack checks for x86"""" This reverts commit 0025a86411145eb7cd4971f9234fc21c7b4aced1. Bug: 16256184 Change-Id: Ie0760a0c293aa3b62e2885398a8c512b7a946a73
d9cb8ae2ed78f957a773af61759432d7a7bf78af	09-Jul-2014	Douglas Leung <douglas@mips.com>	Fix art test failures for Mips. This patch fixes the following art test failures for Mips: 003-omnibus-opcodes 030-bad-finalizer 041-narrowing 059-finalizer-throw Change-Id: I4e0e9ff75f949c92059dd6b8d579450dc15f4467 Signed-off-by: Douglas Leung <douglas@mips.com>
af263df7f643e699abf622c64447d31bacc14c34	12-Jul-2014	Andreas Gampe <agampe@google.com>	ART: Change GenPCUseDefEncoding(), turn on Load Hoisting for ARM64 This defines the PC resource mask as empty, as the PC is not accessible on ARM64. Unify code paths with x86 in LoadStoreElimination and LoadHoisting. Change-Id: Iea8b9e666f306c7a6ff52b6c5bf7e05b35346b2c
48f5c47907654350ce30a8dfdda0e977f5d3d39f	27-Jun-2014	Hans Boehm <hboehm@google.com>	Replace memory barriers to better reflect Java needs. Replaces barriers that enforce ordering of one access type (e.g. Load) with respect to another (e.g. store) with more general ones that better reflect both Java requirements and actual hardware barrier/fence instructions. The old code was inconsistent and unclear about which barriers implied which others. Sometimes multiple barriers were generated and then eliminated; sometimes it was assumed that certain barriers implied others. The new barriers closely parallel those in C++11, though, for now, we use something closer to the old naming. Bug: 14685856 Change-Id: Ie1c80afe3470057fc6f2b693a9831dfe83add831
ccc60264229ac96d798528d2cb7dbbdd0deca993	05-Jul-2014	Andreas Gampe <agampe@google.com>	ART: Rework TargetReg(symbolic_reg, wide) Make the standard implementation in Mir2Lir and the specialized one in the x86 backend return a pair when wide = "true". Introduce WideKind enumeration to improve code readability. Simplify generic code based on this implementation. Change-Id: I670d45aa2572eedfdc77ac763e6486c83f8e26b4
7fb36ded9cd5b1d254b63b3091f35c1e6471b90e	10-Jul-2014	Dave Allison <dallison@google.com>	Revert "Revert "Add implicit null and stack checks for x86"" Fixes x86_64 cross compile issue. Removes command line options and property to set implicit checks - this is hard coded now. This reverts commit 3d14eb620716e92c21c4d2c2d11a95be53319791. Change-Id: I5404473b5aaf1a9c68b7181f5952cb174d93a90d
c380191f3048db2a3796d65db8e5d5a5e7b08c65	08-Jul-2014	Serguei Katkov <serguei.i.katkov@intel.com>	x86_64: Enable fp-reg promotion Patch introduces 4 register XMM12-15 available for promotion of fp virtual registers. Change-Id: I3f89ad07fc8ae98b70f550eada09be7b693ffb67 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
0025a86411145eb7cd4971f9234fc21c7b4aced1	11-Jul-2014	Nicolas Geoffray <ngeoffray@google.com>	Revert "Revert "Revert "Add implicit null and stack checks for x86""" Broke the build. This reverts commit 7fb36ded9cd5b1d254b63b3091f35c1e6471b90e. Change-Id: I9df0e7446ff0913a0e1276a558b2ccf6c8f4c949
34e826ccc80dc1cf7c4c045de6b7f8360d504ccf	29-May-2014	Dave Allison <dallison@google.com>	Add implicit null and stack checks for x86 This adds compiler and runtime changes for x86 implicit checks. 32 bit only. Both host and target are supported. By default, on the host, the implicit checks are null pointer and stack overflow. Suspend is implemented but not switched on. Change-Id: I88a609e98d6bf32f283eaa4e6ec8bbf8dc1df78a
3d14eb620716e92c21c4d2c2d11a95be53319791	10-Jul-2014	Dave Allison <dallison@google.com>	Revert "Add implicit null and stack checks for x86" It breaks cross compilation with x86_64. This reverts commit 34e826ccc80dc1cf7c4c045de6b7f8360d504ccf. Change-Id: I34ba07821fc0a022fda33a7ae21850957bbec5e7
60bfe7b3e8f00f0a8ef3f5d8716adfdf86b71f43	09-Jul-2014	Udayan Banerji <udayan.banerji@intel.com>	X86 Backend support for vectorized float and byte 16x16 operations Add support for reserving vector registers for the duration of vector loop. Add support for 16x16 multiplication, shifts, and add reduce. Changed the vectorization implementation to be able to use the dataflow elements for SSA recreation and fixed a few implementation details. Change-Id: I2f358f05f574fc4ab299d9497517b9906f234b98 Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com> Signed-off-by: Olivier Come <olivier.come@intel.com> Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
407a9d2847161b843966a443b71760b1280bd396	04-Jul-2014	Serguei Katkov <serguei.i.katkov@intel.com>	Clean-up call_x86.cc Also adds some DCHECKs and fixes for the bugs found by them. Change-Id: I455bbfe2c6018590cf491880cd9273edbe39c4c7 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
70c4f06f9965cdb9319a2c85f65acda20086d765	25-Jun-2014	DaniilSokolov <daniil.y.sokolov@intel.com>	ART: Intrinsic implementation for java.lang.System.arraycopy. Implements intrinsic for java.lang.System.arraycopy(char[], int, char[], int, int) - this method is internal to android class libraries and used in such classes as StringBuffer and StringBuilder. It is not possible to call it from application code. The intrinsic for this method is implemented as inline method (assembly code is generated manually). The intrinsic is x86 32 bit only. Change-Id: Id1b1e0a20d5f6d5f5ebfe1fdc2447b6d8a515432 Signed-off-by: Daniil Sokolov <daniil.y.sokolov@intel.com>
a77ee5103532abb197f492c14a9e6fb437054e2a	02-Jul-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: TargetReg update for x86 Also includes changes in common code. Elimination of use of TargetReg with one parameter and direct access to special target registers. Change-Id: Ied2c1f87d4d1e4345248afe74bca40487a46a371 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
b5860fb459f1ed71f39d8a87b45bee6727d79fe8	22-Jun-2014	buzbee <buzbee@google.com>	Register promotion support for 64-bit targets Not sufficiently tested for 64-bit targets, but should be fairly close. A significant amount of refactoring could stil be done, (in later CLs). With this change we are not making any changes to the vmap scheme. As a result, it is a requirement that if a vreg is promoted to both a 32-bit view and the low half of a 64-bit view it must share the same physical register. We may change this restriction later on to allow for more flexibility for 32-bit Arm. For example, if v4, v5, v4/v5 and v5/v6 are all hot enough to promote, we'd end up with something like: v4 (as an int) -> r10 v4/v5 (as a long) -> r10 v5 (as an int) -> r11 v5/v6 (as a long) -> r11 Fix a couple of ARM64 bugs on the way... Change-Id: I6a152b9c164d9f1a053622266e165428045362f3
c5e4ce116e4d44bfdf162f0c949e77772d7e0654	10-Jun-2014	nikolay serdjuk <nikolay.y.serdjuk@intel.com>	x86_64: Fix intrinsics The following intrinsics have been ported: - Abs(double/long/int/float) - String.indexOf/charAt/compareTo/is_empty/length - Float.floatToRawIntBits, Float.intBitsToFloat - Double.doubleToRawLongBits, Double.longBitsToDouble - Thread.currentThread - Unsafe.getInt/Long/Object, Unsafe.putInt/Long/Object - Math.sqrt, Math.max, Math.min - Long.reverseBytes Math.min and max for longs have been implemented for x86_64. Commented out until good tests available: - Memory.peekShort/Int/Long, Memory.pokeShort/Int/Long Turned off on x86-64 as reported having problems - Cas Change-Id: I934bc9c90fdf953be0d3836a17b6ee4e7c98f244
5192cbb12856b12620dc346758605baaa1469ced	01-Jul-2014	Yixin Shou <yixin.shou@intel.com>	Load 64 bit constant into GPR by single instruction for 64bit mode This patch load 64 bit constant into a register by a single movabsq instruction on 64 bit bit instead of previous mov, shift, add instruction sequences. Change-Id: I9d013c4f6c0b5c2e43bd125f91436263c7e6028c Signed-off-by: Yixin Shou <yixin.shou@intel.com>
dd64450b37776f68b9bfc47f8d9a88bc72c95727	01-Jul-2014	Elena Sayapina <elena.v.sayapina@intel.com>	x86_64: Unify 64-bit check in x86 compiler Update x86-specific Gen64Bit() check with the CompilationUnit target64 field which is set using unified Is64BitInstructionSet(InstructionSet) check. Change-Id: Ic00ac863ed19e4543d7ea878d6c6c76d0bd85ce8 Signed-off-by: Elena Sayapina <elena.v.sayapina@intel.com>
4d5d794382cd6d3a25392d17543d5987e432d314	26-Jun-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	x86_64: Simplify FlushIns This change simplifies FlushIns for x86_64. Change-Id: I2b41fae32603e0951e3847cc1e4f9c6bfab349a0 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
b6564c19c5e14a3caa3f8da423b0da510fda7026	24-Jun-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: Fix wide argument increment This patch fixes to always increment the index for a wide argument, and fixes the index upper bound. Otherwise, the mapping may be incorrect. Change-Id: I0116d8fd0a0a5c1270a23129c73a9e3651132977 Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
de68676b24f61a55adc0b22fe828f036a5925c41	24-Jun-2014	Andreas Gampe <agampe@google.com>	Revert "ART: Split out more cases of Load/StoreRef, volatile as parameter" This reverts commit 2689fbad6b5ec1ae8f8c8791a80c6fd3cf24144d. Breaks the build. Change-Id: I9faad4e9a83b32f5f38b2ef95d6f9a33345efa33
3c12c512faf6837844d5465b23b9410889e5eb11	24-Jun-2014	Andreas Gampe <agampe@google.com>	Revert "Revert "ART: Split out more cases of Load/StoreRef, volatile as parameter"" This reverts commit de68676b24f61a55adc0b22fe828f036a5925c41. Fixes an API comment, and differentiates between inserting and appending. Change-Id: I0e9a21bb1d25766e3cbd802d8b48633ae251a6bf
2689fbad6b5ec1ae8f8c8791a80c6fd3cf24144d	23-Jun-2014	Andreas Gampe <agampe@google.com>	ART: Split out more cases of Load/StoreRef, volatile as parameter Splits out more cases of ref registers being loaded or stored. For code clarity, adds volatile as a flag parameter instead of a separate method. On ARM64, continue cleanup. Add flags to print/fatal on size mismatches. Change-Id: I30ed88433a6b4ff5399aefffe44c14a5e6f4ca4e
5655e84e8d71697d8ef3ea901a0b853af42c559e	18-Jun-2014	Andreas Gampe <agampe@google.com>	ART: Implicit checks in the compiler are independent from Runtime When cross-compiling, those flags are independent. This is an initial CL that helps bypass fatal failures when cross-compiling, as not all architectures support (and have turned on) implicit checks. The actual transport for the target architecture when it is different from the runtime needs to be implemented in a follow-up CL. Bug: 15703710 Change-Id: Idc881a9a4abfd38643b862a491a5af9b8841f693
35ec2b5faf9a2dbc3c0cddb7ebc09952b8a27d2a	17-Jun-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: Clobber r8 to r11 and xmm0 to xmm15 This clobbers r8 to r11 and xmm0 to xmm15, so that they can be reloaded after an external C call. Change-Id: If5cac97e475083912026309891dc332f14f8683a Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
7e399fd3a99ba9c9dbfafdf14f75dd318fa7d454	11-Jun-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: Disable all optimizations and fix bugs This disables all optimizations and ensures that art tests still pass. Change-Id: I43217378d6889bb04f4d064f8d53cb3ff4c20aa0 Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
8dea81ca9c0201ceaa88086b927a5838a06a3e69	06-Jun-2014	Vladimir Marko <vmarko@google.com>	Rewrite use/def masks to support 128 bits. Reduce LIR memory usage by holding masks by pointers in the LIR rather than directly and using pre-defined const masks for the common cases, allocating very few on the arena. Change-Id: I0f6d27ef6867acd157184c8c74f9612cebfe6c16
55884bc1e2e1b324809b462455ccaf5811ffafd8	10-Jun-2014	Mark Mendell <mark.p.mendell@intel.com>	X86_64: Proper IMT fix Unfortunately, 97184: X86_64: Pass 'hidden method index' in EAX wasn't correct. TargetReg(kInvokeTgt) is ALSO EAX, and so invoke-interface blows up, since the saved index is overwritten by the generated code. Change kInvokeTgt to EDI (the same as ARG0). Change-Id: I4b1d260237274ee26b9283d810d1b74484ea59af Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
ea248f8b048d904a8fe806b6a52372985945274d	10-Jun-2014	Ian Rogers <irogers@google.com>	Remove TARGET_REX_SUPPORT define. Change-Id: I1c3644176c101064261d13b50484d2e3ae456316
0f9b9c508814a62c6e21c6a06cfe4de39b5036c0	09-Jun-2014	Ian Rogers <irogers@google.com>	Tidy up x86 assembler and fix byte register encoding. Also fix reg storage int size issues. Also fix bad use of byte registers in GenInlinedCas. Change-Id: Id47424f36f9000e051110553e0b51816910e2fe8
d3703d82a0afc28a4ea0cb0f6d88e9f8adc23e43	09-Jun-2014	Mark Mendell <mark.p.mendell@intel.com>	X86_64: Pass 'hidden method index' in EAX Method* is in EDI, and EAX isn't an argument register, so EAX is free to hold the hidden method index. Change-Id: I793a54d00a4593e140f97144419d849b53bfdf44 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
a014776f4474579d4dfc72e3374ba45c6f6e5f35	07-Jun-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: Add long bytecode supports (2/2) This patch adds implementation of math and complex long bytcodes, and basic long arithmetic. Change-Id: I811397d7e0ee8ad0d12b23d32ba58314d479d714 Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com> Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
e0ccdc0dd166136cd43e5f54201179a4496d33e8	07-Jun-2014	Chao-ying Fu <chao-ying.fu@intel.com>	x86_64: Add long bytecode supports (1/2) This patch includes switch enabling and GenFillArray, assembler changes, updates of regalloc behavior for 64-bit, usage in basic utility operations, loading constants, and update for memory operations. Change-Id: I6d8aa35a75c5fd01d69c38a770c3398d0188cc8a Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com> Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
58994cdb00b323339bd83828eddc53976048006f	16-May-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	x86_64: Hard Float ABI support in QCG This patch shows our efforts on resolving the ART limitations: - passing "float"/"double" arguments via FPR - passing "long" arguments via single GPR, not pair - passing more than 3 agruments via GPR. Work done: - Extended SpecialTargetRegister enum with kARG4, kARG5, fARG4..fARG7. - Created initial LoadArgRegs/GenDalvikX/FlushIns version in X86Mir2Lir. - Unlimited number of long/double/float arguments support - Refactored (v2) Change-Id: I5deadd320b4341d5b2f50ba6fa4a98031abc3902 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com> Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
76af0d307194045ece429dbaf62e93d3e08c6c20	05-Jun-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	x86_64: Turn on 64-bit core registers initialization. This enables 64-bit core registers initialization for x86_64. The backend update with 64-bit temp support is in progress. Change-Id: If7c9a62c1145f81050adda86f2beed427220baa2 Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com> Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
089142cf1d0c028b5a7c703baf0b97f4a4ada3f7	05-Jun-2014	Vladimir Marko <vmarko@google.com>	Avoid register pool allocations on the heap. Create a helper template class ArrayRef and use it instead of std::vector<> for register pools in target_<arch>.cc to avoid these heap allocations during program startup. Change-Id: I4ab0205af9c1d28a239c0a105fcdc60ba800a70a
a0cd2d701f29e0bc6275f1b13c0edfd4ec391879	01-Jun-2014	buzbee <buzbee@google.com>	Quick compiler: reference cleanup For 32-bit targets, object references are 32 bits wide both in Dalvik virtual registers and in core physical registers. Because of this, object references and non-floating point values were both handled as if they had the same register class (kCoreReg). However, for 64-bit systems, references are 32 bits in Dalvik vregs, but 64 bits in physical registers. Although the same underlying physical core registers will still be used for object reference and non-float values, different register class views will be used to represent them. For example, an object reference in arm64 might be held in x3 at some point, while the same underlying physical register, w3, would be used to hold a 32-bit int. This CL breaks apart the handling of object reference and non-float values to allow the proper register class (or register view) to be used. A new register class, kRefReg, is introduced which will map to a 32-bit core register on 32-bit targets, and 64-bit core registers on 64-bit targets. From this point on, object references should be allocated registers in the kRefReg class rather than kCoreReg. Change-Id: I6166827daa8a0ea3af326940d56a6a14874f5810
ffddfdf6fec0b9d98a692e27242eecb15af5ead2	03-Jun-2014	Tim Murray <timmurray@google.com>	DO NOT MERGE Merge ART from AOSP to lmp-preview-dev. Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
a20468c004264592f309a548fc71ba62a69b8742	30-Apr-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	x86_64: Support r8-r15, xmm8-xmm15 in assembler Added REX support. The TARGET_REX_SUPPORT should be used during build. Change-Id: I82b457ff5085c8192ad873923bd939fbb91022ce Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
fe94578b63380f464c3abd5c156b7b31d068db6c	22-May-2014	Mark Mendell <mark.p.mendell@intel.com>	Implement all vector instructions for X86 Add X86 code generation for the vector operations. Added support for X86 disassembler for the new instructions. Change-Id: I72b48f5efa3a516a16bb1dd4bdb5c9270a8db53a Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
0999a6f7c83d10aa59b75f079f0d2fdbac982cf7	21-May-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	x86_64: Rebase on top of "64-bit temp register support" Added the 64-bit core/temp register definition, fixed RegisterPool creation for x86_64 so that 64-bit core/temps are NOT used for now. The long arithmetic still operates with register pair on x86_64 and it is a subject for change in a separate patch. Change-Id: I2be06d5aefaf80141983bc9d8ed8a2ee24c2b21b Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
b01bf15d18f9b08d77e7a3c6e2897af0e02bf8ca	14-May-2014	buzbee <buzbee@google.com>	64-bit temp register support. Add a 64-bit temp register allocation path. The recent physical register handling rework supports multiple views of the same physical register (or, such as for Arm's float/double regs, different parts of the same physical register). This CL adds a 64-bit core register view for 64-bit targets. In short, each core register will have a 64-bit name, and a 32-bit name. The different views will be kept in separate register pools, but aliasing will be tracked. The core temp register allocation routines will be largely identical - except for 32-bit targets, which will continue to use pairs of 32-bit core registers for holding long values. Change-Id: I8f118e845eac7903ad8b6dcec1952f185023c053
e87f9b5185379c8cf8392d65a63e7bf7e51b97e7	30-Apr-2014	Mark Mendell <mark.p.mendell@intel.com>	Allow X86 QBE to be extended Enhancements and updates to allow X86Mir2LIR Backend to be subclassed for experimentation. Add virtual in a whole bunch of places, and make some other changes to get this to work. Change-Id: I0980a19bc5d5725f91660f98c95f1f51c17ee9b6 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
082833c8d577db0b2bebc100602f31e4e971613e	18-May-2014	buzbee <buzbee@google.com>	Quick compiler, out of registers fix It turns out that the register pool sanity checker was not working as expected, leaving some inconsistencies unreported. This could result in "out of registers" failures, as well as other more subtle problems. This CL fixes the sanity checker, adds a lot more check and cleans up the previously undetected episodes of insanity. Cherry-pick of internal change 468162 Change-Id: Id2da97e99105a4c272c5fd256205a94b904ecea8
05d3aeb33683b16837741f9348d6fba9a8432068	18-May-2014	buzbee <buzbee@google.com>	Quick compiler, out of registers fix Fixes b/15024623 It turns out that the register pool sanity checker was not working as expected, leaving some inconsistencies unreported. This CL fixes the sanity checker, adds a lot more check and cleans up the previously undetected episodes of insanity. Change-Id: I4d67db864ca5926a1975db251e7e631b65a86275
d65c51a556e6649db4e18bd083c8fec37607a442	29-Apr-2014	Mark Mendell <mark.p.mendell@intel.com>	ART: Add support for constant vector literals Add in some vector instructions. Implement the ConstVector instruction, which takes 4 words of data and loads it into an XMM register. Initially, only the ConstVector MIR opcode is implemented. Others will be added after this one goes in. Change-Id: I5c79bc8b7de9030ef1c213fc8b227debc47f6337 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
b14329f90f725af0f67c45dfcb94933a426d63ce	15-May-2014	Andreas Gampe <agampe@google.com>	ART: Fix MonitorExit code on ARM We do not emit barriers on non-SMP systems. But on ARM, we have places that need to conditionally execute, which is done through an IT instruction. The guide of said instruction thus changes between SMP and non-SMP systems. To cleanly approach this, change the API so that GenMemBarrier returns whether it generated an instruction. ARM will have to query the result and update any dependent IT. Throw a build system error if TARGET_CPU_SMP is not set. Fix runtime/Android.mk to work with new multilib host. Bug: 14989275 Change-Id: I9e611b770e8a1cd4ca19367d7dae0573ec08dc61
9ee801f5308aa3c62ae3bedae2658612762ffb91	12-May-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	Add x86_64 code generation support Utilizes r0..r7 in register allocator, implements spill/unsill core regs as well as operations with stack pointer. Change-Id: I973d5a1acb9aa735f6832df3d440185d9e896c67 Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
2f244e9faccfcca68af3c5484c397a01a1c3a342	08-May-2014	Andreas Gampe <agampe@google.com>	ART: Add more ThreadOffset in Mir2Lir and backends This duplicates all methods with ThreadOffset parameters, so that both ThreadOffset<4> and ThreadOffset<8> can be handled. Dynamic checks against the compilation unit's instruction set determine which pointer size to use and therefore which methods to call. Methods with unsupported pointer sizes should fatally fail, as this indicates an issue during method selection. Change-Id: Ifdb445b3732d3dc5e6a220db57374a55e91e1bf6
30adc7383a74eb3cb6db3bf42cea3a5595055ce1	10-May-2014	buzbee <buzbee@google.com>	Quick compiler: Fix liveness tracking Rework temp register liveness tracking to play nicely with aliased physical registers, and re-enable liveness tracking optimization. Add a pair of x86 utility routines that act like UpdateLoc(), but only show in-register live temps if they are of the expected register class. Change-Id: I92779e0da2554689103e7488025be281f1a58989
674744e635ddbdfb311fbd25b5a27356560d30c3	24-Apr-2014	Vladimir Marko <vmarko@google.com>	Use atomic load/store for volatile IGET/IPUT/SGET/SPUT. Bug: 14112919 Change-Id: I79316f438dd3adea9b2653ffc968af83671ad282
091cc408e9dc87e60fb64c61e186bea568fc3d3a	31-Mar-2014	buzbee <buzbee@google.com>	Quick compiler: allocate doubles as doubles Significant refactoring of register handling to unify usage across all targets & 32/64 backends. Reworked RegStorage encoding to allow expanded use of x86 xmm registers; removed vector registers as a separate register type. Reworked RegisterInfo to describe aliased physical registers. Eliminated quite a bit of target-specific code and generalized common code. Use of RegStorage instead of int for registers now propagated down to the NewLIRx() level. In future CLs, the NewLIRx() routines will be replaced with versions that are explicit about what kind of operand they expect (RegStorage, displacement, etc.). The goal is to eventually use RegStorage all the way to the assembly phase. TBD: MIPS needs verification. TBD: Re-enable liveness tracking. Change-Id: I388c006d5fa9b3ea72db4e37a19ce257f2a15964
695d13a82d6dd801aaa57a22a9d4b3f6db0d0fdb	19-Apr-2014	buzbee <buzbee@google.com>	Update load/store utilities for 64-bit backends This CL replaces the typical use of LoadWord/StoreWord utilities (which, in practice, were 32-bit load/store) in favor of a new set that make the size explicit. We now have: LoadWordDisp/StoreWordDisp: 32 or 64 depending on target. Load or store the natural word size. Expect this to be used infrequently - generally when we know we're dealing with a native pointer or flushed register not holding a Dalvik value (Dalvik values will flush to home location sizes based on Dalvik, rather than the target). Load32Disp/Store32Disp: Load or store 32 bits, regardless of target. Load64Disp/Store64Disp: Load or store 64 bits, regardless of target. LoadRefDisp: Load a 32-bit compressed reference, and expand it to the natural word size in the target register. StoreRefDisp: Compress a reference held in a register of the natural word size and store it as a 32-bit compressed reference. Change-Id: I50fcbc8684476abd9527777ee7c152c61ba41c6f
3a74d15ccc9a902874473ac9632e568b19b91b1c	22-Apr-2014	Mingyao Yang <mingyao@google.com>	Delete throw launchpads. Bug: 13170824 Change-Id: I9d5834f5a66f5eb00f2ac80774e8c27dea99949e
a1758d83e298c9ee31848bcae07c2a35f6efd618	16-Apr-2014	Alexei Zavjalov <alexei.zavjalov@intel.com>	String.IndexOf method handles negative start index value in incorrect way The standard implementation of String.IndexOf converts the negative value of the start index to 0 and searching will start from the beginning of the string. But current implementation may start searching from the incorrect memory offset, that can lead to sigsegv or return incorrect result. This patch adds the handler for cases when fromIndex is negative. Change-Id: I3ac86290712789559eaf5e46bef0006872395bfa Signed-off-by: Alexei Zavjalov <alexei.zavjalov@intel.com>
6a58cb16d803c9a7b3a75ccac8be19dd9d4e520d	02-Apr-2014	Dmitry Petrochenko <dmitry.petrochenko@intel.com>	art: Handle x86_64 architecture equal to x86 This patch forces FE/ME to treat x86_64 as x86 exactly. The x86_64 logic will be revised later when assembly will be ready. Change-Id: I4a92477a6eeaa9a11fd710d35c602d8d6f88cbb6 Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
dd7624d2b9e599d57762d12031b10b89defc9807	15-Mar-2014	Ian Rogers <irogers@google.com>	Allow mixing of thread offsets between 32 and 64bit architectures. Begin a more full implementation x86-64 REX prefixes. Doesn't implement 64bit thread offset support for the JNI compiler. Change-Id: If9af2f08a1833c21ddb4b4077f9b03add1a05147
2700f7e1edbcd2518f4978e4cd0e05a4149f91b6	07-Mar-2014	buzbee <buzbee@google.com>	Continuing register cleanup Ready for review. Continue the process of using RegStorage rather than ints to hold register value in the top layers of codegen. Given the huge number of changes in this CL, I've attempted to minimize the number of actual logic changes. With this CL, the use of ints for registers has largely been eliminated except in the lowest utility levels. "Wide" utility routines have been updated to take a single RegStorage rather than a pair of ints representing low and high registers. Upcoming CLs will be smaller and more targeted. My expectations: o Allocate float double registers as a single double rather than a pair of float single registers. o Refactor to push code which assumes long and double Dalvik values are held in a pair of register to the target dependent layer. o Clean-up of the xxx_mir.h files to reduce the amount of #defines for registers. May also do a register renumbering to bring all of our targets' register naming more consistent. Possibly introduce a target-independent float/non-float test at the RegStorage level. Change-Id: I646de7392bdec94595dd2c6f76e0f1c4331096ff
99ad7230ccaace93bf323dea9790f35fe991a4a2	26-Feb-2014	Razvan A Lupusoru <razvan.a.lupusoru@intel.com>	Relaxed memory barriers for x86 X86 provides stronger memory guarantees and thus the memory barriers can be optimized. This patch ensures that all memory barriers for x86 are treated as scheduling barriers. And in cases where a barrier is needed (StoreLoad case), an mfence is used. Change-Id: I13d02bf3f152083ba9f358052aedb583b0d48640 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
e90501da0222717d75c126ebf89569db3976927e	12-Mar-2014	Serguei Katkov <serguei.i.katkov@intel.com>	Add dependency for operations with x86 FPU stack Load Hoisting optimization can re-order operations with FPU stack due to no dependency set. Patch adds resource dependency between these operations. Change-Id: Iccce98c8f3c565903667c03803884d9de1281ea8 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
b373e091eac39b1a79c11f2dcbd610af01e9e8a9	21-Feb-2014	Dave Allison <dallison@google.com>	Implicit null/suspend checks (oat version bump) This adds the ability to use SEGV signals to throw NullPointerException exceptions from Java code rather than having the compiler generate explicit comparisons and branches. It does this by using sigaction to trap SIGSEGV and when triggered makes sure it's in compiled code and if so, sets the return address to the entry point to throw the exception. It also uses this signal mechanism to determine whether to check for thread suspension. Instead of the compiler generating calls to a function to check for threads being suspended, the compiler will now load indirect via an address in the TLS area. To trigger a suspend, the contents of this address are changed from something valid to 0. A SIGSEGV will occur and the handler will check for a valid instruction pattern before invoking the thread suspension check code. If a user program taps SIGSEGV it will prevent our signal handler working. This will cause a failure in the runtime. There are two signal handlers at present. You can control them individually using the flags -implicit-checks: on the runtime command line. This takes a string parameter, a comma separated set of strings. Each can be one of: none switch off null null pointer checks suspend suspend checks all all checks So to switch only suspend checks on, pass: -implicit-checks:suspend There is also -explicit-checks to provide the reverse once we change the default. For dalvikvm, pass --runtime-arg -implicit-checks:foo,bar The default is -implicit-checks:none There is also a property 'dalvik.vm.implicit_checks' whose value is the same string as the command option. The default is 'none'. For example to switch on null checks using the option: setprop dalvik.vm.implicit_checks null It only works for ARM right now. Bumps OAT version number due to change to Thread offsets. Bug: 13121132 Change-Id: If743849138162f3c7c44a523247e413785677370
3bc8615332b7848dec8c2297a40f7e4d176c0efb	13-Mar-2014	Vladimir Marko <vmarko@google.com>	Use LIRSlowPath for intrinsics, improve String.indexOf(). Rewrite intrinsic launchpads to use the LIRSlowPath. Improve String.indexOf for constant chars by avoiding the check for code points over 0xFFFF. Change-Id: I7fd5583214c5b4ab9c38ee36c5d6f003dd6345a8
34fa0d935bed7a0e17bc6df4bd079e3428a179e7	12-Mar-2014	Yevgeny Rouban <yevgeny.y.rouban@intel.com>	ART's intrinsic for String.indexOf use the incorrect register ART's intrinsic for String.indexOf of x86 platform use the incorrect register to compare start with the string length. It should be fixed. Change-Id: I22986b4d4b23f62b4bb97baab9fe43152d12145e Signed-off-by: Vladimir Ivanov <vladimir.a.ivanov@intel.com> Signed-off-by: Yevgeny Rouban <yevgeny.y.rouban@intel.com>
49161cef10a308aedada18e9aa742498d6e6c8c7	12-Mar-2014	Jeff Hao <jeffhao@google.com>	Allow patching between dex files in the boot classpath. Change-Id: I53f219a5382d0fcd580e96e50025fdad4fc399df
83cc7ae96d4176533dd0391a1591d321b0a87f4f	12-Feb-2014	Vladimir Marko <vmarko@google.com>	Create a scoped arena allocator and use that for LVN. This saves more than 0.5s of boot.oat compilation time on Nexus 5. TODO: Move other stuff to the scoped allocator. This CL alone increases the peak memory allocation. By reusing the memory for other parts of the compilation we should reduce this overhead. Change-Id: Ifbc00aab4f3afd0000da818dfe68b96713824a08
a44d4f508fa1642294e79d3ebecd790afe75ea60	05-Mar-2014	buzbee <buzbee@google.com>	Fix read of uninitialized memory in InlineIndexOf The are two flavors of IndexOf that we treat as an intrinsic: a zero-based verion with 2 args and a 3-arg version that also takes a start position. The same code is used for both, but Valgrind reminded us that we shouldn't try loading a RegLocation for the non-extent 3rd arg in the 2 argument version. We got lucky in that the bug was benign - the generated code would still be correct. Change-Id: I0bc7798c8034d35007ffe6d6d62f9ceb91fc44fd
00e1ec6581b5b7b46ca4c314c2854e9caa647dd2	28-Feb-2014	Bill Buzbee <buzbee@android.com>	Revert "Revert "Rework Quick compiler's register handling"" This reverts commit 86ec520fc8b696ed6f164d7b756009ecd6e4aace. Ready. Fixed the original type, plus some mechanical changes for rebasing. Still needs additional testing, but the problem with the original CL appears to have been a typo in the definition of the x86 double return template RegLocation. Change-Id: I828c721f91d9b2546ef008c6ea81f40756305891
ae9fd93c39a341e2dffe15c61cc7d9e841fa92c4	11-Feb-2014	Mark Mendell <mark.p.mendell@intel.com>	Tell GDB about Quick ART generated code This is actually a lot of work. To do this, we need: .debug_info .debug_abbrev .debug_frame .debug_str These are generated into the OAT file by OatWriter and ElfWriterQuick. Since the Quick ART runtime doesn't use dlopen to load the OAT files, GDB can't find this information. Use the alternate GDB JIT interface, which can be invoked at runtime. To use this interface, an ELF image needs to be built in memory. Read the information from the OAT file, fixup the addresses to point to the real locations, add a symbol table to hold the .text symbol, and then let GDB know about the information, which will be read from the runtime address space. This is quite primitive now, and could be cleaned up considerably. It probably needs symbol table entries for the methods, and descriptions of parameters and return types. Currently only supported for X86. This defaults to enabled for debug builds. Added dexoat --gen-gdb-info and --no-gen-gdb-info flags to override. Change-Id: I4d18b2370f6dfaa00c8cc1925f10717be3bd1a62 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
86ec520fc8b696ed6f164d7b756009ecd6e4aace	26-Feb-2014	Bill Buzbee <buzbee@android.com>	Revert "Rework Quick compiler's register handling" This reverts commit 2c1ed456dcdb027d097825dd98dbe48c71599b6c. Change-Id: If88d69ba88e0af0b407ff2240566d7e4545d8a99
2c1ed456dcdb027d097825dd98dbe48c71599b6c	20-Feb-2014	buzbee <buzbee@google.com>	Rework Quick compiler's register handling For historical reasons, the Quick backend found it convenient to consider all 64-bit Dalvik values held in registers to be contained in a pair of 32-bit registers. Though this worked well for ARM (with double-precision registers also treated as a pair of 32-bit single-precision registers) it doesn't play well with other targets. And, it is somewhat problematic for 64-bit architectures. This is the first of several CLs that will rework the way the Quick backend deals with physical registers. The goal is to eliminate the "64-bit value backed with 32-bit register pair" requirement from the target-indendent portions of the backend and support 64-bit registers throughout. The key RegLocation struct, which describes the location of Dalvik virtual register & register pairs, previously contained fields for high and low physical registers. The low_reg and high_reg fields are being replaced with a new type: RegStorage. There will be a single instance of RegStorage for each RegLocation. Note that RegStorage does not increase the space used. It is 16 bits wide, the same as the sum of the 8-bit low_reg and high_reg fields. At a target-independent level, it will describe whether the physical register storage associated with the Dalvik value is a single 32 bit, single 64 bit, pair of 32 bit or vector. The actual register number encoding is left to the target-dependent code layer. Because physical register handling is pervasive throughout the backend, this restructuring necessarily involves large CLs with lots of changes. I'm going to roll these out in stages, and attempt to segregate the CLs with largely mechanical changes from those which restructure or rework the logic. This CL is of the mechanical change variety - it replaces low_reg and high_reg from RegLocation and introduces RegStorage. It also includes a lot of new code (such as many calls to GetReg()) that should go away in upcoming CLs. The tentative plan for the subsequent CLs is: o Rework standard register utilities such as AllocReg() and FreeReg() to use RegStorage instead of ints. o Rework the target-independent GenXXX, OpXXX, LoadValue, StoreValue, etc. routines to take RegStorage rather than int register encodings. o Take advantage of the vector representation and eliminate the current vector field in RegLocation. o Replace the "wide" variants of codegen utilities that take low_reg/high_reg pairs with versions that use RegStorage. o Add 64-bit register target independent codegen utilities where possible, and where not virtualize with 32-bit general register and 64-bit general register variants in the target dependent layer. o Expand/rework the LIR def/use flags to allow for more registers (currently, we lose out on 16 MIPS floating point regs as well as ARM's D16..D31 for lack of space in the masks). o [Possibly] move the float/non-float determination of a register from the target-dependent encoding to RegStorage. In other words, replace IsFpReg(register_encoding_bits). At the end of the day, all code in the target independent layer should be using RegStorage, as should much of the target dependent layer. Ideally, we won't be using the physical register number encoding extracted from RegStorage (i.e. GetReg()) until the NewLIRx() layer. Change-Id: Idc5c741478f720bdd1d7123b94e4288be5ce52cb
e19c91fdb88ff6fd4e88bc5984772dcfb1e86f80	25-Feb-2014	Mark Mendell <mark.p.mendell@intel.com>	Fix hardcoded offsets in x86 String.indexOf. Use runtime code that will work for 32 and 64 bit too. The old code copied constants from the runtime .S file and is correct for 32 bit code only. Change-Id: I668e1d7f2db8186518c358bde0759633be0d7c40 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
4028a6c83a339036864999fdfd2855b012a9f1a7	20-Feb-2014	Mark Mendell <mark.p.mendell@intel.com>	Inline x86 String.indexOf Take advantage of the presence of a constant search char or start index to tune the generated code. Change-Id: I0adcf184fb91b899a95aa4d8ef044a14deb51d88 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
f3e2cc4a38389aa75eb8ee3973a535254bf1c8d2	18-Feb-2014	Nicolas Geoffray <ngeoffray@google.com>	Code cleanup to avoid LLVM dependency when building with quick only. Change-Id: I0985c227d775c72fd23975d4c9bf673ba32615c2
3bc01748ef1c3e43361bdf520947a9d656658bf8	06-Feb-2014	Razvan A Lupusoru <razvan.a.lupusoru@intel.com>	GenSpecialCase support for x86 Moved GenSpecialCase from being ARM specific to common code to allow it to be used by x86 quick as well. Change-Id: I728733e8f4c4da99af6091ef77e5c76ae0fee850 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
55d0eac918321e0525f6e6491f36a80977e0d416	06-Feb-2014	Mark Mendell <mark.p.mendell@intel.com>	Support Direct Method/Type access for X86 Thumb generates code to optimize calls to methods within core.oat. Implement this for X86 as well, but take advantage of mov with 32 bit immediate and call relative with 32 bit immediate. Fix some incorrect return locations for long inlines. Change-Id: I1907bdfc7574f3d0aa76c7fad13dc537acdf1ed3 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
67c39c4aefca23cb136157b889c09ee200b3dec6	01-Feb-2014	Mark Mendell <mark.p.mendell@intel.com>	Support Literal pools for x86 They are being used to store double constants, which are very expensive to generate into XMM registers. Uses the 'Compiler Temporary' support just added. The MIR instructions are scanned for a reference to a double constant, a packed switch or a FillArray. These all need the address of the start of the method, since 32 bit x86 doesn't have a PC-relative addressing mode. If needed, a compiler temporary is allocated, and the address of the base of the method is calculated, and stored. Later uses can just refer to the saved value. Trickiness comes when generating the load from the literal area, as the offset is unknown before final assembler. Assume a 32 bit displacement is needed, and fix this if it wasn't necessary. Use LoadValue to load the 'base of method' pointer. Fix an incorrect test in GetRegLocation. Change-Id: I53ffaa725dabc370e9820c4e0e78664ede3563e6 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
107c31e598b649a8bb8d959d6a0377937e63e624	24-Jan-2014	Ian Rogers <irogers@google.com>	64bit friendly printf modifiers in LIR dumping. Also correct header file inclusion ordering. Change-Id: I8fb99e80cf1487e8b2278d4c1d110d14ed18c086
e02d48fb24747f90fd893e1c3572bb3c500afced	15-Jan-2014	Mark Mendell <mark.p.mendell@intel.com>	Optimize x86 long arithmetic Be smarter about taking advantage of a constant operand for x86 long add/sub/and/or/xor. Using instructions with immediates and generating results directly into memory reduces the number of temporary registers and avoids hardcoded register usage. Also rewrite the existing non-const x86 arithmetic to avoid fixed register use, and use the fact that x86 instructions are two operand. Pass the opcode to the XXXLong() routines to easily detect two operand DEX opcodes. Add a new StoreFinalValueWide() routine, which is similar to StoreValueWide, but doesn't do an EvalLoc to allocate registers. The src operand must already be in registers, and it just updates the dest location, and calls the right live/dirty routines to get the src into the dest properly. Change-Id: Iefc16e7bc2236a73dc780d3d5137ae8343171f62 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
d61ba4ba6fcde666adb5d5c81b1c32f0534fb2c8	13-Jan-2014	Bill Buzbee <buzbee@android.com>	Revert "Revert "Better support for x86 XMM registers"" This reverts commit 8ff67e3338952c70ccf3b609559bf8cc0f379cfd. Fix applied to loc.fp usage. Change-Id: I1eb3005392544fcf30c595923ed25bcee2dc4859
8ff67e3338952c70ccf3b609559bf8cc0f379cfd	11-Jan-2014	Bill Buzbee <buzbee@android.com>	Revert "Better support for x86 XMM registers" The invalid usage of loc.fp must be corrected before this change can be submitted. This reverts commit 766a5e5940b469ab40e52770862c81cfec1d835b. Change-Id: I1173a9bf829da89cccd9c2898f5e11164987a22b
766a5e5940b469ab40e52770862c81cfec1d835b	10-Jan-2014	Mark Mendell <mark.p.mendell@intel.com>	Better support for x86 XMM registers Currently, ART Quick mode assumes that a double FP register is composed of two single consecutive FP registers. This is true for ARM and MIPS, but not x86. This means that only half of the 8 XMM registers are available for use by x86 doubles. This patch breaks the assumption that a wide FP RegisterLocation must be a paired set of FP registers. This is done by making some routines in common code virtual and overriding them in the X86Mir2Lir class. For these wide fp locations, the high register is set to the same value as the low register, in order to minimize changes to common code. In a couple of places, the common code checks for this case. The changes are also supposed to allow the possibility of using the XMM registers for vector operations,but that support is still WIP. Change-Id: Ic6ef24ea764991c6f4d9fb88d483a619f5a468cb Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
988e6ea9ac66edf1e205851df9bb53de3f3763f3	08-Jan-2014	Ian Rogers <irogers@google.com>	Fix -O0 builds. Use snprintf rather than sprintf to avoid Werror failures. Work around an annotalysis bug when compiling -O0. Change-Id: Ie7e0a70dbceea5fa85f98262b91bcdbd74fdef1c
31c2aac7137b69d5622eea09597500731fbee2ef	09-Dec-2013	Vladimir Marko <vmarko@google.com>	Rename ClobberCalleeSave to Caller, fix it for x86. Change-Id: I6a72703a11985e2753fa9b4520c375a164301433
70b797d998f2a28e39f7d6ffc8a07c9cbc47da14	03-Dec-2013	Vladimir Marko <vmarko@google.com>	Unsafe.compareAndSwapLong() intrinsic for x86. Change-Id: Idbc5371a62dfdd84485a657d4548990519200205
88474b416eb257078e590bf9bc7957cee604a186	24-Oct-2013	Jeff Hao <jeffhao@google.com>	Implement Interface Method Tables (IMT). Change-Id: Idf7fe85e1293453a8ad862ff2380dcd5db4e3a39
0d82948094d9a198e01aa95f64012bdedd5b6fc9	12-Oct-2013	buzbee <buzbee@google.com>	64-bit prep Preparation for 64-bit roll. o Eliminated storing pointers in 32-bit int slots in LIR. o General size reductions of common structures to reduce impact of doubled pointer sizes: - BasicBlock struct was 72 bytes, now is 48. - MIR struct was 72 bytes, now is 64. - RegLocation was 12 bytes, now is 8. o Generally replaced uses of BasicBlock* pointers with 16-bit Ids. o Replaced several doubly-linked lists with singly-linked to save one stored pointer per node. o We had quite a few uses of uintptr_t's that were a holdover from the JIT (which used pointers to mapped dex & actual code cache addresses rather than trace-relative offsets). Replaced those with uint32_t's. o Clean up handling of embedded data for switch tables and array data. o Miscellaneous cleanup. I anticipate one or two additional CLs to reduce the size of MIR and LIR structs. Change-Id: I58e426d3f8e5efe64c1146b2823453da99451230
409fe94ad529d9334587be80b9f6a3d166805508	11-Oct-2013	buzbee <buzbee@google.com>	Quick assembler fix This CL re-instates the select pattern optimization disabled by CL 374310, and fixes the underlying problem: improper handling of the kPseudoBarrier LIR opcode. The bug was introduced in the recent assembler restructuring. In short, LIR pseudo opcodes (which have values < 0), should always have size 0 - and thus cause no bits to be emitted during assembly. In this case, bad logic caused us to set the size of a kPseudoBarrier opcode via lookup through the EncodingMap. Because all pseudo ops are < 0, this meant we did an array underflow load, picking up whatever garbage was located before the EncodingMap. This explains why this error showed up recently - we'd previuosly just gotten a lucky layout. This CL corrects the faulty logic, and adds DCHECKs to uses of the EncodingMap to ensure that we don't try to access w/ a pseudo op. Additionally, the existing is_pseudo_op() macro is replaced with IsPseudoLirOp(), named similar to the existing IsPseudoMirOp(). Change-Id: I46761a0275a923d85b545664cadf052e1ab120dc
b48819db07f9a0992a72173380c24249d7fc648a	15-Sep-2013	buzbee <buzbee@google.com>	Compile-time tuning: assembly phase Not as much compile-time gain from reworking the assembly phase as I'd hoped, but still worthwhile. Should see ~2% improvement thanks to the assembly rework. On the other hand, expect some huge gains for some application thanks to better detection of large machine-generated init methods. Thinkfree shows a 25% improvement. The major assembly change was to establish thread the LIR nodes that require fixup into a fixup chain. Only those are processed during the final assembly pass(es). This doesn't help for methods which only require a single pass to assemble, but does speed up the larger methods which required multiple assembly passes. Also replaced the block_map_ basic block lookup table (which contained space for a BasicBlock* for each dex instruction unit) with a block id map - cutting its space requirements by half in a 32-bit pointer environment. Changes: o Reduce size of LIR struct by 12.5% (one of the big memory users) o Repurpose the use/def portion of the LIR after optimization complete. o Encode instruction bits to LIR o Thread LIR nodes requiring pc fixup o Change follow-on assembly passes to only consider fixup LIRs o Switch on pc-rel fixup kind o Fast-path for small methods - single pass assembly o Avoid using cb[n]z for null checks (almost always exceed displacement) o Improve detection of large initialization methods. o Rework def/use flag setup. o Remove a sequential search from FindBlock using lookup table of 16-bit block ids rather than full block pointers. o Eliminate pcRelFixup and use fixup kind instead. o Add check for 16-bit overflow on dex offset. Change-Id: I4c6615f83fed46f84629ad6cfe4237205a9562b4
bd663de599b16229085759366c56e2ed5a1dc7ec	11-Sep-2013	buzbee <buzbee@google.com>	Compile-time tuning: register/bb utilities This CL yeilds about a 4% improvement in the compilation phase of dex2oat (single-threaded; multi-threaded compilation is more difficult to accurately measure). The register utilities could stand to be completely rewritten, but this gets most of the easy benefit. Next up: the assembly phase. Change-Id: Ife5a474e9b1a6d9e501e888dda6749d34eb77e96
f6c4b3ba3825de1dbb3e747a68b809c6cc8eb4db	25-Aug-2013	Mathieu Chartier <mathieuc@google.com>	New arena memory allocator. Before we were creating arenas for each method. The issue with doing this is that we needed to memset each memory allocation. This can be improved if you start out with arenas that contain all zeroed memory and recycle them for each method. When you give memory back to the arena pool you do a single memset to zero out all of the memory that you used. Always inlined the fast path of the allocation code. Removed the "zero" parameter since the new arena allocator always returns zeroed memory. Host dex2oat time on target oat apks (2 samples each). Before: real 1m11.958s user 4m34.020s sys 1m28.570s After: real 1m9.690s user 4m17.670s sys 1m23.960s Target device dex2oat samples (Mako, Thinkfree.apk): Without new arena allocator: 0m26.47s real 0m54.60s user 0m25.85s system 0m25.91s real 0m54.39s user 0m26.69s system 0m26.61s real 0m53.77s user 0m27.35s system 0m26.33s real 0m54.90s user 0m25.30s system 0m26.34s real 0m53.94s user 0m27.23s system With new arena allocator: 0m25.02s real 0m54.46s user 0m19.94s system 0m25.17s real 0m55.06s user 0m20.72s system 0m24.85s real 0m55.14s user 0m19.30s system 0m24.59s real 0m54.02s user 0m20.07s system 0m25.06s real 0m55.00s user 0m20.42s system Correctness of Thinkfree.apk.oat verified by diffing both of the oat files. Change-Id: I5ff7b85ffe86c57d3434294ca7a621a695bf57a9
468532ea115657709bc32ee498e701a4c71762d4	05-Aug-2013	Ian Rogers <irogers@google.com>	Entry point clean up. Create set of entry points needed for image methods to avoid fix-up at load time: - interpreter - bridge to interpreter, bridge to compiled code - jni - dlsym lookup - quick - resolution and bridge to interpreter - portable - resolution and bridge to interpreter Fix JNI work around to use JNI work around argument rewriting code that'd been accidentally disabled. Remove abstact method error stub, use interpreter bridge instead. Consolidate trampoline (previously stub) generation in generic helper. Simplify trampolines to jump directly into assembly code, keeps stack crawlable. Dex: replace use of int with ThreadOffset for values that are thread offsets. Tidy entry point routines between interpreter, jni, quick and portable. Change-Id: I52a7c2bbb1b7e0ff8a3c3100b774212309d0828e (cherry picked from commit 848871b4d8481229c32e0d048a9856e5a9a17ef9)
848871b4d8481229c32e0d048a9856e5a9a17ef9	05-Aug-2013	Ian Rogers <irogers@google.com>	Entry point clean up. Create set of entry points needed for image methods to avoid fix-up at load time: - interpreter - bridge to interpreter, bridge to compiled code - jni - dlsym lookup - quick - resolution and bridge to interpreter - portable - resolution and bridge to interpreter Fix JNI work around to use JNI work around argument rewriting code that'd been accidentally disabled. Remove abstact method error stub, use interpreter bridge instead. Consolidate trampoline (previously stub) generation in generic helper. Simplify trampolines to jump directly into assembly code, keeps stack crawlable. Dex: replace use of int with ThreadOffset for values that are thread offsets. Tidy entry point routines between interpreter, jni, quick and portable. Change-Id: I52a7c2bbb1b7e0ff8a3c3100b774212309d0828e
7934ac288acfb2552bb0b06ec1f61e5820d924a4	26-Jul-2013	Brian Carlstrom <bdc@google.com>	Fix cpplint whitespace/comments issues Change-Id: Iae286862c85fb8fd8901eae1204cd6d271d69496
2ce745c06271d5223d57dbf08117b20d5b60694a	18-Jul-2013	Brian Carlstrom <bdc@google.com>	Fix cpplint whitespace/braces issues Change-Id: Ide80939faf8e8690d8842dde8133902ac725ed1a
7940e44f4517de5e2634a7e07d58d0fb26160513	12-Jul-2013	Brian Carlstrom <bdc@google.com>	Create separate Android.mk for main build targets The runtime, compiler, dex2oat, and oatdump now are in seperate trees to prevent dependency creep. They can now be individually built without rebuilding the rest of the art projects. dalvikvm and jdwpspy were already this way. Builds in the art directory should behave as before, building everything including tests. Change-Id: Ic6b1151e5ed0f823c3dd301afd2b13eb2d8feb81