e7738e8bf567153fde593404d380a5c79ba6bfa8 |
19-Jun-2015 |
Vladimir Marko <vmarko@google.com> |
Quick: Handle total high/low register overlap on arm/mips. OpRegCopyWide() in arm and mips backends didn't handle the total register overlap when the registers holding the source and destination pairs are the same but in reverse order. Bug: 21897012 (cherry picked from commit 8958f7f8702327e713264d0538ab5dec586f3738) Change-Id: I20afce6cc3213e7f7b3edaef91f3ec29c469f877
nt_arm.cc
|
3d21bdf8894e780d349c481e5c9e29fe1556051c |
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 (cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33) Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3 ART: Fix casts for 64-bit pointers on 32-bit compiler. Bug: 19264997 Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457 Fix JDWP tests after ArtMethod change Fixes Throwable::GetStackDepth for exception event detection after internal stack trace representation change. Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of proxy method. Bug: 19264997 Change-Id: I363e293796848c3ec491c963813f62d868da44d2 Fix accidental IMT and root marking regression Was always using the conflict trampoline. Also included fix for regression in GC time caused by extra roots. Most of the regression was IMT. Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to detached thread. EvaluateAndApplyChanges: From ~2500 -> ~1980 GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots Bug: 19264997 Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0 Fix bogus image test assert Previously we were comparing the size of the non moving space to size of the image file. Now we properly compare the size of the image space against the size of the image file. Bug: 19264997 Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a [MIPS64] Fix art_quick_invoke_stub argument offsets. ArtMethod reference's size got bigger, so we need to move other args and leave enough space for ArtMethod* and 'this' pointer. This fixes mips64 boot. Bug: 19264997 Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
all_arm.cc
odegen_arm.h
nt_arm.cc
|
41b175aba41c9365a1c53b8a1afbd17129c87c14 |
19-May-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Clean up arm64 kNumberOfXRegisters usage. Avoid undefined behavior for arm64 stemming from 1u << 32 in loops with upper bound kNumberOfXRegisters. Create iterators for enumerating bits in an integer either from high to low or from low to high and use them for <arch>Context::FillCalleeSaves() on all architectures. Refactor runtime/utils.{h,cc} by moving all bit-fiddling functions to runtime/base/bit_utils.{h,cc} (together with the new bit iterators) and all time-related functions to runtime/base/time_utils.{h,cc}. Improve test coverage and fix some corner cases for the bit-fiddling functions. Bug: 13925192 (cherry picked from commit 80afd02024d20e60b197d3adfbb43cc303cf29e0) Change-Id: I905257a21de90b5860ebe1e39563758f721eab82
all_arm.cc
nt_arm.cc
|
848f70a3d73833fc1bf3032a9ff6812e429661d9 |
15-Jan-2014 |
Jeff Hao <jeffhao@google.com> |
Replace String CharArray with internal uint16_t array. Summary of high level changes: - Adds compiler inliner support to identify string init methods - Adds compiler support (quick & optimizing) with new invoke code path that calls method off the thread pointer - Adds thread entrypoints for all string init methods - Adds map to verifier to log when receiver of string init has been copied to other registers. used by compiler and interpreter Change-Id: I797b992a8feb566f9ad73060011ab6f51eb7ce01
all_arm.cc
|
084f7d43f4dc38bfc71446b1a3b07af085d778bf |
23-Apr-2015 |
Vladimir Marko <vmarko@google.com> |
Quick: Fix out of temp regs in ArmMir2Lir::GenMulLong(). This fixes running out of temp registers for mul-long that needs a temporary to store the result, i.e. when it's stored to stack location [sp, #offset] with offset >= 1024. The bug is currently not reproducible because ARM_R4_SUSPEND_FLAG is off and thus we have the extra register available. However, the code generation could be cleaned up and make use of that extra register, so pre-emptively fix it anyway. Bug: 20110806 Change-Id: I8362c349961dbe28fc3ec8a9299b66fd72f26779
nt_arm.cc
|
2cebb24bfc3247d3e9be138a3350106737455918 |
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Replace NULL with nullptr Also fixed some lines that were too long, and a few other minor details. Change-Id: I6efba5fb6e03eb5d0a300fddb2a75bf8e2f175cb
ssemble_arm.cc
all_arm.cc
nt_arm.cc
tility_arm.cc
|
fac10700fd99516e8a14f751fe35553021ce6982 |
22-Apr-2015 |
Vladimir Marko <vmarko@google.com> |
Quick: Remove broken Mir2Lir::LocToRegClass(). Its use in intrinsics has been bogus. In all other instances it's been used under the assumption that the inferred type matches the return type of associated calls. However, if the type inference identifies a type mismatch, the assumption doesn't hold and there isn't necessarily a valid value that the function could reasonably return. Bug: 19918641 Change-Id: I050934e6f9eb00427d0b888ee29ae9eeb509bb3f
p_arm.cc
nt_arm.cc
|
3b7b6cc6f26f1fffbad47a7279525198f4e22652 |
09-Apr-2015 |
Vladimir Marko <vmarko@google.com> |
Merge "Quick: PC-relative loads from dex cache arrays on x86."
|
1961b609bfefaedb71cee3651c4f931cc3e7393d |
08-Apr-2015 |
Vladimir Marko <vmarko@google.com> |
Quick: PC-relative loads from dex cache arrays on x86. Rewrite all PC-relative addressing on x86 and implement PC-relative loads from dex cache arrays. Don't adjust the base to point to the start of the method, let it point to the anchor, i.e. the target of the "call +0" insn. Change-Id: Ic22544a8bc0c5e49eb00a75154dc8f3ead816989
tility_arm.cc
|
1109fb3cacc8bb667979780c2b4b12ce5bb64549 |
07-Apr-2015 |
David Srbecky <dsrbecky@google.com> |
Implement CFI for Quick. CFI is necessary for stack unwinding in gdb, lldb, and libunwind. Change-Id: Ic3b84c9dc91c4bae80e27cda02190f3274e95ae8
all_arm.cc
|
cc23481b66fd1f2b459d82da4852073e32f033aa |
07-Apr-2015 |
Vladimir Marko <vmarko@google.com> |
Promote pointer to dex cache arrays on arm. Do the use-count analysis on temps (ArtMethod* and the new PC-relative temp) in Mir2Lir, rather than MIRGraph. MIRGraph isn't really supposed to know how the ArtMethod* is used by the backend. Change-Id: Iaf56a46ae203eca86281b02b54f39a80fe5cc2dd
all_arm.cc
odegen_arm.h
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
e5c76c515a481074aaa6b869aa16490a47ba98bc |
06-Apr-2015 |
Vladimir Marko <vmarko@google.com> |
PC-relative loads from dex cache arrays for arm. Change-Id: Ic25df4b51a901ff1d2ca356b5eec71d4acc5d9b7
all_arm.cc
odegen_arm.h
nt_arm.cc
arget_arm.cc
|
6f7158927fee233255f8e96719c374694b10cad3 |
30-Mar-2015 |
David Srbecky <dsrbecky@google.com> |
Write .debug_line section using the new DWARF library. Also simplify dex to java mapping and handle mapping in prologues and epilogues. Change-Id: I410f06024580f2a8788f2c93fe9bca132805029a
ssemble_arm.cc
all_arm.cc
|
20f85597828194c12be10d3a927999def066555e |
19-Mar-2015 |
Vladimir Marko <vmarko@google.com> |
Fixed layout for dex caches in boot image. Define a fixed layout for dex cache arrays (type, method, string and field arrays) for dex caches in the boot image. This gives those arrays fixed offsets from the boot image code and allows PC-relative addressing of their elements. Use the PC-relative load on arm64 for relevant instructions, i.e. invoke-static, invoke-direct, const-string, const-class, check-cast and instance-of. This reduces the arm64 boot.oat on Nexus 9 by 1.1MiB. This CL provides the infrastructure and shows on the arm64 the gains that we can achieve by having fixed dex cache arrays' layout. To fully use this for the boot images, we need to implement the PC-relative addressing for other architectures. To achieve similar gains for apps, we need to move the dex cache arrays to a .bss section of the oat file. These changes will be implemented in subsequent CLs. (Also remove some compiler_driver.h dependencies to reduce incremental build times.) Change-Id: Ib1859fa4452d01d983fd92ae22b611f45a85d69b
all_arm.cc
|
f6737f7ed741b15cfd60c2530dab69f897540735 |
23-Mar-2015 |
Vladimir Marko <vmarko@google.com> |
Quick: Clean up Mir2Lir codegen. Clean up WrapPointer()/UnwrapPointer() and OpPcRelLoad(). Change-Id: I1a91f01e1e779599c77f3f6efcac2a6ad34629cf
ssemble_arm.cc
all_arm.cc
odegen_arm.h
nt_arm.cc
arget_arm.cc
|
0b40ecf156e309aa17c72a28cd1b0237dbfb8746 |
20-Mar-2015 |
Vladimir Marko <vmarko@google.com> |
Quick: Clean up slow paths. Change-Id: I278d42be77b02778c4a419ae9024b37929915b64
all_arm.cc
|
f2674eac59c02dc2046c7080a799c03ccf66384d |
16-Mar-2015 |
Andrew Hsieh <andrewhsieh@google.com> |
Fixed maybe used uninitialized GCC 4.9 found that ops[1].op maybe uninitialized in ArmMir2Lir::GetEasyMultiplyTwoOps, but used unconditionally in ArmMir2Lir::GenEasyMultiplyTwoOps Change-Id: Icf8fdf3b888bd54ccb252e95637774889c7a0f9d
nt_arm.cc
|
e15ea086439b41a805d164d2beb07b4ba96aaa97 |
10-Feb-2015 |
Hiroshi Yamauchi <yamauchi@google.com> |
Reserve bits in the lock word for read barriers. This prepares for the CC collector to use the standard object header model by storing the read barrier state in the lock word. Bug: 19355854 Bug: 12687968 Change-Id: Ia7585662dd2cebf0479a3e74f734afe5059fb70f
all_arm.cc
|
335c55527846fc9019246163be0ac1ac02e95057 |
04-Feb-2015 |
Ningsheng Jian <ningsheng.jian@arm.com> |
ARM: Fix LIR flags in encoding map. Also correct memory reference type for PC relative load. Change-Id: I7a5258f2ed718448dc3e6e7fda6569b3f0c2fe46
ssemble_arm.cc
nt_arm.cc
|
6561551f094f79ce569160b29131b07be7aa3363 |
19-Feb-2015 |
Mathieu Chartier <mathieuc@google.com> |
Merge "Move arenas into runtime"
|
b666f4805c8ae707ea6fd7f6c7f375e0b000dba8 |
18-Feb-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move arenas into runtime Moved arena pool into the runtime. Motivation: Allow GC to use arena allocators, recycle arena pool for linear alloc. Bug: 19264997 Change-Id: I8ddbb6d55ee923a980b28fb656c758c5d7697c2f
odegen_arm.h
|
6ce3eba0f2e6e505ed408cdc40d213c8a512238d |
16-Feb-2015 |
Vladimir Marko <vmarko@google.com> |
Add suspend checks to special methods. Generate suspend checks at the beginning of special methods. If we need to call to runtime, go to the slow path where we create a simplified but valid frame, spill all arguments, call art_quick_test_suspend, restore necessary arguments and return back to the fast path. This keeps the fast path overhead to a minimum. Bug: 19245639 Change-Id: I3de5aee783943941322a49c4cf2c4c94411dbaa2
all_arm.cc
odegen_arm.h
|
72f53af0307b9109a1cfc0671675ce5d45c66d3a |
12-Nov-2014 |
Chao-ying Fu <chao-ying.fu@intel.com> |
ART: Remove MIRGraph::dex_pc_to_block_map_ This patch removes MIRGraph::dex_pc_to_block_map_, adds a local variable dex_pc_to_block_map inside MIRGraph::InlineMethod(), and updates several functions to pass dex_pc_to_block_map. The goal is to limit the scope of dex_pc_to_block_map and the usage of FindBlock, so that various compiler optimizations cannot rely on dex pc to look up basic blocks to avoid duplicated dex pc issues. Also, this patch changes quick targets to use successor blocks for switch case target generation at Mir2Lir::InstallSwitchTables(). Change-Id: I9f571efebd2706b4e1606279bd61f3b406ecd1c4 Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
all_arm.cc
|
a34e760fa5cc3102ce1998f10816d380c37f43aa |
02-Feb-2015 |
Zheng Xu <zheng.xu@arm.com> |
ARM/ARM64: Dump thread offset. Dump thread offset in compiler verbose log for arm32/arm64 and oatdump for arm64. Before patch : 0x4e: ldr lr, [rSELF, #604] After patch : 0x4e: ldr lr, [rSELF, #604] ; pTestSuspend Change-Id: I514e69dc44b1cf4c8a8fa085b31f93cf6a1b7c91
arget_arm.cc
|
a2e18ed9397f21c96eae4a26df9ca35a6a97341d |
27-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Fix inlining of Mir2Lir Missed -inl includes. Change-Id: I39e6d603c7f5d36693aca3816653594488bff63f
ssemble_arm.cc
|
0b9203e7996ee1856f620f95d95d8a273c43a3df |
23-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Some Quick cleanup Make several fields const in CompilationUnit. May benefit some Mir2Lir code that repeats tests, and in general immutability is good. Remove compiler_internals.h and refactor some other headers to reduce overly broad imports (and thus forced recompiles on changes). Change-Id: I898405907c68923581373b5981d8a85d2e5d185a
rm_lir.h
ssemble_arm.cc
all_arm.cc
odegen_arm.h
p_arm.cc
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
97d9f286971a4c1eec70e08f9f18f990d21780d5 |
20-Jan-2015 |
Andreas Gampe <agampe@google.com> |
Merge "ART: Some Quick cleanup"
|
d500b53ff8742f76b63c9f7593082d9e8114b85f |
17-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Some Quick cleanup Move some definitions around. In case a method is already virtual, avoid instruction-set tests. Change-Id: I8d98f098e55ade1bc0cfa32bb2aad006caccd07d
odegen_arm.h
nt_arm.cc
|
ddf05aa4c4ed4a86ac51fd5eca87739c3164fd9c |
14-Jan-2015 |
Dmitry Petrochenko <dmitry.petrochenko@intel.com> |
ART: Fix compiler warning at arm/int_arm.cc:644 Clang can report warning: int_arm.cc:644:89: error: 'ops.art::ArmMir2Lir::EasyMultiplyOp::shift' may be used uninitialized in this function OpRegRegRegShift(kOpRsub, r_tmp1, r_src, r_src, EncodeShift(kArmLsl, ops[0].shift)); That warning becomes blocker for libart-compiler.so compilation. This patch fixes the only case where 'shift' member was uninitialized. Change-Id: I6428170994c9f97e7a3d85d752b97dfcdff0c8a4 Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
nt_arm.cc
|
ea1c64dba97cb367e6594332f2e542bfcf30e295 |
14-Jan-2015 |
Vladimir Marko <vmarko@google.com> |
Merge "Fix wide volatile IGET/IPUT on ARM without atomic ldrd/strd."
|
ee5e273e4d0dd91b480c8d5dbcccad15c1b7353c |
13-Jan-2015 |
Vladimir Marko <vmarko@google.com> |
Fix wide volatile IGET/IPUT on ARM without atomic ldrd/strd. If ldrd/strd isn't atomic, IPUT_WIDE uses ldrexd+strexd and we need to record the safepoint for the ldrexd rather than strexd. IGET_WIDE was simply missing the memory barrier. Bug: 18993519 Change-Id: I4e9270b994f413c1a047c1c4bb9cce5f29e42cb4
tility_arm.cc
|
69c15d340e7e76821bbc5d4494d4cef383774dee |
13-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Skip r1 on arm if first parameter is a long. Change-Id: I16d927ee0a0b55031ade4c92c0095fd74e18ed5b
arget_arm.cc
|
d30feca670d0af02783bbdfd4a29c5078c18bdc5 |
06-Jan-2015 |
Andreas Gampe <agampe@google.com> |
Merge "ART: Remove LowestSetBit and IsPowerOfTwo"
|
cfe71e59c667abb35bc2363c49af7f8b549c44d0 |
06-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Fix divide-by-zero for ARM There was an infinite loop in the code generation for a divide by literal zero. Bug: 18887754 Change-Id: Ibd481918d3c6d7bc62fdd1a6807042009f561d95
nt_arm.cc
|
7e499925f8b4da46ae51040e9322690f3df992e6 |
06-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Remove LowestSetBit and IsPowerOfTwo Remove those functions from Mir2Lir and replace with functionality from utils.h. Change-Id: Ieb67092b22d5d460b5241c7c7931c15b9faf2815
all_arm.cc
nt_arm.cc
|
bfe400bb1a28cde991cdb3e39bc27bae6b04b8c2 |
19-Dec-2014 |
Vladimir Marko <vmarko@google.com> |
Fix running out of temps when storing invoke-interface result. On ARM, after emitting invoke-interface we didn't have any free temps to use for storing the result, so we would crash if the result was an unpromoted dalvik register with stack location too far from SP. Bug: 18769895 (cherry picked from commit d6bd06c713e8ec69de96510ef57bdf7adb4781ed) Change-Id: Id88f6f3788eaf6ecbc7bd68880b445423f6e4f94
arget_arm.cc
|
a262f7707330dccfb50af6345813083182b61043 |
25-Nov-2014 |
Ningsheng Jian <ningsheng.jian@arm.com> |
ARM: Combine multiply accumulate operations. Try to combine integer multiply and add(sub) into a MAC operation. For AArch64, also try to combine long type multiply and add(sub). Change-Id: Ic85812e941eb5a66abc355cab81a4dd16de1b66e
rm_lir.h
ssemble_arm.cc
odegen_arm.h
nt_arm.cc
arget_arm.cc
|
6c964c98400b8c0949d5e369968da2d4809b772f |
08-Dec-2014 |
Vladimir Marko <vmarko@google.com> |
Merge "Re-factor Quick ABI support"
|
717a3e447c6f7a922cf9c3efe522747a187a045d |
13-Nov-2014 |
Serguei Katkov <serguei.i.katkov@intel.com> |
Re-factor Quick ABI support Now every architecture must provide a mapper between VRs parameters and physical registers. Additionally as a helper function architecture can provide a bulk copy helper for GenDalvikArgs utility. All other things becomes a common code stuff: GetArgMappingToPhysicalReg, GenDalvikArgsNoRange, GenDalvikArgsRange, FlushIns. Mapper now uses shorty representation of input parameters. This is required due to location are not enough to detect the type of parameter (fp or core). For the details see https://android-review.googlesource.com/#/c/113936/. Change-Id: Ie762b921e0acaa936518ee6b63c9a9d25f83e434 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
odegen_arm.h
arget_arm.cc
|
aed3ad734c47fdccf179ff65971284a0d38583cd |
03-Dec-2014 |
Vladimir Marko <vmarko@google.com> |
Quick: Use fewer insns for ARM LDR/STR with large offsets. LDR with large offset is frequently used for reading from DexCache arrays, for example for static and direct invokes. STR with large offset is rarely used but it's updated for consistency. Change-Id: I75871416cecbfd7fe7de590922cea0376a2f4019
tility_arm.cc
|
a29f698b1754ee0ea2f46b6f5900e0da840dff79 |
25-Nov-2014 |
Vladimir Marko <vmarko@google.com> |
Implement InexpensiveConstantInt(., opcode) for ARM. Fix kThumb2{Add,Sub}RRI12 to be used for their full range. Add ORN for completeness. Change-Id: I49a51541fa9ea085d4674b9131d8dd94da5337f3
rm_lir.h
ssemble_arm.cc
odegen_arm.h
nt_arm.cc
tility_arm.cc
|
758662e9d02727d9a88be395a92b476843c44d36 |
01-Dec-2014 |
Vladimir Marko <vmarko@google.com> |
Merge "Quick: Fix neg-long on ARM for overlapping regs."
|
0a080978fc822c51f06cb615662ee9ddcba4f677 |
01-Dec-2014 |
Vladimir Marko <vmarko@google.com> |
Merge "Quick: Use 16-bit conditional branch in Thumb2."
|
2f340a843ea5b3413c901f8c2365243b68864468 |
01-Dec-2014 |
Vladimir Marko <vmarko@google.com> |
Quick: Fix neg-long on ARM for overlapping regs. Bug: 18569347 Change-Id: I764a4648b7ea5fd92f1ffbb9038b9d101b50d137
nt_arm.cc
|
174636dad59068fc6e879b147ae02ac932f38c6f |
26-Nov-2014 |
Vladimir Marko <vmarko@google.com> |
Quick: Use 16-bit conditional branch in Thumb2. We were using the 32-bit version because the compilation time impact of having to change the instruction length and reassemble instructions when the target is out of range was too high. However, the assembly phase has been rewritten since making that decision and the compile time impact is now insignificant, so we prefer to save space. Change-Id: Ib90f90d3f4e0c4e310267af272e3b16611026bbe
tility_arm.cc
|
9d5c25acdd1e9635fde8f8bf52a126b4d371dabd |
26-Nov-2014 |
Vladimir Marko <vmarko@google.com> |
Quick: Use 16-bit Thumb2 PUSH/POP when possible. Generate correct PUSH/POP in Gen{Entry,Exit}Sequence() to avoid extra processing during insn fixup. Change-Id: I396168e2a42faee6980d40779c7de9657531867b
ssemble_arm.cc
all_arm.cc
|
743b98cd3d7db1cfd6b3d7f7795e8abd9d07a42d |
24-Nov-2014 |
Vladimir Marko <vmarko@google.com> |
Skip null check in MarkGCCard() for known non-null values. Use GVN's knowledge of non-null values to set a new MIR flag for IPUT/SPUT/APUT to skip the value null check. Change-Id: I97a8d1447acb530c9bbbf7b362add366d1486ee1
nt_arm.cc
|
4514d2ac529064819d4f02699527764afa140008 |
21-Nov-2014 |
Vladimir Marko <vmarko@google.com> |
Merge "Add card mark to filled-new-array."
|
bf535be514570fc33fc0a6347a87dcd9097d9bfd |
19-Nov-2014 |
Vladimir Marko <vmarko@google.com> |
Add card mark to filled-new-array. Bug: 18032332 Change-Id: I35576b27f9115e4d0b02a11afc5e483b9e93a04a
all_arm.cc
odegen_arm.h
|
8366ca0d7ba3b80a2d5be65ba436446cc32440bd |
17-Nov-2014 |
Elliott Hughes <enh@google.com> |
Fix the last users of TARGET_CPU_SMP. Everyone else assumes SMP. Change-Id: I7ff7faef46fbec6c67d6e446812d599e473cba39
nt_arm.cc
|
2d7210188805292e463be4bcf7a133b654d7e0ea |
10-Nov-2014 |
Mathieu Chartier <mathieuc@google.com> |
Change 64 bit ArtMethod fields to be pointer sized Changed the 64 bit entrypoint and gc map fields in ArtMethod to be pointer sized. This saves a large amount of memory on 32 bit systems. Reduces ArtMethod size by 16 bytes on 32 bit. Total number of ArtMethod on low memory mako: 169957 Image size: 49203 methods -> 787248 image size reduction. Zygote space size: 1070 methods -> 17120 size reduction. App methods: ~120k -> 2 MB savings. Savings per app on low memory mako: 125K+ per app (less active apps -> more image methods per app). Savings depend on how often the shared methods are on dirty pages vs shared. TODO in another CL, delete gc map field from ArtMethod since we should be able to get it from the Oat method header. Bug: 17643507 Change-Id: Ie9508f05907a9f693882d4d32a564460bf273ee8 (cherry picked from commit e832e64a7e82d7f72aedbd7d798fb929d458ee8f)
all_arm.cc
|
d582fa4ea62083a7598dded5b82dc2198b3daac7 |
06-Nov-2014 |
Ian Rogers <irogers@google.com> |
Instruction set features for ARM64, MIPS and X86. Also, refactor how feature strings are handled so they are additive or subtractive. Make MIPS have features for FPU 32-bit and MIPS v2. Use in the quick compiler rather than #ifdefs that wouldn't have worked in cross-compilation. Add SIMD features for x86/x86-64 proposed in: https://android-review.googlesource.com/#/c/112370/ Bug: 18056890 Change-Id: Ic88ff84a714926bd277beb74a430c5c7d5ed7666
tility_arm.cc
|
b28c1c06236751aa5c9e64dcb68b3c940341e496 |
08-Nov-2014 |
Ian Rogers <irogers@google.com> |
Tidy RegStorage for X86. Don't use global variables initialized in constructors to hold onto constant values, instead use the TargetReg32 helper. Improve this helper with the use of lookup tables. Elsewhere prefer to use constexpr values as they will have less runtime cost. Add an ostream operator to RegStorage for CHECK_EQ and use. Change-Id: Ib8d092d46c10dac5909ecdff3cc1e18b7e9b1633
tility_arm.cc
|
8ba17f6ce3853d4bdeee7527c9900e018781cf24 |
28-Oct-2014 |
Ian Rogers <irogers@google.com> |
Don't enable ARM_R4_SUSPEND_FLAG. Bug: 17953517 Change-Id: I4578f1ffbfc987d5d178c7586b6bb99882ed19bb
rm_lir.h
nt_arm.cc
|
675e09b2753c2fcd521bd8f0230a0abf06e9b0e9 |
23-Oct-2014 |
Ningsheng Jian <ningsheng.jian@arm.com> |
ARM: Strength reduction for floating-point division For floating-point division by power of two constants, generate multiplication by the reciprocal instead. Change-Id: I39c79eeb26b60cc754ad42045362b79498c755be
odegen_arm.h
p_arm.cc
|
277ccbd200ea43590dfc06a93ae184a765327ad0 |
04-Nov-2014 |
Andreas Gampe <agampe@google.com> |
ART: More warnings Enable -Wno-conversion-null, -Wredundant-decls and -Wshadow in general, and -Wunused-but-set-parameter for GCC builds. Change-Id: I81bbdd762213444673c65d85edae594a523836e5
tility_arm.cc
|
ad17d41841ba1fb177fb0bf175ec0e9f5e1412b3 |
04-Nov-2014 |
Andreas Gampe <agampe@google.com> |
Merge "ART: Replace COMPILE_ASSERT with static_assert (compiler)"
|
785d2f2116bb57418d81bb55b55a087afee11053 |
04-Nov-2014 |
Andreas Gampe <agampe@google.com> |
ART: Replace COMPILE_ASSERT with static_assert (compiler) Replace all occurrences of COMPILE_ASSERT in the compiler tree. Change-Id: Icc40a38c8bdeaaf7305ab3352a838a2cd7e7d840
arget_arm.cc
|
6a3c1fcb4ba42ad4d5d142c17a3712a6ddd3866f |
31-Oct-2014 |
Ian Rogers <irogers@google.com> |
Remove -Wno-unused-parameter and -Wno-sign-promo from base cflags. Fix associated errors about unused paramenters and implict sign conversions. For sign conversion this was largely in the area of enums, so add ostream operators for the effected enums and fix tools/generate-operator-out.py. Tidy arena allocation code and arena allocated data types, rather than fixing new and delete operators. Remove dead code. Change-Id: I5b433e722d2f75baacfacae4d32aef4a828bfe1b
rm_lir.h
all_arm.cc
odegen_arm.h
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
fb311f8a0d0eafd535f8d25d262dcea35a8feaa4 |
28-Oct-2014 |
Vladimir Marko <vmarko@google.com> |
Remove useless suspend points from arm/arm64 AGET/APUT. Change-Id: Ib17da0c02599b943cb62582a8a25f187272d423b
nt_arm.cc
|
5667fdbb6e441dee7534ade18b628ed396daf593 |
23-Oct-2014 |
Zheng Xu <zheng.xu@arm.com> |
ARM: Use hardfp calling convention between java to java call. This patch default to use hardfp calling convention. Softfp can be enabled by setting kArm32QuickCodeUseSoftFloat to true. We get about -1 ~ +5% performance improvement with different benchmark tests. Hopefully, we should be able to get more performance by address the left TODOs, as some part of the code takes the original assumption which is not optimal. DONE: 1. Interpreter to quick code 2. Quick code to interpreter 3. Transition assembly and callee-saves 4. Trampoline(generic jni, resolution, invoke with access check and etc.) 5. Pass fp arg reg following aapcs(gpr and stack do not follow aapcs) 6. Quick helper assembly routines to handle ABI differences 7. Quick code method entry 8. Quick code method invocation 9. JNI compiler TODO: 10. Rework ArgMap, FlushIn, GenDalvikArgs and affected common code. 11. Rework CallRuntimeHelperXXX(). Change-Id: I9965d8a007f4829f2560b63bcbbde271bdcf6ec2
rm_lir.h
odegen_arm.h
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
b62ff579cd870b0bf213765b07d7b404d15ece7b |
25-Oct-2014 |
Ian Rogers <irogers@google.com> |
Merge "ART: Add div/rem zero check elimination flag"
|
6f3dbbadf4ce66982eb3d400e0a74cb73eb034f3 |
15-Oct-2014 |
Ian Rogers <irogers@google.com> |
Make ART compile with GCC -O0 again. Tidy up InstructionSetFeatures so that it has a type hierarchy dependent on architecture. Add to instruction_set_test to warn when InstructionSetFeatures don't agree with ones from system properties, AT_HWCAP and /proc/cpuinfo. Clean-up class linker entry point logic to not return entry points but to test whether the passed code is the particular entrypoint. This works around image trampolines that replicate entrypoints. Bug: 17993736 Change-Id: I5f4b49e88c3b02a79f9bee04f83395146ed7be23
tility_arm.cc
|
5c5676b26a08454b3f0133783778991bbe5dd681 |
30-Sep-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
ART: Add div/rem zero check elimination flag Just as with other throwing bytecodes, it is possible to prove in some cases that a divide/remainder won't throw ArithmeticException. For example, in case two divides with same denominator are in order, then provably the second one cannot throw if the first one did not. This patch adds the elimination flag and updates the signature of several Mir2Lir methods to take the instruction optimization flags into account. Change-Id: I0b078cf7f29899f0f059db1f14b65a37444b84e8 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
odegen_arm.h
nt_arm.cc
|
fc787ecd91127b2c8458afd94e5148e2ae51a1f5 |
10-Oct-2014 |
Ian Rogers <irogers@google.com> |
Enable -Wimplicit-fallthrough. Falling through switch cases on a clang build must now annotate the fallthrough with the FALLTHROUGH_INTENDED macro. Bug: 17731372 Change-Id: I836451cd5f96b01d1ababdbf9eef677fe8fa8324
ssemble_arm.cc
nt_arm.cc
tility_arm.cc
|
d8c3e3608a7b47e82186e4f8118541ef06d9eab2 |
08-Oct-2014 |
Alexei Zavjalov <alexei.zavjalov@intel.com> |
ART: X86: GenLongArith should handle overlapped VRs In a case, when src and dest VRs are overlapped when we called GenLongArith it may cause the incorrect use of regs. The solution is to map src to an physical reg and work with this reg instead of mem. Renamed BadOverlap() to PartiallyIntersects() for consistency. Change-Id: Ia3fc7f741f0a92556e1b2a1b084506662ef04c9d Signed-off-by: Katkov, Serguei I <serguei.i.katkov@intel.com> Signed-off-by: Alexei Zavjalov <alexei.zavjalov@intel.com>
nt_arm.cc
|
832336b3c9eb892045a8de1bb12c9361112ca3c5 |
09-Oct-2014 |
Ian Rogers <irogers@google.com> |
Don't copy fill array data to quick literal pool. Currently quick copies the fill array data from the dex file to the literal pool. It then has to go through hoops to pass this PC relative address down to out-of-line code. Instead, pass the offset of the table to the out-of-line code and use the CodeItem data associated with the ArtMethod. This reduces the size of oat code while greatly simplifying it. Unify the FillArrayData implementation in quick, portable and the interpreters. Change-Id: I9c6971cf46285fbf197856627368c0185fdc98ca
all_arm.cc
odegen_arm.h
|
7e70b002c4552347ed1af8c002a0e13f08864f20 |
08-Oct-2014 |
Ian Rogers <irogers@google.com> |
Header file clean up. Remove runtime.h from object.h. Move TypeStaticIf to its own header file to avoid bringing utils.h into allocator.h. Move Array::DataOffset into -inl.h as it now has a utils.h dependency. Fix include issues arising from this. Change-Id: I4605b1aa4ff5f8dc15706a0132e15df03c7c8ba0
nt_arm.cc
|
7c02e918e752ab36f0b6cab7528f10c0cf55a4ee |
03-Oct-2014 |
buzbee <buzbee@google.com> |
Quick compiler: Fix ambiguous LoadValue() Internal b/17790197 & hat tip to Stephen Kyle The following custom-edited dex program demonstrated incorrect code generation caused by type confusion. In the example, the constant held in v0 is used in both float and int contexts, and the register class gets confused at the if-eq. .method private static getInt()I .registers 4 const/16 v0, 100 const/4 v1, 1 const/4 v2, 7 :loop if-eq v2, v0, :done add-int v2, v2, v1 goto :loop :done add-float v3, v0, v1 return v2 .end method The bug was introduced in c/96499, "Quick compiler: reference cleanup" That CL created a convenience variant of LoadValue which selected the target register type based on the type of the RegLocation. It should not have done so. The type of a RegLocation is the compiler's best guess of the Dalvik type - and Dalvik allows constants to be used in multiple type contexts. All code generation utilities must specify desired register class based on the capabilities of the instructions to be emitted. In the failing case, OpCmpImmBranch (and GenCompareZeroAndBranch) will be using core registers, so the LoadValue must specify either kCoreReg or kRefReg. The CL deletes the dangerous LoadValue() variant. Change-Id: Ie4ec6e51b19676dbbb9628c72c8b3473a419e7ec
nt_arm.cc
|
750359753444498d509a756fa9a042e9f3c432df |
12-Sep-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
ART: Deprecate CompilationUnit's code_item The code_item field is tracked in both the CompilationUnit and the MIRGraph. However, the existence of this field in CompilationUnit promotes bad practice because it creates assumption only a single code_item can be part of method. This patch deprecates this field and updates MIRGraph methods to make it easy to get same information as before. Part of this is the update to interface GetNumDalvikInsn which ensures to count all code_items in MIRGraph. Some dead code was also removed because it was not friendly to these updates. Change-Id: Ie979be73cc56350321506cfea58f06d688a7fe99 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
nt_arm.cc
|
f4da675bbc4615c5f854c81964cac9dd1153baea |
01-Aug-2014 |
Vladimir Marko <vmarko@google.com> |
Implement method calls using relative BL on ARM. Store the linker patches with each CompiledMethod instead of keeping them in CompilerDriver. Reorganize oat file creation to apply the patches as we're writing the method code. Add framework for platform-specific relative call patches in the OatWriter. Implement relative call patches for ARM. Change-Id: Ie2effb3d92b61ac8f356140eba09dc37d62290f8
rm_lir.h
ssemble_arm.cc
all_arm.cc
odegen_arm.h
arget_arm.cc
|
e39c54ea575ec710d5e84277fcdcc049f8acb3c9 |
22-Sep-2014 |
Vladimir Marko <vmarko@google.com> |
Deprecate GrowableArray, use ArenaVector instead. Purge GrowableArray from Quick and Portable. Remove GrowableArray<T>::Iterator. Change-Id: I92157d3a6ea5975f295662809585b2dc15caa1c6
all_arm.cc
arget_arm.cc
|
9863daf4fdc1a08339edac794452dbc719aef4f1 |
04-Sep-2014 |
Serguei Katkov <serguei.i.katkov@intel.com> |
AddIntrinsicSlowPath with resume requires clobbering AddIntrinsicSlowPath with resume results in a call. So all temps must be clobbered at the point where AddIntrinsicSlowPath returns. Change-Id: If9eb887e295ff5e59920f4da1cef63258ad490b0 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
nt_arm.cc
|
b5477f0f8080cef4cb8a9dcea4367344ea7cdab4 |
10-Sep-2014 |
Junmo Park <junmoz.park@samsung.com> |
Fix kThumb2Vldrd definition to set correct flag. kThumb2Vldrd shoud be set IS_LOAD_OFF4 not IS_LOAD_OFF. Change-Id: I6b8ec3c54513f687a846ba7f3a817f6e439abcc9 Signed-off-by: Junmo Park <junmoz.park@samsung.com>
ssemble_arm.cc
|
eacc5f015fab4d6607f72165b0902f49f7d18763 |
01-Sep-2014 |
Junmo Park <junmoz.park@samsung.com> |
Fix Thumb2Stm, ldm definition of EncodingMap for arm Thumb2Stm instruction can save r0-r12,r14. But the definition of EncodingMap only set r0-r12. So it is fixed likes Thumb2Stmia. Add new assembler formats kFmtLdmRegList and kFmtStmRegList. Change-Id: Id03118d602f9d49d9d916f3dd9f3198f24ab9c37
rm_lir.h
ssemble_arm.cc
|
8d0d03e24325463f0060abfd05dba5598044e9b1 |
07-Jun-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
ART: Change temporaries to positive names Changes compiler temporaries to have positive names. The numbering now puts them above the code VRs (locals + ins, in that order). The patch also introduces APIs to query the number of temporaries, locals and ins. The compiler temp infrastructure suffered from several issues which are also addressed by this patch: -There is no longer a queue of compiler temps. This would be polluted with Method* when post opts were called multiple times. -Sanity checks have been added to allow requesting of temps from BE and to prevent temps after frame is committed. -None of the structures holding temps can overflow because they are allocated to allow holding maximum temps. Thus temps can be requested by BE with no problem. -Since the queue of compiler temps is no longer maintained, it is no longer possible to refer to a temp that has invalid ssa (because it was requested before ssa was run). -The BE can now request temps after all ME allocations and it is guaranteed to actually receive them. -ME temps are now treated like normal VRs in all cases with no special handling. Only the BE temps are handled specially because there are no references to them from MIRs. -Deprecated and removed several fields in CompilationUnit that saved register information and updated callsites to call the new interface from MIRGraph. Change-Id: Ia8b1fec9384a1a83017800a59e5b0498dfb2698c Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com> Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
all_arm.cc
odegen_arm.h
|
53c913bb71b218714823c8c87a1f92830c336f61 |
13-Aug-2014 |
Andreas Gampe <agampe@google.com> |
ART: Clean up compiler Clean up the compiler: less extern functions, dis-entangle compilers, hide some compiler specifics, lower global includes. Change-Id: Ibaf88d02505d86994d7845cf0075be5041cc8438
ackend_arm.h
odegen_arm.h
arget_arm.cc
|
a5f90b6f0f5b33487e71eaeb05508555f17dcf30 |
14-Aug-2014 |
Vladimir Marko <vmarko@google.com> |
Fix intrinsic Math.abs(double) for ARM. Bug: 16930909 Change-Id: I1210cb3aa82a73b9e4d4df1ceddeff78ac1df42b
p_arm.cc
|
648d7112609dd19c38131b3e71c37bcbbd19d11e |
26-Jul-2014 |
Dave Allison <dallison@google.com> |
Reduce stack usage for overflow checks This reduces the stack space reserved for overflow checks to 12K, split into an 8K gap and a 4K protected region. GC needs over 8K when running in a stack overflow situation. Also prevents signal runaway by detecting a signal inside code that resulted from a signal handler invokation. And adds a max signal count to the SignalTest to prevent it running forever. Also reduces the number of iterations for the InterfaceTest as this was taking (almost) forever with the --trace option on run-test. Bug: 15435566 Change-Id: Id4fd46f22d52d42a9eb431ca07948673e8fda694
all_arm.cc
|
947717a2b085f36ea007ac64f728e19ff1c8db0b |
07-Aug-2014 |
Zheng Xu <zheng.xu@arm.com> |
Add arraycopy intrinsic for arm and arm64. Implement intrinsic for java.lang.System.arraycopy(char[], int, char[], int, int). Bug: 16241558 Change-Id: I558a9c4403d0c3abb07af1511d394981bbfcabc5
odegen_arm.h
nt_arm.cc
|
48971b3242e5126bcd800cc9c68df64596b43d13 |
06-Aug-2014 |
Andreas Gampe <agampe@google.com> |
ART: Generate chained compare-and-branch for short switches Refactor Mir2Lir to generate chained compare-and-branch sequences for short switches on all architectures. Change-Id: Ie2a572ae69d462ba68a119e9fb93ae538cddd08f
all_arm.cc
odegen_arm.h
|
c76c614d681d187d815760eb909e5faf488a3c35 |
05-Aug-2014 |
Andreas Gampe <agampe@google.com> |
ART: Refactor long ops in quick compiler Make GenArithOpLong virtual. Let the implementation in gen_common be very basic, without instruction-set checks, and meant as a fall-back. Backends should implement and dispatch to code for better implementations. This allows to remove the GenXXXLong virtual methods from Mir2Lir, and clean up the backends (especially removing some LOG(FATAL) implementations). Change-Id: I6366443c0c325c1999582d281608b4fa229343cf
odegen_arm.h
nt_arm.cc
|
63999683329612292d534e6be09dbde9480f1250 |
15-Jul-2014 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Revert "Revert "Enable Load Store Elimination for ARM and ARM64"" This patch refactors the implementation of the LoadStoreElimination optimisation pass. Please note that this pass was disabled and not functional for any of the backends. The current implementation tracks aliases and handles DalvikRegs as well as Heap memory regions. It has been tested and it is known to optimise out the following: * Load - Load * Store - Load * Store - Store * Load Literals Change-Id: I3aadb12a787164146a95bc314e85fa73ad91e12b
ssemble_arm.cc
odegen_arm.h
tility_arm.cc
|
984305917bf57b3f8d92965e4715a0370cc5bcfb |
28-Jul-2014 |
Andreas Gampe <agampe@google.com> |
ART: Rework quick entrypoint code in Mir2Lir, cleanup To reduce the complexity of calling trampolines in generic code, introduce an enumeration for entrypoints. Introduce a header that lists the entrypoint enum and exposes a templatized method that translates an enum value to the corresponding thread offset value. Call helpers are rewritten to have an enum parameter instead of the thread offset. Also rewrite LoadHelper and GenConversionCall this way. It is now LoadHelper's duty to select the right thread offset size. Introduce InvokeTrampoline virtual method to Mir2Lir. This allows to further simplify the call helpers, as well as make OpThreadMem specific to X86 only (removed from Mir2Lir). Make GenInlinedCharAt virtual, move a copy to X86 backend, and simplify both copies. Remove LoadBaseIndexedDisp and OpRegMem from Mir2Lir, as they are now specific to X86 only. Remove StoreBaseIndexedDisp from Mir2Lir, as it was only ever used in the X86 backend. Remove OpTlsCmp from Mir2Lir, as it was only ever used in the X86 backend. Remove OpLea from Mir2Lir, as it was only ever defined in the X86 backend. Remove GenImmedCheck from Mir2Lir as it was neither used nor implemented. Change-Id: If0a6182288c5d57653e3979bf547840a4c47626e
odegen_arm.h
p_arm.cc
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
c32447bcc8c36ee8ff265ed678c7df86936a9ebe |
27-Jul-2014 |
Bill Buzbee <buzbee@android.com> |
Revert "Enable Load Store Elimination for ARM and ARM64" On extended testing, I'm seeing a CHECK failure at utility_arm.cc:1201. This reverts commit fcc36ba2a2b8fd10e6eebd21ecb6329606443ded. Change-Id: Icae3d49cd7c8fcab09f2f989cbcb1d7e5c6d137a
ssemble_arm.cc
odegen_arm.h
tility_arm.cc
|
fcc36ba2a2b8fd10e6eebd21ecb6329606443ded |
15-Jul-2014 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Enable Load Store Elimination for ARM and ARM64 This patch refactors the implementation of the LoadStoreElimination optimisation pass. Please note that this pass was disabled and not functional for any of the backends. The current implementation tracks aliases and handles DalvikRegs as well as Heap memory regions. It has been tested and it is known to optimise out the following: * Load - Load * Store - Load * Store - Store * Load Literals Change-Id: Iefae9b696f87f833ef35c451ed4d49c5a1b6fde0
ssemble_arm.cc
odegen_arm.h
tility_arm.cc
|
9ee4519afd97121f893f82d41d23164fc6c9ed34 |
17-Jul-2014 |
Serguei Katkov <serguei.i.katkov@intel.com> |
x86: GenSelect utility update The is follow-up https://android-review.googlesource.com/#/c/101396/ to make x86 GenSelectConst32 implementation complete. Change-Id: I69f318e18093f9a5b00f8f00f0f1c2e4ff7a9ab2 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
tility_arm.cc
|
2983d230534aee99090d28b2666dae094440f1c4 |
22-Jul-2014 |
Vladimir Marko <vmarko@google.com> |
Merge "Use vabs/fabs on arm/arm64 for intrinsic abs()."
|
5030d3ee8c6fe10394912ede107cbc8df63b7b16 |
17-Jul-2014 |
Vladimir Marko <vmarko@google.com> |
Use vabs/fabs on arm/arm64 for intrinsic abs(). Bug: 11579369 Change-Id: If09da85e22786faa13a2d74f62cee68ea67bd087
odegen_arm.h
p_arm.cc
|
7ea6f79bbddd69d5db86a8656a31aaaf64ae2582 |
15-Jul-2014 |
Andreas Gampe <agampe@google.com> |
ART: Throw StackOverflowError in native code Initialize stack-overflow errors in native code to be able to reduce the preserved area size of the stack. Includes a refactoring away from constexpr in instruction_set.h to allow for easy changing of the values. Change-Id: I117cc8485f43da5f0a470f0f5e5b3dc3b5a06246
all_arm.cc
|
fb8a07bdf92ab097c1d309a8a6b70dacc81f4478 |
17-Jul-2014 |
Andreas Gampe <agampe@google.com> |
Merge "ART: Refactor GenSelect, refactor gen_common accordingly"
|
90969af6deb19b1dbe356d62fe68d8f5698d3d8f |
16-Jul-2014 |
Andreas Gampe <agampe@google.com> |
ART: Refactor GenSelect, refactor gen_common accordingly This adds a GenSelect method meant for selection of constants. The general-purpose GenInstanceof code is refactored to take advantage of this. This cleans up code and squashes a branch-over on ARM64 to a cset. Also add a slow-path for type initialization in GenInstanceof. Change-Id: Ie4494858bb8c26d386cf2e628172b81bba911ae5
odegen_arm.h
nt_arm.cc
|
9791bb427fd812c1268edab6fb3ac7b82ad9fb93 |
17-Jul-2014 |
Jeff Hao <jeffhao@google.com> |
Merge "Fix art test failures for Mips."
|
69dfe51b684dd9d510dbcb63295fe180f998efde |
11-Jul-2014 |
Dave Allison <dallison@google.com> |
Revert "Revert "Revert "Revert "Add implicit null and stack checks for x86"""" This reverts commit 0025a86411145eb7cd4971f9234fc21c7b4aced1. Bug: 16256184 Change-Id: Ie0760a0c293aa3b62e2885398a8c512b7a946a73
all_arm.cc
|
d9cb8ae2ed78f957a773af61759432d7a7bf78af |
09-Jul-2014 |
Douglas Leung <douglas@mips.com> |
Fix art test failures for Mips. This patch fixes the following art test failures for Mips: 003-omnibus-opcodes 030-bad-finalizer 041-narrowing 059-finalizer-throw Change-Id: I4e0e9ff75f949c92059dd6b8d579450dc15f4467 Signed-off-by: Douglas Leung <douglas@mips.com>
odegen_arm.h
arget_arm.cc
|
9522af985466b2a05ef5cdede0808777dea7236e |
15-Jul-2014 |
Andreas Gampe <agampe@google.com> |
ART: Squash a cmp w/ zero and b.ls to cbz (ARM/ARM64) In case of array bounds checks at constant index 0 we generate a compare and a branch. Squash into a cbz. Change-Id: I1c6a6e37a7a2356b2c4580a3387cedb55436e251
nt_arm.cc
|
0f73aa8f64417232e3f3d09e53f49084d2783fe0 |
12-Jul-2014 |
Andreas Gampe <agampe@google.com> |
Merge "Update counting VR for promotion"
|
59a42afc2b23d2e241a7e301e2cd68a94fba51e5 |
04-Jul-2014 |
Serguei Katkov <serguei.i.katkov@intel.com> |
Update counting VR for promotion For 64-bit it makes sense to compute VR uses together for int and long because core reg is shared. Change-Id: Ie8676ece12c928d090da2465dfb4de4e91411920 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
odegen_arm.h
|
48f5c47907654350ce30a8dfdda0e977f5d3d39f |
27-Jun-2014 |
Hans Boehm <hboehm@google.com> |
Replace memory barriers to better reflect Java needs. Replaces barriers that enforce ordering of one access type (e.g. Load) with respect to another (e.g. store) with more general ones that better reflect both Java requirements and actual hardware barrier/fence instructions. The old code was inconsistent and unclear about which barriers implied which others. Sometimes multiple barriers were generated and then eliminated; sometimes it was assumed that certain barriers implied others. The new barriers closely parallel those in C++11, though, for now, we use something closer to the old naming. Bug: 14685856 Change-Id: Ie1c80afe3470057fc6f2b693a9831dfe83add831
all_arm.cc
nt_arm.cc
tility_arm.cc
|
0025a86411145eb7cd4971f9234fc21c7b4aced1 |
11-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Revert "Add implicit null and stack checks for x86""" Broke the build. This reverts commit 7fb36ded9cd5b1d254b63b3091f35c1e6471b90e. Change-Id: I9df0e7446ff0913a0e1276a558b2ccf6c8f4c949
all_arm.cc
|
7fb36ded9cd5b1d254b63b3091f35c1e6471b90e |
10-Jul-2014 |
Dave Allison <dallison@google.com> |
Revert "Revert "Add implicit null and stack checks for x86"" Fixes x86_64 cross compile issue. Removes command line options and property to set implicit checks - this is hard coded now. This reverts commit 3d14eb620716e92c21c4d2c2d11a95be53319791. Change-Id: I5404473b5aaf1a9c68b7181f5952cb174d93a90d
all_arm.cc
|
23abec955e2e733999a1e2c30e4e384e46e5dde4 |
02-Jul-2014 |
Serban Constantinescu <serban.constantinescu@arm.com> |
AArch64: Add few more inline functions This patch adds inlining support for the following functions: * Math.max/min(long, long) * Math.max/min(float, float) * Math.max/min(double, double) * Integer.reverse(int) * Long.reverse(long) Change-Id: Ia2b1619fd052358b3a0d23e5fcbfdb823d2029b9 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
odegen_arm.h
nt_arm.cc
|
b5860fb459f1ed71f39d8a87b45bee6727d79fe8 |
22-Jun-2014 |
buzbee <buzbee@google.com> |
Register promotion support for 64-bit targets Not sufficiently tested for 64-bit targets, but should be fairly close. A significant amount of refactoring could stil be done, (in later CLs). With this change we are not making any changes to the vmap scheme. As a result, it is a requirement that if a vreg is promoted to both a 32-bit view and the low half of a 64-bit view it must share the same physical register. We may change this restriction later on to allow for more flexibility for 32-bit Arm. For example, if v4, v5, v4/v5 and v5/v6 are all hot enough to promote, we'd end up with something like: v4 (as an int) -> r10 v4/v5 (as a long) -> r10 v5 (as an int) -> r11 v5/v6 (as a long) -> r11 Fix a couple of ARM64 bugs on the way... Change-Id: I6a152b9c164d9f1a053622266e165428045362f3
odegen_arm.h
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
3c12c512faf6837844d5465b23b9410889e5eb11 |
24-Jun-2014 |
Andreas Gampe <agampe@google.com> |
Revert "Revert "ART: Split out more cases of Load/StoreRef, volatile as parameter"" This reverts commit de68676b24f61a55adc0b22fe828f036a5925c41. Fixes an API comment, and differentiates between inserting and appending. Change-Id: I0e9a21bb1d25766e3cbd802d8b48633ae251a6bf
all_arm.cc
odegen_arm.h
nt_arm.cc
tility_arm.cc
|
de68676b24f61a55adc0b22fe828f036a5925c41 |
24-Jun-2014 |
Andreas Gampe <agampe@google.com> |
Revert "ART: Split out more cases of Load/StoreRef, volatile as parameter" This reverts commit 2689fbad6b5ec1ae8f8c8791a80c6fd3cf24144d. Breaks the build. Change-Id: I9faad4e9a83b32f5f38b2ef95d6f9a33345efa33
all_arm.cc
odegen_arm.h
nt_arm.cc
tility_arm.cc
|
2689fbad6b5ec1ae8f8c8791a80c6fd3cf24144d |
23-Jun-2014 |
Andreas Gampe <agampe@google.com> |
ART: Split out more cases of Load/StoreRef, volatile as parameter Splits out more cases of ref registers being loaded or stored. For code clarity, adds volatile as a flag parameter instead of a separate method. On ARM64, continue cleanup. Add flags to print/fatal on size mismatches. Change-Id: I30ed88433a6b4ff5399aefffe44c14a5e6f4ca4e
all_arm.cc
odegen_arm.h
nt_arm.cc
tility_arm.cc
|
995b32cc8e94a9730d6cf663a23afc9c997c1771 |
19-Jun-2014 |
Andreas Gampe <agampe@google.com> |
Merge "ART: Implicit checks in the compiler are independent from Runtime"
|
5655e84e8d71697d8ef3ea901a0b853af42c559e |
18-Jun-2014 |
Andreas Gampe <agampe@google.com> |
ART: Implicit checks in the compiler are independent from Runtime When cross-compiling, those flags are independent. This is an initial CL that helps bypass fatal failures when cross-compiling, as not all architectures support (and have turned on) implicit checks. The actual transport for the target architecture when it is different from the runtime needs to be implemented in a follow-up CL. Bug: 15703710 Change-Id: Idc881a9a4abfd38643b862a491a5af9b8841f693
all_arm.cc
|
7cd26f355ba83be75b72ed628ed5ee84a3245c4f |
19-Jun-2014 |
Andreas Gampe <agampe@google.com> |
ART: Target-dependent stack overflow, less check elision Refactor the separate stack overflow reserved sizes from thread.h into instruction_set.h and make sure they're used in the compiler. Refactor the decision on when to elide stack overflow checks: especially with large interpreter stack frames, it is not a good idea to elide checks when the frame size is even close to the reserved size. Currently enforce checks when the frame size is >= 2KB, but make sure that frame sizes 1KB and below will elide the checks (number from experience). Bug: 15728765 Change-Id: I016bfd3d8218170cbccbd123ed5e2203db167c06
all_arm.cc
|
37573977769e9068874506050c62acd4e324d246 |
16-Jun-2014 |
Vladimir Marko <vmarko@google.com> |
Clean up ARM load/store with offset imm8 << 2. Change-Id: I95ed6860131b99eef7ed727f54745976949cbcb3
odegen_arm.h
tility_arm.cc
|
5aa6e04061ced68cca8111af1e9c19781b8a9c5d |
14-Jun-2014 |
Ian Rogers <irogers@google.com> |
Tidy x86 assembler. Use helper functions to compute when the kind has a SIB, a ModRM and RegReg form. Change-Id: I86a5cb944eec62451c63281265e6974cd7a08e07
ssemble_arm.cc
odegen_arm.h
|
c0090a4206306a80a830de35c7b4c74a43df690a |
12-Jun-2014 |
Vladimir Marko <vmarko@google.com> |
Merge "Rewrite use/def masks to support 128 bits."
|
8dea81ca9c0201ceaa88086b927a5838a06a3e69 |
06-Jun-2014 |
Vladimir Marko <vmarko@google.com> |
Rewrite use/def masks to support 128 bits. Reduce LIR memory usage by holding masks by pointers in the LIR rather than directly and using pre-defined const masks for the common cases, allocating very few on the arena. Change-Id: I0f6d27ef6867acd157184c8c74f9612cebfe6c16
rm_lir.h
all_arm.cc
odegen_arm.h
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
db9d523ff305721d4ca3f1470d1b2ce64c736e0a |
10-Jun-2014 |
Vladimir Marko <vmarko@google.com> |
Clean up ArmMirToLir::LoadDispBody()/StoreDispBody(). Refactor the 64-bit load and store code to use a shared helper function that can be used for any opcode with the displacement limited to 8-bit value shifted by 2 (i.e. max 1020). Use that function also for 32-bit float load and store as it is actually better than the old code for offsets exceeding the 1020 byte limit. Change-Id: I7dec38bae8cd9891420d2e92b1bac6138af5d64e
odegen_arm.h
tility_arm.cc
|
8550197244d470bf7645075e5400750f2cab4e42 |
07-Jun-2014 |
Bill Buzbee <buzbee@android.com> |
Merge "x86_64: Hard Float ABI support in QCG"
|
58994cdb00b323339bd83828eddc53976048006f |
16-May-2014 |
Dmitry Petrochenko <dmitry.petrochenko@intel.com> |
x86_64: Hard Float ABI support in QCG This patch shows our efforts on resolving the ART limitations: - passing "float"/"double" arguments via FPR - passing "long" arguments via single GPR, not pair - passing more than 3 agruments via GPR. Work done: - Extended SpecialTargetRegister enum with kARG4, kARG5, fARG4..fARG7. - Created initial LoadArgRegs/GenDalvikX/FlushIns version in X86Mir2Lir. - Unlimited number of long/double/float arguments support - Refactored (v2) Change-Id: I5deadd320b4341d5b2f50ba6fa4a98031abc3902 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com> Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com> Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com> Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
arget_arm.cc
|
576ca0cd692c0b6ae70e776de91015b8ff000a08 |
07-Jun-2014 |
Ian Rogers <irogers@google.com> |
Reduce header files including header files. Main focus is getting heap.h out of runtime.h. Change-Id: I8d13dce8512816db2820a27b24f5866cc871a04b
all_arm.cc
|
04f4d8abe45d6e79eca983e057de76aea24b7df9 |
30-May-2014 |
Wei Jin <wejin@google.com> |
Add an optimization for removing redundant suspend tests in ART This CL: (1) eliminates redundant suspend checks (dominated by another check), (2) removes the special treatment of the R4 register, which got reset on every native call, possibly yielding long execution sequences without any suspend checks, and (3) fixes the absence of suspend checks in leaf methods. (2) and (3) increase the frequency of suspend checks, which improves the performance of GC and the accuracy of profile data. To compensate for the increased number of checks, we implemented an optimization that leverages dominance information to remove redundant suspend checks on back edges. Based on the results of running the Caffeine benchmark on Nexus 7, the patch performs roughly 30% more useful suspend checks, spreading them much more evenly along the execution trace, while incurring less than 1% overhead. For flexibility consideration, this CL defines two flags to control the enabling of optimizations. The original implementation is the default. Change-Id: I31e81a5b3c53030444dbe0434157274c9ab8640f Signed-off-by: Wei Jin <wejin@google.com>
rm_lir.h
nt_arm.cc
arget_arm.cc
|
089142cf1d0c028b5a7c703baf0b97f4a4ada3f7 |
05-Jun-2014 |
Vladimir Marko <vmarko@google.com> |
Avoid register pool allocations on the heap. Create a helper template class ArrayRef and use it instead of std::vector<> for register pools in target_<arch>.cc to avoid these heap allocations during program startup. Change-Id: I4ab0205af9c1d28a239c0a105fcdc60ba800a70a
arget_arm.cc
|
a0cd2d701f29e0bc6275f1b13c0edfd4ec391879 |
01-Jun-2014 |
buzbee <buzbee@google.com> |
Quick compiler: reference cleanup For 32-bit targets, object references are 32 bits wide both in Dalvik virtual registers and in core physical registers. Because of this, object references and non-floating point values were both handled as if they had the same register class (kCoreReg). However, for 64-bit systems, references are 32 bits in Dalvik vregs, but 64 bits in physical registers. Although the same underlying physical core registers will still be used for object reference and non-float values, different register class views will be used to represent them. For example, an object reference in arm64 might be held in x3 at some point, while the same underlying physical register, w3, would be used to hold a 32-bit int. This CL breaks apart the handling of object reference and non-float values to allow the proper register class (or register view) to be used. A new register class, kRefReg, is introduced which will map to a 32-bit core register on 32-bit targets, and 64-bit core registers on 64-bit targets. From this point on, object references should be allocated registers in the kRefReg class rather than kCoreReg. Change-Id: I6166827daa8a0ea3af326940d56a6a14874f5810
all_arm.cc
odegen_arm.h
p_arm.cc
nt_arm.cc
arget_arm.cc
|
67c482f6737343f3afbf214995d67d98b0b36c91 |
27-May-2014 |
buzbee <buzbee@google.com> |
Merge "Art compiler: remove unnecessary sqrt call"
|
055c29fd0f752328981f1b7ccadb1862eecedd40 |
27-May-2014 |
buzbee <buzbee@google.com> |
Art compiler: remove unnecessary sqrt call For reasons lost in the mists of time, the Dalvik JIT tested the results of an inlined sqrt for NaN on Arm targets, and then called an out-of-line routine to recompute if true. The Quick compiler inherited this behavior. It is not necessary, and the CL purges it (along with the out-of-line sqrt entrypoint). Change-Id: I8c8fa6feacf9b7c3b9e190dfc6f728932fd948c6
p_arm.cc
|
85089dd28a39dd20f42ac258398b2a08668f9ef1 |
26-May-2014 |
buzbee <buzbee@google.com> |
Quick compiler: generalize NarrowRegLoc() Some of the RegStorage utilites (DoubleToLowSingle(), DoubleToHighSingle(), etc.) worked only for targets which which treat double precision registers as a pair of aliased single precision registers. This CL elminates those utilities, and replaces them with a new RegisterInfo utility that will search an aliased register set and return the member matching the required storage configuration (if it exists). Change-Id: Iff5de10f467d20a56e1a89df9fbf30d1cf63c240
p_arm.cc
arget_arm.cc
|
ed65c5e982705defdb597d94d1aa3f2997239c9b |
22-May-2014 |
Serban Constantinescu <serban.constantinescu@arm.com> |
AArch64: Enable LONG_* and INT_* opcodes. This patch fixes some of the issues with LONG and INT opcodes. The patch has been tested and passes all the dalvik tests except for 018 and 107. Change-Id: Idd1923ed935ee8236ab0c7e5fa969eaefeea8708 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
odegen_arm.h
nt_arm.cc
|
b01bf15d18f9b08d77e7a3c6e2897af0e02bf8ca |
14-May-2014 |
buzbee <buzbee@google.com> |
64-bit temp register support. Add a 64-bit temp register allocation path. The recent physical register handling rework supports multiple views of the same physical register (or, such as for Arm's float/double regs, different parts of the same physical register). This CL adds a 64-bit core register view for 64-bit targets. In short, each core register will have a 64-bit name, and a 32-bit name. The different views will be kept in separate register pools, but aliasing will be tracked. The core temp register allocation routines will be largely identical - except for 32-bit targets, which will continue to use pairs of 32-bit core registers for holding long values. Change-Id: I8f118e845eac7903ad8b6dcec1952f185023c053
odegen_arm.h
arget_arm.cc
|
082833c8d577db0b2bebc100602f31e4e971613e |
18-May-2014 |
buzbee <buzbee@google.com> |
Quick compiler, out of registers fix It turns out that the register pool sanity checker was not working as expected, leaving some inconsistencies unreported. This could result in "out of registers" failures, as well as other more subtle problems. This CL fixes the sanity checker, adds a lot more check and cleans up the previously undetected episodes of insanity. Cherry-pick of internal change 468162 Change-Id: Id2da97e99105a4c272c5fd256205a94b904ecea8
odegen_arm.h
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
13ff8cd5d29c66de49506b0d7dddf8e0a959e104 |
15-May-2014 |
Bill Buzbee <buzbee@google.com> |
Merge "Quick Compiler: fix Arm cts failures"
|
fe8cf8b1c1b4af0f8b4bb639576f7a5fc59f52ea |
15-May-2014 |
Bill Buzbee <buzbee@google.com> |
Quick Compiler: fix Arm cts failures Fixes move_wide_16#testN1, move_wide_16#testN2 Two bugs for the price of one (thanks CTS!) First, the new stack overflow checking code was broken for very large frames. For Arm on method entry, we only have 1 available temp register, r12, until argument registers are flushed. Previously, for explicit checks on large frames, r12 was immediately loaded with the stack_end value. However, later on when the frame is extended, if the frame size exceeds the range of a reg-reg-imm subtract, the codegen utilities will allocate a new temporary register to complete the operation. r12 was getting clobbered. Similarly, for medium-large frames r12 could get clobbered during frame creation. What we should always do when directly using fixed registers like this is to lock them to prevent them from being allocated as a temp. The other half of the first bug is easily solved by delaying the load of stack_end until after the new sp is computed. We'll increase the stall cost, but this is an uncommon case. The second bug was likely a typo in LoadValueDisp(). I'm a bit surprised we hadn't hit this one earlier - but perhaps it was recently introduced. The wrong base register was being used in the non-float, wide, excessive offset case (which I suppose is also somewhat uncommon). Cherry-pick of internal commit If5b30f729e31d86db604045dd7581fd4626e0b55 Change-Id: If5b30f729e31d86db604045dd7581fd4626e0b55
all_arm.cc
tility_arm.cc
|
b14329f90f725af0f67c45dfcb94933a426d63ce |
15-May-2014 |
Andreas Gampe <agampe@google.com> |
ART: Fix MonitorExit code on ARM We do not emit barriers on non-SMP systems. But on ARM, we have places that need to conditionally execute, which is done through an IT instruction. The guide of said instruction thus changes between SMP and non-SMP systems. To cleanly approach this, change the API so that GenMemBarrier returns whether it generated an instruction. ARM will have to query the result and update any dependent IT. Throw a build system error if TARGET_CPU_SMP is not set. Fix runtime/Android.mk to work with new multilib host. Bug: 14989275 Change-Id: I9e611b770e8a1cd4ca19367d7dae0573ec08dc61
all_arm.cc
odegen_arm.h
nt_arm.cc
|
9b9dec8bbcb812315eb0b68b3465c6c567f09527 |
14-May-2014 |
Andreas Gampe <agampe@google.com> |
ART: Fix ARM dmb placement in monitor-exit This moves the dmb in quick-compiled monitor-exit before the str perfoming the unlock. Change-Id: I231f98ff21eb7bac45b4a1b7ff57316deeb858cc
all_arm.cc
|
2f244e9faccfcca68af3c5484c397a01a1c3a342 |
08-May-2014 |
Andreas Gampe <agampe@google.com> |
ART: Add more ThreadOffset in Mir2Lir and backends This duplicates all methods with ThreadOffset parameters, so that both ThreadOffset<4> and ThreadOffset<8> can be handled. Dynamic checks against the compilation unit's instruction set determine which pointer size to use and therefore which methods to call. Methods with unsupported pointer sizes should fatally fail, as this indicates an issue during method selection. Change-Id: Ifdb445b3732d3dc5e6a220db57374a55e91e1bf6
odegen_arm.h
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
674744e635ddbdfb311fbd25b5a27356560d30c3 |
24-Apr-2014 |
Vladimir Marko <vmarko@google.com> |
Use atomic load/store for volatile IGET/IPUT/SGET/SPUT. Bug: 14112919 Change-Id: I79316f438dd3adea9b2653ffc968af83671ad282
odegen_arm.h
arget_arm.cc
tility_arm.cc
|
3bf7c60a86d49bf8c05c5d2ac5ca8e9f80bd9824 |
07-May-2014 |
Vladimir Marko <vmarko@google.com> |
Cleanup ARM load/store wide and remove unused param s_reg. Use a single LDRD/VLDR instruction for wide load/store on ARM, adjust the base pointer if needed. Remove unused parameter s_reg from LoadBaseDisp(), LoadBaseIndexedDisp() and StoreBaseIndexedDisp() on all architectures. Change-Id: I25a9a42d523a68addbc11abe44ddc55a4401df98
odegen_arm.h
nt_arm.cc
tility_arm.cc
|
455759b5702b9435b91d1b4dada22c4cce7cae3c |
06-May-2014 |
Vladimir Marko <vmarko@google.com> |
Remove LoadBaseDispWide and StoreBaseDispWide. Just pass k64 or kDouble to non-wide versions. Change-Id: I000619c3b78d3a71db42edc747c8a0ba1ee229be
odegen_arm.h
nt_arm.cc
tility_arm.cc
|
1f1d2513a11eaaa59601d7599ac2e80ddfa1bcf5 |
06-May-2014 |
Andreas Gampe <agampe@google.com> |
Merge "ART: Use utils.h::RoundUp instead of explicit bit-fiddling"
|
660188264dee3c8f3510e2e24c11816c6b60f197 |
06-May-2014 |
Andreas Gampe <agampe@google.com> |
ART: Use utils.h::RoundUp instead of explicit bit-fiddling Change-Id: I249a2cfeb044d3699d02e13d42b8e72518571640
ssemble_arm.cc
|
5cd33753b96d92c03e3cb10cb802e68fb6ef2f21 |
16-Apr-2014 |
Dave Allison <dallison@google.com> |
Handle implicit stack overflow without affecting stack walks This changes the way in which implicit stack overflows are handled to satisfy concerns about changes to the stack walk code. Instead of creating a gap in the stack and checking for it in the stack walker, use the ManagedStack infrastructure to concoct an invisible gap that will never be seen by a stack walk. Also, this uses madvise to tell the kernel that the main stack's protected region will probably never be accessed, and instead of using memset to map the pages in, use memcpy to read from them. This will save 32K on the main stack. Also adds a 'signals' verbosity level as per a review request. Bug: 14066862 Change-Id: I5257305feeaea241d11e6aa6f021d2a81da20b81
all_arm.cc
|
091cc408e9dc87e60fb64c61e186bea568fc3d3a |
31-Mar-2014 |
buzbee <buzbee@google.com> |
Quick compiler: allocate doubles as doubles Significant refactoring of register handling to unify usage across all targets & 32/64 backends. Reworked RegStorage encoding to allow expanded use of x86 xmm registers; removed vector registers as a separate register type. Reworked RegisterInfo to describe aliased physical registers. Eliminated quite a bit of target-specific code and generalized common code. Use of RegStorage instead of int for registers now propagated down to the NewLIRx() level. In future CLs, the NewLIRx() routines will be replaced with versions that are explicit about what kind of operand they expect (RegStorage, displacement, etc.). The goal is to eventually use RegStorage all the way to the assembly phase. TBD: MIPS needs verification. TBD: Re-enable liveness tracking. Change-Id: I388c006d5fa9b3ea72db4e37a19ce257f2a15964
rm_lir.h
ssemble_arm.cc
all_arm.cc
odegen_arm.h
p_arm.cc
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
6ffcfa04ebb2660e238742a6000f5ccebdd5df15 |
25-Apr-2014 |
Mingyao Yang <mingyao@google.com> |
Rewrite suspend test check with LIRSlowPath. Change-Id: I2dc17d079655586bfc588349c7a04afc2c6879af
all_arm.cc
|
7a11ab09f93f54b1c07c0bf38dd65ed322e86bc6 |
29-Apr-2014 |
buzbee <buzbee@google.com> |
Quick compiler: debugging assists A few minor assists to ease A/B debugging in the Quick compiler: 1. To save time, the assemblers for some targets only update the object code offsets on instructions involved with pc-relative fixups. We add code to fix up all offsets when doing a verbose codegen listing. 2. Temp registers are normally allocated in a round-robin fashion. When disabling liveness tracking, we now reset the round-robin pool to 0 on each instruction boundary. This makes it easier to spot real codegen differences. 3. Self-register copies were previously emitted, but marked as nops. Minor change to avoid generating them in the first place and reduce clutter. Change-Id: I7954bba3b9f16ee690d663be510eac7034c93723
odegen_arm.h
nt_arm.cc
|
fd698e67953e40e804d7c9d1a3e8460e9d67382a |
28-Apr-2014 |
buzbee <buzbee@google.com> |
Quick compiler: fix DCHECKS The recent change to introduce k32, k64 and kReference operand sizes missed updating a few DCHECKS. Change-Id: I66eb617b07766e781b38962dc862fc5b023c2fbd
tility_arm.cc
|
125011d70aa84b3fd9052f1c90101401b0851928 |
24-Apr-2014 |
Mingyao Yang <mingyao@google.com> |
Merge "Delete throw launchpads."
|
0ea4bf7edb20be30f63566bce2d9db23f0b1c87f |
22-Apr-2014 |
buzbee <buzbee@google.com> |
Merge "Update load/store utilities for 64-bit backends"
|
695d13a82d6dd801aaa57a22a9d4b3f6db0d0fdb |
19-Apr-2014 |
buzbee <buzbee@google.com> |
Update load/store utilities for 64-bit backends This CL replaces the typical use of LoadWord/StoreWord utilities (which, in practice, were 32-bit load/store) in favor of a new set that make the size explicit. We now have: LoadWordDisp/StoreWordDisp: 32 or 64 depending on target. Load or store the natural word size. Expect this to be used infrequently - generally when we know we're dealing with a native pointer or flushed register not holding a Dalvik value (Dalvik values will flush to home location sizes based on Dalvik, rather than the target). Load32Disp/Store32Disp: Load or store 32 bits, regardless of target. Load64Disp/Store64Disp: Load or store 64 bits, regardless of target. LoadRefDisp: Load a 32-bit compressed reference, and expand it to the natural word size in the target register. StoreRefDisp: Compress a reference held in a register of the natural word size and store it as a 32-bit compressed reference. Change-Id: I50fcbc8684476abd9527777ee7c152c61ba41c6f
all_arm.cc
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
3a74d15ccc9a902874473ac9632e568b19b91b1c |
22-Apr-2014 |
Mingyao Yang <mingyao@google.com> |
Delete throw launchpads. Bug: 13170824 Change-Id: I9d5834f5a66f5eb00f2ac80774e8c27dea99949e
odegen_arm.h
nt_arm.cc
|
80365d9bb947edef0eae0bfe62b9f7a239416e6b |
18-Apr-2014 |
Mingyao Yang <mingyao@google.com> |
Revert "Revert "Use LIRSlowPath for throwing ArrayOutOfBoundsException."" This adds back using LIRSlowPath for ArrayIndexOutOfBoundsException. And fix the host test crash. Change-Id: Idbb602f4bb2c5ce59233feb480a0ff1b216e4887
nt_arm.cc
|
7fff544c38f0dec3a213236bb785c3ca13d21a0f |
18-Apr-2014 |
Brian Carlstrom <bdc@google.com> |
Revert "Use LIRSlowPath for throwing ArrayOutOfBoundsException." This reverts commit 9d46314a309aff327f9913789b5f61200c162609.
nt_arm.cc
|
9d46314a309aff327f9913789b5f61200c162609 |
18-Apr-2014 |
Mingyao Yang <mingyao@google.com> |
Use LIRSlowPath for throwing ArrayOutOfBoundsException. Get rid of launchpads for throwing ArrayOutOfBoundsException and use LIRSlowPath instead. Bug: 13170824 Change-Id: I0e27f7a261a6a7fb5c0645e6113a957e098f699e
nt_arm.cc
|
e643a179cf5585ba6bafdd4fa51730d9f50c06f6 |
08-Apr-2014 |
Mingyao Yang <mingyao@google.com> |
Use LIRSlowPath for throwing NPE. Get rid of launchpads for throwing NPE and use LIRSlowPath instead. Also clean up some code of using LIRSlowPath for checking div by zero. Bug: 13170824 Change-Id: I0c20a49c39feff3eb1f147755e557d9bc0ff15bb
odegen_arm.h
nt_arm.cc
|
d6ed642458c8820e1beca72f3d7b5f0be4a4b64b |
10-Apr-2014 |
Dave Allison <dallison@google.com> |
Revert "Revert "Revert "Use trampolines for calls to helpers""" This reverts commit f9487c039efb4112616d438593a2ab02792e0304. Change-Id: Id48a4aae4ecce73db468587967968a3f7618b700
rm_lir.h
ssemble_arm.cc
all_arm.cc
odegen_arm.h
tility_arm.cc
|
f9487c039efb4112616d438593a2ab02792e0304 |
09-Apr-2014 |
Dave Allison <dallison@google.com> |
Revert "Revert "Use trampolines for calls to helpers"" This reverts commit 081f73e888b3c246cf7635db37b7f1105cf1a2ff. Change-Id: Ibd777f8ce73cf8ed6c4cb81d50bf6437ac28cb61 Conflicts: compiler/dex/quick/mir_to_lir.h
rm_lir.h
ssemble_arm.cc
all_arm.cc
odegen_arm.h
tility_arm.cc
|
1512ea155cbe0a4b33776b0320c1ce38583ab09b |
08-Apr-2014 |
buzbee <buzbee@google.com> |
Merge "Quick compiler: fix CmpLong pair handling"
|
4289456fa265b833434c2a8eee9e7a16da31c524 |
07-Apr-2014 |
Mingyao Yang <mingyao@google.com> |
Use LIRSlowPath for throwing div by zero exception. Get rid of launchpads for throwing div by zero exception and use LIRSlowPath instead. Add a CallRuntimeHelper that takes no argument for the runtime function. Bug: 13170824 Change-Id: I7e0563e736c6f92bd63e3fbdfe3a777ad333e338
nt_arm.cc
|
a1983d4dab10b0cc51e9d1b6bcafa9a723fabcd9 |
07-Apr-2014 |
buzbee <buzbee@google.com> |
Quick compiler: fix CmpLong pair handling OpCmpLong wasn't properly extracting the low register of a pair. Change-Id: I6d6cc3de1f543f4316e561648f371f793502fddb
nt_arm.cc
|
081f73e888b3c246cf7635db37b7f1105cf1a2ff |
07-Apr-2014 |
Dave Allison <dallison@google.com> |
Revert "Use trampolines for calls to helpers" This reverts commit 754ddad084ccb610d0cf486f6131bdc69bae5bc6. Change-Id: Icd979adee1d8d781b40a5e75daf3719444cb72e8
rm_lir.h
ssemble_arm.cc
all_arm.cc
odegen_arm.h
tility_arm.cc
|
754ddad084ccb610d0cf486f6131bdc69bae5bc6 |
19-Feb-2014 |
Dave Allison <dallison@google.com> |
Use trampolines for calls to helpers This is an ARM specific optimization to the compiler that uses trampoline islands to make calls to runtime helper functions. The intention is to reduce the size of the generated code (by 2 bytes per call) without affecting performance. By default this is on when generating an OAT file. It is off when compiling to memory. To switch this off in dex2oat, use the command line option: --no-helper-trampolines Enhances disassembler to print the trampoline entry on the BL instruction like this: 0xb6a850c0: f7ffff9e bl -196 (0xb6a85000) ; pTestSuspend Bug: 12607709 Change-Id: I9202bdb7cf21252ad807bd48701f1f6ce8e3d0fe
rm_lir.h
ssemble_arm.cc
all_arm.cc
odegen_arm.h
tility_arm.cc
|
45157a41b6c0ac9f73aeeb1f064c2270a6a68a60 |
05-Apr-2014 |
Ian Rogers <irogers@google.com> |
Merge "ARM: enable optimisation for easy multiply, add modulus pattern."
|
09379fd9f20e25ee71687e2c60f6a84c9ede8cd6 |
04-Apr-2014 |
Dave Allison <dallison@google.com> |
Merge "Disable use of R4 as a promotable register"
|
7efad5d3a806a15166109837439f2e149031feef |
04-Apr-2014 |
Vladimir Marko <vmarko@google.com> |
Merge "Disassemble Thumb2 shifts and more VFP instructions."
|
8325296769a77ecf3ab647b5ab516f439f5b3206 |
04-Apr-2014 |
Dave Allison <dallison@google.com> |
Disable use of R4 as a promotable register When we are using implicit suspend checks we can potentially use r4 as a register into which variables can be promoted. However the runtime doesn't save this and thus will corrupt it. Not good. This disables the promotion of r4 until we can figure out how to make the runtime save it properly. Change-Id: Ib95ce93579e1c364de5ecc8e728f2cb7990da77a
arget_arm.cc
|
c777e0de83cdffdb2e240d439c5595a4836553e8 |
03-Apr-2014 |
Vladimir Marko <vmarko@google.com> |
Disassemble Thumb2 shifts and more VFP instructions. Disassemble Thumb2 instructions LSL, LSR, ASR, ROR and VFP instructions VABS, VADD, VSUB, VMOV, VMUL, VNMUL, VDIV. Clean up disassembly of VCMP, VCMPE, VNEG and VSQRT. These could have been erroneously used for other insns (VSQRT for VMOV was encountered) and one VSQRT branch was unreachable. Remove duplicate VMOV opcodes from compiler. Change-Id: I160a1e3e4b6eabb6a5101ce348ffd49c0573257d
rm_lir.h
ssemble_arm.cc
|
3da67a558f1fd3d8a157d8044d521753f3f99ac8 |
03-Apr-2014 |
Dave Allison <dallison@google.com> |
Add OpEndIT() for marking the end of OpIT blocks In ARM we need to prevent code motion to the inside of an IT block. This was done using a GenBarrier() to mark the end, but it wasn't obvious that this is what was happening. This CL adds an explicit OpEndIT() that takes the LIR of the OpIT for future checks. Bug: 13751744 Change-Id: If41d2adea1f43f11ebb3b72906bd308252ce3d01
all_arm.cc
odegen_arm.h
p_arm.cc
nt_arm.cc
|
f9719f9abbea060e086fe1304d72be50cbc8808e |
02-Apr-2014 |
Zheng Xu <zheng.xu@arm.com> |
ARM: enable optimisation for easy multiply, add modulus pattern. Fix the issue when src/dest registers overlap in easy multiply. Change-Id: Ie8cc098c29c74fd06c1b67359ef94f2c6b88a71e
nt_arm.cc
|
f6b65c123dafad62004a93a43eb82de00ddc8214 |
02-Apr-2014 |
Dave Allison <dallison@google.com> |
Add r4 as promotable register if implicit suspend checks If we are doing implicit suspend checks we should be able to use r4 as a target for promotion. Also bump OAT version Change-Id: Ia27d14ece3b3259dbb74bcf89feaa9da2cda6db8
arget_arm.cc
|
43a065ce1dda78e963868f9753a6e263721af927 |
02-Apr-2014 |
Dave Allison <dallison@google.com> |
Add GenBarrier() calls to terminate all IT blocks. This is needed to prevent things like load hoisting from putting instructions inside the IT block. Bug: 13749123 Change-Id: I98a010453b163ac20a90f626144f798fc06e65a9
nt_arm.cc
|
80fdef4018cde9bee8cdb0159ba660db1c4c4bf7 |
01-Apr-2014 |
buzbee <buzbee@google.com> |
Quick compiler: add comment to Arm encoding A question from an AOSP contributer demonstrated the need for explanation of a seemingly odd encoding for vldrd/vldrs. In short, we add a "def" bit for lr on those instructions to cover the cases in which we have to materialize a new base pointer at assembly time using lr as a temp register. Change-Id: I22c5740218a90e0ff387c6aac2bd20cc98eece85
ssemble_arm.cc
|
7ea687d886be7b8c106b0e0190dab299d14adcad |
01-Apr-2014 |
Mathieu Chartier <mathieuc@google.com> |
Merge "Fix stack overflow slow path error."
|
88e0463fa7e8ea7b427b65a07cd7b28111575174 |
01-Apr-2014 |
Ian Rogers <irogers@google.com> |
Merge "Revert "Revert "Optimize easy multiply and easy div remainder."""
|
dd7624d2b9e599d57762d12031b10b89defc9807 |
15-Mar-2014 |
Ian Rogers <irogers@google.com> |
Allow mixing of thread offsets between 32 and 64bit architectures. Begin a more full implementation x86-64 REX prefixes. Doesn't implement 64bit thread offset support for the JNI compiler. Change-Id: If9af2f08a1833c21ddb4b4077f9b03add1a05147
all_arm.cc
odegen_arm.h
p_arm.cc
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
f943914730db8ad2ff03d49a2cacd31885d08fd7 |
27-Mar-2014 |
Dave Allison <dallison@google.com> |
Implement implicit stack overflow checks This also fixes some failing run tests due to missing null pointer markers. The implementation of the implicit stack overflow checks introduces the ability to have a gap in the stack that is skipped during stack walk backs. This gap is protected against read/write and is used to trigger a SIGSEGV at function entry if the stack will overflow. Change-Id: I0c3e214c8b87dc250cf886472c6d327b5d58653e
all_arm.cc
tility_arm.cc
|
05a48b1f8e62564abb7c2fe674e3234d5861647f |
01-Apr-2014 |
Mathieu Chartier <mathieuc@google.com> |
Fix stack overflow slow path error. The frame size without spill was being passed into the slow path instead of the spill size. This was incorrect since only the spills will have been pushed at the point of the overflow check. Also addressed an other comment. Change-Id: Ic6e455122473a8f796b291d71f945bcf72788662
all_arm.cc
|
306f017dd883c0bf806d239d97e0bca3194afbd7 |
07-Jan-2014 |
Vladimir Marko <vmarko@google.com> |
Faster AssembleLIR for ARM. This also reduces sizeof(LIR) by 4 bytes (32-bit builds). Change-Id: I0cb81f9bf098dfc50050d5bc705c171af26464ce
ssemble_arm.cc
odegen_arm.h
|
e2143c0a4af68c08e811885eb2f3ea5bfdb21ab6 |
28-Mar-2014 |
Ian Rogers <irogers@google.com> |
Revert "Revert "Optimize easy multiply and easy div remainder."" This reverts commit 3654a6f50a948ead89627f398aaf86a2c2db0088. Remove the part of the change that confused !is_div with being multiply rather than implying remainder. Change-Id: I202610069c69351259a320e8852543cbed4c3b3e
odegen_arm.h
nt_arm.cc
tility_arm.cc
|
3654a6f50a948ead89627f398aaf86a2c2db0088 |
28-Mar-2014 |
Brian Carlstrom <bdc@google.com> |
Revert "Optimize easy multiply and easy div remainder." This reverts commit 08df4b3da75366e5db37e696eaa7e855cba01deb.
odegen_arm.h
nt_arm.cc
tility_arm.cc
|
08df4b3da75366e5db37e696eaa7e855cba01deb |
25-Mar-2014 |
Zheng Xu <zheng.xu@arm.com> |
Optimize easy multiply and easy div remainder. Update OpRegRegShift and OpRegRegRegShift to use RegStorage parameters. Add special cases for *0 and *1. Add more easy multiply special cases for Arm. Reuse easy multiply in SmallLiteralDivRem() to support remainder cases. Change-Id: Icd76a993d3ac8d4988e9653c19eab4efca14fad0
odegen_arm.h
nt_arm.cc
tility_arm.cc
|
2700f7e1edbcd2518f4978e4cd0e05a4149f91b6 |
07-Mar-2014 |
buzbee <buzbee@google.com> |
Continuing register cleanup Ready for review. Continue the process of using RegStorage rather than ints to hold register value in the top layers of codegen. Given the huge number of changes in this CL, I've attempted to minimize the number of actual logic changes. With this CL, the use of ints for registers has largely been eliminated except in the lowest utility levels. "Wide" utility routines have been updated to take a single RegStorage rather than a pair of ints representing low and high registers. Upcoming CLs will be smaller and more targeted. My expectations: o Allocate float double registers as a single double rather than a pair of float single registers. o Refactor to push code which assumes long and double Dalvik values are held in a pair of register to the target dependent layer. o Clean-up of the xxx_mir.h files to reduce the amount of #defines for registers. May also do a register renumbering to bring all of our targets' register naming more consistent. Possibly introduce a target-independent float/non-float test at the RegStorage level. Change-Id: I646de7392bdec94595dd2c6f76e0f1c4331096ff
rm_lir.h
all_arm.cc
odegen_arm.h
p_arm.cc
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
99ad7230ccaace93bf323dea9790f35fe991a4a2 |
26-Feb-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
Relaxed memory barriers for x86 X86 provides stronger memory guarantees and thus the memory barriers can be optimized. This patch ensures that all memory barriers for x86 are treated as scheduling barriers. And in cases where a barrier is needed (StoreLoad case), an mfence is used. Change-Id: I13d02bf3f152083ba9f358052aedb583b0d48640 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
nt_arm.cc
|
66e4c3e96dccdec7423d673ad6bbf7821a776651 |
19-Mar-2014 |
Mathieu Chartier <mathieuc@google.com> |
Merge "Optimize stack overflow handling."
|
40bbb39b85c063cd6a9f4ab00ff70372370e08cf |
19-Mar-2014 |
buzbee <buzbee@google.com> |
Fix Quick compiler "out of registers" There are a few places in the Arm backend that expect to be able to survive on a single temp register - in particular entry code generation and argument passing. However, in the case of a very large frame and floating point ld/st, the existing code could end up using 2 temps. In short, if there is a displacement overflow we try to use indexed load/store instructions (slightly more efficient). However, there are none for floating point - so we ended up burning yet another register to construct a direct pointer. This CL detects this case and doesn't try to use the indexed load/store mechanism for floats. Fix for https://code.google.com/p/android/issues/detail?id=67349 Change-Id: I1ea596ea660e4add89fd4fddb8cbf99a54fbd343
tility_arm.cc
|
0d507d1e0441e6bd6f3affca3a60774ea920f317 |
19-Mar-2014 |
Mathieu Chartier <mathieuc@google.com> |
Optimize stack overflow handling. We now subtract the frame size from the stack pointer for methods which have a frame smaller than a certain size. Also changed code to use slow paths instead of launchpads. Delete kStackOverflow launchpad since it is no longer needed. ARM optimizations: One less move per stack overflow check (without fault handler for stack overflows). Use ldr pc instead of ldr r12, b r12. Code size (boot.oat): Before: 58405348 After: 57803236 TODO: X86 doesn't have the case for large frames. This could case an incoming signal to go past the end of the stack (unlikely however). Change-Id: Ie3a5635cd6fb09de27960e1f8cee45bfae38fb33
all_arm.cc
|
60d7a65f7fb60f502160a2e479e86014c7787553 |
14-Mar-2014 |
Brian Carlstrom <bdc@google.com> |
Fix stack overflow for mutual recursion. There was an error where we would have a pc that was in the method which generated the stack overflow. This didn't work however because the stack overflow check was before we stored the method in the stack. The result was that the stack overflow handler had a PC which wasnt necessarily in the method at the top of the stack. This is now fixed by always restoring the link register before branching to the throw entrypoint. Slight code size regression on ARM/Mips (unmeasured). Regression on ARM is 4 bytes of code per stack overflow check. Some of this regression is mitigated by having one less GC safepoint. Also adds test case for StackOverflowError issue (from bdc). Tests passing: ARM, X86, Mips Phone booting: ARM Bug: https://code.google.com/p/android/issues/detail?id=66411 Bug: 12967914 Change-Id: I96fe667799458b58d1f86671e051968f7be78d5d (cherry-picked from c0f96d03a1855fda7d94332331b94860404874dd)
tility_arm.cc
|
d7f8e02041e9d16160bc81bd1fa19189bffc04b3 |
13-Mar-2014 |
Zheng Xu <zheng.xu@arm.com> |
ARM: Do not allocate temp registers in MulLong if possible. Just use rl_result if we have enough registers and it is *not* either operand. Change-Id: I5a6f3ec09653b97e41bbc6dce823aa8534f98a13
nt_arm.cc
|
b373e091eac39b1a79c11f2dcbd610af01e9e8a9 |
21-Feb-2014 |
Dave Allison <dallison@google.com> |
Implicit null/suspend checks (oat version bump) This adds the ability to use SEGV signals to throw NullPointerException exceptions from Java code rather than having the compiler generate explicit comparisons and branches. It does this by using sigaction to trap SIGSEGV and when triggered makes sure it's in compiled code and if so, sets the return address to the entry point to throw the exception. It also uses this signal mechanism to determine whether to check for thread suspension. Instead of the compiler generating calls to a function to check for threads being suspended, the compiler will now load indirect via an address in the TLS area. To trigger a suspend, the contents of this address are changed from something valid to 0. A SIGSEGV will occur and the handler will check for a valid instruction pattern before invoking the thread suspension check code. If a user program taps SIGSEGV it will prevent our signal handler working. This will cause a failure in the runtime. There are two signal handlers at present. You can control them individually using the flags -implicit-checks: on the runtime command line. This takes a string parameter, a comma separated set of strings. Each can be one of: none switch off null null pointer checks suspend suspend checks all all checks So to switch only suspend checks on, pass: -implicit-checks:suspend There is also -explicit-checks to provide the reverse once we change the default. For dalvikvm, pass --runtime-arg -implicit-checks:foo,bar The default is -implicit-checks:none There is also a property 'dalvik.vm.implicit_checks' whose value is the same string as the command option. The default is 'none'. For example to switch on null checks using the option: setprop dalvik.vm.implicit_checks null It only works for ARM right now. Bumps OAT version number due to change to Thread offsets. Bug: 13121132 Change-Id: If743849138162f3c7c44a523247e413785677370
all_arm.cc
odegen_arm.h
nt_arm.cc
arget_arm.cc
|
0f6784737882199197796b67b99e5f1ded383bee |
11-Mar-2014 |
Ian Rogers <irogers@google.com> |
Unify 64bit int constant definitions. LL and ULL prefixes are word size dependent, use the INT64_C and UINT64_C macros instead. Change-Id: I5b70027651898814fc0b3e9e22a18a1047e76cb9
p_arm.cc
tility_arm.cc
|
3dfc5c168506b89e345c977355a4eabebfede72a |
10-Mar-2014 |
Ian Rogers <irogers@google.com> |
Merge "ARM: Inline codegen for long-to-float on ARM."
|
e19649a91702234f9aa9941d76da447a1e0dcc2a |
27-Feb-2014 |
Zheng Xu <zheng.xu@arm.com> |
ARM: Remove duplicated instructions; add vcvt, vmla, vmls disassembler. Remove kThumb2VcvtID in the assembler which was duplicated. Add vcvt, vmla, vmls in the disassembler. Change-Id: I14cc39375c922c9917274d8dcfcb515e888fdf26
rm_lir.h
ssemble_arm.cc
p_arm.cc
|
f0e6c9c7b395f4fce4d00d31cabd362302e1d924 |
10-Mar-2014 |
Zheng Xu <zheng.xu@arm.com> |
ARM: Inline codegen for long-to-float on ARM. long-to-double-to-float using vfp instructions should be faster than the function provided by toolchain. Change-Id: I7ff809bca6665f0c1a0d7e6db98d570ce86b7c66
p_arm.cc
|
83cc7ae96d4176533dd0391a1591d321b0a87f4f |
12-Feb-2014 |
Vladimir Marko <vmarko@google.com> |
Create a scoped arena allocator and use that for LVN. This saves more than 0.5s of boot.oat compilation time on Nexus 5. TODO: Move other stuff to the scoped allocator. This CL alone increases the peak memory allocation. By reusing the memory for other parts of the compilation we should reduce this overhead. Change-Id: Ifbc00aab4f3afd0000da818dfe68b96713824a08
all_arm.cc
arget_arm.cc
|
a1a7074eb8256d101f7b5d256cda26d7de6ce6ce |
03-Mar-2014 |
Vladimir Marko <vmarko@google.com> |
Rewrite kMirOpSelect for all IF_ccZ opcodes. Also improve special cases for ARM and add tests. Change-Id: I06f575b9c7b547dbc431dbfadf2b927151fe16b9
nt_arm.cc
|
00e1ec6581b5b7b46ca4c314c2854e9caa647dd2 |
28-Feb-2014 |
Bill Buzbee <buzbee@android.com> |
Revert "Revert "Rework Quick compiler's register handling"" This reverts commit 86ec520fc8b696ed6f164d7b756009ecd6e4aace. Ready. Fixed the original type, plus some mechanical changes for rebasing. Still needs additional testing, but the problem with the original CL appears to have been a typo in the definition of the x86 double return template RegLocation. Change-Id: I828c721f91d9b2546ef008c6ea81f40756305891
rm_lir.h
all_arm.cc
odegen_arm.h
p_arm.cc
nt_arm.cc
arget_arm.cc
|
dbb8c49d540edd2a39076093163c7218f03aa502 |
28-Feb-2014 |
Vladimir Marko <vmarko@google.com> |
Remove non-existent ARM insn kThumb2SubsRRI12. For kOpSub/kOpAdd, prefer modified immediate encodings because they set flags. Change-Id: I41dcd2d43ba1e62120c99eaf9106edc61c41e157
rm_lir.h
ssemble_arm.cc
all_arm.cc
nt_arm.cc
tility_arm.cc
|
86ec520fc8b696ed6f164d7b756009ecd6e4aace |
26-Feb-2014 |
Bill Buzbee <buzbee@android.com> |
Revert "Rework Quick compiler's register handling" This reverts commit 2c1ed456dcdb027d097825dd98dbe48c71599b6c. Change-Id: If88d69ba88e0af0b407ff2240566d7e4545d8a99
rm_lir.h
all_arm.cc
odegen_arm.h
p_arm.cc
nt_arm.cc
arget_arm.cc
|
2c1ed456dcdb027d097825dd98dbe48c71599b6c |
20-Feb-2014 |
buzbee <buzbee@google.com> |
Rework Quick compiler's register handling For historical reasons, the Quick backend found it convenient to consider all 64-bit Dalvik values held in registers to be contained in a pair of 32-bit registers. Though this worked well for ARM (with double-precision registers also treated as a pair of 32-bit single-precision registers) it doesn't play well with other targets. And, it is somewhat problematic for 64-bit architectures. This is the first of several CLs that will rework the way the Quick backend deals with physical registers. The goal is to eliminate the "64-bit value backed with 32-bit register pair" requirement from the target-indendent portions of the backend and support 64-bit registers throughout. The key RegLocation struct, which describes the location of Dalvik virtual register & register pairs, previously contained fields for high and low physical registers. The low_reg and high_reg fields are being replaced with a new type: RegStorage. There will be a single instance of RegStorage for each RegLocation. Note that RegStorage does not increase the space used. It is 16 bits wide, the same as the sum of the 8-bit low_reg and high_reg fields. At a target-independent level, it will describe whether the physical register storage associated with the Dalvik value is a single 32 bit, single 64 bit, pair of 32 bit or vector. The actual register number encoding is left to the target-dependent code layer. Because physical register handling is pervasive throughout the backend, this restructuring necessarily involves large CLs with lots of changes. I'm going to roll these out in stages, and attempt to segregate the CLs with largely mechanical changes from those which restructure or rework the logic. This CL is of the mechanical change variety - it replaces low_reg and high_reg from RegLocation and introduces RegStorage. It also includes a lot of new code (such as many calls to GetReg()) that should go away in upcoming CLs. The tentative plan for the subsequent CLs is: o Rework standard register utilities such as AllocReg() and FreeReg() to use RegStorage instead of ints. o Rework the target-independent GenXXX, OpXXX, LoadValue, StoreValue, etc. routines to take RegStorage rather than int register encodings. o Take advantage of the vector representation and eliminate the current vector field in RegLocation. o Replace the "wide" variants of codegen utilities that take low_reg/high_reg pairs with versions that use RegStorage. o Add 64-bit register target independent codegen utilities where possible, and where not virtualize with 32-bit general register and 64-bit general register variants in the target dependent layer. o Expand/rework the LIR def/use flags to allow for more registers (currently, we lose out on 16 MIPS floating point regs as well as ARM's D16..D31 for lack of space in the masks). o [Possibly] move the float/non-float determination of a register from the target-dependent encoding to RegStorage. In other words, replace IsFpReg(register_encoding_bits). At the end of the day, all code in the target independent layer should be using RegStorage, as should much of the target dependent layer. Ideally, we won't be using the physical register number encoding extracted from RegStorage (i.e. GetReg()) until the NewLIRx() layer. Change-Id: Idc5c741478f720bdd1d7123b94e4288be5ce52cb
rm_lir.h
all_arm.cc
odegen_arm.h
p_arm.cc
nt_arm.cc
arget_arm.cc
|
3bc01748ef1c3e43361bdf520947a9d656658bf8 |
06-Feb-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
GenSpecialCase support for x86 Moved GenSpecialCase from being ARM specific to common code to allow it to be used by x86 quick as well. Change-Id: I728733e8f4c4da99af6091ef77e5c76ae0fee850 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
all_arm.cc
odegen_arm.h
arget_arm.cc
|
502c2a84888b7da075049dcaaeb0156602304f65 |
06-Feb-2014 |
Vladimir Marko <vmarko@google.com> |
Generate ARM special methods from InlineMethod data. Change-Id: I204b01660a1e515879524018d1371e31f41da59b
all_arm.cc
odegen_arm.h
nt_arm.cc
|
c9bf407643329fee7eb2603fdace46eebf618cc6 |
10-Feb-2014 |
Vladimir Marko <vmarko@google.com> |
Fix special getter/setter generation. Change-Id: I381618bdcc46c51b50e94042f332db99c3a71a38
all_arm.cc
|
2bc47809febcf36369dd40877b8226318642b428 |
10-Feb-2014 |
Vladimir Marko <vmarko@google.com> |
Revert "Revert "Check FastInstance() early for special getters and setters."" This reverts commit 632e458dc267fadfb8120be3ab02701e09e64875. Change-Id: I5098c41ee84fbbb39397133a7ecfd367fecebe42
all_arm.cc
odegen_arm.h
|
632e458dc267fadfb8120be3ab02701e09e64875 |
08-Feb-2014 |
Ian Rogers <irogers@google.com> |
Revert "Check FastInstance() early for special getters and setters." This reverts commit 5dc5727261e87ba8a418e2d0e970c75f67e4ab79. Change-Id: I3299c8ca5c3ce3f2de994bab61ea16a734f1de33
all_arm.cc
odegen_arm.h
|
5dc5727261e87ba8a418e2d0e970c75f67e4ab79 |
05-Feb-2014 |
Vladimir Marko <vmarko@google.com> |
Check FastInstance() early for special getters and setters. Perform the FastInstance() check for getters and setters when they are detected by the inliner. This will help avoid the FastInstance() check for inlining. We also record the field offset and whether the field is volatile and whether the method is static for use when inlining or generating the special accessors. Change-Id: I3f832fc9ae263883b8a984be89a3b7793398b55a
all_arm.cc
odegen_arm.h
|
2c498d1f28e62e81fbdb477ff93ca7454e7493d7 |
30-Jan-2014 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
Specializing x86 range argument copying The ARM implementation of range argument copying was specialized in some cases. For all other architectures, it would fall back to generating memcpy. This patch updates the x86 implementation so it does not call memcpy and instead generates loads and stores, favoring movement of 128-bit chunks. Change-Id: Ic891e5609a4b0e81a47c29cc5a9b301bd10a1933 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
odegen_arm.h
tility_arm.cc
|
4708dcd68eebf1173aef1097dad8ab13466059aa |
22-Jan-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Improve x86 long multiply and shifts Generate inline code for long shifts by constants and do long multiplication inline. Convert multiplication by a constant to a shift when we can. Fix some x86 assembler problems and add the new instructions that were needed (64 bit shifts). Change-Id: I6237a31c36159096e399d40d01eb6bfa22ac2772 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
odegen_arm.h
nt_arm.cc
|
a278ac31a1beeebd093ec64026d27a02fdc28807 |
24-Jan-2014 |
Ian Rogers <irogers@google.com> |
Merge "Improve x86 long divide"
|
67122a03a4c66e01c4b64364a3701fe6ec3c5a18 |
24-Jan-2014 |
Ian Rogers <irogers@google.com> |
Merge "64bit friendly printf modifiers in LIR dumping."
|
2bf31e67694da24a19fc1f328285cebb1a4b9964 |
23-Jan-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Improve x86 long divide Implement inline division for literal and variable divisors. Use the general case for dividing by a literal by using a double length multiply by the appropriate constant with fixups. This is the Hacker's Delight algorithm. Change-Id: I563c250f99d89fca5ff8bcbf13de74de13815cfe Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
odegen_arm.h
nt_arm.cc
|
3f5b42f1d31c877abca2571a51dd0a5055a9b94c |
24-Jan-2014 |
Vladimir Marko <vmarko@google.com> |
Merge "Optimize x86 long arithmetic"
|
107c31e598b649a8bb8d959d6a0377937e63e624 |
24-Jan-2014 |
Ian Rogers <irogers@google.com> |
64bit friendly printf modifiers in LIR dumping. Also correct header file inclusion ordering. Change-Id: I8fb99e80cf1487e8b2278d4c1d110d14ed18c086
odegen_arm.h
arget_arm.cc
|
e02d48fb24747f90fd893e1c3572bb3c500afced |
15-Jan-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Optimize x86 long arithmetic Be smarter about taking advantage of a constant operand for x86 long add/sub/and/or/xor. Using instructions with immediates and generating results directly into memory reduces the number of temporary registers and avoids hardcoded register usage. Also rewrite the existing non-const x86 arithmetic to avoid fixed register use, and use the fact that x86 instructions are two operand. Pass the opcode to the XXXLong() routines to easily detect two operand DEX opcodes. Add a new StoreFinalValueWide() routine, which is similar to StoreValueWide, but doesn't do an EvalLoc to allocate registers. The src operand must already be in registers, and it just updates the dest location, and calls the right live/dirty routines to get the src into the dest properly. Change-Id: Iefc16e7bc2236a73dc780d3d5137ae8343171f62 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
odegen_arm.h
nt_arm.cc
|
a894607bca7eb623bc957363e4b36f44cfeea1b6 |
22-Jan-2014 |
Vladimir Marko <vmarko@google.com> |
Move fused cmp branch ccode to MIR::meta. This a small refactoring towards removing the large DecodedInstruction from the MIR class. Change-Id: I10f9ed5eaac42511d864c71d20a8ff6360292cec
p_arm.cc
nt_arm.cc
|
d61ba4ba6fcde666adb5d5c81b1c32f0534fb2c8 |
13-Jan-2014 |
Bill Buzbee <buzbee@android.com> |
Revert "Revert "Better support for x86 XMM registers"" This reverts commit 8ff67e3338952c70ccf3b609559bf8cc0f379cfd. Fix applied to loc.fp usage. Change-Id: I1eb3005392544fcf30c595923ed25bcee2dc4859
rm_lir.h
|
8ff67e3338952c70ccf3b609559bf8cc0f379cfd |
11-Jan-2014 |
Bill Buzbee <buzbee@android.com> |
Revert "Better support for x86 XMM registers" The invalid usage of loc.fp must be corrected before this change can be submitted. This reverts commit 766a5e5940b469ab40e52770862c81cfec1d835b. Change-Id: I1173a9bf829da89cccd9c2898f5e11164987a22b
rm_lir.h
|
766a5e5940b469ab40e52770862c81cfec1d835b |
10-Jan-2014 |
Mark Mendell <mark.p.mendell@intel.com> |
Better support for x86 XMM registers Currently, ART Quick mode assumes that a double FP register is composed of two single consecutive FP registers. This is true for ARM and MIPS, but not x86. This means that only half of the 8 XMM registers are available for use by x86 doubles. This patch breaks the assumption that a wide FP RegisterLocation must be a paired set of FP registers. This is done by making some routines in common code virtual and overriding them in the X86Mir2Lir class. For these wide fp locations, the high register is set to the same value as the low register, in order to minimize changes to common code. In a couple of places, the common code checks for this case. The changes are also supposed to allow the possibility of using the XMM registers for vector operations,but that support is still WIP. Change-Id: Ic6ef24ea764991c6f4d9fb88d483a619f5a468cb Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
rm_lir.h
|
0adc680c388913a63666797e907f87c4c6b0b4ea |
08-Jan-2014 |
Ian Rogers <irogers@google.com> |
Merge "Add conditional move support to x86 and allow GenMinMax to use it"
|
ef6a776af2b4b8607d5f91add0ed0e8497100e31 |
20-Dec-2013 |
Ian Rogers <irogers@google.com> |
Inline codegen for long-to-double on ARM. Change-Id: I4fc443c1b942a2231d680fc2c7a1530c86104584
rm_lir.h
ssemble_arm.cc
p_arm.cc
|
988e6ea9ac66edf1e205851df9bb53de3f3763f3 |
08-Jan-2014 |
Ian Rogers <irogers@google.com> |
Fix -O0 builds. Use snprintf rather than sprintf to avoid Werror failures. Work around an annotalysis bug when compiling -O0. Change-Id: Ie7e0a70dbceea5fa85f98262b91bcdbd74fdef1c
arget_arm.cc
|
bd288c2c1206bc99fafebfb9120a83f13cf9723b |
21-Dec-2013 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
Add conditional move support to x86 and allow GenMinMax to use it X86 supports conditional moves which is useful for reducing branchiness. This patch adds support to the x86 backend to generate conditional reg to reg operations. Both encoder and decoder support was added for cmov. The x86 version of GenMinMax used for generating inlined version Math.min/max has been updated to make use of the conditional move support. Change-Id: I92c5428e40aa8ff88bd3071619957ac3130efae7 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
odegen_arm.h
tility_arm.cc
|
47d79fd31295b29e4abeb6d3fc318e6a6dd1e97c |
20-Dec-2013 |
Vladimir Marko <vmarko@google.com> |
Merge "Clean up usage of carry flag condition codes."
|
58af1f9385742f70aca4fcb5e13aba53b8be2ef4 |
19-Dec-2013 |
Vladimir Marko <vmarko@google.com> |
Clean up usage of carry flag condition codes. On X86, kCondUlt and kCondUge are bound to CS and CC, respectively, while on ARM it's the other way around. The explicit binding in ConditionCode was wrong and misleading and could lead to subtle bugs. Therefore, we detach those constants and clean up usage. The CS and CC conditions are now effectively unused but we keep them around as they may eventually be useful. And some minor cleanup and comments. Change-Id: Ic5ed81d86b6c7f9392dd8fe9474b3ff718fee595
all_arm.cc
p_arm.cc
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
b122a4bbed34ab22b4c1541ee25e5cf22f12a926 |
20-Nov-2013 |
Ian Rogers <irogers@google.com> |
Tidy up memory barriers. Change-Id: I937ea93e6df1835ecfe2d4bb7d84c24fe7fc097b
nt_arm.cc
|
5816ed48bc339c983b40dc493e96b97821ce7966 |
27-Nov-2013 |
Vladimir Marko <vmarko@google.com> |
Detect special methods at the end of verification. This moves special method handling to method inliner and prepares for eventual inlining of these methods. Change-Id: I51c51b940fb7bc714e33135cd61be69467861352
all_arm.cc
odegen_arm.h
|
8d4122f24d1d964e91444300045936b42986e00e |
10-Dec-2013 |
Vladimir Marko <vmarko@google.com> |
Merge "Get rid of platform-specific method inliners."
|
867a2b35e67ddcbec089964e8f3cd9a827186e48 |
10-Dec-2013 |
Vladimir Marko <vmarko@google.com> |
Get rid of platform-specific method inliners. The DexFileToMethodInlinerMap dependency on CompilerDriver and its instruction set makes it impossible to implement verification-time checking for methods we want to inline. Therefore, we get rid of the platform-specific method inliners and rely on the backend's existing ability to recognize when it can actually emit an intrinsic function. Change-Id: I57947db93f13a26c1c794cb3584130321106306f
rm_dex_file_method_inliner.cc
rm_dex_file_method_inliner.h
|
31c2aac7137b69d5622eea09597500731fbee2ef |
09-Dec-2013 |
Vladimir Marko <vmarko@google.com> |
Rename ClobberCalleeSave to *Caller*, fix it for x86. Change-Id: I6a72703a11985e2753fa9b4520c375a164301433
all_arm.cc
odegen_arm.h
p_arm.cc
arget_arm.cc
|
1da1e2fceb0030b4b76b43510b1710a9613e0c2e |
15-Nov-2013 |
buzbee <buzbee@google.com> |
More compile-time tuning Another round of compile-time tuning, this time yeilding in the vicinity of 3% total reduction in compile time (which means about double that for the Quick Compile portion). Primary improvements are skipping the basic block combine optimization pass when using Quick (because we already have big blocks), combining the null check elimination and type inference passes, and limiting expensive local value number analysis to only those blocks which might benefit from it. Following this CL, the actual compile phase consumes roughly 60% of the total dex2oat time on the host, and 55% on the target (Note, I'm subtracting out the Deduping time here, which the timing logger normally counts against the compiler). A sample breakdown of the compilation time follows (this taken on PlusOne.apk w/ a Nexus 4): 39.00% -> MIR2LIR: 1374.90 (Note: includes local optimization & scheduling) 10.25% -> MIROpt:SSATransform: 361.31 8.45% -> BuildMIRGraph: 297.80 7.55% -> Assemble: 266.16 6.87% -> MIROpt:NCE_TypeInference: 242.22 5.56% -> Dedupe: 196.15 3.45% -> MIROpt:BBOpt: 121.53 3.20% -> RegisterAllocation: 112.69 3.00% -> PcMappingTable: 105.65 2.90% -> GcMap: 102.22 2.68% -> Launchpads: 94.50 1.16% -> MIROpt:InitRegLoc: 40.94 1.16% -> Cleanup: 40.93 1.10% -> MIROpt:CodeLayout: 38.80 0.97% -> MIROpt:ConstantProp: 34.35 0.96% -> MIROpt:UseCount: 33.75 0.86% -> MIROpt:CheckFilters: 30.28 0.44% -> SpecialMIR2LIR: 15.53 0.44% -> MIROpt:BBCombine: 15.41 (cherry pick of 9e8e234af4430abe8d144414e272cd72d215b5f3) Change-Id: I86c665fa7e88b75eb75629a99fd292ff8c449969
ssemble_arm.cc
|
3e5af82ae1a2cd69b7b045ac008ac3b394d17f41 |
21-Nov-2013 |
Vladimir Marko <vmarko@google.com> |
Intrinsic Unsafe.CompareAndSwapLong() for ARM. (cherry picked from cb53fcd79b1a5ce608208ec454b5c19f64aaba37) Change-Id: Iadd3cc8b4ed390670463b80f8efd579ce6ece226
rm_dex_file_method_inliner.cc
rm_lir.h
ssemble_arm.cc
nt_arm.cc
|
1c282e2b9a9b432e132b2c332f861cad9feb4a73 |
21-Nov-2013 |
Vladimir Marko <vmarko@google.com> |
Refactor intrinsic CAS, prepare for 64-bit version. Bug: 11391018 Change-Id: Ic0f740e0cd0eb47f2c915f81be02f52f7721f8a3
rm_dex_file_method_inliner.cc
odegen_arm.h
nt_arm.cc
|
2247984899247b1402408d39731ff64048f0e274 |
19-Nov-2013 |
Vladimir Marko <vmarko@google.com> |
Clean up kOpCmp on ARM. kThumb2CmnRI8M is now used. Change-Id: I300299258ed99d86c300dee45c904c360dd44638
nt_arm.cc
tility_arm.cc
|
3cebbc759a1e34d5900d35933bb364e160072c1e |
18-Nov-2013 |
Vladimir Marko <vmarko@google.com> |
Merge "Rewrite intrinsics detection." into dalvik-dev
|
332b7aa6220124dc638b9f7e59611c376473f128 |
18-Nov-2013 |
Vladimir Marko <vmarko@google.com> |
Improve Thumb2 instructions' use of constant operands. Rename instructions using modified immediate to use suffix I8M. Many were using I8 which may lead to confusion with Thumb I8 instructions and some were using other suffixes. Add and use CmnRI8M, increase constant range of AddRRI12 and SubRRI12 and use BicRRI8M for applicable kOpAnd constants. In particular, this should marginaly improve Math.abs(float) and Math.abs(double) by converting x & 0x7fffffff to BIC. Bug: 11579369 Change-Id: I0f17a9eb80752d2625730a60555152cdffed50ba
rm_lir.h
ssemble_arm.cc
p_arm.cc
nt_arm.cc
tility_arm.cc
|
5c96e6b4dc354a7439b211b93462fbe8edea5e57 |
14-Nov-2013 |
Vladimir Marko <vmarko@google.com> |
Rewrite intrinsics detection. Intrinsic methods should be treated as a special case of inline methods. They should be detected early and used to guide other optimizations. This CL rewrites the intrinsics detection so that it can be moved to any compilation phase. Change-Id: I4424a6a869bd98b9c478953c9e3bcaf1c6de2b33
rm_dex_file_method_inliner.cc
rm_dex_file_method_inliner.h
|
7020278bce98a0735dc6abcbd33bdf1ed2634f1d |
23-Oct-2013 |
Dave Allison <dallison@google.com> |
Support hardware divide instruction Bug: 11299025 Uses sdiv for division and a combo of sdiv, mul and sub for modulus. Only does this on processors that are capable of the sdiv instruction, as determined by the build system. Also provides a command line arg --instruction-set-features= to allow cross compilation. Makefile adds the --instruction-set-features= arg to build-time dex2oat runs and defaults it to something obtained from the target architecture. Provides a GetInstructionSetFeatures() function on CompilerDriver that can be queried for various features. The only feature supported right now is hasDivideInstruction(). Also adds a few more instructions to the ARM disassembler b/11535253 is an addition to this CL to be done later. Change-Id: Ia8aaf801fd94bc71e476902749cf20f74eba9f68
rm_lir.h
ssemble_arm.cc
nt_arm.cc
tility_arm.cc
|
e508a2090b19fe705fbc6b99d76474037a74bbfb |
04-Nov-2013 |
Vladimir Marko <vmarko@google.com> |
Fix unaligned Memory peek/poke intrinsics. Change-Id: Id454464d0b28aa37f5239f1c6589ceb0b3bbbdea
odegen_arm.h
nt_arm.cc
|
88474b416eb257078e590bf9bc7957cee604a186 |
24-Oct-2013 |
Jeff Hao <jeffhao@google.com> |
Implement Interface Method Tables (IMT). Change-Id: Idf7fe85e1293453a8ad862ff2380dcd5db4e3a39
arget_arm.cc
|
a8b4caf7526b6b66a8ae0826bd52c39c66e3c714 |
24-Oct-2013 |
Vladimir Marko <vmarko@google.com> |
Add byte swap instructions for ARM and x86. Change-Id: I03fdd61ffc811ae521141f532b3e04dda566c77d
rm_lir.h
ssemble_arm.cc
tility_arm.cc
|
a61f49539a59b610e557b5513695295639496750 |
23-Aug-2013 |
buzbee <buzbee@google.com> |
Add timing logger to Quick compiler Current Quick compiler breakdown for compiling the boot class path: MIR2LIR: 29.674% MIROpt:SSATransform: 17.656% MIROpt:BBOpt: 11.508% BuildMIRGraph: 7.815% Assemble: 6.898% MIROpt:ConstantProp: 5.151% Cleanup: 4.916% MIROpt:NullCheckElimination: 4.085% RegisterAllocation: 3.972% GcMap: 2.359% Launchpads: 2.147% PcMappingTable: 2.145% MIROpt:CodeLayout: 0.697% LiteralData: 0.654% SpecialMIR2LIR: 0.323% Change-Id: I9f77e825faf79e6f6b214bb42edcc4b36f55d291
ssemble_arm.cc
|
a8d24bf578a1022ff14f89f650074dc39b9667fe |
21-Oct-2013 |
buzbee <buzbee@google.com> |
Merge "64-bit prep" into dalvik-dev
|
0d82948094d9a198e01aa95f64012bdedd5b6fc9 |
12-Oct-2013 |
buzbee <buzbee@google.com> |
64-bit prep Preparation for 64-bit roll. o Eliminated storing pointers in 32-bit int slots in LIR. o General size reductions of common structures to reduce impact of doubled pointer sizes: - BasicBlock struct was 72 bytes, now is 48. - MIR struct was 72 bytes, now is 64. - RegLocation was 12 bytes, now is 8. o Generally replaced uses of BasicBlock* pointers with 16-bit Ids. o Replaced several doubly-linked lists with singly-linked to save one stored pointer per node. o We had quite a few uses of uintptr_t's that were a holdover from the JIT (which used pointers to mapped dex & actual code cache addresses rather than trace-relative offsets). Replaced those with uint32_t's. o Clean up handling of embedded data for switch tables and array data. o Miscellaneous cleanup. I anticipate one or two additional CLs to reduce the size of MIR and LIR structs. Change-Id: I58e426d3f8e5efe64c1146b2823453da99451230
ssemble_arm.cc
all_arm.cc
odegen_arm.h
p_arm.cc
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
379067c3970fb225332cca25301743f5010d3ef9 |
16-Oct-2013 |
Ian Rogers <irogers@google.com> |
Don't clobber array reg if its needed for card marking Change-Id: I4377717a2431ffd7e8fafc2e2cca7c1285b38668
nt_arm.cc
|
773aab1e8992b2834153eb23c976a4eb0da51a71 |
14-Oct-2013 |
Ian Rogers <irogers@google.com> |
Correct free-ing of temp register. Bug 11199874. The card mark was potentially using a register freed just before. Make the free-ing of temps strongly correspond to their allocation. Change-Id: I3d1e8c923b7fd8b3666e841d3ff9a46e6eb58318
nt_arm.cc
|
409fe94ad529d9334587be80b9f6a3d166805508 |
11-Oct-2013 |
buzbee <buzbee@google.com> |
Quick assembler fix This CL re-instates the select pattern optimization disabled by CL 374310, and fixes the underlying problem: improper handling of the kPseudoBarrier LIR opcode. The bug was introduced in the recent assembler restructuring. In short, LIR pseudo opcodes (which have values < 0), should always have size 0 - and thus cause no bits to be emitted during assembly. In this case, bad logic caused us to set the size of a kPseudoBarrier opcode via lookup through the EncodingMap. Because all pseudo ops are < 0, this meant we did an array underflow load, picking up whatever garbage was located before the EncodingMap. This explains why this error showed up recently - we'd previuosly just gotten a lucky layout. This CL corrects the faulty logic, and adds DCHECKs to uses of the EncodingMap to ensure that we don't try to access w/ a pseudo op. Additionally, the existing is_pseudo_op() macro is replaced with IsPseudoLirOp(), named similar to the existing IsPseudoMirOp(). Change-Id: I46761a0275a923d85b545664cadf052e1ab120dc
ssemble_arm.cc
arget_arm.cc
tility_arm.cc
|
a9a8254c920ce8e22210abfc16c9842ce0aea28f |
04-Oct-2013 |
Ian Rogers <irogers@google.com> |
Improve quick codegen for aput-object. 1) don't type check known null. 2) if we know types in verify don't check at runtime. 3) if we're runtime checking then move all the code out-of-line. Also, don't set up a callee-save frame for check-cast, do an instance-of test then throw an exception if that fails. Tidy quick entry point of Ldivmod to Lmod which it is on x86 and mips. Fix monitor-enter/exit NPE for MIPS. Fix benign bug in mirror::Class::CannotBeAssignedFromOtherTypes, a byte[] cannot be assigned to from other types. Change-Id: I9cb3859ec70cca71ed79331ec8df5bec969d6745
odegen_arm.h
nt_arm.cc
|
c68fb2094b562186a571f496fc46ad2b85b02a39 |
02-Oct-2013 |
Ian Rogers <irogers@google.com> |
Merge "Inflate contended lock word by suspending owner." into dalvik-dev
|
d9c4fc94fa618617f94e1de9af5f034549100753 |
02-Oct-2013 |
Ian Rogers <irogers@google.com> |
Inflate contended lock word by suspending owner. Bug 6961405. Don't inflate monitors for Notify and NotifyAll. Tidy lock word, handle recursive lock case alongside unlocked case and move assembly out of line (except for ARM quick). Also handle null in out-of-line assembly as the test is quick and the enter/exit code is already a safepoint. To gain ownership of a monitor on behalf of another thread, monitor contenders must not hold the monitor_lock_, so they wait on a condition variable. Reduce size of per mutex contention log. Be consistent in calling thin lock thread ids just thread ids. Fix potential thread death races caused by the use of FindThreadByThreadId, make it invariant that returned threads are either self or suspended now. Code size reduction on ARM boot.oat 0.2%. Old nexus 7 speedup 0.25%, new nexus 7 speedup 1.4%, nexus 10 speedup 2.24%, nexus 4 speedup 2.09% on DeltaBlue. Change-Id: Id52558b914f160d9c8578fdd7fc8199a9598576a
all_arm.cc
nt_arm.cc
|
b48819db07f9a0992a72173380c24249d7fc648a |
15-Sep-2013 |
buzbee <buzbee@google.com> |
Compile-time tuning: assembly phase Not as much compile-time gain from reworking the assembly phase as I'd hoped, but still worthwhile. Should see ~2% improvement thanks to the assembly rework. On the other hand, expect some huge gains for some application thanks to better detection of large machine-generated init methods. Thinkfree shows a 25% improvement. The major assembly change was to establish thread the LIR nodes that require fixup into a fixup chain. Only those are processed during the final assembly pass(es). This doesn't help for methods which only require a single pass to assemble, but does speed up the larger methods which required multiple assembly passes. Also replaced the block_map_ basic block lookup table (which contained space for a BasicBlock* for each dex instruction unit) with a block id map - cutting its space requirements by half in a 32-bit pointer environment. Changes: o Reduce size of LIR struct by 12.5% (one of the big memory users) o Repurpose the use/def portion of the LIR after optimization complete. o Encode instruction bits to LIR o Thread LIR nodes requiring pc fixup o Change follow-on assembly passes to only consider fixup LIRs o Switch on pc-rel fixup kind o Fast-path for small methods - single pass assembly o Avoid using cb[n]z for null checks (almost always exceed displacement) o Improve detection of large initialization methods. o Rework def/use flag setup. o Remove a sequential search from FindBlock using lookup table of 16-bit block ids rather than full block pointers. o Eliminate pcRelFixup and use fixup kind instead. o Add check for 16-bit overflow on dex offset. Change-Id: I4c6615f83fed46f84629ad6cfe4237205a9562b4
rm_lir.h
ssemble_arm.cc
odegen_arm.h
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
1fc5800def46a2fa6cbd235fcb8af099ee35a127 |
18-Sep-2013 |
buzbee <buzbee@google.com> |
Art compiler: minor instruction assembler fix During the assembly phase, we iteratively walk through the LIR encoding instructions until we can complete a full pass without without having to change the sequence because of displacement overflow. In the (fairly common) situation in which a 16-bit cbnz/cbz can't reach the target, we expand it to a compare and branch sequence Initially, we use a 16-bit Thumb1 unconditional branch, which itself may be expanded in a future pass to a 32-bit branch. The original cbnz/cbz LIR is converted into a cmp, and a new branch instruction is inserted following. The problem here is that by doing a following insertion, that new instruction will be the next one considered to determine if it can reach it's target. Because it is new, though, it's starting offset will show as zero - making it much more likely that it will be treated as a displacement overflow and be converted to a 32-bit branch. This is not a correctness issue - the bad offset will be corrected on the next pass, but it does result in unnecessary uses of 32-bit branches where 16-bit ones would work. Change-Id: Ie68a93fd319f0f7c603e1d870588047ad6a0779f
ssemble_arm.cc
|
2de2aa1a96dfa5bebc004f29b5dbfafd37039cee |
13-Sep-2013 |
Jeff Hao <jeffhao@google.com> |
Make inlined CAS32 loop until store is successful if values match. The native implementation of compareAndSwap uses android_atomic_cas, which will repeat the strex until it succeeds. The compiled version was changed to do the same. Bug: 10530407 Change-Id: I7efb3f92d0d0610fcc5a885e2c97f1d701b5a4ea
nt_arm.cc
|
77695d2e13d522426c973546391c07ac88242bc2 |
12-Sep-2013 |
Jeff Hao <jeffhao@google.com> |
am 715084a2: am 9d7e507f: am 95848d01: Revert "Fix CAS intrinsic to clear exclusive if values don\'t match." * commit '715084a24a9db1b898c38bbf4a8a7383da76e326': Revert "Fix CAS intrinsic to clear exclusive if values don't match."
|
95848d01adae14c6a9ba433f6789a9462edb8e7d |
12-Sep-2013 |
Jeff Hao <jeffhao@google.com> |
Revert "Fix CAS intrinsic to clear exclusive if values don't match." Ian is correct. I can still see this bug even with this change. This reverts commit 3a0831507637028a439712dedaaddd7cd0893995. Change-Id: I780f2de926f1ff7576adc679c56a6cf491dad127
nt_arm.cc
|
3a0831507637028a439712dedaaddd7cd0893995 |
12-Sep-2013 |
Jeff Hao <jeffhao@google.com> |
Fix CAS intrinsic to clear exclusive if values don't match. The LDREX has a matching STREX if the values match, but it needed a CLREX for the case where they didn't. Bug: 10530407 Change-Id: I46b474cca326a251536e7f214c80486694431386 (cherry picked from commit 78765e84a3654357a03f84b76985556cf7d9731a)
nt_arm.cc
|
78765e84a3654357a03f84b76985556cf7d9731a |
12-Sep-2013 |
Jeff Hao <jeffhao@google.com> |
Fix CAS intrinsic to clear exclusive if values don't match. The LDREX has a matching STREX if the values match, but it needed a CLREX for the case where they didn't. Bug: 10530407 Change-Id: I46b474cca326a251536e7f214c80486694431386
nt_arm.cc
|
bd663de599b16229085759366c56e2ed5a1dc7ec |
11-Sep-2013 |
buzbee <buzbee@google.com> |
Compile-time tuning: register/bb utilities This CL yeilds about a 4% improvement in the compilation phase of dex2oat (single-threaded; multi-threaded compilation is more difficult to accurately measure). The register utilities could stand to be completely rewritten, but this gets most of the easy benefit. Next up: the assembly phase. Change-Id: Ife5a474e9b1a6d9e501e888dda6749d34eb77e96
odegen_arm.h
arget_arm.cc
|
252254b130067cd7a5071865e793966871ae0246 |
09-Sep-2013 |
buzbee <buzbee@google.com> |
More Quick compile-time tuning: labels & branches This CL represents a roughly 3.5% performance improvement for the compile phase of dex2oat. Move of the gain comes from avoiding the generation of dex boundary LIR labels unless a debug listing is requested. The other significant change is moving from a basic block ending branch model of "always generate a fall-through branch, and then delete it if we can" to a "only generate a fall-through branch if we need it" model. The data motivating these changes follow. Note that two area of potentially attractive gain remain: restructing the assembler model and reworking the register handling utilities. These will be addressed in subsequent CLs. --- data follows The Quick compiler's assembler has shown up on profile reports a bit more than seems reasonable. We've tried a few quick fixes to apparently hot portions of the code, but without much gain. So, I've been looking at the assembly process at a somewhat higher level. There look to be several potentially good opportunities. First, an analysis of the makeup of the LIR graph showed a surprisingly high proportion of LIR pseudo ops. Using the boot classpath as a basis, we get: 32.8% of all LIR nodes are pseudo ops. 10.4% are LIR instructions which require pc-relative fixups. 11.8% are LIR instructions that have been nop'd by the various optimization passes. Looking only at the LIR pseudo ops, we get: kPseudoDalvikByteCodeBoundary 43.46% kPseudoNormalBlockLabel 21.14% kPseudoSafepointPC 20.20% kPseudoThrowTarget 6.94% kPseudoTarget 3.03% kPseudoSuspendTarget 1.95% kPseudoMethodExit 1.26% kPseudoMethodEntry 1.26% kPseudoExportedPC 0.37% kPseudoCaseLabel 0.30% kPseudoBarrier 0.07% kPseudoIntrinsicRetry 0.02% Total LIR count: 10167292 The standout here is the Dalvik opcode boundary marker. This is just a label inserted at the beginning of the codegen for each Dalvik bytecode. If we're also doing a verbose listing, this is also where we hang the pretty-print disassembly string. However, this label was also being used as a convenient way to find the target of switch case statements (and, I think at one point was used in the Mir->GBC conversion process). This CL moves the use of kPseudoDalvikByteCodeBoundary labels to only verbose listing runs, and replaces the codegen uses of the label with the kPseudoNormalBlockLabel attached to the basic block that contains the switch case target. Great savings here - 14.3% reduction in the number of LIR nodes needed. After this CL, our LIR pseudo proportions drop to 21.6% of all LIR. That's still a lot, but much better. Possible further improvements via combining normal labels with kPseudoSafepointPC labels where appropriate, and also perhaps reduce memory usage by using a short-hand form for labels rather than a full LIR node. Also, many of the basic block labels are no longer branch targets by the time we get to assembly - cheaper to delete, or just ingore? Here's the "after" LIR pseudo op breakdown: kPseudoNormalBlockLabel 37.39% kPseudoSafepointPC 35.72% kPseudoThrowTarget 12.28% kPseudoTarget 5.36% kPseudoSuspendTarget 3.45% kPseudoMethodEntry 2.24% kPseudoMethodExit 2.22% kPseudoExportedPC 0.65% kPseudoCaseLabel 0.53% kPseudoBarrier 0.12% kPseudoIntrinsicRetry 0.04% Total LIR count: 5748232 Not done in this CL, but it will be worth experimenting with actually deleting LIR nodes from the graph when they are optimized away, rather than just setting the NOP bit. Keeping them around is invaluable during debugging - but when not debugging it may pay off if the cost of node removal is less than the cost of traversing through dead nodes in subsequent passes. Next up (and partially in this CL - but mostly to be done in follow-on CLs) is the overall assembly process. Inherited from the trace JIT, the Quick compiler has a fairly simple-minded approach to instruction assembly. First, a pass is made over the LIR list to assign offsets to each instruction. Then, the assembly pass is made - which generates the actual machine instruction bit patterns and pushes the instruction data into the code_buffer. However, the code generator takes the "always optimistic" approach to instruction selection and emits the shortest instruction. If, during assembly, we find that a branch or load doesn't reach, that short-form instruction is replaces with a longer sequence. Of course, this invalidates the previously-computed offset calculations. Assembly thus is an iterative process: compute offsets and then assemble until we survive an assembly pass without invalidation. This seems like a likely candidate for improvement. First, I analyzed the number of retries required, and the reason for invalidation over the boot classpath load. The results: more than half of methods don't require a retry, and very few require more than 1 extra pass: 5 or more: 6 of 96334 4 or more: 22 of 96334 3 or more: 140 of 96334 2 or more: 1794 of 96334 - 2% 1 or more: 40911 of 96334 - 40% 0 retries: 55423 of 96334 - 58% The interesting group here is the one that requires 1 retry. Looking at the reason, we see three typical reasons: 1. A cbnz/cbz doesn't reach (only 7 bits of offset) 2. A 16-bit Thumb1 unconditional branch doesn't reach. 3. An unconditional branch which branches to the next instruction is encountered, and deleted. The first 2 cases are the cost of the optimistic strategy - nothing much to change there. However, the interesting case is #3 - dead branch elimination. A further analysis of the single retry group showed that 42% of the methods (16305) that required a single retry did so *only* because of dead branch elimination. The big question here is why so many dead branches survive to the assembly stage. We have a dead branch elimination pass which is supposed to catch these - perhaps it's not working correctly, should be moved later in the optimization process, or perhaps run multiple times. Other things to consider: o Combine the offset generation pass with the assembly pass. Skip pc-relative fixup assembly (other than assigning offset), but push LIR* for them into work list. Following the main pass, zip through the work list and assemble the pc-relative instructions (now that we know the offsets). This would significantly cut back on traversal costs. o Store the assembled bits into both the code buffer and the LIR. In the event we have to retry, only the pc-relative instructions would need to be assembled, and we'd finish with a pass over the LIR just to dumb the bits into the code buffer. Change-Id: I50029d216fa14f273f02b6f1c8b6a0dde5a7d6a6
ssemble_arm.cc
all_arm.cc
nt_arm.cc
|
9b297bfc588c7d38efd12a6f38cd2710fc513ee3 |
06-Sep-2013 |
Ian Rogers <irogers@google.com> |
Refactor CompilerDriver::Compute..FieldInfo Don't use non-const reference arguments. Move ins before outs. Change-Id: I7b251156388d8f07513b3da62ebfd29e5fd9ff76
all_arm.cc
|
11b63d13f0a3be0f74390b66b58614a37f9aa6c1 |
27-Aug-2013 |
buzbee <buzbee@google.com> |
Quick compiler: division by literal fix The constant propagation optimization pass attempts to identify constants in Dalvik virtual registers and handle them more efficiently. The use of small constants in divison, though, was handled incorrectly in that the high level code correctly detected the use of a constant, but the actual code generation routine was only expecting the use of a special constant form opcode. see b/10503566 Change-Id: I88aa4d2eafebb2b1af1a1e88049f1845aefae261
odegen_arm.h
nt_arm.cc
|
f6c4b3ba3825de1dbb3e747a68b809c6cc8eb4db |
25-Aug-2013 |
Mathieu Chartier <mathieuc@google.com> |
New arena memory allocator. Before we were creating arenas for each method. The issue with doing this is that we needed to memset each memory allocation. This can be improved if you start out with arenas that contain all zeroed memory and recycle them for each method. When you give memory back to the arena pool you do a single memset to zero out all of the memory that you used. Always inlined the fast path of the allocation code. Removed the "zero" parameter since the new arena allocator always returns zeroed memory. Host dex2oat time on target oat apks (2 samples each). Before: real 1m11.958s user 4m34.020s sys 1m28.570s After: real 1m9.690s user 4m17.670s sys 1m23.960s Target device dex2oat samples (Mako, Thinkfree.apk): Without new arena allocator: 0m26.47s real 0m54.60s user 0m25.85s system 0m25.91s real 0m54.39s user 0m26.69s system 0m26.61s real 0m53.77s user 0m27.35s system 0m26.33s real 0m54.90s user 0m25.30s system 0m26.34s real 0m53.94s user 0m27.23s system With new arena allocator: 0m25.02s real 0m54.46s user 0m19.94s system 0m25.17s real 0m55.06s user 0m20.72s system 0m24.85s real 0m55.14s user 0m19.30s system 0m24.59s real 0m54.02s user 0m20.07s system 0m25.06s real 0m55.00s user 0m20.42s system Correctness of Thinkfree.apk.oat verified by diffing both of the oat files. Change-Id: I5ff7b85ffe86c57d3434294ca7a621a695bf57a9
all_arm.cc
arget_arm.cc
|
468532ea115657709bc32ee498e701a4c71762d4 |
05-Aug-2013 |
Ian Rogers <irogers@google.com> |
Entry point clean up. Create set of entry points needed for image methods to avoid fix-up at load time: - interpreter - bridge to interpreter, bridge to compiled code - jni - dlsym lookup - quick - resolution and bridge to interpreter - portable - resolution and bridge to interpreter Fix JNI work around to use JNI work around argument rewriting code that'd been accidentally disabled. Remove abstact method error stub, use interpreter bridge instead. Consolidate trampoline (previously stub) generation in generic helper. Simplify trampolines to jump directly into assembly code, keeps stack crawlable. Dex: replace use of int with ThreadOffset for values that are thread offsets. Tidy entry point routines between interpreter, jni, quick and portable. Change-Id: I52a7c2bbb1b7e0ff8a3c3100b774212309d0828e (cherry picked from commit 848871b4d8481229c32e0d048a9856e5a9a17ef9)
all_arm.cc
odegen_arm.h
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
7655f29fabc0a12765de828914a18314382e5a35 |
29-Jul-2013 |
Ian Rogers <irogers@google.com> |
Portable refactorings. Separate quick from portable entrypoints. Move architectural dependencies into arch. Change-Id: I9adbc0a9782e2959fdc3308215f01e3107632b7c
all_arm.cc
p_arm.cc
nt_arm.cc
|
166db04e259ca51838c311891598664deeed85ad |
26-Jul-2013 |
Ian Rogers <irogers@google.com> |
Move assembler out of runtime into compiler/utils. Other directory layout bits of clean up. There is still work to separate quick and portable in some files (e.g. argument visitor, proxy..). Change-Id: If8fecffda8ba5c4c47a035f0c622c538c6b58351
all_arm.cc
nt_arm.cc
|
7934ac288acfb2552bb0b06ec1f61e5820d924a4 |
26-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Fix cpplint whitespace/comments issues Change-Id: Iae286862c85fb8fd8901eae1204cd6d271d69496
rm_lir.h
all_arm.cc
p_arm.cc
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
4274889d48ef82369bf2c1ca70d84689b4f9e93a |
19-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Fixing cpplint readability/check issues Change-Id: Ia81db7238b4a13ff2e585aaac9d5e3e91df1e3e0
nt_arm.cc
|
6f485c62b9cfce3ab71020c646ab9f48d9d29d6d |
19-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Fix cpplint whitespace/indent issues Change-Id: I7c1647f0c39e1e065ca5820f9b79998691ba40b1
all_arm.cc
tility_arm.cc
|
9b7085a4e7c40e7fa01932ea1647a4a33ac1c585 |
19-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Fix cpplint readability/braces issues Change-Id: I56b88956510077b0e13aad4caee8898313fab55b
tility_arm.cc
|
38f85e4892f6504971bde994fec81fd61780ac30 |
18-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Fix cpplint whitespace/operators issues Change-Id: I730bd87b476bfa36e93b42e816ef358006b69ba5
arget_arm.cc
tility_arm.cc
|
df62950e7a32031b82360c407d46a37b94188fbb |
18-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Fix cpplint whitespace/parens issues Change-Id: Ifc678d59a8bed24ffddde5a0e543620b17b0aba9
p_arm.cc
nt_arm.cc
tility_arm.cc
|
0cd7ec2dcd8d7ba30bf3ca420b40dac52849876c |
18-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Fix cpplint whitespace/blank_line issues Change-Id: Ice937e95e23dd622c17054551d4ae4cebd0ef8a2
ssemble_arm.cc
|
b1eba213afaf7fa6445de863ddc9680ab99762ea |
18-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Fix cpplint whitespace/comma issues Change-Id: I456fc8d80371d6dfc07e6d109b7f478c25602b65
rm_lir.h
ssemble_arm.cc
nt_arm.cc
arget_arm.cc
|
2ce745c06271d5223d57dbf08117b20d5b60694a |
18-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Fix cpplint whitespace/braces issues Change-Id: Ide80939faf8e8690d8842dde8133902ac725ed1a
ssemble_arm.cc
all_arm.cc
p_arm.cc
nt_arm.cc
arget_arm.cc
tility_arm.cc
|
fc0e3219edc9a5bf81b166e82fd5db2796eb6a0d |
17-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Fix multiple inclusion guards to match new pathnames Change-Id: Id7735be1d75bc315733b1773fba45c1deb8ace43
rm_lir.h
odegen_arm.h
|
7940e44f4517de5e2634a7e07d58d0fb26160513 |
12-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Create separate Android.mk for main build targets The runtime, compiler, dex2oat, and oatdump now are in seperate trees to prevent dependency creep. They can now be individually built without rebuilding the rest of the art projects. dalvikvm and jdwpspy were already this way. Builds in the art directory should behave as before, building everything including tests. Change-Id: Ic6b1151e5ed0f823c3dd301afd2b13eb2d8feb81
rm_lir.h
ssemble_arm.cc
all_arm.cc
odegen_arm.h
p_arm.cc
nt_arm.cc
arget_arm.cc
tility_arm.cc
|