a1935c4fa255b5c20f5e9b2abce6be2d0f7cb0a8 |
|
26-Jun-2015 |
Roland Levillain <rpl@google.com> |
MIPS: Initial version of optimizing compiler for MIPS64R6. (cherry picked from commit 4dda3376b71209fae07f5c3c8ac3eb4b54207aa8) (amended for mnc-dev) Bug: 21555893 Change-Id: I874dc356eee6ab061a32f8f3df5f8ac3a4ab7dcf Signed-off-by: Alexey Frunze <Alexey.Frunze@imgtec.com> Signed-off-by: Douglas Leung <douglas.leung@imgtec.com>
|
dd3c7d2d6124ceb346b4ed9aa7115f75fc6d3f9f |
|
18-Jun-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Remove old DCHECK that trips Baseline Codegen verified that the entry block always falls through to the next block. While this is the case with Optimizing, it doesn't hold for Baseline but it doesn't need to since codegen handles it fine. Bug:21913514 Change-Id: I751ef227e6cf103af3e7fc35fca4b01c663385a1 (cherry picked from commit 015c7e63604c038e866d7af3850c557403cddc8b)
|
3d21bdf8894e780d349c481e5c9e29fe1556051c |
|
22-Apr-2015 |
Mathieu Chartier <mathieuc@google.com> |
Move mirror::ArtMethod to native Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 (cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33) Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3 ART: Fix casts for 64-bit pointers on 32-bit compiler. Bug: 19264997 Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457 Fix JDWP tests after ArtMethod change Fixes Throwable::GetStackDepth for exception event detection after internal stack trace representation change. Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of proxy method. Bug: 19264997 Change-Id: I363e293796848c3ec491c963813f62d868da44d2 Fix accidental IMT and root marking regression Was always using the conflict trampoline. Also included fix for regression in GC time caused by extra roots. Most of the regression was IMT. Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to detached thread. EvaluateAndApplyChanges: From ~2500 -> ~1980 GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots Bug: 19264997 Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0 Fix bogus image test assert Previously we were comparing the size of the non moving space to size of the image file. Now we properly compare the size of the image space against the size of the image file. Bug: 19264997 Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a [MIPS64] Fix art_quick_invoke_stub argument offsets. ArtMethod reference's size got bigger, so we need to move other args and leave enough space for ArtMethod* and 'this' pointer. This fixes mips64 boot. Bug: 19264997 Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
|
0a23d74dc2751440822960eab218be4cb8843647 |
|
07-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a parent environment to HEnvironment. This code has no functionality change. It adds a placeholder for chaining inlined frames. Change-Id: I5ec57335af76ee406052345b947aad98a6a4423a
|
3e3d73349a2de81d14e2279f60ffbd9ab3f3ac28 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Have HInvoke instructions know their number of actual arguments. Add an art::HInvoke::GetNumberOfArguments routine so that art::HInvoke and its subclasses can return the number of actual arguments of the called method. Use it in code generators and intrinsics handlers. Consequently, no longer remove a clinit check as last input of a static invoke if it is still present during baseline code generation, but ensure that static invokes have no such check as last input in optimized compilations. Change-Id: Iaf9e07d1057a3b15b83d9638538c02b70211e476
|
4f46ac5179967dda5966f2dcecf2cf08977951ef |
|
23-Apr-2015 |
Calin Juravle <calin@google.com> |
Cleanup and improve stack map stream - transform AddStackMapEntry into BeginStackMapEntry/EndStackMapEntry. This allows for nicer code and less assumptions when searching for equal dex register maps. - store the components sizes and their start positions as fields to avoid re-computation. - store the current stack map entry as a field to avoid the copy semantic when updating its value in the stack maps array. - remove redundant methods and fix visibility for the remaining ones. Change-Id: Ica2d2969d7e15993bdbf8bc41d9df083cddafd24
|
641547a5f18ca2ea54469cceadcfef64f132e5e0 |
|
21-Apr-2015 |
Calin Juravle <calin@google.com> |
[optimizing] Fix a bug in moving the null check to the user. When taking the decision to move a null check to the user we did not verify if the next instruction checks the same object. Change-Id: I2f4533a4bb18aa4b0b6d5e419f37dcccd60354d2
|
88c13cddc3a4184908662b0f3de796565d348c76 |
|
14-Apr-2015 |
Alexandre Rames <alexandre.rames@arm.com> |
Opt compiler: Correctly require register or FPU register. Also add a check that location summary are correctly typed with the HInstruction. Change-Id: I699762ff4e8f4e321c7db01ea005236ea1934af9
|
9021825d1e73998b99c81e89c73796f6f2845471 |
|
15-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Type MoveOperands. The ParallelMoveResolver implementation needs to know if a move is for 64bits or not, to handle swaps correctly. Bug found, and test case courtesy of Serguei I. Katkov. Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
|
c6b4dd8980350aaf250f0185f73e9c42ec17cd57 |
|
07-Apr-2015 |
David Srbecky <dsrbecky@google.com> |
Implement CFI for Optimizing. CFI is necessary for stack unwinding in gdb, lldb, and libunwind. Change-Id: I1a3480e3a4a99f48bf7e6e63c4e83a80cfee40a2
|
65b798ea10dd716c1bb3dda029f9bf255435af72 |
|
06-Apr-2015 |
Andreas Gampe <agampe@google.com> |
ART: Enable more Clang warnings Change-Id: Ie6aba02f4223b1de02530e1515c63505f37e184c
|
fb8d279bc011b31d0765dc7ca59afea324fd0d0c |
|
01-Apr-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Implement x86/x86_64 math intrinsics Implement floor/ceil/round/RoundFloat on x86 and x86_64. Implement RoundDouble on x86_64. Add support for roundss and roundsd on both architectures. Support them in the disassembler as well. Add the instruction set features for x86, as the 'round' instruction is only supported if SSE4.1 is supported. Fix the tests to handle the addition of passing the instruction set features to x86 and x86_64. Add assembler tests for roundsd and roundss to x86_64 assembler tests. Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
46e2a3915aa68c77426b71e95b9f3658250646b7 |
|
16-Mar-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Boolean simplifier The optimization recognizes the negation pattern generated by 'javac' and replaces it with a single condition. To this end, boolean values are now consistently assumed to be represented by an integer. This is a first optimization which deletes blocks from the HGraph and does so by replacing the corresponding entries with null. Hence, existing code can continue indexing the list of blocks with the block ID, but must check for null when iterating over the list. Change-Id: I7779da69cfa925c6521938ad0bcc11bc52335583
|
da4d79bc9a4aeb9da7c6259ce4c9c1c3bf545eb8 |
|
24-Mar-2015 |
Roland Levillain <rpl@google.com> |
Unify ART's various implementations of bit_cast. ART had several implementations of art::bit_cast: 1. one in runtime/base/casts.h, declared as: template <class Dest, class Source> inline Dest bit_cast(const Source& source); 2. another one in runtime/utils.h, declared as: template<typename U, typename V> static inline V bit_cast(U in); 3. and a third local version, in runtime/memory_region.h, similar to the previous one: template<typename Source, typename Destination> static Destination MemoryRegion::local_bit_cast(Source in); This CL removes versions 2. and 3. and changes their callers to use 1. instead. That version was chosen over the others as: - it was the oldest one in the code base; and - its syntax was closer to the standard C++ cast operators, as it supports the following use: bit_cast<Destination>(source) since `Source' can be deduced from `source'. Change-Id: I7334fd5d55bf0b8a0c52cb33cfbae6894ff83633
|
eeefa1276e83776f08704a3db4237423b0627e20 |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Update locations of registers after slow paths spilling. Change-Id: Id9aafcc13c1a085c17ce65d704c67b73f9de695d
|
fead4e4f397455aa31905b2982d4d861126ab89d |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
[optimizing] Don't record None locations in the stack maps. - moved environment recording from code generator to stack map stream - added creation/loading factory methods for the DexRegisterMap (hides internal details) - added new tests Change-Id: Ic8b6d044f0d8255c6759c19a41df332ef37876fe
|
a8ac9130b872c080299afacf5dcaab513d13ea87 |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Refactor code in preparation of correct stack maps in slow path. Move the logic of saving/restoring live registers in slow path in the SlowPathCode method. Also add a RecordPcInfo helper to SlowPathCode, that will act as the placeholder of saving correct stack maps. Change-Id: I25c2bc7a642ef854bbc8a3eb570e5c8c8d2d030c
|
a4d120c88e79eece333e66eec64c4e909d770e3e |
|
13-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix build breakage. Change-Id: I86959eca5d8f5458ff75c78776b0af9db9c26800
|
915b9d0c13bb5091875d868fbfa551d7b65d7477 |
|
11-Mar-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Tweak liveness when instructions are used in environments. Instructions remain live when debuggable, but only instructions with object types remain live when non-debuggable. Enable StackVisitor::GetThisObject for optimizing. Change-Id: Id87b2cbf33a02450059acc9993995782e5f28987
|
a2d8ec6876325e89e5d82f5dbeca59f96ced3ec1 |
|
12-Mar-2015 |
Roland Levillain <rpl@google.com> |
Compress the Dex register maps built by the optimizing compiler. - Replace the current list-based (fixed-size) Dex register encoding in stack maps emitted by the optimizing compiler with another list-based variable-size Dex register encoding compressing short locations on 1 byte (3 bits for the location kind, 5 bits for the value); other (large) values remain encoded on 5 bytes. - In addition, use slot offsets instead of byte offsets to encode the location of Dex registers placed in stack slots at small offsets, as it enables more values to use the short (1-byte wide) encoding instead of the large (5-byte wide) one. - Rename art::DexRegisterMap::LocationKind as art::DexRegisterLocation::Kind, turn it into a strongly-typed enum based on a uint8_t, and extend it to support new kinds (kInStackLargeOffset and kConstantLargeValue). - Move art::DexRegisterEntry from compiler/optimizing/stack_map_stream.h to runtime/stack_map.h and rename it as art::DexRegisterLocation. - Adjust art::StackMapStream, art::CodeGenerator::RecordPcInfo, art::CheckReferenceMapVisitor::CheckOptimizedMethod, art::StackVisitor::GetVRegFromOptimizedCode, and art::StackVisitor::SetVRegFromOptimizedCode. - Implement unaligned memory accesses in art::MemoryRegion. - Use them to manipulate data in Dex register maps. - Adjust oatdump to support the new Dex register encoding. - Update compiler/optimizing/stack_map_test.cc. Change-Id: Icefaa2e2b36b3c80bb1b882fe7ea2f77ba85c505
|
5f8741860d465410bfed495dbb5f794590d338da |
|
04-Mar-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing] Use callee-save registers for x86 Add ESI, EDI, EBP to available registers for non-baseline mode. Ensure that they aren't used when byte addressible registers are needed. Change-Id: Ie7130d4084c2ae9cfcd1e47c26eb3e5dcac1ebd6 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
579885a26d761f5ba9550f2a1cd7f0f598c2e1e3 |
|
22-Feb-2015 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Opt Compiler: ARM64: Enable explicit memory barriers over acquire/release Implement remaining explicit memory barrier code paths and temporarily enable the use of explicit memory barriers for testing. This CL also enables the use of instruction set features in the ARM64 backend. kUseAcquireRelease has been replaced with PreferAcquireRelease(), which for now is statically set to false (prefer explicit memory barriers). Please note that we still prefer acquire-release for the ARM64 Optimizing Compiler, but we would like to exercise the explicit memory barrier code path too. Change-Id: I84e047ecd43b6fbefc5b82cf532e3f5c59076458 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
d6138ef1ea13d07ae555542f8898b30d89e9ac9a |
|
18-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Ensure the graph is correctly typed. We used to be forgiving because of HIntConstant(0) also being used for null. We now create a special HNullConstant for such uses. Also, we need to run the dead phi elimination twice during ssa building to ensure the correctness. Change-Id: If479efa3680d3358800aebb1cca692fa2d94f6e5
|
442b46a087c389a91a0b51547ac9205058432364 |
|
18-Feb-2015 |
Roland Levillain <rpl@google.com> |
Display optimizing compiler's CodeInfo objects in oatdump. A few elements are not displayed yet (stack mask, inline info) though. Change-Id: I5e51a801c580169abc5d1ef43ad581aadc110754
|
dc23d8318db08cb42e20f1d16dbc416798951a8b |
|
16-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Avoid generating jmp +0. When a block branches to a non-following block, but blocks in-between do branch to it, we can avoid doing the branch. Change-Id: I9b343f662a4efc718cd4b58168f93162a24e1219
|
c0572a451944f78397619dec34a38c36c11e9d2a |
|
06-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize leaf methods. Avoid suspend checks and stack changes when not needed. Change-Id: I0fdb31e8c631e99091b818874a558c9aa04b1628
|
829280cc90b7a84db42864589b4bafb4c94a79d9 |
|
28-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Finally implement Location::kNoOutputOverlap. The [i, i + 1) interval scheme we chose for representing lifetime positions is not optimal for doing this optimization. It however doesn't prevent recognizing a non-split interval during the TryAllocateFreeReg phase, and try to re-use its inputs' registers. Change-Id: I80a2823b0048d3310becfc5f5fb7b1230dfd8201
|
4c204bafbc8d596894f8cb8ec696f5be1c6f12d8 |
|
03-Feb-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Use a different block order when not compiling baseline. Use the linearized order instead, as it puts blocks logically next to each other in a better way. Also, it does not contain dead blocks. Change-Id: Ie65b56041a093c8155e6c1e06351cb36a4053505
|
4dee636d21d9ce54386cdfbb824e5eb2a9c1af0d |
|
23-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support callee-save registers on ARM. Change-Id: I7c519b7a828c9891b1141a8e51e12d6a8bc84118
|
d97dc40d186aec46bfd318b6a2026a98241d7e9c |
|
22-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Support callee save floating point registers on x64. - Share the computation of core_spill_mask and fpu_spill_mask between backends. - Remove explicit stack overflow check support: we need to adjust them and since they are not tested, they will easily bitrot. Change-Id: I0b619b8de4e1bdb169ea1ae7c6ede8df0d65837a
|
988939683c26c0b1c8808fc206add6337319509a |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Enable core callee-save on x64. Will work on other architectures and FP support in other CLs. Change-Id: I8cef0343eedc7202d206f5217fdf0349035f0e4d
|
0ada95d8de4b04b5f201b4b7e9c3c2fd2cc321ae |
|
04-Dec-2014 |
Jean Christophe Beyler <jean.christophe.beyler@intel.com> |
ART: Replace NULL to nullptr in the optimizing compiler Replace macro NULL to the nullptr variation for C++. Change-Id: Ib6e48dd4bb3c254343383011b67372622578ca76 Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com>
|
6c2dff8ff8e1440fa4d9e1b2ba2a44d036882801 |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Fully support pairs in the register allocator."" This reverts commit c399fdc442db82dfda66e6c25518872ab0f1d24f. Change-Id: I19f8215c4b98f2f0827e04bf7806c3ca439794e5
|
77520bca97ec44e3758510cebd0f20e3bb4584ea |
|
12-Jan-2015 |
Calin Juravle <calin@google.com> |
Record implicit null checks at the actual invoke time. ImplicitNullChecks are recorded only for instructions directly (see NB below) preceeded by NullChecks in the graph. This way we avoid recording redundant safepoints and minimize the code size increase. NB: ParallalelMoves might be inserted by the register allocator between the NullChecks and their uses. These modify the environment and the correct action would be to reverse their modification. This will be addressed in a follow-up CL. Change-Id: Ie50006e5a4bd22932dcf11348f5a655d253cd898
|
c399fdc442db82dfda66e6c25518872ab0f1d24f |
|
21-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Fully support pairs in the register allocator." Libcore tests fail. This reverts commit 41aedbb684ccef76ff8373f39aba606ce4cb3194. Change-Id: I2572f120d4bbaeb7a4d4cbfd47ab00c9ea39ac6c
|
41aedbb684ccef76ff8373f39aba606ce4cb3194 |
|
14-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Fully support pairs in the register allocator. Enabled on ARM for longs and doubles. Change-Id: Id8792d08bd7ca9fb049c5db8a40ae694bafc2d8b
|
cd6dffedf1bd8e6dfb3fb0c933551f9a90f7de3f |
|
08-Jan-2015 |
Calin Juravle <calin@google.com> |
Add implicit null checks for the optimizing compiler - for backends: arm, arm64, x86, x86_64 - fixed parameter passing for CodeGenerator - 003-omnibus-opcodes test verifies that NullPointerExceptions work as expected Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
|
42d1f5f006c8bdbcbf855c53036cd50f9c69753e |
|
16-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not use register pair in a parallel move. The ParallelMoveResolver does not work with pairs. Instead, decompose the pair into two individual moves. Change-Id: Ie9d3f0b078cef8dc20640c98b20bb20cc4971a7f
|
f85a9ca9859ad843dc03d3a2b600afbaf2e9bbdd |
|
13-Jan-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
[optimizing compiler] Compute live spill size The current stack frame calculation assumes that each live register to be saved/restored has the word size of the machine. This fails for X86, where a double in an XMM register takes up 8 bytes. Change the calculation to keep track of the number of core registers and number of fp registers to handle this distinction. This is slightly pessimal, as the registers may not be active at the same time, but the only way to handle this would be to allocate both classes of registers simultaneously, or remember all the active intervals, matching them up and compute the size of each safepoint interval. Change-Id: If7860aa319b625c214775347728cdf49a56946eb Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
12df9ebf72255544b0147c81b1dca6644a29764e |
|
09-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Move code around in OptimizingCompiler::Compile to reduce stack space. Also fix an (intentional) memory leak, by allocating the CodeGenerator on the heap instead of the arena: they construct an Assembler object that requires destruction. BUG:18787334 Change-Id: I8cf0667cb70ce5b14d4ac334bd4487a562635f1b
|
840e5461a85f8908f51e7f6cd562a9129ff0e7ce |
|
07-Jan-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement double and float support for arm in register allocator. The basic approach is: - An instruction that needs two registers gets two intervals. - When allocating the low part, we also allocate the high part. - When splitting a low (or high) interval, we also split the high (or low) equivalent. - Allocation follows the (S/D register) requirement that low registers are always even and the high equivalent is low + 1. Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
|
3416601a9e9be81bb7494864287fd3602d18ef13 |
|
19-Dec-2014 |
Calin Juravle <calin@google.com> |
Look at instruction set features when generating volatiles code Change-Id: Ia882405719fdd60b63e4102af7e085f7cbe0bb2a
|
e21dc3db191df04c100620965bee4617b3b24397 |
|
09-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Swap-space in the compiler Introduce a swap-space and corresponding allocator to transparently switch native allocations to memory backed by a file. Bug: 18596910 (cherry picked from commit 62746d8d9c4400e4764f162b22bfb1a32be287a9) Change-Id: I131448f3907115054a592af73db86d2b9257ea33
|
5b4b898ed8725242ee6b7229b94467c3ea3054c8 |
|
18-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't block quick callee saved registers for optimizing." X64 has one libcore test failing, and codegen_test on arm is failing. This reverts commit 6004796d6c630696127df2494dcd4f30d1367a34. Change-Id: I20e00431fa18e11ce4c0cb6fffa91977fa8e9b4f
|
6004796d6c630696127df2494dcd4f30d1367a34 |
|
15-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't block quick callee saved registers for optimizing. This change builds on: https://android-review.googlesource.com/#/c/118983/ - Also fix x86_64 assembler bug triggered by this change. - Fix (and improve) x86's backend byte register usage. - Fix a bug in baseline register allocator: a fixed out register must prevent inputs from allocating it. Change-Id: I4883862e29b4e4b6470f1823cf7eab7e7863d8ad
|
e53798a7e3267305f696bf658e418c92e63e0834 |
|
01-Dec-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Inlining support in optimizing. Currently only inlines simple things that don't require an environment, such as: - Returning a constant. - Returning a parameter. - Returning an arithmetic operation. Change-Id: Ie844950cb44f69e104774a3cf7a8dea66bc85661
|
d2ec87d84057174d4884ee16f652cbcfd31362e9 |
|
08-Dec-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] Add REM_FLOAT and REM_DOUBLE - for arm, x86, x86_64 backends - reinstated fmod quick entry points for x86. This is a partial revert of bd3682eada753de52975ae2b4a712bd87dc139a6 which added inline assembly for floting point rem on x86. Note that Quick still uses the inline version. - fix rem tests for longs Change-Id: I73be19a9f2f2bcf3f718d9ca636e67bdd72b5440
|
624279f3c70f9904cbaf428078981b05d3b324c0 |
|
04-Dec-2014 |
Roland Levillain <rpl@google.com> |
Add support for float-to-long in the optimizing compiler. - Add support for the float-to-long Dex instruction in the optimizing compiler. - Add a Dex PC field to art::HTypeConversion to allow the x86 and ARM code generators to produce runtime calls. - Instruct art::CodeGenerator::RecordPcInfo not to record PC information for HTypeConversion instructions. - Add S0 to the list of ARM FPU parameter registers. - Have art::x86_64::X86_64Assembler::cvttss2si work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for float to long HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: I954214f0d537187883f83f7a83a1bb2dd8a21fd4
|
32f5b4d2c8c9b52e9522941c159577b21752d0fa |
|
25-Nov-2014 |
Serban Constantinescu <serban.constantinescu@arm.com> |
Vixl: Update the VIXL interface to VIXL 1.7 and enable VIXL debug. This patch updates the interface to VIXL 1.7 and enables the debug version of VIXL when ART is built in debug mode. Change-Id: I443fb941bec3cffefba7038f93bb972e6b7d8db5 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
|
647b9ed41cdb7cf302fd356627a3ba372419b78c |
|
27-Nov-2014 |
Roland Levillain <rpl@google.com> |
Add support for long-to-double in the optimizing compiler. - Add support for the long-to-double Dex instruction in the optimizing compiler. - Enable requests of temporary FPU (double) registers during code generation. - Fix art::x86::X86Assembler::LoadLongConstant and extend it to int64_t values. - Have art::x86_64::X86_64Assembler::cvtsi2sd work with 64-bit operands. - Generate x86, x86-64 and ARM (but not ARM64) code for long to double HTypeConversion nodes. - Add related tests to test/422-type-conversion. Change-Id: Ie73d9e5e25bd2e15f585c371e8fc2dcb83438ccd
|
87d03761f35ad6cbe0bffbf1ec739875a471da6d |
|
19-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix safepoint bug when computing live registers. Change-Id: I8f28dd287c0e04223c49dea6a323058c1b210913
|
f97f9fbfdf7f2e23c662f21081fadee6af37809d |
|
11-Nov-2014 |
Calin Juravle <calin@google.com> |
[optimizing compiler] add HTemporary support for long and doubles Change-Id: I5247ecd71d0193050484b7632c804c9bfd20f924
|
f0e3937b87453234d0d7970b8712082062709b8d |
|
12-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Do a parallel move in BoundsCheckSlowPath. The two locations of the index and length could overlap, so we need a parallel move. Also factorize the code for doing a parallel move based on two locations. Change-Id: Iee8b3459e2eed6704d45e9a564fb2cd050741ea4
|
52839d17c06175e19ca4a093fb878450d1c4310d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support invoke-interface in optimizing. Change-Id: Ic18d7c3d2810557231caf0571956e0c431f5d384
|
f43083d560565aea46c602adb86423daeefe589d |
|
07-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not update Out after it has a valid location. Slow paths use LocationSummary to know where to move things around, and they are executed at the end of the code generation. This fix is needed for https://android-review.googlesource.com/#/c/113345/. Change-Id: Id336c6409479b1de6dc839b736a7234d08a7774a
|
de58ab2c03ff8112b07ab827c8fa38f670dfc656 |
|
05-Nov-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement try/catch/throw in optimizing. - We currently don't run optimizations in the presence of a try/catch. - We therefore implement Quick's mapping table. - Also fix a missing null check on array-length. Change-Id: I6917dfcb868e75c1cf6eff32b7cbb60b6cfbd68f
|
19a19cffd197a28ae4c9c3e59eff6352fd392241 |
|
22-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for static fields in optimizing compiler. Change-Id: Id2f010589e2bd6faf42c05bb33abf6816ebe9fa9
|
3c03503d66df3b4440f851ae7d0c4fae5e7872df |
|
28-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Follow-up CL after hard float changes. Addressing comments from Zheng Xu. Change-Id: I8c599cdfab03373e82a1b90b711005c490bc6ca0
|
1ba0f596e9e4ddd778ab431237d11baa85594eba |
|
27-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support hard float on arm in optimizing compiler. Also bump oat version, needed after latest hard float switch. Change-Id: Idf5acfb36c07e74acff00edab998419a3c6b2965
|
5319defdf502fc4569316473846b83180ec08035 |
|
23-Oct-2014 |
Alexandre Rames <alexandre.rames@arm.com> |
ART: optimizing compiler: initial support for ARM64. The ARM64 port uses VIXL for code generation, to which it defers work like label binding and branch resolving, register type coherency checking, and immediate values handling. Change-Id: I0a44508c0c991f472a63e67b3469cdd878fe1a68 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com> Signed-off-by: Alexandre Rames <alexandre.rames@arm.com>
|
102cbed1e52b7c5f09458b44903fe97bb3e14d5f |
|
15-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Implement register allocator for floating point registers. Also: - Fix misuses of emitting the rex prefix in the x86_64 assembler. - Fix movaps code generation in the x86_64 assembler. Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
|
92a73aef279be78e3c2b04db1713076183933436 |
|
16-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use assembler classes in code_generator.h. The arm64 backend uses its own assembler and does not share the same classes as the other backends. To avoid conflicts or unnecessary mappings, just don't use those classes in the shared part of the code generator. Change-Id: I9e5fa40c1021d2e83a4ef14c52cd1ccd03f2f73d
|
71175b7f19a4f6cf9cc264feafd820dbafa371fb |
|
09-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Cleanup baseline register allocator. - Use three arrays for blocking regsters instead of one and computing offsets in that array.] - Don't pass blocked_registers_ to methods, just use the field. Change-Id: Ib698564c31127c59b5a64c80f4262394b8394dc6
|
56b9ee6fe1d6880c5fca0e7feb28b25a1ded2e2f |
|
09-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Stop converting from Location to ManagedRegister. Now the source of truth is the Location object that knows which register (core, pair, fpu) it needs to refer to. Change-Id: I62401343d7479ecfb24b5ed161ec7829cda5a0b1
|
7fb49da8ec62e8a10ed9419ade9f32c6b1174687 |
|
06-Oct-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for floats and doubles. - Follows Quick conventions. - Currently only works with baseline register allocator. Change-Id: Ie4b8e298f4f5e1cd82364da83e4344d4fc3621a3
|
3c04974a90b0e03f4b509010bff49f0b2a3da57f |
|
24-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Optimize suspend checks in optimizing compiler. - Remove the ones added during graph build (they were added for the baseline code generator). - Emit them at loop back edges after phi moves, so that the test can directly jump to the loop header. - Fix x86 and x86_64 suspend check by using cmpw instead of cmpl. Change-Id: I6fad5795a55705d86c9e1cb85bf5d63dadfafa2a
|
3bca0df855f0e575c6ee020ed016999fc8f14122 |
|
19-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support for saving and restoring live registers in a slow path. And use it in suspend check slow paths. Change-Id: I79caf28f334c145a36180c79a6e2fceae3990c31
|
8a16d97fb8f031822b206e65f9109a071da40563 |
|
11-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix valgrind errors. For now just stack allocate the code generator. Will think about cleaning up the root problem later (CodeGenerator being an arena object). Change-Id: I161a6f61c5f27ea88851b446f3c1e12ee9c594d7
|
3946844c34ad965515f677084b07d663d70ad1b8 |
|
02-Sep-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Runtime support for the new stack maps for the opt compiler. Now most of the methods supported by the compiler can be optimized, instead of using the baseline. Change-Id: I80ab36a34913fa4e7dd576c7bf55af63594dc1fa
|
e3ea83811d47152c00abea24a9b420651a33b496 |
|
08-Aug-2014 |
Yevgeny Rouban <yevgeny.y.rouban@intel.com> |
ART source line debug info in OAT files OAT files have source line information enough for ART runtime needs like jump to/from interpreter and thread suspension. But this information is not enough for finer grained source level debugging and low-level profiling (VTune or perf). This patch adds to OAT files two additional sections: .debug_line - DWARF formatted Elf32 section with detailed source line information (mapping from native PC to Java source lines). In addition to the debugging symbols added using the dex2oat option --include-debug-symbols, the source line information is added to the section .debug_line. The source line info can be read by many Elf reading tools like objdump, readelf, dwarfdump, gdb, perf, VTune, ... gdb can use this debug line information in x86. In 64-bit mode the information can be used if the oat file is mapped in the lower address space (address has higher 32 bits zeroed). Relocation works. Testing: 1. art/test/run-test --host --gdb [--64] 001-HelloWorld 2. in gdb: break Main.java:19 3. in gdb: break Runtime.java:111 4. in gdb: run - stops at void java.lang.Runtime.<init>() 5. in gdb: backtrace - shows call stack down to main() 6. in gdb: continue - stops at void Main.main() (only in 32-bit mode) 7. in gdb: backtrace - shows call stack down to main() 8. objdump -W <oat-file> - addresses are from VMA range of .text section reported by objdump -h <file> 9. dwarfdump -ka <oat-file> - no errors expected Size of aosp-x86-eng boot.oat increased by 11% from 80.5Mb to 89.2Mb with two sections added .debug_line (7.2Mb) and .rel.debug (1.5Mb). Change-Id: Ib8828832686e49782a63d5529008ff4814ed9cda Signed-off-by: Yevgeny Rouban <yevgeny.y.rouban@intel.com>
|
73e80c3ae76fafdb53afe3a85306dcb491fb5b00 |
|
22-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Make unit test tell if a method is a leaf. The runtime is not initialized completely in gtests, so we cannot run code (such as explicit stack overflow checks) that look at tls values. Change-Id: I74a4449b01eb203f1b411dda700e9459878d0d55
|
f12feb8e0e857f2832545b3f28d31bad5a9d3903 |
|
17-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Stack overflow checks and NPE checks for optimizing. Change-Id: I59e97448bf29778769b79b51ee4ea43f43493d96
|
ab032bc1ff57831106fdac6a91a136293609401f |
|
15-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Fix a braino in the stack layout. Also do some refactoring to have this code be just in CodeGenerator. Change-Id: I88de109889138af8d60027973c12a64bee813cb7
|
e50383288a75244255d3ecedcc79ffe9caf774cb |
|
04-Jul-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support fields in optimizing compiler. - Required support for temporaries, to be only used by baseline compiler. - Also fixed a few invalid assumptions around locations and instructions that don't need materialization. These instructions should not have an Out. Change-Id: Idc4a30dd95dd18015137300d36bec55fc024cf62
|
86dbb9a12119273039ce272b41c809fa548b37b6 |
|
04-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Final CL to enable register allocation on x86. This CL implements: 1) Resolution after allocation: connecting the locations allocated to an interval within a block and between blocks. 2) Handling of fixed registers: some instructions require inputs/output to be at a specific location, and the allocator needs to deal with them in a special way. 3) ParallelMoveResolver::EmitNativeCode for x86. Change-Id: I0da6bd7eb66877987148b87c3be6a983b4e3f858
|
9cf35523764d829ae0470dae2d5dd99be469c841 |
|
09-Jun-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add x86_64 support to the optimizing compiler. Change-Id: I4462d9ae15be56c4a3dc1bd4d1c0c6548c1b94be
|
f635e63318447ca04731b265a86a573c9ed1737c |
|
14-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a compilation tracing mechanism to the new compiler. Code mostly imported from: https://android-review.googlesource.com/#/c/81653/. Change-Id: I150fe942be0fb270e03fabb19032180f7a065d13
|
804d09372cc3d80d537da1489da4a45e0e19aa5d |
|
02-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Build live-in, live-out and kill sets for each block. This information will be used when computing live ranges of instructions. Change-Id: I345ee833c1ccb4a8e725c7976453f6d58d350d74
|
f529d776ca9f48b115714f6c79677755ecc37d24 |
|
02-May-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Make all registers available when allocating an output register. On ARM we currently only have two register pairs available, so we need to use one already used for an input. Change-Id: I5411862310009a41e50ddab3549d3a9e9052266a
|
a7aca370a7d62ca04a1e24423d90e8020d6f1a58 |
|
28-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Setup policies for register allocation. Change-Id: I857e77530fca3e2fb872fc142a916af1b48400dc
|
c32e770f21540e4e9eda6dc7f770e745d33f1b9f |
|
24-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add a Transform to SSA phase to the optimizing compiler. Change-Id: Ia9700756a0396d797a00b529896487d52c989329
|
b55f835d66a61e5da6fc1895ba5a0482868c9552 |
|
07-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Test control flow instruction with optimizing compiler. Add support for basic instructions to implement these tests. Change-Id: I3870bf9301599043b3511522bb49dc6364c9b4c0
|
f583e5976e1de9aa206fb8de4f91000180685066 |
|
07-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for taking parameters in optimizing compiler. - Fix stack layout to mimic Quick's. - Implement some sub operations. Change-Id: I8cf75a4d29b662381a64f02c0bc61d859482fc4e
|
707c809f661554713edfacf338365adca8dfd3a3 |
|
04-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Use target-specific word instead of runtime word. Change-Id: Ia11dc3cc520a1a5c7bd017013e5699af9570ce91
|
4a34a428c6a2588e0857ef6baf88f1b73ce65958 |
|
03-Apr-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Support passing arguments to invoke-static* instructions. - Stop using the frame pointer for accessing locals. - Stop emulating a stack when doing code generation. Instead, rely on dex register model, where instructions only reference registers. Change-Id: Id51bd7d33ac430cb87a53c9f4b0c864eeb1006f9
|
6a58cb16d803c9a7b3a75ccac8be19dd9d4e520d |
|
02-Apr-2014 |
Dmitry Petrochenko <dmitry.petrochenko@intel.com> |
art: Handle x86_64 architecture equal to x86 This patch forces FE/ME to treat x86_64 as x86 exactly. The x86_64 logic will be revised later when assembly will be ready. Change-Id: I4a92477a6eeaa9a11fd710d35c602d8d6f88cbb6 Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
|
8ccc3f5d06fd217cdaabd37e743adab2031d3720 |
|
19-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add support for invoke-static in optimizing compiler. Support is limited to calls without parameters and returning void. For simplicity, we currently follow the Quick ABI. Change-Id: I54805161141b7eac5959f1cae0dc138dd0b2e8a5
|
92cf83e001357329cbf41fa15a6e053fab6f4933 |
|
18-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Run Java tests with the optimizing compiler. Also fix a vector.reserve -> vector.resize braino, and build a GC map that dex2oat expects. Change-Id: I6acf2f90a4c32f90b79bf7709bf2e43931b98757
|
787c3076635cf117eb646c5a89a9014b2072fb44 |
|
17-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Plug new optimizing compiler in compilation pipeline. Also rename accessors to ART's conventions. Change-Id: I344807055b98aa4b27215704ec362191464acecc
|
bab4ed7057799a4fadc6283108ab56f389d117d4 |
|
11-Mar-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
More code generation for the optimizing compiler. - Add HReturn instruction - Generate code for locals/if/return - Setup infrastructure for register allocation. Currently emulate a stack. Change-Id: Ib28c2dba80f6c526177ed9a7b09c0689ac8122fb
|
d4dd255db1d110ceb5551f6d95ff31fb57420994 |
|
28-Feb-2014 |
Nicolas Geoffray <ngeoffray@google.com> |
Add codegen support to the optimizing compiler. Change-Id: I9aae76908ff1d6e64fb71a6718fc1426b67a5c28
|