Cross Reference: /art/compiler/optimizing/ssa_liveness

History log of /art/compiler/optimizing/ssa_liveness_analysis.cc
Revision	Date	Author	Comments
2dd7b672ea0afd7ea4448b43d24829e9886de3af	07-Dec-2017	Aart Bik <ajcbik@google.com>	Fixed spilling bug (visible on ARM64): missed SIMD type. Test: test-art-host test-art-target Change-Id: I6f321446f54943e02f250732ec9da729f633c3a9
e764d2e50c544c2cb98ee61a15d613161ac6bd17	05-Oct-2017	Vladimir Marko <vmarko@google.com>	Use ScopedArenaAllocator for register allocation. Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 25.1MiB -> 21.1MiB BatteryStats.dumpLocked(): 49.6MiB -> 42.0MiB This is because all the memory previously used by Scheduler is reused by the register allocator; the register allocator has a higher peak usage of the ArenaStack. And continue the "arena"->"allocator" renaming. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 64312607 Change-Id: Idfd79a9901552b5147ec0bf591cb38120de86b01
ca6fff898afcb62491458ae8bcd428bfb3043da1	03-Oct-2017	Vladimir Marko <vmarko@google.com>	ART: Use ScopedArenaAllocator for pass-local data. Passes using local ArenaAllocator were hiding their memory usage from the allocation counting, making it difficult to track down where memory was used. Using ScopedArenaAllocator reveals the memory usage. This changes the HGraph constructor which requires a lot of changes in tests. Refactor these tests to limit the amount of work needed the next time we change that constructor. Test: m test-art-host-gtest Test: testrunner.py --host Test: Build with kArenaAllocatorCountAllocations = true. Bug: 64312607 Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
0ebe0d83138bba1996e9c8007969b5381d972b32	21-Sep-2017	Vladimir Marko <vmarko@google.com>	ART: Introduce compiler data type. Replace most uses of the runtime's Primitive in compiler with a new class DataType. This prepares for introducing new types, such as Uint8, that the runtime does not need to know about. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 23964345 Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
5e3afa950f05bca470ef6b92460940f37831c27f	20-Sep-2017	Aart Bik <ajcbik@google.com>	Ensure extract is seen as having scalar result. Rationale: Extracting from a vector yields a scalar, yet our parallel mover and one DCHECK did not account for that fact (note that moving towards a vector type system will prevent such errors). Regression test for this is part of the SAD CL. Test: test-art-host test-art-target Bug: 64091002 Change-Id: Id154edd1a069c54e7d8da069c368dea0a8f973f4
c9c310487b8730fce5edfa72e79c4188629898a3	29-Jun-2017	Nicolas Geoffray <ngeoffray@google.com>	Turn a few DCHECK into CHECKs. To help diagnose b/63070152. bug: 63070152 Test: test.py Change-Id: I1ac1cf9bfe1bc15ecfa94b5b8537cd3afda6fd14
82b0740f03b1a6acab4558214d3edc362e27e238	01-Mar-2017	Vladimir Marko <vmarko@google.com>	Use IntrusiveForwardList<> for Env-/UsePosition. Test: m test-art-host-gtest Test: testrunner.py --host Change-Id: I2b720e2ed8f96303cf80e9daa6d5278bf0c3da2f
f8f5a16ed7bad1e18179e38453e59c96a944de10	07-Feb-2017	Aart Bik <ajcbik@google.com>	ART vectorizer. Rationale: Make SIMD great again with a retargetable and easily extendable vectorizer. Provides a full x86/x86_64 and a proof-of-concept ARM implementation. Sample improvement (without any perf tuning yet) for Linpack on x86 is about 20% to 50%. Test: test-art-host, test-art-target (angler) Bug: 34083438, 30933338 Change-Id: Ifb77a0f25f690a87cd65bf3d5e9f6be7ea71d6c1
5576f3741c58cb8b5fb2f68f3b3a9415efe05f4f	24-Mar-2017	Aart Bik <ajcbik@google.com>	Implement a SIMD spilling slot. Rationale: The last ART vectorizer break-out CL \O/ This ensures spilling on x86 and x86_4 is correct. Also, it paves the way to wider SIMD on ARM and MIPS. Test: test-art-host Bug: 34083438 Change-Id: I5b27d18c2045f3ab70b64c335423b3ff2a507ac2
cc89525c13894247cb82a1973617da6cba286f0c	21-Mar-2017	Aart Bik <ajcbik@google.com>	Change 1/2 spill slots to more general number of spill slots. Rationale: This prepares requesting a different number of spill slots during SIMD vectorization. Bug: 34083438 Test: test-art-host, test-art-host-gtest-register_allocator_test Change-Id: I6d22966ba483deec72b5eea5061c403c12b2ada7
2c45bc9137c29f886e69923535aff31a74d90829	25-Oct-2016	Vladimir Marko <vmarko@google.com>	Remove H[Reverse]PostOrderIterator and HInsertionOrderIterator. Use range-based loops instead, introducing helper functions ReverseRange() for iteration in reverse order in containers. When the contents of the underlying container change inside the loop, use an index-based loop that better exposes the container data modifications, compared to the old iterator interface that's hiding it which may lead to subtle bugs. Test: m test-art-host Change-Id: I2a4e6c508b854c37a697fc4b1e8423a8c92c5ea0
9620230700d4b451097c2163faa70627c9d8088a	05-Oct-2016	Aart Bik <ajcbik@google.com>	Refactoring of graph linearization and linear order. Rationale: Ownership of graph's linear order and iterators was a bit unclear now that other phases are using it. New approach allows phases to compute their own order, while ssa_liveness is sole owner for graph (since it is not mutated afterwards). Also shortens lifetime of loop's arena. Test: test-art-host Change-Id: Ib7137d1203a1e0a12db49868f4117d48a4277f30
20e9db6db787e007e7032878c9899b28ec43e93f	14-Sep-2016	Aart Bik <ajcbik@google.com>	Make LinearizeGraph() public (and move it to nodes files) Rationale: It is strange that HLinearOrderIterator is defined (and visible) in nodes.h, but clients have no way to build this order. This CL makes the building available at the usual place. Change-Id: Ib66f2edf6dfc8edd6b429bd4bea3ac7e37440b28 Tests: m test-art
30f766688006813ce90f42160c4b31112e90da60	02-Sep-2016	David Brazdil <dbrazdil@google.com>	Cache result of an expensive DCHECK LiveInterval::AddBackEdgeUses tests whether linear order is well formed on debug builds. This is expensive and can significantly hinder compilation times when many back edge uses are added. This patch moves the IsLinearOrderWellFormed test at the end of linear order generation. Bug: 31163119 Change-Id: Ic4fe66bee2055f4b2cb065d9451ad5f21ba00676
d9ffd0dd7266f6a5e76f29d98dbe1a04f64cbb9b	22-Jun-2016	Matthew Gharrity <gharrma@google.com>	Implement a graph coloring register allocator Test: m test-art-host Change-Id: I8c0d77f339ab02b33588a54b96ecce5c8322cfce
e90049140fdfb89080e5cc9b000b0c9be8c18bcd	16-Jun-2016	Vladimir Marko <vmarko@google.com>	Create a typedef for HInstruction::GetInputs() return type. And some other cleanup after https://android-review.googlesource.com/230742 Test: No new tests. ART test suite passed (tested on host). Change-Id: I4743bf17544d0234c6ccb46dd0c1b9aae5c93e17
372f10e5b0b34e2bb6e2b79aeba6c441e14afd1f	17-May-2016	Vladimir Marko <vmarko@google.com>	Refactor handling of input records. Introduce HInstruction::GetInputRecords(), a new virtual function that returns an ArrayRef<> to all input records. Implement all other functions dealing with input records as wrappers around GetInputRecords(). Rewrite functions that previously used multiple virtual calls to deal with input records, especially in loops, to prefetch the ArrayRef<> only once for each instruction. Besides avoiding all the extra calls, this also allows the compiler (clang++) to perform additional optimizations. This speeds up the Nexus 5 boot image compilation by ~0.5s (4% of "Compile Dex File", 2% of dex2oat time) on AOSP ToT. Change-Id: Id8ebe0fb9405e38d918972a11bd724146e4ca578
d7c2fdc939bb7efb3e7204d62e54c6a3f7d77f9b	10-May-2016	Nicolas Geoffray <ngeoffray@google.com>	Fix another case of live_in at irreducible loop entry. GVN was implicitly extending the liveness of an instruction across an irreducible loop. Fix this problem by clearing the value set at loop entries that contain an irreducible loop. bug:28252896 (cherry picked from commit 77ce6430af2709432b22344ed656edd8ec80581b) Change-Id: Ie0121e83b2dfe47bcd184b90a69c0194d13fce54
77ce6430af2709432b22344ed656edd8ec80581b	10-May-2016	Nicolas Geoffray <ngeoffray@google.com>	Fix another case of live_in at irreducible loop entry. GVN was implicitly extending the liveness of an instruction across an irreducible loop. Fix this problem by clearing the value set at loop entries that contain an irreducible loop. bug:28252896 Change-Id: I68823cb88dceb4c2b4545286ba54fd0c958a48b0
d59f3b1b7f5c1ab9f0731ff9dc60611e8d9a6ede	29-Mar-2016	Vladimir Marko <vmarko@google.com>	Use iterators "before" the use node in HUserRecord<>. Create a new template class IntrusiveForwardList<> that mimicks std::forward_list<> except that all allocations are handled externally. This is essentially the same as boost::intrusive::slist<> but since we're not using Boost we have to reinvent the wheel. Use the new container to replace the HUseList and use the iterators to "before" use nodes in HUserRecord<> to avoid the extra pointer to the previous node which was used exclusively for removing nodes from the list. This reduces the size of the HUseListNode by 25%, 32B to 24B in 64-bit compiler, 16B to 12B in 32-bit compiler. This translates directly to overall memory savings for the 64-bit compiler but due to rounding up of the arena allocations to 8B, we do not get any improvement in the 32-bit compiler. Compiling the Nexus 5 boot image with the 64-bit dex2oat on host this CL reduces the memory used for compiling the most hungry method, BatteryStats.dumpLocked(), by ~3.3MiB: Before: MEM: used: 47829200, allocated: 48769120, lost: 939920 Number of arenas allocated: 345, Number of allocations: 815492, avg size: 58 ... UseListNode 13744640 ... After: MEM: used: 44393040, allocated: 45361248, lost: 968208 Number of arenas allocated: 319, Number of allocations: 815492, avg size: 54 ... UseListNode 10308480 ... Note that while we do not ship the 64-bit dex2oat to the device, the JIT compilation for 64-bit processes is using the 64-bit libart-compiler. Bug: 28173563 Bug: 27856014 (cherry picked from commit 46817b876ab00d6b78905b80ed12b4344c522b6c) Change-Id: Ifb2d7b357064b003244e92c0d601d81a05e56a7b
46817b876ab00d6b78905b80ed12b4344c522b6c	29-Mar-2016	Vladimir Marko <vmarko@google.com>	Use iterators "before" the use node in HUserRecord<>. Create a new template class IntrusiveForwardList<> that mimicks std::forward_list<> except that all allocations are handled externally. This is essentially the same as boost::intrusive::slist<> but since we're not using Boost we have to reinvent the wheel. Use the new container to replace the HUseList and use the iterators to "before" use nodes in HUserRecord<> to avoid the extra pointer to the previous node which was used exclusively for removing nodes from the list. This reduces the size of the HUseListNode by 25%, 32B to 24B in 64-bit compiler, 16B to 12B in 32-bit compiler. This translates directly to overall memory savings for the 64-bit compiler but due to rounding up of the arena allocations to 8B, we do not get any improvement in the 32-bit compiler. Compiling the Nexus 5 boot image with the 64-bit dex2oat on host this CL reduces the memory used for compiling the most hungry method, BatteryStats.dumpLocked(), by ~3.3MiB: Before: MEM: used: 47829200, allocated: 48769120, lost: 939920 Number of arenas allocated: 345, Number of allocations: 815492, avg size: 58 ... UseListNode 13744640 ... After: MEM: used: 44393040, allocated: 45361248, lost: 968208 Number of arenas allocated: 319, Number of allocations: 815492, avg size: 54 ... UseListNode 10308480 ... Note that while we do not ship the 64-bit dex2oat to the device, the JIT compilation for 64-bit processes is using the 64-bit libart-compiler. Bug: 28173563 Change-Id: I985eabd4816f845372d8aaa825a1489cf9569208
3563c44464ca55b2106373b35110e5ecaae38abf	18-Apr-2016	Vladimir Marko <vmarko@google.com>	Fix inlining loops in OSR mode. When compiling a method in OSR mode and the method does not contain a loop (arguably, a very odd case) but we inline another method with a loop and then the final DCE re-runs the loop identification, the inlined loop would previously be marked as irreducible. However, the SSA liveness analysis expects irreducible loop to have extra loop Phis which were already eliminated from the loop before the inner graph was inlined to the outer graph, so we would fail a DCHECK(). We fix this by not marking inlined loops as irreducible when compiling in OSR mode. Bug: 28210356 (cherry picked from commit fd66c50d64c38e40bafde83b4872e27bbff7546d) Change-Id: I149273b766d1c713c571baad6033c5f70e6dd960
fd66c50d64c38e40bafde83b4872e27bbff7546d	18-Apr-2016	Vladimir Marko <vmarko@google.com>	Fix inlining loops in OSR mode. When compiling a method in OSR mode and the method does not contain a loop (arguably, a very odd case) but we inline another method with a loop and then the final DCE re-runs the loop identification, the inlined loop would previously be marked as irreducible. However, the SSA liveness analysis expects irreducible loop to have extra loop Phis which were already eliminated from the loop before the inner graph was inlined to the outer graph, so we would fail a DCHECK(). We fix this by not marking inlined loops as irreducible when compiling in OSR mode. Bug: 28210356 Change-Id: If10057ed883333c62a878ed2ae3fe01bb5280e33
badd826664896d4a9628a5a89b78016894aa414b	02-Feb-2016	David Brazdil <dbrazdil@google.com>	ART: Run SsaBuilder from HGraphBuilder First step towards merging the two passes, which will later result in HGraphBuilder directly producing SSA form. This CL mostly just updates tests broken by not being able to inspect the pre-SSA form. Using HLocals outside the HGraphBuilder is now deprecated. Bug: 27150508 Change-Id: I00fb6050580f409dcc5aa5b5aa3a536d6e8d759e
674f519fe00ae07e0db90c4374f785bb418ae332	02-Feb-2016	David Brazdil <dbrazdil@google.com>	ART: Enable multi-level instruction inlining Change-Id: I4b4c927d7b1598dc197793c25185fb079dec7fe1
b3e773eea39a156b3eacf915ba84e3af1a5c14fa	26-Jan-2016	David Brazdil <dbrazdil@google.com>	ART: Implement support for instruction inlining Optimizing HIR contains 'non-materialized' instructions which are emitted at their use sites rather than their defining sites. This was not properly handled by the liveness analysis which did not adjust the use positions of the inputs of such instructions. Despite the analysis being incorrect, the current use cases never produce incorrect code. This patch generalizes the concept of inlined instructions and updates liveness analysis to set the compute use positions correctly. Change-Id: Id703c154b20ab861241ae5c715a150385d3ff621
15bd22849ee6a1ffb3fb3630f686c2870bdf1bbc	05-Jan-2016	Nicolas Geoffray <ngeoffray@google.com>	Implement irreducible loop support in optimizing. So we don't fallback to the interpreter in the presence of irreducible loops. Implications: - A loop pre-header does not necessarily dominate a loop header. - Non-constant redundant phis will be kept in loop headers, to satisfy our linear scan register allocation algorithm. - while-graph optimizations, such as gvn, licm, lse, and dce need to know when they are dealing with irreducible loops. Change-Id: I2cea8934ce0b40162d215353497c7f77d6c9137e
ec7802a102d49ab5c17495118d4fe0bcc7287beb	01-Oct-2015	Vladimir Marko <vmarko@google.com>	Add DCHECKs to ArenaVector and ScopedArenaVector. Implement dchecked_vector<> template that DCHECK()s element access and insert()/emplace()/erase() positions. Change the ArenaVector<> and ScopedArenaVector<> aliases to use the new template instead of std::vector<>. Remove DCHECK()s that have now become unnecessary from the Optimizing compiler. Change-Id: Ib8506bd30d223f68f52bd4476c76d9991acacadc
2aaa4b5532d30c4e65d8892b556400bb61f9dc8c	17-Sep-2015	Vladimir Marko <vmarko@google.com>	Optimizing: Tag more arena allocations. Replace GrowableArray with ArenaVector and tag arena allocations with new allocation types. As part of this, make the register allocator a bit more efficient, doing bulk insert/erase. Some loops are now O(n) instead of O(n^2). Change-Id: Ifac0871ffb34b121cc0447801a2d07eefd308c14
fa6b93c4b69e6d7ddfa2a4ed0aff01b0608c5a3a	15-Sep-2015	Vladimir Marko <vmarko@google.com>	Optimizing: Tag arena allocations in HGraph. Replace GrowableArray with ArenaVector in HGraph and related classes HEnvironment, HLoopInformation, HInvoke and HPhi, and tag allocations with new arena allocation types. Change-Id: I3d79897af405b9a1a5b98bfc372e70fe0b3bc40d
77a48ae01bbc5b05ca009cf09e2fcb53e4c8ff23	15-Sep-2015	David Brazdil <dbrazdil@google.com>	Revert "Revert "ART: Register allocation and runtime support for try/catch"" The original CL triggered b/24084144 which has been fixed by Ib72e12a018437c404e82f7ad414554c66a4c6f8c. This reverts commit 659562aaf133c41b8d90ec9216c07646f0f14362. Change-Id: Id8980436172457d0fcb276349c4405f7c4110a55
659562aaf133c41b8d90ec9216c07646f0f14362	14-Sep-2015	David Brazdil <dbrazdil@google.com>	Revert "ART: Register allocation and runtime support for try/catch" Breaks libcore test org.apache.harmony.security.tests.java.security.KeyStorePrivateKeyEntryTest#testGetCertificateChain. Need to investigate. This reverts commit b022fa1300e6d78639b3b910af0cf85c43df44bb. Change-Id: Ib24d3a80064d963d273e557a93469c95f37b1f6f
b022fa1300e6d78639b3b910af0cf85c43df44bb	20-Aug-2015	David Brazdil <dbrazdil@google.com>	ART: Register allocation and runtime support for try/catch This patch completes a series of CLs that add support for try/catch in the Optimizing compiler. With it, Optimizing can compile all methods containing try/catch, provided they don't contain catch loops. Future work will focus on improving performance of the generated code. SsaLivenessAnalysis was updated to propagate liveness information of instructions live at catch blocks, and to keep location information on instructions which may be caught by catch phis. RegisterAllocator was extended to spill values used after catch, and to allocate spill slots for catch phis. Catch phis generated for the same vreg share a spill slot as the raw value must be the same. Location builders and slow paths were updated to reflect the fact that throwing an exception may not lead to escaping the method. Instruction code generators are forbidden from using of implicit null checks in try blocks as live registers need to be saved before handing over to the runtime. CodeGenerator emits a stack map for each catch block, storing locations of catch phis. CodeInfo and StackMapStream recognize this new type of stack map and store them separate from other stack maps to avoid dex_pc conflicts. After having found the target catch block to deliver an exception to, QuickExceptionHandler looks up the dex register maps at the throwing instruction and the catch block and copies the values over to their respective locations. The runtime-support approach was selected because it allows for the best performance in the normal control-flow path, since no propagation of catch phi values is necessary until the exception is thrown. In addition, it also greatly simplifies the register allocation phase. ConstantHoisting was removed from LICMTest because it instantiated (now abstract) HConstant and was bogus anyway (constants are always in the entry block). Change-Id: Ie31038ad8e3ee0c13a5bbbbaf5f0b3e532310e4e
6058455d486219994921b63a2d774dc9908415a2	03-Sep-2015	Vladimir Marko <vmarko@google.com>	Optimizing: Tag basic block allocations with their source. Replace GrowableArray with ArenaVector in HBasicBlock and, to track the source of allocations, assign one new and two Quick's arena allocation types to these vectors. Rename kArenaAllocSuccessor to kArenaAllocSuccessors. Bug: 23736311 Change-Id: Ib52e51698890675bde61f007fe6039338cf1a025
145acc5361deb769eed998f057bc23abaef6e116	03-Sep-2015	Vladimir Marko <vmarko@google.com>	Revert "Optimizing: Tag basic block allocations with their source." Reverting so that we can have more discussion about the STL API. This reverts commit 91e11c0c840193c6822e66846020b6647de243d5. Change-Id: I187fe52f2c16b6e7c5c9d49c42921eb6c7063dba
91e11c0c840193c6822e66846020b6647de243d5	02-Sep-2015	Vladimir Marko <vmarko@google.com>	Optimizing: Tag basic block allocations with their source. Replace GrowableArray with ArenaVector in HBasicBlock and, to track the source of allocations, assign one new and two Quick's arena allocation types to these vectors. Rename kArenaAllocSuccessor to kArenaAllocSuccessors. Bug: 23736311 Change-Id: I984aef6e615ae2380a532f5c6726af21015f43f5
681652d8e8a33bc07c5c082a71aea13d0f15e0a0	23-Jul-2015	Mingyao Yang <mingyao@google.com>	HDeoptimize should hold values live in env. Values that are not live in compiled code anymore may still be needed in interpreter, due to code motion, etc. (cherry-picked from commit 718493c6c3c8e380663cb8a94e57ce160a6c473f) Bug: 22665511 Change-Id: I8b85833c5c462f8fe36f86d6026a51b07563995a
718493c6c3c8e380663cb8a94e57ce160a6c473f	23-Jul-2015	Mingyao Yang <mingyao@google.com>	HDeoptimize should hold values live in env. Values that are not live in compiled code anymore may still be needed in interpreter, due to code motion, etc. Bug: 22665511 Change-Id: I8b85833c5c462f8fe36f86d6026a51b07563995a
94015b939060f5041d408d48717f22443e55b6ad	04-Jun-2015	Nicolas Geoffray <ngeoffray@google.com>	Revert "Revert "Use HCurrentMethod in HInvokeStaticOrDirect."" Fix was to special case baseline for x86, which does not have enough registers to allocate the current method. This reverts commit c345f141f11faad177aa9635a78088d00cf66086. Change-Id: I5997aa52f8d4df373ae5ff4d4150dac0c44c4c10
c345f141f11faad177aa9635a78088d00cf66086	04-Jun-2015	Nicolas Geoffray <ngeoffray@google.com>	Revert "Use HCurrentMethod in HInvokeStaticOrDirect." Fails on baseline/x86. This reverts commit 38207af82afb6f99c687f64b15601ed20d82220a. Change-Id: Ib71018367eb7c6046965494a7e996c22af3de403
38207af82afb6f99c687f64b15601ed20d82220a	01-Jun-2015	Nicolas Geoffray <ngeoffray@google.com>	Use HCurrentMethod in HInvokeStaticOrDirect. Change-Id: I0d15244b6b44c8b10079398c55da5071a3e3af66
8272688499c2232355db34d94057983fd436173d	01-Jun-2015	Nicolas Geoffray <ngeoffray@google.com>	Tweak one hint and one split in the linear scan. - Return a hinted register if it is available. Otherwise another move will be necessary. - Use SplitBetween instead of raw split when a register is not fully available. This will find the best split position. Change-Id: Ie464e536204ab556eb09345fe6426621eb86e5ac
0a23d74dc2751440822960eab218be4cb8843647	07-May-2015	Nicolas Geoffray <ngeoffray@google.com>	Add a parent environment to HEnvironment. This code has no functionality change. It adds a placeholder for chaining inlined frames. Change-Id: I5ec57335af76ee406052345b947aad98a6a4423a
db216f4d49ea1561a74261c29f1264952232728a	05-May-2015	Nicolas Geoffray <ngeoffray@google.com>	Relax the only one back-edge restriction. The rule is in the way for better register allocation, as it creates an artificial join point between multiple paths. Change-Id: Ia4392890f95bcea56d143138f28ddce6c572ad58
fbda5f3e1378f07ae202f62da625ee43a063a052	29-Apr-2015	Nicolas Geoffray <ngeoffray@google.com>	Find better split positions in the register allocator. In a standard if/else control flow graph, this avoids doing a move in one branch if the other branch decided to move an interval. This also needs a new register hint kind, which is what was the location of the interval at the predecessor block. Change-Id: I18b78264587b4d693540fbb5e014d12df2add3e2
579026039080252878106118645ed70706f4838e	21-Apr-2015	Nicolas Geoffray <ngeoffray@google.com>	Add synthesize uses at back edge. This reduces the cost of linearizing the graph (hence removing the notion of back edge). Since linear scan allocates/spills registers based on next use, adding a use at a back edge ensures we do count for loop uses. Change-Id: Idaa882cb120edbdd08ca6bff142d326a8245bd14
4ed947a58de87d19d0609be773207c905ccb0f7f	27-Apr-2015	Nicolas Geoffray <ngeoffray@google.com>	Dissociate uses with environment uses. They are most of the times in the way when iterating. They also complicate the logic of (future) back edge uses. Change-Id: I152595d9913073fe901b267ca623fa0fe7432484
241a486267bdb59b32fe4c8db370eb936068fb39	16-Apr-2015	David Brazdil <dbrazdil@google.com>	ART: Replace expensive calls to Covers in reg alloc LiveInterval::Covers is implemented as a linear-time search over liveness ranges and can therefore be rather expensive and should be avoided unless necessary. This patch replaces calls to Covers when searching for a sibling with the cheaper IsDefinedAt call. Change-Id: I93fc73529c15a518335f4cbdc3a0def52d9501e5
0d9f17de8f21a10702de1510b73e89d07b3b9bbf	15-Apr-2015	Nicolas Geoffray <ngeoffray@google.com>	Move the linear order to the HGraph. Bug found by Zheng Xu: SsaLivenessAnalysis being a stack allocated object, we should not refer to it in later phases of the compiler. Specifically, the code generator was using the linear order, which was stored in the liveness analysis object. Change-Id: I574641f522b7b86fc43f3914166108efc72edb3b
d8126bef62df7f40f2e6abc74004f52e664daf45	27-Mar-2015	Nicolas Geoffray <ngeoffray@google.com>	Fix locations at environment uses. We were too agressive in not recording environment uses when the instruction was not of type object. We have to record the use to the use list of an interval, but it should not affect the live ranges of that interval. Change-Id: Id16fb7cc06f14083766d408a345837793583b6ea
f01d34445953e6b9c9b13de1dd32a5c0ee5abab5	27-Mar-2015	Nicolas Geoffray <ngeoffray@google.com>	Implement a proper solution for temps. We used to play some trickery when updating locations of temps. This change creates a proper use of the temp, and use it for updating its location. Change-Id: I53e9447b87a55137a3a79841db21ad3864854825
46e2a3915aa68c77426b71e95b9f3658250646b7	16-Mar-2015	David Brazdil <dbrazdil@google.com>	ART: Boolean simplifier The optimization recognizes the negation pattern generated by 'javac' and replaces it with a single condition. To this end, boolean values are now consistently assumed to be represented by an integer. This is a first optimization which deletes blocks from the HGraph and does so by replacing the corresponding entries with null. Hence, existing code can continue indexing the list of blocks with the block ID, but must check for null when iterating over the list. Change-Id: I7779da69cfa925c6521938ad0bcc11bc52335583
915b9d0c13bb5091875d868fbfa551d7b65d7477	11-Mar-2015	Nicolas Geoffray <ngeoffray@google.com>	Tweak liveness when instructions are used in environments. Instructions remain live when debuggable, but only instructions with object types remain live when non-debuggable. Enable StackVisitor::GetThisObject for optimizing. Change-Id: Id87b2cbf33a02450059acc9993995782e5f28987
5b8e6a594b827f7dc88b2e3d895e08f5b3f22446	25-Feb-2015	David Brazdil <dbrazdil@google.com>	ART: Cache last returned range in LiveInterval::Covers Optimizing spends ~10% of compilation time in the register allocator. One of the frequently called methods is LiveInterval::Covers which has linear complexity w.r.t. the number of gaps in liveness intervals. This patch leverages the fact that the register allocator calls Covers with non-decreasing position values and caches the last returned result to start the iteration closer to the result the next time the method is invoked. Stats from compiling the framework show that this optimization reduces the average number of iterations needed to find the result by 40%. Change-Id: I4dd26b900879d5e1d03818ebc1e117cc6a53053c
da02afe615191a19eae9a039786c4c4fc20dbfff	11-Feb-2015	Nicolas Geoffray <ngeoffray@google.com>	Support hints for register pairs. Change-Id: Ia49dc5bf3e9a2bd481425bfe7fbeea9feb66c8e6
c0572a451944f78397619dec34a38c36c11e9d2a	06-Feb-2015	Nicolas Geoffray <ngeoffray@google.com>	Optimize leaf methods. Avoid suspend checks and stack changes when not needed. Change-Id: I0fdb31e8c631e99091b818874a558c9aa04b1628
ed59619b370ef23ffbb25d1d01f615e60a9262b6	23-Jan-2015	David Brazdil <dbrazdil@google.com>	Optimizing: Speed up HEnvironment use removal Removal of use records from HEnvironment vregs involved iterating over potentially large linked lists which made compilation of huge methods very slow. This patch turns use lists into doubly-linked lists, stores pointers to the relevant nodes inside HEnvironment and subsequently turns the removals into constant-time operations. Change-Id: I0e1d4d782fd624e7b8075af75d4adf0a0634a1ee
840e5461a85f8908f51e7f6cd562a9129ff0e7ce	07-Jan-2015	Nicolas Geoffray <ngeoffray@google.com>	Implement double and float support for arm in register allocator. The basic approach is: - An instruction that needs two registers gets two intervals. - When allocating the low part, we also allocate the high part. - When splitting a low (or high) interval, we also split the high (or low) equivalent. - Allocation follows the (S/D register) requirement that low registers are always even and the high equivalent is low + 1. Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
a8eed3acbc39c71ec22dc2943e71eaa07c6507dd	24-Nov-2014	Nicolas Geoffray <ngeoffray@google.com>	Revert "Revert "Fix the computation of linear ordering."" PS2 fixes the obvious typos/wrong refactoring. This reverts commit e50fa5887b1342b845826197d81950e26753fc9c. Change-Id: I22f81d63a12cf01aafd61535abc2399d936d49c2
e50fa5887b1342b845826197d81950e26753fc9c	24-Nov-2014	Nicolas Geoffray <ngeoffray@google.com>	Revert "Fix the computation of linear ordering." Build is broken. This reverts commit 3054a90063d379ab8c9e5a42a7daf0d644b48b07. Change-Id: I259bc2bd6a58e30391b8176f3db5fdb5c07e4d6d
3054a90063d379ab8c9e5a42a7daf0d644b48b07	21-Nov-2014	Nicolas Geoffray <ngeoffray@google.com>	Fix the computation of linear ordering. The register allocator makes assumptions on the order, and we ended up not computing the right one. The algorithm worked fine when the loop header is the block branching to the exit, but in the presence of breaks or do/while, it was incorrect. Change-Id: Iad0a89872cd3f7b7a8b2bdf560f0d03493f93ba5
277ccbd200ea43590dfc06a93ae184a765327ad0	04-Nov-2014	Andreas Gampe <agampe@google.com>	ART: More warnings Enable -Wno-conversion-null, -Wredundant-decls and -Wshadow in general, and -Wunused-but-set-parameter for GCC builds. Change-Id: I81bbdd762213444673c65d85edae594a523836e5
296bd60423e0630d8152b99fb7afb20fbff5a18a	07-Oct-2014	Mingyao Yang <mingyao@google.com>	Some improvement to reg alloc. Change-Id: If579a37791278500a7e5bc763f144c241f261920
102cbed1e52b7c5f09458b44903fe97bb3e14d5f	15-Oct-2014	Nicolas Geoffray <ngeoffray@google.com>	Implement register allocator for floating point registers. Also: - Fix misuses of emitting the rex prefix in the x86_64 assembler. - Fix movaps code generation in the x86_64 assembler. Change-Id: Ib6dcf6e7c4a9c43368cfc46b02ba50f69ae69cbe
56b9ee6fe1d6880c5fca0e7feb28b25a1ded2e2f	09-Oct-2014	Nicolas Geoffray <ngeoffray@google.com>	Stop converting from Location to ManagedRegister. Now the source of truth is the Location object that knows which register (core, pair, fpu) it needs to refer to. Change-Id: I62401343d7479ecfb24b5ed161ec7829cda5a0b1
01ef345767ea609417fc511e42007705c9667546	01-Oct-2014	Nicolas Geoffray <ngeoffray@google.com>	Add trivial register hints to the register allocator. - Add hints for phis, same as first input, and expected registers. - Make the if instruction accept non-condition instructions. Change-Id: I34fa68393f0d0c19c68128f017b7a05be556fbe5
8ddb00ca935733f5d3b07816e5bb33d6cabe6ec4	29-Sep-2014	Nicolas Geoffray <ngeoffray@google.com>	Improve detection of lifetime holes. The check concluding that the next use was in a successor was too conservative: two blocks following each other in terms of liveness are not necessarily predecessor/sucessor. Change-Id: Ideec98046c812aa5fb63781141b5fde24c706d6d
8a16d97fb8f031822b206e65f9109a071da40563	11-Sep-2014	Nicolas Geoffray <ngeoffray@google.com>	Fix valgrind errors. For now just stack allocate the code generator. Will think about cleaning up the root problem later (CodeGenerator being an arena object). Change-Id: I161a6f61c5f27ea88851b446f3c1e12ee9c594d7
e77493c7217efdd1a0ecef521a6845a13da0305b	21-Aug-2014	Ian Rogers <irogers@google.com>	Make common BitVector operations inline-able. Change-Id: Ie25de4fae56c6712539f04172c42e3eff57df7ca
e50383288a75244255d3ecedcc79ffe9caf774cb	04-Jul-2014	Nicolas Geoffray <ngeoffray@google.com>	Support fields in optimizing compiler. - Required support for temporaries, to be only used by baseline compiler. - Also fixed a few invalid assumptions around locations and instructions that don't need materialization. These instructions should not have an Out. Change-Id: Idc4a30dd95dd18015137300d36bec55fc024cf62
31d76b42ef5165351499da3f8ee0ac147428c5ed	09-Jun-2014	Nicolas Geoffray <ngeoffray@google.com>	Plug code generator into liveness analysis. Also implement spill slot support. Change-Id: If5e28811e9fbbf3842a258772c633318a2f4fafc
ec7e4727e99aa1416398ac5a684f5024817a25c7	06-Jun-2014	Nicolas Geoffray <ngeoffray@google.com>	Fix some bugs in graph construction/simplification methods. Also fix a brano during SSA construction. The code should not have been commented out. Added a test to cover what the code intends. Change-Id: Ia00ae79dcf75eb0d412f07649d73e7f94dbfb6f0
ffddfdf6fec0b9d98a692e27242eecb15af5ead2	03-Jun-2014	Tim Murray <timmurray@google.com>	DO NOT MERGE Merge ART from AOSP to lmp-preview-dev. Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
a7062e05e6048c7f817d784a5b94e3122e25b1ec	22-May-2014	Nicolas Geoffray <ngeoffray@google.com>	Add a linear scan register allocator to the optimizing compiler. This is a "by-the-book" implementation. It currently only deals with allocating registers, with no hint optimizations. The changes remaining to make it functional are: - Allocate spill slots. - Resolution and placements of Move instructions. - Connect it to the code generator. Change-Id: Ie0b2f6ba1b98da85425be721ce4afecd6b4012a4
a5b8fde2d2bc3167078694fad417fddfe442a6fd	23-May-2014	Vladimir Marko <vmarko@google.com>	Rewrite BitVector index iterator. The BitVector::Iterator was not iterating over the bits but rather over indexes of the set bits. Therefore, we rename it to IndexIterator and provide a BitVector::Indexes() to get a container-style interface with begin() and end() for range based for loops. Also, simplify InsertPhiNodes where the tmp_blocks isn't needed since the phi_nodes and input_blocks cannot lose any blocks in subsequent iterations, so we can do the Union() directly in those bit vectors and we need to repeat the loop only if we have new input_blocks, rather than on phi_nodes change. And move the temporary bit vectors to scoped arena. Change-Id: I6cb87a2f60724eeef67c6aaa34b36ed5acde6d43
ddb311fdeca82ca628fed694c4702f463b5c4927	16-May-2014	Nicolas Geoffray <ngeoffray@google.com>	Build live ranges in preparation for register allocation. Change-Id: I7ae24afaa4e49276136bf34f4ba7d62db7f28c01
0d3f578909d0d1ea072ca68d78301b6fb7a44451	14-May-2014	Nicolas Geoffray <ngeoffray@google.com>	Linearize the graph before creating live ranges. Change-Id: I02eb5671e3304ab062286131745c1366448aff58
f635e63318447ca04731b265a86a573c9ed1737c	14-May-2014	Nicolas Geoffray <ngeoffray@google.com>	Add a compilation tracing mechanism to the new compiler. Code mostly imported from: https://android-review.googlesource.com/#/c/81653/. Change-Id: I150fe942be0fb270e03fabb19032180f7a065d13
622d9c31febd950255b36a48b47e1f630197c5fe	12-May-2014	Nicolas Geoffray <ngeoffray@google.com>	Add loop recognition and CFG simplifications in new compiler. We do three simplifications: - Split critical edges, for code generation from SSA (new). - Ensure one back edge per loop, to simplify loop recognition (new). - Ensure only one pre header for a loop, to simplify SSA creation (existing). Change-Id: I9bfccd4b236a00486a261078627b091c8a68be33
804d09372cc3d80d537da1489da4a45e0e19aa5d	02-May-2014	Nicolas Geoffray <ngeoffray@google.com>	Build live-in, live-out and kill sets for each block. This information will be used when computing live ranges of instructions. Change-Id: I345ee833c1ccb4a8e725c7976453f6d58d350d74