Cross Reference: /art/compiler/optimizing/parallel_move

History log of /art/compiler/optimizing/parallel_move_resolver.cc
Revision	Date	Author	Comments
ad4450e5c3ffaa9566216cc6fafbf5c11186c467	17-Apr-2015	Zheng Xu <zheng.xu@arm.com>	Opt compiler: Implement parallel move resolver without using swap. The algorithm of ParallelMoveResolverNoSwap() is almost the same with ParallelMoveResolverWithSwap(), except the way we resolve the circular dependency. NoSwap() uses additional scratch register to resolve the circular dependency. For example, (0->1) (1->2) (2->0) will be performed as (2->scratch) (1->2) (0->1) (scratch->0). On architectures without swap register support, NoSwap() can reduce the number of moves from 3x(N-1) to (N+1) when there is circular dependency with N moves. And also, NoSwap() algorithm does not depend on architecture register layout information, which means it can support register pairs on arm32 and X/W, D/S registers on arm64 without additional modification. Change-Id: Idf56bd5469bb78c0e339e43ab16387428a082318
e14590bdfed24df30e6b7545fc819ba03ff8bba1	15-Apr-2015	Guillaume Sanchez <guillaumesa@google.com>	Revert "[optimizing] Improve x86 parallel moves/swaps" This reverts commit a5c19ce8d200d68a528f2ce0ebff989106c4a933. This commit introduces a performance regression on CaffeineLogic of 30%. Change-Id: I917e206e249d44e1748537bc1b2d31054ea4959d
9021825d1e73998b99c81e89c73796f6f2845471	15-Apr-2015	Nicolas Geoffray <ngeoffray@google.com>	Type MoveOperands. The ParallelMoveResolver implementation needs to know if a move is for 64bits or not, to handle swaps correctly. Bug found, and test case courtesy of Serguei I. Katkov. Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
a5c19ce8d200d68a528f2ce0ebff989106c4a933	01-Apr-2015	Mark Mendell <mark.p.mendell@intel.com>	[optimizing] Improve x86 parallel moves/swaps Add a new constructor to ScratchRegisterScope that will supply a register if there is a free one, but not spill to force one. Use this to generated alternate code that doesn't use a temporary, as the spill/restore of a register generates extra instructions that aren't necessary on x86. Here is the benefit for a 32 bit memory-to-memory exchange with no free registers: < 50 push eax < 53 push ebx < 8B44244C mov eax, [esp + 76] < 8B5C246C mov ebx, [esp + 108] < 8944246C mov [esp + 108], eax < 895C244C mov [esp + 76], ebx < 5B pop ebx < 58 pop eax --- > FF742444 push [esp + 68] > FF742468 push [esp + 104] > 8F44244C pop [esp + 72] > 8F442468 pop [esp + 100] Avoid using xchg instruction, as it is slow on smaller processors. Change-Id: Id29ee3abd998577baaee552d55d23e60ae0c7871 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
a2d15b5e486021bef330b70c21e99557cb116ee5	31-Mar-2015	Nicolas Geoffray <ngeoffray@google.com>	Fix wrong assumptions about ParallelMove. Registers involved in single and double operations can drag stack locations as well, so it is possible to update a single stack location with a slot from a double stack location. bug:19999189 Change-Id: Ibeec7d6f1b3126c4ae226fca56e84dccf798d367
f7a0c4e421b5edaad5b7a15bfff687da28d0b287	10-Feb-2015	Nicolas Geoffray <ngeoffray@google.com>	Improve ParallelMoveResolver to work with pairs. Change-Id: Ie2a540ffdb78f7f15d69c16a08ca2d3e794f65b9
42d1f5f006c8bdbcbf855c53036cd50f9c69753e	16-Jan-2015	Nicolas Geoffray <ngeoffray@google.com>	Do not use register pair in a parallel move. The ParallelMoveResolver does not work with pairs. Instead, decompose the pair into two individual moves. Change-Id: Ie9d3f0b078cef8dc20640c98b20bb20cc4971a7f
48c310c431b110f6ab54907da20c4fa39a8f76b8	14-Jan-2015	Nicolas Geoffray <ngeoffray@google.com>	Remove constant moves after emitting them in parallel resolver. This fixes the case where a constant move requires a scratch register. Note that there is no backend that needs this for now, but X86 might with the move to hard float. Change-Id: I37f6b8961b48f2cf6fbc0cd281e70d58466d018e
277ccbd200ea43590dfc06a93ae184a765327ad0	04-Nov-2014	Andreas Gampe <agampe@google.com>	ART: More warnings Enable -Wno-conversion-null, -Wredundant-decls and -Wshadow in general, and -Wunused-but-set-parameter for GCC builds. Change-Id: I81bbdd762213444673c65d85edae594a523836e5
56b9ee6fe1d6880c5fca0e7feb28b25a1ded2e2f	09-Oct-2014	Nicolas Geoffray <ngeoffray@google.com>	Stop converting from Location to ManagedRegister. Now the source of truth is the Location object that knows which register (core, pair, fpu) it needs to refer to. Change-Id: I62401343d7479ecfb24b5ed161ec7829cda5a0b1
e27f31a81636ad74bd3376ee39cf215941b85c0e	12-Jun-2014	Nicolas Geoffray <ngeoffray@google.com>	Enable the register allocator on ARM. - Also fixes a few bugs/wrong assumptions in code not hit by x86. - We need to differentiate between moves due to connecting siblings within a block, and moves due to control flow resolution. Change-Id: Idd05cf138a71c8f36f5531c473de613c0166fe38
86dbb9a12119273039ce272b41c809fa548b37b6	04-Jun-2014	Nicolas Geoffray <ngeoffray@google.com>	Final CL to enable register allocation on x86. This CL implements: 1) Resolution after allocation: connecting the locations allocated to an interval within a block and between blocks. 2) Handling of fixed registers: some instructions require inputs/output to be at a specific location, and the allocator needs to deal with them in a special way. 3) ParallelMoveResolver::EmitNativeCode for x86. Change-Id: I0da6bd7eb66877987148b87c3be6a983b4e3f858
ffddfdf6fec0b9d98a692e27242eecb15af5ead2	03-Jun-2014	Tim Murray <timmurray@google.com>	DO NOT MERGE Merge ART from AOSP to lmp-preview-dev. Change-Id: I0f578733a4b8756fd780d4a052ad69b746f687a9
4e3d23aa1523718ea1fdf3a32516d2f9d81e84fe	22-May-2014	Nicolas Geoffray <ngeoffray@google.com>	Import Dart's parallel move resolver. And write a few tests while at it. A parallel move resolver will be needed for performing multiple moves that are conceptually parallel, for example moves at a block exit that branches to a block with phi nodes. Change-Id: Ib95b247b4fc3f2c2fcab3b8c8d032abbd6104cd7