History log of /art/compiler/optimizing/code_generator_vector_mips.cc
Revision Date Author Comments
66c158ef6b2a16257f1590b3ace78848a7c2407b 31-Jan-2018 Aart Bik <ajcbik@google.com> Clean up signed/unsigned in vectorizer.

Rationale:
Currently we have some remaining ugliness around signed and unsigned
SIMD operations due to lack of kUint32 and kUint64 in the HIR. By
"softly" introducing these types, ABS/MIN/MAX/HALVING_ADD/SAD_ACCUMULATE
operations can solely rely on the packed data types to distinguish
between signed and unsigned operations. Cleaner, and also allows for
some code removal in the current loop optimizer.

Bug: 72709770

Test: test-art-host test-art-target
Change-Id: I68e4cdfba325f622a7256adbe649735569cab2a3
38e380b3c5139c0993495a7a0f040bfe8aa1e9e9 30-Oct-2017 Lena Djokic <Lena.Djokic@imgtec.com> MIPS: Implement Sum-of-Abs-Differences

Test: test-art-host test-art-target

Change-Id: I32a3e21f96cdcbab2e108d71746670408deb901a
7a0222e78ca39dc8f8e084b0c84f71a10852d24a 08-Nov-2017 Goran Jakovljevic <goran.jakovljevic@mips.com> MIPS32: Don't leave vector registers in an unpredictable state

Don't use regular FPU instructions in the Vectorizer because some
of those (such as mtc1 and mthc1) may leave remaining elements in
an unpredictable state.

Test: ./testrunner --optimizing --target --32 in QEMU (MIPS32R6)
Change-Id: I9b082bfd83360374b2f4831ef5f5bd35b00272a5
e434c4f47e30a6440834d14ee060f2879cbc18aa 23-Oct-2017 Lena Djokic <Lena.Djokic@imgtec.com> MIPS: Basic SIMD reduction support.

Enables vectorization of x += .... for very basic (simple, same-type)
constructs for MIPS.

Note: Testing is done with checker parts of tests 661 and 665,
locally changed to cover MIPS32 cases. These changes can't
be included in this patch since MSA is not a default option.

Test: test-art-host test-art-target
Change-Id: Ia3b3646afecb76c2f00996a30923ca70302be57e
e764d2e50c544c2cb98ee61a15d613161ac6bd17 05-Oct-2017 Vladimir Marko <vmarko@google.com> Use ScopedArenaAllocator for register allocation.

Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 25.1MiB -> 21.1MiB
BatteryStats.dumpLocked(): 49.6MiB -> 42.0MiB
This is because all the memory previously used by Scheduler
is reused by the register allocator; the register allocator
has a higher peak usage of the ArenaStack.

And continue the "arena"->"allocator" renaming.

Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 64312607
Change-Id: Idfd79a9901552b5147ec0bf591cb38120de86b01
ca6fff898afcb62491458ae8bcd428bfb3043da1 03-Oct-2017 Vladimir Marko <vmarko@google.com> ART: Use ScopedArenaAllocator for pass-local data.

Passes using local ArenaAllocator were hiding their memory
usage from the allocation counting, making it difficult to
track down where memory was used. Using ScopedArenaAllocator
reveals the memory usage.

This changes the HGraph constructor which requires a lot of
changes in tests. Refactor these tests to limit the amount
of work needed the next time we change that constructor.

Test: m test-art-host-gtest
Test: testrunner.py --host
Test: Build with kArenaAllocatorCountAllocations = true.
Bug: 64312607
Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
d5d2f2ce627aa0f6920d7ae05197abd1a396e035 26-Sep-2017 Vladimir Marko <vmarko@google.com> ART: Introduce Uint8 compiler data type.

This CL adds all the necessary codegen for the Uint8 type
but does not add code transformations that use that code.
Vectorization codegens are modified to use Uint8 as the
packed type when appropriate. The side effects are now
disconnected from the instruction's type after the graph has
been built to allow changing HArrayGet/H*FieldGet/HVecLoad
to use a type different from the underlying field or array.

Note: HArrayGet for String.charAt() is modified to have
no side effects whatsoever; Strings are immutable.

Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Test: testrunner.py --target --optimizing on Nexus 6P
Test: Nexus 6P boots.
Bug: 23964345
Change-Id: If2dfffedcfb1f50db24570a1e9bd517b3f17bfd0
0ebe0d83138bba1996e9c8007969b5381d972b32 21-Sep-2017 Vladimir Marko <vmarko@google.com> ART: Introduce compiler data type.

Replace most uses of the runtime's Primitive in compiler
with a new class DataType. This prepares for introducing
new types, such as Uint8, that the runtime does not need
to know about.

Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 23964345
Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
dbbac8f812a866b1b53f3007721f66038d208549 01-Sep-2017 Aart Bik <ajcbik@google.com> Implement Sum-of-Abs-Differences idiom recognition.

Rationale:
Currently just on ARM64 (x86 lacks proper support),
using the SAD idiom yields great speedup on loops
that compute the sum-of-abs-difference operation.
Also includes some refinements around type conversions.

Speedup ExoPlayerAudio (golem run):
1.3x on ARM64
1.1x on x86

Test: test-art-host test-art-target

Bug: 64091002

Change-Id: Ia2b711d2bc23609a2ed50493dfe6719eedfe0130
0148de41a5c77c2f61252c219f1a02413c7c4a32 05-Sep-2017 Aart Bik <ajcbik@google.com> Basic SIMD reduction support.

Rationale:
Enables vectorization of x += .... for very basic (simple, same-type)
constructs. Paves the way for more complex (narrower and/or mixed-type)
constructs, which will be handled by the next CL.

This is a revert of Icb5d6c805516db0a1d911c3ede9a246ccef89a22
and thus a revert^2 of I2454778dd0ef1da915c178c7274e1cf33e271d0f
and thus a revert^3 of I1c1c87b6323e01442e8fbd94869ddc9e760ea1fc
and thus a revert^4 of I7880c135aee3ed0a39da9ae5b468cbf80e613766

PS1-2 shows what needed to change

Test: test-art-host test-art-target

Bug: 64091002
Change-Id: I647889e0da0959ca405b70081b79c7d3c9bcb2e9
982334cef17d47ef2477d88a97203a9587a4b86f 02-Sep-2017 Nicolas Geoffray <ngeoffray@google.com> Revert "Basic SIMD reduction support."

Fails 530-checker-lse on arm64.

Bug: 64091002, 65212948

This reverts commit cfa59b49cde265dc5329a7e6956445f9f7a75f15.

Change-Id: Icb5d6c805516db0a1d911c3ede9a246ccef89a22
cfa59b49cde265dc5329a7e6956445f9f7a75f15 31-Aug-2017 Aart Bik <ajcbik@google.com> Basic SIMD reduction support.

Rationale:
Enables vectorization of x += .... for very basic (simple, same-type)
constructs. Paves the way for more complex (narrower and/or mixed-type)
constructs, which will be handled by the next CL.

This is a revert^2 of I7880c135aee3ed0a39da9ae5b468cbf80e613766
and thus a revert of I1c1c87b6323e01442e8fbd94869ddc9e760ea1fc

PS1-2 shows what needed to change, with regression tests

Test: test-art-host test-art-target

Bug: 64091002, 65212948
Change-Id: I2454778dd0ef1da915c178c7274e1cf33e271d0f
a57b4ee7b15ce6abfb5fa88c8dc8a516fe40e0d9 30-Aug-2017 Aart Bik <ajcbik@google.com> Revert "Basic SIMD reduction support."

This reverts commit 9879d0eac8fe2aae19ca6a4a2a83222d6383afc2.

Getting these type check failures in some builds. Need time to look at this better, so reverting for now :-(


dex2oatd F 08-30 21:14:29 210122 226218
code_generator.cc:115] Check failed: CheckType(instruction->GetType(), locations->InAt(0)) PrimDouble C

Change-Id: I1c1c87b6323e01442e8fbd94869ddc9e760ea1fc
9879d0eac8fe2aae19ca6a4a2a83222d6383afc2 15-Aug-2017 Aart Bik <ajcbik@google.com> Basic SIMD reduction support.

Rationale:
Enables vectorization of x += .... for very basic (simple, same-type)
constructs. Paves the way for more complex (narrower and/or mixed-type)
constructs, which will be handled by the next CL.

Test: test-art-host test-art-target

Bug: 64091002

Change-Id: I7880c135aee3ed0a39da9ae5b468cbf80e613766
bc5460b850a0fa2d8dcf6c8d36b0eb86f8fe46a8 20-Jul-2017 Lena Djokic <Lena.Djokic@imgtec.com> MIPS: Support MultiplyAccumulate for SIMD.

Moved support for multiply accumulate from arm64-specific to
general instruction simplification.
Also extended 550-checker-multiply-accumulate test.

Test: test-art-host, test-art-target

Change-Id: If113f0f0d5cb48e8a76273c919cfa2f49fce667d
51765b098301fff1897361b2d1a21af356d9d6d8 22-Jun-2017 Lena Djokic <Lena.Djokic@imgtec.com> MIPS32: ART Vectorizer

MIPS32 implementation which uses MSA extension.

Note: Testing is done with checker parts of tests 640, 645, 646 and
651, locally changed to cover MIPS32 cases. These changes can't
be included in this patch since MSA is not a default option.

Test: ./testrunner.py --target --optimizing -j1 in QEMU (mips32r6)
Change-Id: Ieba28f94c48c943d5444017bede9a5d409149762
f34dd206d0073fb3949be872224420a8488f551f 10-Apr-2017 Artem Serov <artem.serov@linaro.org> ARM64: Support MultiplyAccumulate for SIMD.

Test: test-art-host, test-art-target.

Change-Id: I06af8415e15352d09d176cae828163cbe99ae7a7
f3e61ee363fe7f82ef56704f06d753e2034a67dd 13-Apr-2017 Aart Bik <ajcbik@google.com> Implement halving add idiom (with checker tests).

Rationale:
First of several idioms that map to very efficient SIMD instructions.
Note that the is-zero-ext and is-sign-ext are general-purpose utilities
that will be widely used in the vectorizer to detect low precision
idioms, so expect that code to be shared with many CLs to come.

Test: test-art-host, test-art-target
Change-Id: If7dc2926c72a2e4b5cea15c44ef68cf5503e9be9
6daebeba6ceab4e7dff5a3d65929eeac9a334004 03-Apr-2017 Aart Bik <ajcbik@google.com> Implemented ABS vectorization.

Rationale:
This CL adds the concept of vectorizing intrinsics
to the ART vectorizer. More can follow (MIN, MAX, etc).

Test: test-art-host, test-art-target (angler)
Change-Id: Ieed8aa83ec64c1250ac0578570249cce338b5d36
f8f5a16ed7bad1e18179e38453e59c96a944de10 07-Feb-2017 Aart Bik <ajcbik@google.com> ART vectorizer.

Rationale:
Make SIMD great again with a retargetable and easily extendable vectorizer.

Provides a full x86/x86_64 and a proof-of-concept ARM implementation. Sample
improvement (without any perf tuning yet) for Linpack on x86 is about 20% to 50%.

Test: test-art-host, test-art-target (angler)
Bug: 34083438, 30933338

Change-Id: Ifb77a0f25f690a87cd65bf3d5e9f6be7ea71d6c1