History log of /frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
41ab8faaf0d90238d42d8e2bbb7177467c10b4f6 08-Sep-2016 Miao Wang <miaowang@google.com> Implement multi-thread CPU GEMM for BLAS Intrinsics

- Multi-thread GEMM utilizes existing RS thread pool on top of
Eigen.
- Large matrix-matrix multiplication is decomposed into multiple
tiled matrix-matrix multiplications. Each thread iterates on
the unfinished works.
- The tiling applies to ONLY ONE dimension of each input matrix,
and whether to tile X or Y depends on the transpose of the matrix.
- The performance increase is proportional to the number of
available CPU cores, for sufficiently large matrices.

Test: CTS test (rsblas) pass on Angler, Fugu and new devices.
Performance test with RsBlasBenchmark and RsNeuralNet demo
on Anger, Ryu, Seed, Shamu, Volantis, Fugu and new devices,
showing roughly 70%(Volantix 2 core) ~ 400+%(Angler 8 core) perf gain.

Change-Id: If96f4119fd34d5d9d98a2542801495e7ffe577ae
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
462de21ac2e1773b99aedee012adb374e476ae36 16-Nov-2016 Chih-Hung Hsieh <chh@google.com> Fix google-build-using-namespace warnings in cpu_ref.

* Remove "using namespace ..." statements.
* Replace them with using declarations of the required names.
* Enclose the C++ methods and static and extern "C" functions with
namespace android and renderscript.
* Keep global C++ functions as-is and add using declarations for them.

Bug: 32670901
Test: build with WITH_TIDY=1
Change-Id: I818de466e8786a6c4f9ce0cd8e0fe027f34d7fad
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
e4f999b761180a227864d97b172a42ca1d8c0df3 03-Feb-2016 Miao Wang <miaowang@google.com> Switch "transpose" for Matrix A & B, after gemmlowp change.

Change-Id: I26fcaebcca828388ef6fe53c6e9e4db8e60dd4d9
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
9195e5188cb0e72d874512de01e7e58f1f47e0b7 15-Sep-2015 Miao Wang <miaowang@google.com> Update IntrinsicBLAS call to gemmlowp after rebase.

Change-Id: Id084ac7b53ea0b3c61311b4f4c78312f397b7c5f
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
223231fe99c9c958de4a1c8723aff88cb667de52 17-Jul-2015 Miao Wang <miaowang@google.com> Update eight_bit_int_gemm call after gemmlowp rebase and provide
non-optimal path for armv7 without NEON.

- gemmlowp will handle the optimal path for x86, NEON and aarch64

Change-Id: I67ce4c1e5b3195017a3d46895a8ce096682bc172
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
e941f18202b9c9883ff81c63710f7faec5c988e4 15-Jul-2015 Miao Wang <miaowang@google.com> Making libRSSupport able to optionally bundle libblas(V8) through dlopen
and dlsym.

Change-Id: I3ade3ad2802f3b8e5fc5661319b98a6212e6d8a2
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
99d0e8130f5b4bb83d1a68d96496fa558e35193a 07-Jul-2015 Miao Wang <miaowang@google.com> Update the BNNM cpu reference implementation with NEON friendly
gemmlowp.

Change-Id: I5bcfd0fa988d8075e70272f277d7d7fab93d5fea
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
06deda3751a4a7358a7c7e03fbf1e4325fafb807 30-Jun-2015 Miao Wang <miaowang@google.com> update the offset type for BLAS.BNNM

bug: 22184114

Change-Id: I6ec212f8d5feb46fc9d0f97862b206978af1675b
(cherry picked from commit 22cb808b0dfc9bd514d2e19b302a97f8455b5731)
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
c060f1435e7b9405f3be8974417fa6f410f03753 14-May-2015 Stephen Hines <srhines@google.com> Use "override" instead of "virtual" when replacing methods.

Bug: 20306487

Change-Id: Ic83cb04cac153a7556f5d516e8f5ec88b5527b6f
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
e64c3d560186f976f9b8923e0f2c6ac3080913a2 07-May-2015 Miao Wang <miaowang@google.com> remove dead code (ALOGE) in rsCpuIntrinsicBLAS.cpp

bug: 21028875

Change-Id: Ia2d85a265f6e4a2617373f99b5c7bdc3810a7f24
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
08ef7b7f7977e9c991d8ba94a63860edcb88a3d9 30-Apr-2015 Miao Wang <miaowang@google.com> fix the CHER, CHPR, ZHER, ZHPR crash due to incorrect param order.

Change-Id: If91cbf969c75e01afc6d93b204bc8167180c9ef9
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
b75ba0fc7469d0bb4c1a6679664a846b3741792e 27-Apr-2015 Miao Wang <miaowang@google.com> fix RsBlas_xgemv and RsBlas_xgbmv crash. (typo)

Change-Id: Ia948afa2bc4af22f99323618738d5eb7d415ca97
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
2b999883f2f390ee43ed18317d77c810a0c6657b 13-Apr-2015 Tim Murray <timmurray@google.com> Rename BGEMM to BNNM. Modify layout of eight-bit GEMM-like intrinsic storage.

Change-Id: If4b1267dfd42d6dd65bedf20c0b674479eefab35
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
aff744561bea3c8a7a7d59c0cb8cd9438f6dcd1c 31-Mar-2015 Tim Murray <timmurray@google.com> Add eight-bit GEMM-like intrinsic.

Change-Id: I9b920900b4cb8b27e2ab27386d05f4175142d6b2
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp
64c682b65cd04ac83b51251b40dca14423df351a 09-Jan-2015 Tim Murray <timmurray@google.com> Add BLAS to supported intrinsics.

Change-Id: I8e776b2ffdbac09a73924035eee2eca0a12facb3
/frameworks/rs/cpu_ref/rsCpuIntrinsicBLAS.cpp