History log of /frameworks/rs/cpu_ref/rsCpuBLASDispatch.h
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
41ab8faaf0d90238d42d8e2bbb7177467c10b4f6 08-Sep-2016 Miao Wang <miaowang@google.com> Implement multi-thread CPU GEMM for BLAS Intrinsics

- Multi-thread GEMM utilizes existing RS thread pool on top of
Eigen.
- Large matrix-matrix multiplication is decomposed into multiple
tiled matrix-matrix multiplications. Each thread iterates on
the unfinished works.
- The tiling applies to ONLY ONE dimension of each input matrix,
and whether to tile X or Y depends on the transpose of the matrix.
- The performance increase is proportional to the number of
available CPU cores, for sufficiently large matrices.

Test: CTS test (rsblas) pass on Angler, Fugu and new devices.
Performance test with RsBlasBenchmark and RsNeuralNet demo
on Anger, Ryu, Seed, Shamu, Volantis, Fugu and new devices,
showing roughly 70%(Volantix 2 core) ~ 400+%(Angler 8 core) perf gain.

Change-Id: If96f4119fd34d5d9d98a2542801495e7ffe577ae
/frameworks/rs/cpu_ref/rsCpuBLASDispatch.h
4fcef498e1508f2f3a0b9332ea89b0064c2dc6e9 01-Jul-2016 Chih-Hung Hsieh <chh@google.com> Fix clang-tidy warnings in frameworks/rs.

* Declare explicit conversion constructors.
* Add parentheses around macro arguments beside operators.

Bug: 28341362
Bug: 28705665
Change-Id: I2eef68ab0edd33f765bcc5dd73f6baf25b6f7585
Test: build with clang-tidy
/frameworks/rs/cpu_ref/rsCpuBLASDispatch.h
e941f18202b9c9883ff81c63710f7faec5c988e4 15-Jul-2015 Miao Wang <miaowang@google.com> Making libRSSupport able to optionally bundle libblas(V8) through dlopen
and dlsym.

Change-Id: I3ade3ad2802f3b8e5fc5661319b98a6212e6d8a2
/frameworks/rs/cpu_ref/rsCpuBLASDispatch.h