Cross Reference: /frameworks/rs/cpu_ref/rsCpuBLASDispatch.h

History log of /frameworks/rs/cpu_ref/rsCpuBLASDispatch.h
Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
41ab8faaf0d90238d42d8e2bbb7177467c10b4f6	08-Sep-2016	Miao Wang <miaowang@google.com>	Implement multi-thread CPU GEMM for BLAS Intrinsics - Multi-thread GEMM utilizes existing RS thread pool on top of Eigen. - Large matrix-matrix multiplication is decomposed into multiple tiled matrix-matrix multiplications. Each thread iterates on the unfinished works. - The tiling applies to ONLY ONE dimension of each input matrix, and whether to tile X or Y depends on the transpose of the matrix. - The performance increase is proportional to the number of available CPU cores, for sufficiently large matrices. Test: CTS test (rsblas) pass on Angler, Fugu and new devices. Performance test with RsBlasBenchmark and RsNeuralNet demo on Anger, Ryu, Seed, Shamu, Volantis, Fugu and new devices, showing roughly 70%(Volantix 2 core) ~ 400+%(Angler 8 core) perf gain. Change-Id: If96f4119fd34d5d9d98a2542801495e7ffe577ae /frameworks/rs/cpu_ref/rsCpuBLASDispatch.h
4fcef498e1508f2f3a0b9332ea89b0064c2dc6e9	01-Jul-2016	Chih-Hung Hsieh <chh@google.com>	Fix clang-tidy warnings in frameworks/rs. * Declare explicit conversion constructors. * Add parentheses around macro arguments beside operators. Bug: 28341362 Bug: 28705665 Change-Id: I2eef68ab0edd33f765bcc5dd73f6baf25b6f7585 Test: build with clang-tidy /frameworks/rs/cpu_ref/rsCpuBLASDispatch.h
e941f18202b9c9883ff81c63710f7faec5c988e4	15-Jul-2015	Miao Wang <miaowang@google.com>	Making libRSSupport able to optionally bundle libblas(V8) through dlopen and dlsym. Change-Id: I3ade3ad2802f3b8e5fc5661319b98a6212e6d8a2 /frameworks/rs/cpu_ref/rsCpuBLASDispatch.h