7974fc03e11f3a8dd40f794f3b33b4889483090c |
|
09-Feb-2017 |
Rahul Chaudhry <rahulchaudhry@google.com> |
frameworks/rs: fix typos and clang-tidy warnings This change fixes a few typos and clang tidy warnings related to the "llvm-namespace-comment" checks. Bug: 26936282 Test: WITH_TIDY=1 WITH_TIDY_CHECKS="llvm-namespace-comment" mm Change-Id: Ic65182e5b4999fbd48d6a8ad7172e4bfeeb541f4
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|
67b5f524ef188efb77f0dcff6e6bf483827e2f82 |
|
08-Sep-2016 |
Miao Wang <miaowang@google.com> |
Implement multi-thread CPU GEMM for BLAS Intrinsics - Multi-thread GEMM utilizes existing RS thread pool on top of Eigen. - Large matrix-matrix multiplication is decomposed into multiple tiled matrix-matrix multiplications. Each thread iterates on the unfinished works. - The tiling applies to ONLY ONE dimension of each input matrix, and whether to tile X or Y depends on the transpose of the matrix. - The performance increase is proportional to the number of available CPU cores, for sufficiently large matrices. Test: CTS test (rsblas) pass on Angler, Fugu and new devices. Performance test with RsBlasBenchmark and RsNeuralNet demo on Anger, Ryu, Seed, Shamu, Volantis, Fugu and new devices, showing roughly 70%(Volantix 2 core) ~ 400+%(Angler 8 core) perf gain. Change-Id: If96f4119fd34d5d9d98a2542801495e7ffe577ae (cherry picked from commit 41ab8faaf0d90238d42d8e2bbb7177467c10b4f6)
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|
14ce007a633b10e3b9a3fae29d8f53a7e8c9b59f |
|
31-Jul-2015 |
Matt Wala <wala@google.com> |
Add a basic implementation of the reduce kernel API to the CPU reference implementation. Bug: 22631253 For now, this just runs a serial reduction on one thread. Change-Id: I34c96d24bb6f44274de72bb53160abcf79d143b0
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|
c060f1435e7b9405f3be8974417fa6f410f03753 |
|
14-May-2015 |
Stephen Hines <srhines@google.com> |
Use "override" instead of "virtual" when replacing methods. Bug: 20306487 Change-Id: Ic83cb04cac153a7556f5d516e8f5ec88b5527b6f
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|
f37121300217d3b39ab66dd9c8881bcbcad932df |
|
17-Jul-2014 |
Chris Wailes <chriswailes@google.com> |
Collapse code paths for single- and multi-input kernels. This patch simplifies the RenderScript driver and CPU reference implementation by removing the distinction between sing- and multi-input kernels in many places. The distinction is maintained in some places due to the need to maintain backwards compatibility. This permits the deletion of some functions and struct members that are no longer needed. Several related functions were also cleaned up. Change-Id: Id70a223ea5e3aa2b0b935b2b7f9af933339ae8a4
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|
4b2bea3dc20865f3a198797702e19912a6a2171c |
|
13-Aug-2014 |
Stephen Hines <srhines@google.com> |
Revert "Collapse code paths for single- and multi-input kernels." This reverts commit 818cfa034e257c7bb48356257f5cb67334e19aa6. Change-Id: I59f39f52e6c8f60bb01cbcb8ccf2215eaf46a57f
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|
818cfa034e257c7bb48356257f5cb67334e19aa6 |
|
17-Jul-2014 |
Chris Wailes <chriswailes@google.com> |
Collapse code paths for single- and multi-input kernels. This patch simplifies the RenderScript driver and CPU reference implementation by removing the distinction between sing- and multi-input kernels in many places. The distinction is maintained in some places due to the need to maintain backwards compatibility. This permits the deletion of some functions and struct members that are no longer needed. Several related functions were also cleaned up. Change-Id: I77e4b155cc7ca1581b05bf901c70ae53a9ff0b12
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|
c5a20170784a6a44ee1de1a754ca8c7175b78a6d |
|
09-Jul-2014 |
Stephen Hines <srhines@google.com> |
Fix build break for size_t vs. uint32_t difference. Change-Id: I11b9592214c4fa57ef62f42fd086a5a3df33abbf
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|
4b3c34e6833e39bc89c2128002806b654b8e623d |
|
11-Jun-2014 |
Chris Wailes <chriswailes@google.com> |
Adds support for multi-input kernels to Frameworks/RS. This patch modifies Frameworks/RS in the following ways: * Adjusted the data-layout of the C/C++ version of RsForEachStubParamStruct to accommodate a pointer to an array of input allocations and a pointer to an array of stride sizes for each of these allocatoins. * Adds a new code path for Java code to pass multiple allocations to a RS kernel. * Packs base pointers and step values for multi-input kernels into the new RsForEachStubParamStruct members. Change-Id: I46d2834c37075b2a2407fd8b010546818a4540d1
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|
ac8d146a41f18afad5314ac8af440d6aedbe20bf |
|
25-Jun-2014 |
Stephen Hines <srhines@google.com> |
Switch the dimensions array to use uint32_t instead of size_t. size_t isn't safe, since we pack/unpack the array as a 32-bit int array, but that is the wrong type for 64-bit. Switching to uint32_t is better, since we only support 1 dimension today, and won't need many more than that even for complex cases in the future. Change-Id: Ie0dda264a9398b0e385e0f9ee0a91cda08325dbc
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|
2282e2816ac5f5de53f9bd4f3ecbdfd6d756d120 |
|
18-Jun-2013 |
Jason Sams <jsams@google.com> |
add histogram intrinsic Change-Id: I42c297bfe116ea29cf015680fcc2143ff4cc95d2
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|
c905efd76fdcc1b8846b229bf7d991d185a7b4b7 |
|
27-Nov-2012 |
Jason Sams <jsams@google.com> |
Cleanup pass + implement blur uchar Change-Id: Ib7f1c5218663b468a3c11daa2c3373ae132145ac Conflicts: cpu_ref/rsCpuIntrinsicBlend.cpp
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|
709a0978ae141198018ca9769f8d96292a8928e6 |
|
16-Nov-2012 |
Jason Sams <jsams@google.com> |
Separate CPU driver impl from reference driver. Change-Id: Ifb484edda665959b81d7b1f890d108bfa20a535d
/frameworks/rs/cpu_ref/rsCpuIntrinsic.h
|