History log of /frameworks/rs/cpu_ref/rsCpuCore.cpp
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
ae2ec3febedfc29376b9104413fb4042028f1265 01-Jun-2016 David Gross <dgross@google.com> Delete simple reduction implementation.

Bug: 27298560
Change-Id: I8c3d568e98aaf0b7d86881c985d13ed5b8e95338
633beab5de5ffe4c5068cf1c7a1c0bd09ee9f195 20-Apr-2016 Yang Ni <yangni@google.com> Remove unnecessary logcat message

Bug: 28281435
Change-Id: If0865b20576322845cd23743916c4589ea199287
415a249300cb9ca3f560814bfde52c6796fed1f7 07-Apr-2016 David Gross <dgross@google.com> Multithreaded execution of general reduction kernels over 2D and 3D iteration spaces.

Includes fix for bug in reduce_backward fz3 kernel (combiner function).

Bug: 27299475

Change-Id: I48ed2f99f53dfc786a85e04dc0206cb3ebe98034
6760f7ba7934ddd51938a8d0206fc41c2a7cb419 01-Apr-2016 David Gross <dgross@google.com> Guard general reduction logging output under property "debug.rs.reduce".

Bug: 27299475
Change-Id: I5be634fe38d20b9fe6867ad3c0c0b982442b52fd
10adb0c2029f112b5738228617d5645f6ecea0c5 29-Mar-2016 David Gross <dgross@google.com> Multithreaded execution of certain general reduction kernels; reduction test overhaul.

A reduction kernel is eligible for multithreaded execution if it has a
combiner function and it is launched over a 1D iteration space.

Note: Properties debug.rs.reduce-accum and debug.rs.reduce-split-accum
are added for debugging multithreaded reduction.

The following changes are made to reduction tests in RsTest:
- Overhaul the test framework -- now data-driven, and can execute the same
test multiple times with different seeds and input sizes, features
separate sets of quick correctness tests, full correctness tests,
and performance tests. (Performance tests are not run by default.)
- Report timing information for test execution.
- Report more information for fz* kernel testing.
- Remove dp kernel testing -- this involved floating-point arithmetic
which is not guaranteed to produce identical results between java
and rs or for different rs multithreaded executions.
- Add sumgcd kernel testing. This is intended to be representative of
a compute-heavy kernel.
- findMinAndMax kernel testing must compare cell value not cell index
-- two or more cells might have the same min or max value, and java
and various rs multithreaded executions are not guaranteed to find
the same cell.
- Fix bug in findMinAndMax kernel's combiner function. (It behaved
incorrectly when operating on an accumulator datum that has been
initialized but never passed to the accumulator function.)
- RsTest now requests largeHeap.

Bug: 27299475
Change-Id: I58f99c21389dbae5c8e3ad85d98700dc165664bb
6c1876bbef1b2c89975dce91230a168bd2d2ce4c 15-Jan-2016 David Gross <dgross@google.com> Support for general reduction kernels.

Requires coordinated change in frameworks/base.

Requires coordinated change in frameworks/compile/libbcc in order
for RsTest to run.

At present, general reduction kernels are run single-threaded.

Also: Remove dead struct field MTLaunchStructForEach::sig.

Bug: 23535724
Change-Id: Ice17ccf20a902f8a106eaa62ec071d46e3c0ad8c
14ce007a633b10e3b9a3fae29d8f53a7e8c9b59f 31-Jul-2015 Matt Wala <wala@google.com> Add a basic implementation of the reduce kernel API to the CPU
reference implementation.

Bug: 22631253

For now, this just runs a serial reduction on one thread.

Change-Id: I34c96d24bb6f44274de72bb53160abcf79d143b0
5d70cb591d78d62d10839a52302ec9087c6f3350 17-Jul-2015 Miao Wang <miaowang@google.com> Fix GetCpuInfo() routine to correctly check the cpuinfo file to make
sure we don't miss SIMD path if there is one.

Change-Id: I8c8841ba9924ee28ae56be8b3c66c50b5badf796
11fd9ec1ab8dfa7ae45c6edeea48dddc4633efea 11-Jul-2015 Matt Wala <wala@google.com> CPU ref: Fix potential buffer over-read / uninitialized memory access.

GetCpuInfo() was reading /proc/cpuinfo into a string without properly
null terminating the result. The resulting unterminated string was
being passed to strstr().

Change the code to read the file with fgets(), which ensures the
result is null terminated. Also, document the GetCpuInfo() function
and the global variable that it sets.

Change-Id: I041331fdc25d79217ff7c1bf36a4aff2be8e0192
4b5b295f6437c580083997c8b7cbaad3e5975df8 18-Jun-2015 Miao Wang <miaowang@google.com> Make support lib CPU driver able to access IntrinsicBLAS

bug: 21902810

(cherry picked from commit 0d6b6f51b9d1f98478a32a270fa2304f0839ca8c)

Change-Id: I1f15306f1cf333d6932127ae62e9af8386ecb3fe
b043df0676fef226336deb3a00ead2f31e02343f 29-May-2015 David Gross <dgross@google.com> Remove dead uses of RSCompilerDriver and of compiler callbacks.

Change-Id: Ibe8725074724b75e35c25a404daaba07ffbca2ab
8409d6414dd4a42aa59779fcfe9fce18648cb135 29-Apr-2015 Stephen Hines <srhines@google.com> Add RSGlobalInfoPass information to RS driver.

Bug: 20306487

This change enables vendor drivers to configure support for including
additional information about global variables in the emitted CPU code.
This information includes the number of total global variables, the
names of these variables, the addresses of these variables and the
sizes of these variables. The driver can also select whether the
information includes constant (immutable) globals or not.

The reference driver defaults to embedding information about each of
the existing, non-constant global variables.

Change-Id: I1e55fc3f08e518f04eeee3e4f9dc7b6ea3b80d7c
247f8ee57196d6cf3264e6f7505f53e8f8a7860d 19-Apr-2015 Logan Chien <tzuhsiang.chien@gmail.com> Code cleanup: Remove unused typedefs and declarations.

Change-Id: I48dafb2bc1dc335a52b289db2981397251f673c8
a9139c724f8312b3634d213599f2d6b3b2505db2 17-Apr-2015 Jason Sams <jsams@google.com> Fix allocation-less launches.

Change-Id: I6d6b46c55f3e88a810ebe51def3ebaccb1fd3fa2
59b35f29cc576243a322bf88bc16063a9810da55 17-Mar-2015 Jason Sams <jsams@google.com> Fix issues with >2D launches

mtls->fep was being passed to setup in place of per-thread fep.

Change-Id: Ic26154fcf47dc7bc70cec43f0daf023fb83dfd78
b0abb140ac51b93d1a85aadaa63fe057f2d29850 12-Mar-2015 David Gross <dgross@google.com> Pass RsExpandKernelDriverInfo not RsExpandKernelParams.

Which is to say: retire RsExpandKernelParams and pass RsExpandKernelDriverInfo
directly to kernel wrapper functions instead.

Requires related change in frameworks/compile/libbcc.

Change-Id: I453f45ec18f389e88e27fcfa57ddf245d077cb98
64c682b65cd04ac83b51251b40dca14423df351a 09-Jan-2015 Tim Murray <timmurray@google.com> Add BLAS to supported intrinsics.

Change-Id: I8e776b2ffdbac09a73924035eee2eca0a12facb3
bf2111d3b3de310932099514f06924e48fa1d7b2 27-Jan-2015 Jason Sams <jsams@google.com> add array launch support.

Change-Id: I66cd89b5b44eafa92f391708a06464cd7cdde3ed
c0d68470b978a79ce024fde56f23ea3690603ccd 20-Jan-2015 Jason Sams <jsams@google.com> Cleanup of ForEachParams in cpu ref

Change-Id: I8cc51915b2a605c240d98e3010619b741a13bae2
dc0d8f7c0f1f43f25c34fbc04656ad578f6e953b 03-Dec-2014 Pirama Arumuga Nainar <pirama@google.com> Skip linkloader, use shared object files

Bug: 18322681

- In rsCpuScript, if property rs.skip.linkloader is set, look for a .so
file in the cache directory and load it. If it is not available, use
bcc to generate relocatable object file and link it to a .so using
ld.mc. Use the embedded symbols in .rs.info and follow steps similar
to the compatibility library to invoke script functions or access
script variables.
- Add rs* symbols like rsGetAllocation to libRSCpuRef (ala
libRSSupport). Do necessary changes to argument types to get mangled
names correct.
- Make 64-bit version of rsSetObject take two pointers instead of a
pointer and a large object. rsIsObject takes a pointer instead of a
large object. Otherwise, we get failures in x86_64 due to calling
convention mismatch. To match the function names in the shared object
path, define these functions as 'extern "C"' with their mangled names.
- Add stubbed Math functions from rsCpuRuntimeMath and
rsCpuRuntimeMathFuncs into libRSCpuRef.so.
- Coalesce separate #ifdef paths in libRSCpuRef. Function parameters
for runtime callbacks and bcc plugin are needed in the
non-compatibilty path, but take default NULL arguments. This patch
introduces these parameters into the compatibility path as well, and
passes default NULL arguments.

Change-Id: I8a853350e39d30b4d852c30e4b5da5a75a2f2820
1ffd86b448d78366190c540f98f8b6d641cdb6cf 07-Jan-2015 Yang Ni <yangni@google.com> New Script Group API: runtime and cpu driver support.

Change-Id: I9c612cf8874aabaf0ca7d1640567464c71ed3070
77d57a305f4134e78ebc91869011c4009988104e 24-Oct-2014 Jason Sams <jsams@google.com> Fix query for CPU count.

Some devices report fewer processors online versus
configured. Always get the configured (higher) number

bug 18108290

Change-Id: Ic6202e05ad8c4686dd79795f880baf5429674d70
fe0922575f26af84ee33429626f36049410cb7b6 09-Oct-2014 Jason Sams <jsams@google.com> Revert "RS: Add VP9 LoopFilter Intrinsic"

This reverts commit 6fc3e12b8912458cb4adcfd32e2f53d76b0cc737.

Change-Id: I4eb50620548805344bd45669fda1af81128195f5
44bef6fba6244292b751387f3d6c31cca96c28ad 12-Aug-2014 Chris Wailes <chriswailes@google.com> Replace NULL macros with nullptr literals.

Change-Id: I918c40879aa547438f77e7d1a95fa2aa33bec398
9ed79105cc6a8dbfaf959875249f36022cc2c798 26-Jul-2014 Chris Wailes <chriswailes@google.com> Remove the instep parameter.

This patch removes the instep parameter from calls to expanded kernels and
from the CPU reference implementation intrinsics.

Change-Id: I059db548a57702c576963f6b17a002b2ee393cdb
f37121300217d3b39ab66dd9c8881bcbcad932df 17-Jul-2014 Chris Wailes <chriswailes@google.com> Collapse code paths for single- and multi-input kernels.

This patch simplifies the RenderScript driver and CPU reference implementation
by removing the distinction between sing- and multi-input kernels in many
places. The distinction is maintained in some places due to the need to
maintain backwards compatibility. This permits the deletion of some functions
and struct members that are no longer needed. Several related functions were
also cleaned up.

Change-Id: Id70a223ea5e3aa2b0b935b2b7f9af933339ae8a4
4b2bea3dc20865f3a198797702e19912a6a2171c 13-Aug-2014 Stephen Hines <srhines@google.com> Revert "Collapse code paths for single- and multi-input kernels."

This reverts commit 818cfa034e257c7bb48356257f5cb67334e19aa6.

Change-Id: I59f39f52e6c8f60bb01cbcb8ccf2215eaf46a57f
818cfa034e257c7bb48356257f5cb67334e19aa6 17-Jul-2014 Chris Wailes <chriswailes@google.com> Collapse code paths for single- and multi-input kernels.

This patch simplifies the RenderScript driver and CPU reference implementation
by removing the distinction between sing- and multi-input kernels in many
places. The distinction is maintained in some places due to the need to
maintain backwards compatibility. This permits the deletion of some functions
and struct members that are no longer needed. Several related functions were
also cleaned up.

Change-Id: I77e4b155cc7ca1581b05bf901c70ae53a9ff0b12
80ef693674f69c0343c41564e30f80e7fb513b60 08-Jul-2014 Chris Wailes <chriswailes@google.com> Split the RsForEachStubParamStruct in two.

This patch splits the RsForEachStubParamStruct into two smaller structs, one
used specifically by the driver and the other by the expanded kernels. Doing
so makes it clearer what data is used where. In addition, fewer data are
copied between memory locations during kernel invocation.

Several fields that were not being used were removed from the structs.

Change-Id: I7788ef754add44463b17a6b571c7cde6e73b9712
4b3c34e6833e39bc89c2128002806b654b8e623d 11-Jun-2014 Chris Wailes <chriswailes@google.com> Adds support for multi-input kernels to Frameworks/RS.

This patch modifies Frameworks/RS in the following ways:
* Adjusted the data-layout of the C/C++ version of RsForEachStubParamStruct to
accommodate a pointer to an array of input allocations and a pointer to an
array of stride sizes for each of these allocatoins.
* Adds a new code path for Java code to pass multiple allocations to a RS
* Packs base pointers and step values for multi-input kernels into the new
RsForEachStubParamStruct members.

Change-Id: I46d2834c37075b2a2407fd8b010546818a4540d1
074424a4ac5b093331df2c92e7a5bcbfff136b71 22-May-2014 Jason Sams <jsams@google.com> Enable ARM64 intrinsics.

This also moves ARM intrinsic ifdefs behing ARCH_ARM_USE_INTRINSICS instead of ARCH_ARM_HAVE_VFP.

Change-Id: I48d3d55c77feb931e22288828247e281db43d32b
005113297b19ed256b6db9d6bc293ed9266899fc 31-Jan-2014 Stephen Hines <srhines@google.com> Configure standalone bcc compiler to work with plugin libraries.

Bug: 7342767

This change adds support (hidden behind the EXTERNAL_BCC_COMPILER ifdef)
for loading plugin libraries via the external bcc toolchain. The external
bcc compiler loads the named library and will then invoke a customized
rsCompilerDriverInit() from that library.

Change-Id: I07c2ea68be54c2255d36926fd37e395db790ef8f
d3fe4992b47848deb9a2876951aeb0bb1c62ad3f 24-Apr-2014 Jason Sams <jsams@google.com> Merge "Revert "Add VP9 inter-frame prediction intrinsic""
ee0f4835e065ef08a6283e3f86cdc671a5a156c7 24-Apr-2014 Jason Sams <jsams@google.com> Revert "Add VP9 inter-frame prediction intrinsic"

This reverts commit 60498fe9679ea25a260a503d6dfd27cbc0a0c079.

Change-Id: I4d8bb284793874a08c0cc991c0e04ecc104e1e0f

7b7060c61e4182b29186849c5a857ea5f0898e56 21-Apr-2014 Rose, James <james.rose@intel.com> Improve RS intrinsics performance.

Renderscript CPU performance for intrinsics cases is not good for x86 platforms.
In many cases it is significantly slower even with SIMD Intrinsics. In current x86 implementation
it is using full 32 bit multiplies which aren't well supported on current Atom platforms.

This patch uses 16 bit multiply with 32 bit add pmaddwd instruction where appropriate.
It also adds atom specificoptimizations to improve RS intrinsics performance.

Change-Id: Ifc01b5a6d6f7430d2dc218f1618b9df3fb7937fe
Signed-off-by: Xiaofei Wan <xiaofei.wan@intel.com>
39ab94aafb7f0916a7f6e345ee1fa0f5ff3bbacd 17-Apr-2014 Jason Sams <jsams@google.com> Bicubic resize intrinsic

Change-Id: Ie869484505c3e25e8ea57ff208b9e052ee8dca7b
6dc86b492191f9062e912afd948e08362201f332 25-Mar-2014 Jason Sams <jsams@google.com> Merge "RS: Add VP9 LoopFilter Intrinsic"
6fc3e12b8912458cb4adcfd32e2f53d76b0cc737 04-Mar-2014 Matthieu Delahaye <matthieu@multicorewareinc.com> RS: Add VP9 LoopFilter Intrinsic

Change-Id: I5caa46da2c825a95cc1ed35a1cdbcd6da0ffce88
00dbeacfd62bdecd5fce9426c4795aec8618753b 18-Mar-2014 Jason Sams <jsams@google.com> Merge "Revert "RS: Add VP9 LoopFilter Intrinsic""
933bdc9b648995ab68da746c6daa2206eec02b0f 18-Mar-2014 Jason Sams <jsams@google.com> Revert "RS: Add VP9 LoopFilter Intrinsic"

This has build errors with the x86 SDK.

This reverts commit 64048e720cf940cb0f7f6f9a4ab4f061918a1fd9.

Change-Id: Ia712a46abd06e2a580853c863bfa53410b7f99e9
21dbc8ba61dd6a5852b1346c14bd29373326c240 18-Mar-2014 Jason Sams <jsams@google.com> Merge "Solve four separate memory leaks related to rsdHalInit"
64048e720cf940cb0f7f6f9a4ab4f061918a1fd9 04-Mar-2014 Matthieu Delahaye <matthieu@multicorewareinc.com> RS: Add VP9 LoopFilter Intrinsic

Change-Id: If1ac77774c74b5513ce7a2db4ef31888a351a9c5
07ef704308b514272ed2f5c3e6a2f4c055550158 19-Feb-2014 Jens Gulin <jens.gulin@sonymobile.com> Solve four separate memory leaks related to rsdHalInit

Three of the items are local to RsdCpuReferenceImpl and now freed in
destructor after all threads are stopped.
Last one is the RsdHal item itself where the pointer for some reason
was explicitly cleared but not freed. There is no reference counting
but it should be ok to free in Shutdown.

Change-Id: I7832e412d12f4bd7cc728481ae0c782fa57b57e4
1e2aedbef554a10a16296d3b529327fffcb10e0d 14-Mar-2014 Jason Sams <jsams@google.com> Revert "RS: Add VP9 LoopFilter Intrinsic"

This reverts commit e4749f3a5a6a6041ef2894162edce5115b307db0.

Change-Id: I45ccdacb1706abd4df7f635c5e64dcb1ee4b876d
e4749f3a5a6a6041ef2894162edce5115b307db0 04-Mar-2014 Matthieu Delahaye <matthieu@multicorewareinc.com> RS: Add VP9 LoopFilter Intrinsic

Change-Id: Ia49e56c7e21fee1601a0418bd105ef6429c336ca
83f304cb26008d3f4da154cec19c3a12fa2e6c74 06-Mar-2014 Jason Sams <jsams@google.com> Fix build issues with external patch.

Change-Id: Ib5ea4338df179eb27e4ce9958ef42df1e3ac3eb1
60498fe9679ea25a260a503d6dfd27cbc0a0c079 18-Feb-2014 Matthieu Delahaye <matthieu@multicorewareinc.com> Add VP9 inter-frame prediction intrinsic

Change-Id: If8985a6200fb6d34083eff711ccdf2f1b3c374e6
5cb36d9b36617f6b0493602ef61d620dc8f7e0ae 09-Aug-2013 Jason Sams <jsams@google.com> Merge commit 'b10a68c3' into manualmerge


Change-Id: Ibc2f1514f8858d99f08380f698bc9ae533c69212
f5ef8df639ba6363aa5d546e57ce872d04144cb6 06-Aug-2013 Jason Sams <jsams@google.com> Neon detection for RS SDK compat lib.

Change-Id: I3887158c7ec97ba116c28dc7b1d0c789b81fae60
140a7acade66ab5d1f3dc55803a3a65a71f3f86c 11-Jul-2013 Stephen Hines <srhines@google.com> resolved conflicts for merge of 5376c9bf to master

Change-Id: I51507da10f8d7116a2aa29446a00a43d397a37c8
b0934b67b95cc27e2358c2aa4db5f7c1067c8f9b 04-Jul-2013 Stephen Hines <srhines@google.com> Remove libutils and fix rsDebug for RS support library.

Bug: 9664050

Our bitcode runtime library translates vector rsDebug() calls into passing
their parameters via pointers. The previous version of libRSSupport.so was
being created with non-pointer versions of these routines accidentally.
This change also fixes a missing permission issue for ImageProcessing2, so
that the compatibility library can be verified.

This change also removes the use of libutils by switching the implementation of
String8/Vector in the compatibility library to internal types backed by

Change-Id: I20da75e8c19a82a42dc2bceaba1937d21372db84
2282e2816ac5f5de53f9bd4f3ecbdfd6d756d120 18-Jun-2013 Jason Sams <jsams@google.com> add histogram intrinsic

Change-Id: I42c297bfe116ea29cf015680fcc2143ff4cc95d2
b7d9c80c98fc96aa7c638e3124be24f13a6436b2 30-Apr-2013 Stephen Hines <srhines@google.com> Provide a mechanism for adjusting RSCompilerDriver after construction.

We add a simple callback to the reference implementation of libRSDriver.so,
such that additional BCC flags can be toggled/adjusted before doing any actual
CPU compilation.

Change-Id: Iaf253b7d967d0382937369b1c5dae2d23a99e8be
1d476620399d54774e4fd386c1d23cc583d49522 30-Mar-2013 Stephen Hines <srhines@google.com> Add callback to allow replacement of runtime support library.

Change-Id: I84ec56dfb29a0158015ebf31b3a73ac5bf34ef98
962e720b3d1c27bcfec90374ff393584b99577b3 19-Mar-2013 Tim Murray <timmurray@google.com> Merge "Add x86 server support." into jb-mr2-dev
0b575de8ed0b628d84d256f5846500b0385979bd 15-Mar-2013 Tim Murray <timmurray@google.com> Add x86 server support.

Change-Id: I674acaf15b67afa48bc736f72942a11e2e38e940
8ca358a2abe7e0dba23993e0fc8d64b8b55bd9ca 19-Mar-2013 Jason Sams <jsams@google.com> Fix bug reporting CPU count.

Change-Id: Ib76a17c3239dc5b52624a567b40cace16f412327
cadfac411e6690e39de36c4f9e94deb9b7d2d08e 07-Mar-2013 Jason Sams <jsams@google.com> Sync with compat lib.

Change-Id: Id8ace103814cf126f0d157100d1d4a12cc0b8664
f218bf115af4ae4fd79adbb8842608b308a4cf07 13-Feb-2013 Stephen Hines <srhines@google.com> Support LinkRuntimeCallback() with RS compiler.

Change-Id: I28ada4e7c462cb9673de6886d934dce855fac339
7c4b888f2147edf99690b6af75470774ff31c43b 04-Jan-2013 Jason Sams <jsams@google.com> Functional 3D LUT intrinsic.

1600x1000 takes ~23ms on manta.

Change-Id: I142d6dedded66df05aa5f49e3da409a34c6e1b6e
4d252d6e807b89764dad123ac845df298c52ca97 29-Nov-2012 Tim Murray <timmurray@google.com> enable synchronous mode (functional)

Change-Id: I613610013e7e4d1623620ab94d2d25d8a1bd82b3
Bug: 5972398
c905efd76fdcc1b8846b229bf7d991d185a7b4b7 27-Nov-2012 Jason Sams <jsams@google.com> Cleanup pass + implement blur uchar

Change-Id: Ib7f1c5218663b468a3c11daa2c3373ae132145ac


709a0978ae141198018ca9769f8d96292a8928e6 16-Nov-2012 Jason Sams <jsams@google.com> Separate CPU driver impl from reference driver.

Change-Id: Ifb484edda665959b81d7b1f890d108bfa20a535d