da7ddd8477dc802c8736c7ab860fc09f33689ce9 |
|
12-Jul-2013 |
Tobias Grosser <grosser@google.com> |
Simplify code of convolve3x3 Instead of first doing all multiplications and then adding the results in a tree manner, we just repetitively perform a load/multiply/add patter. With and without tuning for A15, this yields a 5% performance increase for N10. This commit also exposes more instructions to be transformed into fused multiply adds. Change-Id: I1215d75da236e6b2d6b6aa48b3ab35606cdba7b8
/frameworks/rs/java/tests/ImageProcessing/src/com/android/rs/image/convolve3x3.fs
|