History log of /external/skia/src/opts/SkBlitRow_opts_SSE2.h
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
c6820383b2526de95296ed8436f76333e0651d75 05-May-2017 Mike Klein <mtklein@chromium.org> remove old 565 destination opts

This is not an important format, and the code is dead or close to it.
The code is an occasional maintenance burden so I'd like it gone.

Change-Id: I4ad921533abf3211e6a81e6e475b848795eea060
Reviewed-on: https://skia-review.googlesource.com/15600
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
b4a7dc99b1a01cdd5c0cd5913b630436ca696210 23-Mar-2016 mtklein <mtklein@chromium.org> Port S32A_opaque blit row to SkOpts.

This should be a pixel-for-pixel (i.e. bug-for-bug) port.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1820313002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1820313002
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
4977983510028712528743aa877f6da83781b381 10-Aug-2015 mtklein <mtklein@chromium.org> Sk4px blit mask.

Local SKP nanobenching ranges SSE between 1.05x and 0.87x, much more heavily weighted toward <1.0x ratios (speedups).
I profiled the top five regressions (1.05x-1.01x) and they look like noise. Will follow up after broad bot results.

NEON looks similar but less extreme than SSE changes, ranging between 1.02x and 0.95x, again mostly speedups in 0.99x-0.97x range.

The old code trifurcated into black, opaque-but-not-black, and general versions as a function of the constant src color. I did not see a significant difference between general and opaque-but-not-black, and I don't think a black version would be faster using SIMD. So we have here just one version of the code, the general version.

Somewhat fantastically, I see no pixel diffs on GMs or SKPs.

I will be following up with more CLs for the other procs called by SkBlitMask.
BUG=skia:

Review URL: https://codereview.chromium.org/1278253003
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
95cc012ccaea20f372893ae277ea0a8a6339d094 28-Apr-2015 mtklein <mtklein@chromium.org> De-proc Color32

Also strips SK_SUPPORT_LEGACY_COLOR32_MATH,
which is no longer needed.

Seems handy to have SkTypes include the relevant intrinsics when
we know we've got them, but I'm not married to it.

Locally this looks like a pointlessly small perf win, but I'm mostly
keen to get all the code together.

BUG=skia:

Committed: https://skia.googlesource.com/skia/+/376e9bc206b69d9190f38dfebb132a8769bbd72b

Committed: https://skia.googlesource.com/skia/+/d65dc0cedd5b50dd407b6ff8fdc39123f11511cc

CQ_EXTRA_TRYBOTS=client.skia.compile:Build-Ubuntu-GCC-Mips-Debug-Android-Trybot

Review URL: https://codereview.chromium.org/1104183004
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
641c3ff7c680ef7935d47d2e68f8301acc79e3de 27-Apr-2015 mtklein <mtklein@google.com> Revert of De-proc Color32 (patchset #5 id:80001 of https://codereview.chromium.org/1104183004/)

Reason for revert:
duh

Original issue's description:
> De-proc Color32
>
> Also strips SK_SUPPORT_LEGACY_COLOR32_MATH,
> which is no longer needed.
>
> Seems handy to have SkTypes include the relevant intrinsics when
> we know we've got them, but I'm not married to it.
>
> Locally this looks like a pointlessly small perf win, but I'm mostly
> keen to get all the code together.
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/376e9bc206b69d9190f38dfebb132a8769bbd72b
>
> Committed: https://skia.googlesource.com/skia/+/d65dc0cedd5b50dd407b6ff8fdc39123f11511cc

TBR=reed@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:

Review URL: https://codereview.chromium.org/1102363006
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
d65dc0cedd5b50dd407b6ff8fdc39123f11511cc 27-Apr-2015 mtklein <mtklein@chromium.org> De-proc Color32

Also strips SK_SUPPORT_LEGACY_COLOR32_MATH,
which is no longer needed.

Seems handy to have SkTypes include the relevant intrinsics when
we know we've got them, but I'm not married to it.

Locally this looks like a pointlessly small perf win, but I'm mostly
keen to get all the code together.

BUG=skia:

Committed: https://skia.googlesource.com/skia/+/376e9bc206b69d9190f38dfebb132a8769bbd72b

Review URL: https://codereview.chromium.org/1104183004
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
498856ebc6f22d7018071bd6696756d7cd077ab8 27-Apr-2015 mtklein <mtklein@google.com> Revert of De-proc Color32 (patchset #4 id:60001 of https://codereview.chromium.org/1104183004/)

Reason for revert:
MIPS

Original issue's description:
> De-proc Color32
>
> Also strips SK_SUPPORT_LEGACY_COLOR32_MATH,
> which is no longer needed.
>
> Seems handy to have SkTypes include the relevant intrinsics when
> we know we've got them, but I'm not married to it.
>
> Locally this looks like a pointlessly small perf win, but I'm mostly
> keen to get all the code together.
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/376e9bc206b69d9190f38dfebb132a8769bbd72b

TBR=reed@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:

Review URL: https://codereview.chromium.org/1108163002
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
376e9bc206b69d9190f38dfebb132a8769bbd72b 27-Apr-2015 mtklein <mtklein@chromium.org> De-proc Color32

Also strips SK_SUPPORT_LEGACY_COLOR32_MATH,
which is no longer needed.

Seems handy to have SkTypes include the relevant intrinsics when
we know we've got them, but I'm not married to it.

Locally this looks like a pointlessly small perf win, but I'm mostly
keen to get all the code together.

BUG=skia:

Review URL: https://codereview.chromium.org/1104183004
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
70840cbd898df67f603987213164c798415d76bf 20-Mar-2015 henrik.smiding <henrik.smiding@intel.com> Replace SSE optimization of Color32A_D565

Adds an SSE2 version of the Color32A_D565 function, to replace
the existing SSE4 version. Also does some minor cleanup.

Performance improvement in the following Skia benchmarks.
Measured on Atom Silvermont:
Xfermode_SrcOver - x3
luma_colorfilter_large - x4.6
luma_colorfilter_small - x2
tablebench - ~15%
chart_bw - ~10%

Measured on Corei7 Haswell:
luma_colorfilter_large running SSE2 - x2
luma_colorfilter_large running SSE4 - x2.3

Also improves performance in WPS Office application and 2D subtest of 0xbenchmark on Android.

Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>

Review URL: https://codereview.chromium.org/923523002
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
8c4953c6f176469ad287c3270ab146e292b23bad 30-Apr-2014 commit-bot@chromium.org <commit-bot@chromium.org@2bbb7eff-a529-9590-31e7-b0007b416f81> Cleanup of SSE optimization files.

General cleanup of optimization files for x86/SSEx.
Renamed the opts_check_SSE2.cpp file to _x86, since it's not specific
to SSE2. Commented out the ColorRect32 optimization, since it's
disabled anyway, to make it more visible.
Also fixed a lot of indentation, inclusion guards, spelling,
copyright headers, braces, whitespace, and sorting of includes.

Author: henrik.smiding@intel.com

Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>

R=reed@google.com, mtklein@google.com, tomhudson@google.com, djsollen@google.com, joakim.landberg@intel.com

Author: henrik.smiding@intel.com

Review URL: https://codereview.chromium.org/264603002

git-svn-id: http://skia.googlecode.com/svn/trunk@14464 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
fe089b383aeae512ee39678a667c81867f730cd0 07-Mar-2014 commit-bot@chromium.org <commit-bot@chromium.org@2bbb7eff-a529-9590-31e7-b0007b416f81> SSE2 implementation of S32A_D565_Opaque_Dither

Run benchmarks with command line option "--forceDither true --forceBlend
1", almost all the benchmarks exercised S32A_D565_Opaque_Dither can get
about 20%-70% performance improvement.
Here are the data on i7-3770:
before after
verts 4314.81 3627.64 15.93%
constXTile_MM_filter_trans 1434.22 432.82 69.82%
constXTile_CC_filter_trans_scale 1440.17 437.00 69.66%
constXTile_RR_filter_trans 1436.96 431.93 69.94%
constXTile_MM_trans_scale 1436.33 435.77 69.66%
constXTile_CC_trans 1433.12 431.36 69.90%
constXTile_RR_trans_scale 1436.13 436.06 69.64%
constXTile_MM_filter 1411.55 408.06 71.09%
constXTile_CC_filter_scale 1416.68 414.18 70.76%
constXTile_RR_filter 1429.46 409.81 71.33%
constXTile_MM_scale 1415.00 412.56 70.84%
constXTile_CC 1410.32 408.36 71.04%
constXTile_RR_scale 1413.26 413.16 70.77%
repeatTile_4444_A 1922.01 879.03 54.27%
repeatTile_4444_A 1430.68 818.34 42.80%
repeatTile_4444_X 1817.43 816.63 55.07%
maskshader 5911.09 5895.46 0.26%
gradient_create_alpha 4.41 4.41 -0.15%
gradient_conical_clamp_3color 35298.71 27574.34 21.88%
gradient_conical_clamp_hicolor 35262.15 27538.99 21.90%
gradient_conical_clamp 35276.21 27599.80 21.76%
gradient_radial2_mirror 20846.74 12969.39 37.79%
gradient_radial2_clamp_hicolor 21848.12 13967.57 36.07%
gradient_radial2_clamp 21829.95 13978.57 35.97%
bitmap_4444_A_scale_rotate_bicubic 105.31 87.13 17.26%
bitmap_4444_A_scale_bicubic 73.69 47.76 35.20%
bitmap_4444_update_scale_rotate_bilerp 125.65 87.86 30.08%
bitmap_4444_update_volatile_scale_rotate_bilerp 125.50 87.65 30.16%
bitmap_4444_scale_rotate_bilerp 124.46 87.91 29.37%
bitmap_4444_A_scale_rotate_bilerp 105.09 87.27 16.96%
bitmap_4444_update_scale_bilerp 106.78 63.28 40.74%
bitmap_4444_update_volatile_scale_bilerp 106.66 63.66 40.32%
bitmap_4444_scale_bilerp 106.70 63.19 40.78%
bitmap_4444_A_scale_bilerp 83.05 62.25 25.04%
bitmap_a8 98.11 52.76 46.22%
bitmap_a8_A 98.24 52.85 46.20%

BUG=
R=mtklein@google.com

Author: qiankun.miao@intel.com

Review URL: https://codereview.chromium.org/179443003

git-svn-id: http://skia.googlecode.com/svn/trunk@13699 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
275804782f7b752cc9c25cb556db2a0cfc711dd9 07-Mar-2014 commit-bot@chromium.org <commit-bot@chromium.org@2bbb7eff-a529-9590-31e7-b0007b416f81> SSE2 implementation of S32_D565_Opaque_Dither

Run benchmarks with command line option "--forceDither true". The result
shows that all benchmarks exercised S32_D565_Opaque_Dither benefit from
this SSE2 optimization. Here are the data on i7-3770:
before after
constXTile_MM_filter 900.93 217.75 75.83%
constXTile_CC_filter_scale 907.59 225.65 75.14%
constXTile_RR_filter 903.33 219.41 75.71%
constXTile_MM_scale 902.45 221.46 75.46%
constXTile_CC 898.55 218.37 75.70%
constXTile_RR_scale 902.69 222.35 75.37%
repeatTile_4444_X 938.53 240.49 74.38%
gradient_radial2_mirror 16999.49 11540.39 32.11%
gradient_radial2_clamp_hicolor 17943.38 12501.71 30.33%
gradient_radial2_clamp 17816.36 12492.04 29.88%
bitmaprect_FF_filter_trans 47.81 10.98 77.03%
bitmaprect_FF_nofilter_trans 47.79 10.91 77.18%
bitmaprect_FF_filter_identity 47.74 10.89 77.18%
bitmaprect_FF_nofilter_identity 47.83 10.89 77.24%
bitmap_4444_update_scale_rotate_bilerp 100.45 76.84 23.50%
bitmap_4444_update_volatile_scale_rotate_bilerp 100.80 76.70 23.91%
bitmap_4444_scale_rotate_bilerp 100.43 77.18 23.15%
bitmap_4444_update_scale_bilerp 79.00 49.03 37.93%
bitmap_4444_update_volatile_scale_bilerp 78.90 48.87 38.06%
bitmap_4444_scale_bilerp 78.92 48.81 38.16%
bitmap_4444_update 42.19 11.53 72.68%
bitmap_4444_update_volatile 42.28 11.49 72.82%
bitmap_a8 60.37 29.75 50.72%
bitmap_4444 42.19 11.52 72.69%

BUG=
R=mtklein@google.com

Author: qiankun.miao@intel.com

Review URL: https://codereview.chromium.org/181293002

git-svn-id: http://skia.googlecode.com/svn/trunk@13698 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
39ce33a1facae795eb2f02e35674702de7eb23b5 24-Feb-2014 commit-bot@chromium.org <commit-bot@chromium.org@2bbb7eff-a529-9590-31e7-b0007b416f81> SSE2 implementation of S32_D565_Opaque

Benchmarks hitting this path can benfit from this patch.
Here are the data:
before after
gradient_radial2_mirror 10885.52 10849.48 0.33%
gradient_radial2_clamp_hicolor 11819.69 11644.83 1.48%
gradient_radial2_clamp 11816.10 11649.91 1.41%
bitmaprect_FF_filter_trans 6.27 4.88 22.17%
bitmaprect_FF_nofilter_trans 6.27 4.88 22.17%
bitmaprect_FF_filter_identity 6.31 4.86 22.98%
bitmaprect_FF_nofilter_identity 6.25 4.86 22.24%
bitmap_4444_update 6.26 5.05 19.33%
bitmap_4444_update_volatile 6.21 5.06 18.52%
bitmap_4444 6.22 5.06 18.65%

BUG=
R=mtklein@google.com

Author: qiankun.miao@intel.com

Review URL: https://codereview.chromium.org/172083003

git-svn-id: http://skia.googlecode.com/svn/trunk@13556 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
475910750cdc7d14da3071d4052ba9ab98383be9 19-Feb-2014 commit-bot@chromium.org <commit-bot@chromium.org@2bbb7eff-a529-9590-31e7-b0007b416f81> SSE2 implementation of S32A_D565_Opaque

microbenchmark of S32A_D565_Opaque() shows a 3x speedup after SSE optimization with various count on i7-3770.

BUG=
R=mtklein@google.com, reed@google.com

Author: qiankun.miao@intel.com

Review URL: https://codereview.chromium.org/138163013

git-svn-id: http://skia.googlecode.com/svn/trunk@13495 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
7329dc9976e5a61e2382974884d2e075f4f856f1 27-Jul-2012 reed@google.com <reed@google.com@2bbb7eff-a529-9590-31e7-b0007b416f81> revert 4799-4801 -- red and blue are reversed on windows and linux



git-svn-id: http://skia.googlecode.com/svn/trunk@4803 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
e5a196f4403529076dfa335facb2122c5768d8aa 27-Jul-2012 reed@google.com <reed@google.com@2bbb7eff-a529-9590-31e7-b0007b416f81> use SK_RESTRICT instead of __restrict__



git-svn-id: http://skia.googlecode.com/svn/trunk@4801 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
37fe48489907870067e9cef9cf987b505ae6f02c 27-Jul-2012 reed@google.com <reed@google.com@2bbb7eff-a529-9590-31e7-b0007b416f81> land http://codereview.appspot.com/6327044/

SSE optimization for 565 pixel format -- by Lei



git-svn-id: http://skia.googlecode.com/svn/trunk@4799 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
d6770e69e05c9dcc12f2a1a2d509c0b174372ee7 14-Feb-2012 tomhudson@google.com <tomhudson@google.com@2bbb7eff-a529-9590-31e7-b0007b416f81> SSE2 version of blit_lcd16, courtesy of Jin Yang.
Yields 25-30% speedup on Windows (32b), 4-7% on Linux (64b, less register
pressure), not invoked on Mac (lcd text is 32b instead of 16b).

Followup: GDI system settings on Windows can suppress LCD text for small
fonts, interfering with our benchmarks.
(http://code.google.com/p/skia/issues/detail?id=483)

http://codereview.appspot.com/5617058/



git-svn-id: http://skia.googlecode.com/svn/trunk@3189 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
edb606cb999887d54629f361bcbf57c5fede1bb0 18-Oct-2011 reed@google.com <reed@google.com@2bbb7eff-a529-9590-31e7-b0007b416f81> move LCD blits into opts, so they can have assembly versions



git-svn-id: http://skia.googlecode.com/svn/trunk@2484 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
ec3ed6a5ebf6f2c406d7bcf94b6bc34fcaeb976e 28-Jul-2011 epoger@google.com <epoger@google.com@2bbb7eff-a529-9590-31e7-b0007b416f81> Automatic update of all copyright notices to reflect new license terms.

I have manually examined all of these diffs and restored a few files that
seem to require manual adjustment.

The following files still need to be modified manually, in a separate CL:

android_sample/SampleApp/AndroidManifest.xml
android_sample/SampleApp/res/layout/layout.xml
android_sample/SampleApp/res/menu/sample.xml
android_sample/SampleApp/res/values/strings.xml
android_sample/SampleApp/src/com/skia/sampleapp/SampleApp.java
android_sample/SampleApp/src/com/skia/sampleapp/SampleView.java
experimental/CiCarbonSampleMain.c
experimental/CocoaDebugger/main.m
experimental/FileReaderApp/main.m
experimental/SimpleCocoaApp/main.m
experimental/iOSSampleApp/Shared/SkAlertPrompt.h
experimental/iOSSampleApp/Shared/SkAlertPrompt.m
experimental/iOSSampleApp/SkiOSSampleApp-Base.xcconfig
experimental/iOSSampleApp/SkiOSSampleApp-Debug.xcconfig
experimental/iOSSampleApp/SkiOSSampleApp-Release.xcconfig
gpu/src/android/GrGLDefaultInterface_android.cpp
gyp/common.gypi
gyp_skia
include/ports/SkHarfBuzzFont.h
include/views/SkOSWindow_wxwidgets.h
make.bat
make.py
src/opts/memset.arm.S
src/opts/memset16_neon.S
src/opts/memset32_neon.S
src/opts/opts_check_arm.cpp
src/ports/SkDebug_brew.cpp
src/ports/SkMemory_brew.cpp
src/ports/SkOSFile_brew.cpp
src/ports/SkXMLParser_empty.cpp
src/utils/ios/SkImageDecoder_iOS.mm
src/utils/ios/SkOSFile_iOS.mm
src/utils/ios/SkStream_NSData.mm
tests/FillPathTest.cpp
Review URL: http://codereview.appspot.com/4816058

git-svn-id: http://skia.googlecode.com/svn/trunk@1982 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
981d4798007b91e2e19c13b171583927a56df63b 09-Mar-2011 reed@google.com <reed@google.com@2bbb7eff-a529-9590-31e7-b0007b416f81> http://codereview.appspot.com/3980041/

Add blitmask procs (with optional platform acceleration)
patch by yaojie.yan



git-svn-id: http://skia.googlecode.com/svn/trunk@910 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h
4e753558fc8cc2f77cbcd46fba80d8612e836a1e 16-Nov-2009 senorblanco@chromium.org <senorblanco@chromium.org@2bbb7eff-a529-9590-31e7-b0007b416f81> More SSE2-ification; fix for gcc -msse2.

Review URL: http://codereview.appspot.com/154163



git-svn-id: http://skia.googlecode.com/svn/trunk@428 2bbb7eff-a529-9590-31e7-b0007b416f81
/external/skia/src/opts/SkBlitRow_opts_SSE2.h