History log of /external/skia/src/core/SkOpts.h
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
8729e5bbf7c436fd7c7c13182adbbfb419f566b5 16-Feb-2017 Mike Klein <mtklein@chromium.org> Simplify more: remove SkRasterPipeline::compile().

It's easier to work on SkJumper if everything funnels through run().

I don't anticipate huge benefit from compile() without JITing,
but it's something we can always put back if we find a need.

Change-Id: Id5256fd21495e8195cad1924dbad81856416d913
Reviewed-on: https://skia-review.googlesource.com/8468
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
2e83b13122f93a4f88d1c5471e4b4d71b446e223 10-Feb-2017 Herb Derby <herb@google.com> Remove SkTextureCompressor.

This is the ultimate state of what it looks like to remove
SkTextureCompressor.

This end result will result from the following steps.

1) Remove Skia dep on ktx (done)
2) Move format over to ktx (done)
3) Remove all SkTexture compressor code

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD

Change-Id: I3ad7a6abbea006a3034d95662c652d6db90b86ef
Reviewed-on: https://skia-review.googlesource.com/8272
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
Reviewed-by: Leon Scroggins <scroggo@google.com>
/external/skia/src/core/SkOpts.h
319ba3d3a177498095c31696e0aec8b3af25f663 20-Jan-2017 Mike Klein <mtklein@chromium.org> Move shader register setup to SkRasterPipelineBlitter.

We've been seeding the initial values of our registers to x+0.5,y+0.5,
1,0, 0,0,0,0 (useful values for shaders to start with) in all pipelines.
This CL changes that to do so only when blitting, and only when we have
a shader.

The nicest part of this change is that SkRasterPipeline itself no longer
needs to have a concept of y, or what x means. It just marches x
through [x,x+n), and the blitter handles y and layers the meaning of
"dst x coordinate" onto x.

This ought to make SkSplicer a little easier to work with too.

dm --src gm --config f16 srgb 565 all draws the same.

Change-Id: I69d8c1cc14a06e5dfdd6a7493364f43a18f8dec5
Reviewed-on: https://skia-review.googlesource.com/7353
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
efaad3cd53330f063e6feaee8b14ad43ca251184 20-Jan-2017 Mike Klein <mtklein@chromium.org> Remove SkColorCubeFilter. It is unused.

Change-Id: Iec5fc759e331de24caea1347f9510917260d379b
Reviewed-on: https://skia-review.googlesource.com/7363
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
c789b61167dd98efc3c3bfcf9673eef24c2e57f4 30-Nov-2016 Mike Klein <mtklein@chromium.org> Bring back SkRasterPipeline::run() for one-off uses.

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD

Change-Id: I308b6d75f2987a667eead9a55760a2ff6aec2984
Reviewed-on: https://skia-review.googlesource.com/5353
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
d2265e537c8015f8115d7b5b7f6de970aa688172 18-Nov-2016 xiangze.zhang <xiangze.zhang@intel.com> Port convolve functions to SkOpts

This patch moves the C++/SSE2/NEON implementations of convolve functions
into the same place and uses SkOpts framework.
Also some indentation fix.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2500113004
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2500113004
/external/skia/src/core/SkOpts.h
d47067392848ba132d4e86ffbeebe2dcacda9534 15-Nov-2016 Mike Reed <reed@google.com> make SkXfermode.h go away

This is step one:
- make SkXfermode useless to public clients
- everything they should need is in SkBlendMode.h

Step two:
- remove SkXfermode.h entirely (since skia core will already be using SkXfermodePriv.h)

BUG=skia:

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4534

Change-Id: If2cea9f71df92430ed6644edb98dd306c5572cbc
Reviewed-on: https://skia-review.googlesource.com/4534
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
/external/skia/src/core/SkOpts.h
af49b19582f1f7a55e1300647804212252315b8a 15-Nov-2016 Mike Klein <mtklein@chromium.org> Start each pipeline with (x,y) in (dr,dg) registers for the shader.

Image shaders need to do some geometry work before sampling the image colors:
1) determine dst coordinates
2) map back to src coordinates
3) tiling

Feeding (x,y) through as (dr,dg) registers makes step 1) easy, perhaps trivial, while leaving (r,g,b,a) with their usual meanings, "the color", starting with the paint color.

This is easy to tweak into something like (x+0.5, y+0.5, 1) in (dr,dg,db) once this lands. Mostly I just want to get all the uninteresting boilerplate out of the way first.

BUG=skia:

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4791
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Change-Id: Ia07815d942ded6672dc1df785caf80a508fc8f37
Reviewed-on: https://skia-review.googlesource.com/4791
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
7d954ad797176afedb9262fdea4507d0fc60eb9d 28-Oct-2016 Mike Reed <reed@google.com> remove xfermode from public api

BUG=skia:

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4020

CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Change-Id: I19cd056f2af778f10e8c6c2b7b2735593b43dbac
Reviewed-on: https://skia-review.googlesource.com/4020
Reviewed-by: Florin Malita <fmalita@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Reed <reed@google.com>
/external/skia/src/core/SkOpts.h
e9f74b89c09772dd5abae1c0709c711d7cdb6535 25-Oct-2016 Mike Klein <mtklein@chromium.org> SkRasterPipeline::compile().

I'm not yet caching these in the blitter, and speed is essentially unchanged in the bench where I am now building and compiling the pipeline only once. This may not be able to stay a simple std::function after I figure out caching, but for now it's a nice fit.

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3911

Change-Id: I9545af589f73baf9f17cb4e6ace9a814c2478fe9
Reviewed-on: https://skia-review.googlesource.com/3911
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
aebfb45104eeb6dab5dbbedda13c2eaa7b7f7868 25-Oct-2016 Mike Klein <mtklein@chromium.org> Move SkRasterPipeline further into SkOpts.

The portable code now becomes entirely focused on enum+ptr descriptions, leaving the concrete implementation of the pipeline to SkOpts::run_pipeline().

As implemented, the concrete implementation is basically the same, with a little more type safety.

Speed is essentially unchanged on my laptop, and that's having run_pipeline() rebuild its concrete state every call. There's room for improvement there if we split this into a compile_pipeline() / run_pipeline() sort of thing, which is my next planned CL.

BUG=skia:

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3920
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Change-Id: Ie4c554f51040426de7c5c144afa5d9d9d8938012
Reviewed-on: https://skia-review.googlesource.com/3920
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
04adfda9c74481d0b640c0ce18864588babfcdf6 12-Oct-2016 Mike Klein <mtklein@chromium.org> SkRasterPipeline: 8x pipelines, without any 8x code enabled.

Original review here: https://skia-review.googlesource.com/c/2990/
Second attempt here: https://skia-review.googlesource.com/c/3064/

This is the same as the second attempt, but with the change to SkOpts_hsw.cpp left out.
That omitted part is the key piece... this just lands the refactoring.

CQ_INCLUDE_TRYBOTS=master.client.skia:Perf-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Debug-ASAN-Trybot,Perf-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Debug-GN,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot;master.client.skia.compile:Build-Win-MSVC-x86_64-Debug-Trybot

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3242

Change-Id: Iaafa793a4854c2c9cd7e85cca3701bf871253f71
Reviewed-on: https://skia-review.googlesource.com/3242
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
42f4b42e8311f168aeeadd939b476c05b329500e 10-Oct-2016 Mike Klein <mtklein@chromium.org> Revert "SkRasterPipeline: 8x pipelines, attempt 2"

This reverts commit Id0ba250037e271a9475fe2f0989d64f0aa909bae.

crbug.com/654213
Looks like Chrome Canary's picking up Haswell code on non-Haswell machines.

Change-Id: I16f976da24db86d5c99636c472ffad56db213a2a
Reviewed-on: https://skia-review.googlesource.com/3108
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
a71e151c6f0be68dc96ad2d169bbc31edca8f946 07-Oct-2016 Mike Klein <mtklein@chromium.org> SkRasterPipeline: 8x pipelines, attempt 2

Original review here: https://skia-review.googlesource.com/c/2990/

Changes since:
- simpler implementations of load_tail() / store_tail(): slower, but more obviously correct to all compilers
- fleshed out math ops on Sk8i and Sk8u to make unit tests happy on -Fast bot (where we always have AVX2)
- now storing stage functions as void(*)() to avoid undefined behavior and/or linker problems. This restores 32-bit Windows.
- all AVX2 Sk8x methods are marked always-inline, to avoid linking the "wrong" version on Debug builds.

CQ_INCLUDE_TRYBOTS=master.client.skia:Perf-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Debug-ASAN-Trybot,Perf-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Debug-GN,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot;master.client.skia.compile:Build-Win-MSVC-x86_64-Debug-Trybot

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3064

Change-Id: Id0ba250037e271a9475fe2f0989d64f0aa909bae
Reviewed-on: https://skia-review.googlesource.com/3064
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
942858774831fea6e3f5f9d6c39b9538cf031bfd 07-Oct-2016 Mike Klein <mtklein@chromium.org> Revert "SkRasterPipeline: 8x pipelines"

This reverts commit I1c82e5755d8e44cc0b9c6673d04b117f85d71a3a.

Reason for revert: lots of failing bots.

TBR=mtklein@chromium.org,msarett@google.com,reviews@skia.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true

Change-Id: I653bed3905187f43196504f19424985fa2a765b5
Reviewed-on: https://skia-review.googlesource.com/3063
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
1aebdaee0e2aa4324509fd3ad4c40c21703ae4a2 06-Oct-2016 Mike Klein <mtklein@chromium.org> SkRasterPipeline: 8x pipelines

Bench runtime changes:
sRGB: 7194 -> 3735 = 1.93x faster
F16: 6531 -> 2559 = 2.55x faster

Instead of building 4x and 1-3x pipelines and then maybe 8x and 1-7x, instead build either the short ones or the long ones, but not both. If we just take care to use a compatible run_pipeline(), there's some cross-module type disagreement but everything works out in the end.

Oddly, a few places that looked like they'd be faster using SkNx_fma() or Sk4f_round()/Sk8f_round() are actually faster the long way, e.g. multiply, add 0.5, truncate. Curious! In all the other places you see here that I've used SkNx_fma(), it's been a significant speedup.

This folds in a couple refactors and cleanups that I've been meaning to do. Hope you don't mind... if find the new code considerably easier to read than the old code.

BUG=skia:

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2990
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Change-Id: I1c82e5755d8e44cc0b9c6673d04b117f85d71a3a
Reviewed-on: https://skia-review.googlesource.com/2990
Reviewed-by: Matt Sarett <msarett@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
fa9f241a85c55c32a3fe2ae0324811de998f7a2e 29-Sep-2016 Mike Klein <mtklein@chromium.org> Add an enum layer of indirection for stock raster pipeline stages.

This is handy now, and becomes necessary with fancier backends:
- most code can't speak the type of AVX pipeline stages,
so indirection's definitely needed there;
- if the pipleine is entirely composed of stock stages,
these enum values become an abstract recipe that can be JITted.

BUG=skia:

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2782
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Change-Id: Iedd62e99ce39e94cf3e6ffc78c428f0ccc182342
Reviewed-on: https://skia-review.googlesource.com/2782
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
baaf8ad95237d1defdb7d93077d9bf8410d8ad7f 29-Sep-2016 Mike Klein <mtklein@chromium.org> Start moving SkRasterPipeline stages to SkOpts.

This lets them pick up runtime CPU specializations. Here I've plugged in SSE4.1. This is still one of the N prelude CLs to full 8-at-a-time AVX.

I've moved the union of the stages used by SkRasterPipelineBench and SkRasterPipelineBlitter to SkOpts... they'll all be used by the blitter eventually. Picking up SSE4.1 specialization here (even still just 4 pixels at a time) is a significant speedup, especially to store_srgb(), so much that it's no longer really interesting to compare against the fused-but-default-instruction-set version in the bench. So that's gone now.

That left the SkRasterPipeline unit test as the only other user of the EasyFn simplified interface to SkRasterPipeline. So I converted that back down to the bare-metal interface, and EasyFn and its friends became SkRasterPipeline_opts.h exclusive abbreviations (now called Kernel_Sk4f). This isn't really unexpected: SkXfermode also wanted to build up its own little abstractions, and once you build your own abstraction, the value of an additional EasyFn-like layer plummets to negative.

For simplicity I've left the SkXfermode stages alone, except srcover() which was always part of the blitter. No particular reason except keeping the churn down while I hack. These _can_ be in SkOpts, but don't have to be until we go 8-at-a-time.

BUG=skia:

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2752
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Change-Id: I3b476b18232a1598d8977e425be2150059ab71dc
Reviewed-on: https://skia-review.googlesource.com/2752
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkOpts.h
4e97607d9a1cef66fac16f347c5ca813ec4f9515 08-Aug-2016 mtklein <mtklein@chromium.org> Use sse4.2 CRC32 instructions to hash when available.

About 9x faster than Murmur3 for long inputs.

Most of this is a mechanical change from SkChecksum::Murmur3(...) to SkOpts::hash(...).

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2208903002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia.compile:Build-Ubuntu-GCC-x86_64-Release-CMake-Trybot,Build-Mac-Clang-x86_64-Release-CMake-Trybot

Review-Url: https://codereview.chromium.org/2208903002
/external/skia/src/core/SkOpts.h
15ee3deee8aca2bf6e658449f25ee34a8153e6ee 02-Aug-2016 msarett <msarett@google.com> Refactor of SkColorSpaceXformOpts

(1) Performance is better or stays the same.

(2) Code is split into functions (RasterPipeline-ish
design). IMO, it's not really more or less readable.
But I think it's now much easier add capabilities,
apply optimizations, or do more refactors. Or to
actually use RasterPipeline. I help back from trying
any of these to try to keep this CL sane.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2194303002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2194303002
/external/skia/src/core/SkOpts.h
50ce1f28ffede3fa3e38d330d4114ee52b387848 29-Jul-2016 msarett <msarett@google.com> Add color space xform support to SkJpegCodec (includes F16!)

Also changes SkColorXform to support:
RGBA->RGBA
RGBA->BGRA

Instead of:
RGBA->SkPMColor

TBR=reed@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2174493002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Committed: https://skia.googlesource.com/skia/+/73d55332e2846dd05e9efdaa2f017bcc3872884b
Review-Url: https://codereview.chromium.org/2174493002
/external/skia/src/core/SkOpts.h
39979d8c6b97889f600a212cfc9b063360f3de2f 29-Jul-2016 msarett <msarett@google.com> Revert of Add color space xform support to SkJpegCodec (includes F16!) (patchset #9 id:260001 of https://codereview.chromium.org/2174493002/ )

Reason for revert:
Breaking MSAN

Original issue's description:
> Add color space xform support to SkJpegCodec (includes F16!)
>
> Also changes SkColorXform to support:
> RGBA->RGBA
> RGBA->BGRA
>
> Instead of:
> RGBA->SkPMColor
>
> TBR=reed@google.com
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2174493002
> CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/73d55332e2846dd05e9efdaa2f017bcc3872884b

TBR=mtklein@google.com,reed@google.com,herb@google.com,brianosman@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:

Review-Url: https://codereview.chromium.org/2195523002
/external/skia/src/core/SkOpts.h
73d55332e2846dd05e9efdaa2f017bcc3872884b 29-Jul-2016 msarett <msarett@google.com> Add color space xform support to SkJpegCodec (includes F16!)

Also changes SkColorXform to support:
RGBA->RGBA
RGBA->BGRA

Instead of:
RGBA->SkPMColor

TBR=reed@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2174493002
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2174493002
/external/skia/src/core/SkOpts.h
959d45b43357a40854938586c303177c6aa53220 21-Jul-2016 msarett <msarett@google.com> Miscellaneous color space refactors

(1) Use float matrix[16] everywhere (enables future code
sharing).
(2) SkColorLookUpTable refactors
*** Store in a single allocation (like SkGammas)
*** Eliminate fOutputChannels (we always require 3,
and probably always will)
(3) Change names of read_big_endian_* helpers

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2166093003
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2166093003
/external/skia/src/core/SkOpts.h
9ce3a543c92a73e6daca420defc042886b3f2019 15-Jul-2016 msarett <msarett@google.com> Add capability for SkColorXform to output half floats

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2147763002
CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2147763002
/external/skia/src/core/SkOpts.h
05c73b7ed5b73a55cbecf7f787a35fcc0e84e983 13-Jul-2016 mtklein <mtklein@chromium.org> Remove bulk float <-> half routines. These are dead code.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2152583002

Review-Url: https://codereview.chromium.org/2152583002
/external/skia/src/core/SkOpts.h
6006f678e78af7b6f67a454cd4bc213048983f9d 11-Jul-2016 msarett <msarett@google.com> Make all color xforms 'fast' (step 1)

This refactors opt code to handle arbitrary src and dst
gammas that are specified by tables.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2130013002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2130013002
/external/skia/src/core/SkOpts.h
d2809573deb7b99e764f7f71fe34a5b5322df0b2 20-Jun-2016 msarett <msarett@google.com> Support sRGB dsts in opt code

201295.jpg on HP z620 (300x280)

QCMS Xform 0.418 ms
Skia NEW Xform 0.378 ms

Vs QCMS 1.11x

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2078623002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2078623002
/external/skia/src/core/SkOpts.h
dea0340cadb759932e53416a657f5ea75fee8b5f 16-Jun-2016 msarett <msarett@google.com> Implement fast, correct gamma conversion for color xforms

201295.jpg on HP z620
(300x280, most common form of sRGB profile)

QCMS Xform 0.495 ms
Skia Old Xform 0.235 ms
Skia NEW Xform 0.423 ms

Vs Old Code 0.56x
Vs QCMS 1.17x

So to summarize, we are now much slower than before,
but still a bit faster than QCMS. And now we are also
far more accurate than QCMS :).

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2060823003
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2060823003
/external/skia/src/core/SkOpts.h
a9e878c836994bce695274b4c28890290139dcdf 08-Jun-2016 msarett <msarett@google.com> Optimize color xforms with 2.2 gammas for SSE2

Because we recognize commonly used gamma tables and
parameters as 2.2f, about 98% of jpegs with color profiles
will pass through this xform (assuming the dst is also
2.2f). Sample size is 10,322 jpegs.

I won't go crazy with performance numbers because this is
a work in progress, particularly in terms of correctness.

201295.jpg on HP z620
(300x280, most common form of sRGB profile)

Decode Time + QCMS Xform 1.28 ms
QCMS Xform Only 0.495 ms
Decode Time + Skia Opt Xform 1.01 ms
Skia Opt Xform Only 0.235 ms

Decode Time + Xform Speed-up 1.27x
Xform Only Speed-up 2.11x

FWIW, Skia xform time before these optimizations was
41.1 ms. But we expected that code to be slow.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2046013002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2046013002
/external/skia/src/core/SkOpts.h
c5091b5b6c4b8a7aef8c12db9ea2a85e907b01c4 02-May-2016 mtklein <mtklein@chromium.org> Add a hook for CPU-optimized sRGB-sRGB srcover.

Herb's really starting to get serious about tweaking this, which becomes
a lot easier when you've got SkOpts' runtime CPU detection. We should be
able to optimize this usefully for SSSE3, SSE4.1, AVX, AVX2, or NEON.
(We can of course implement a subset.)

This function takes two counts to give us flexibility to write src patterns:
nsrc >= ndst -> the usual srcover function
nsrc < ndst -> repeat src until it fills dst
nsrc << ndst -> possibly preprocess src into registers
nsrc == 1 -> equivalent of blitrow_color32, srcover_1, etc.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1939783003

Review-Url: https://codereview.chromium.org/1939783003
/external/skia/src/core/SkOpts.h
05db63b5fc262136da9b289418b871180cd1359b 19-Apr-2016 mtklein <mtklein@chromium.org> Remove static initializer for SkOpts::Init()

Static initializers run in a confusing unspecified order,
so it's best to have as few of them as possible.

Most tools and clients I can find already call SkGraphics::Init(),
(or equivalently create an SkAutoGraphics) which calls SkOpts::Init():
- Chrome
- Chrome renderer
- Android
- DM
- nanobench
- SampleApp
- VisualBench
- the old debugger

Seems like the only thing relying on this static initializer today is
the new debugger, fixed here.

TBR=reed@google.com

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1903503002

Review URL: https://codereview.chromium.org/1903503002
/external/skia/src/core/SkOpts.h
567118fbe69f9d0e245ddc0a6c312b6b9b70233f 14-Apr-2016 mtklein <mtklein@chromium.org> Graduate matrix map-point procs out of SkOpts.

These are implemented generically with Sk4s and don't benefit
from anything fancier than vanilla SSE/NEON.

This means there's no need to hide this code away in another
file or behind a function pointer... it's readable and we have
compile-time support for all the instructions it needs.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1872193002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1872193002
/external/skia/src/core/SkOpts.h
b4a7dc99b1a01cdd5c0cd5913b630436ca696210 23-Mar-2016 mtklein <mtklein@chromium.org> Port S32A_opaque blit row to SkOpts.

This should be a pixel-for-pixel (i.e. bug-for-bug) port.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1820313002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1820313002
/external/skia/src/core/SkOpts.h
3bc2624a4b89c49efd65f5e548ac5f2dd9351431 17-Feb-2016 mtklein <mtklein@chromium.org> try plain-old code for sk_memset16/32 now that NEON is compile-time

Most of these implementations now just say "always inline".
Let's see if we can get away with the simplicity of doing that all the time.

These inlined implementations can autovectorize easily.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1639863002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1639863002
/external/skia/src/core/SkOpts.h
a525cb151bb39fb6362af051f69b6d633f660fd9 09-Feb-2016 mtklein <mtklein@chromium.org> skeleton for float <-> half optimized procs

Nothing fancy yet, just calls the serial code in a loop.

I will try to folow this up with at least some of:
- SSE2 version of serial code
- NEON version of serial code
- NEON version using vcvt.f32.f16/vcvt.f16.f32
- F16C (between AVX and AVX2) version using vcvtph2ps/vcvtps2ph
The last two are fastest but need runtime detection.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1686543003

Review URL: https://codereview.chromium.org/1686543003
/external/skia/src/core/SkOpts.h
c5c322d8ecfc05718f9f04360956c4f1f9dc33c1 08-Feb-2016 msarett <msarett@google.com> Optimize CMYK->RGBA (BGRA) transform for jpeg decodes

Swizzle Bench Runtime
Nexus 6P 0.14x
Dell Venue 8 0.12x

CMYK Jpeg Decode Runtime
Nexus 6P 0.81x
Dell Venue 8 0.85x

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1676773003
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1676773003
/external/skia/src/core/SkOpts.h
1e06079b259d1091b735492b2f71d9897c14c608 03-Feb-2016 msarett <msarett@google.com> NEON optimizations for GrayAlpha -> RGBA/BGRA Premul/Unpremul

PNG Decode Time Nexus 6P (for a test set of GrayAlpha encoded PNGs)
Regular Unpremul 0.91x
Zero Init Unpremul 0.92x
Regular Premul 0.84x
Zero Init Premul 0.86x

BUG=skia:4767
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1663623002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1663623002
/external/skia/src/core/SkOpts.h
2eff71c9b5f984b58961e5a6b4e66774c4385224 02-Feb-2016 msarett <msarett@google.com> NEON optimizations for gray -> RGBA (or BGRA) conversions

Swizzle Bench Runtime
Nexus 6P 0.32x
Nexus 9 0.89x

PNG Decode Time (for test set of gray encoded PNGs)
Nexus 6P 0.88x
Nexus 9 0.91x

BUG=skia:4767
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1656383002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1656383002
/external/skia/src/core/SkOpts.h
a766ca87bfb1129a8cf4f8a428121caf60726de4 26-Jan-2016 mtklein <mtklein@chromium.org> de-proc sk_float_rsqrt

This is the first of many little baby steps to have us stop runtime-detecting NEON.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1616013003
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Committed: https://skia.googlesource.com/skia/+/efcc125acd2d71eb077caf6db65fdd6b9eb1dc0d

Review URL: https://codereview.chromium.org/1616013003
/external/skia/src/core/SkOpts.h
ed814f34c72248c7eecf3b0a5f335c3a7c2e9dd1 22-Jan-2016 mtklein <mtklein@google.com> Revert of de-proc sk_float_rsqrt (patchset #3 id:40001 of https://codereview.chromium.org/1616013003/ )

Reason for revert:
This is somehow blocking the Google3 roll in ways neither Ben nor I understand. Precautionary revert... will try again Monday.

Original issue's description:
> de-proc sk_float_rsqrt
>
> This is the first of many little baby steps to have us stop runtime-detecting NEON.
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1616013003
> CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/efcc125acd2d71eb077caf6db65fdd6b9eb1dc0d

TBR=reed@google.com,mtklein@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:

Review URL: https://codereview.chromium.org/1629503002
/external/skia/src/core/SkOpts.h
f1b8b6ae34e5a1f4b29e423401da39f88f0c117a 22-Jan-2016 msarett <msarett@google.com> Use NEON optimizations for RGB -> RGB(FF) or BGR(FF) in SkSwizzler

Swizzle Bench Runtime Nexus 6P
xxx_xxxa 0.32x
xxx_swaprb_xxxa 0.31x

Swizzle Bench Runtime Nexus 9
xxx_xxxa 1.11x
xxx_swaprb_xxxa 1.14x
(This is a slow down.)

Swizzle Bench Runtime Nexus 5
xxx_xxxa 0.12x
xxx_swaprb 0.12x

RGB PNG Decode Runtime
Nexus 6P 0.94x
Nexus 9 0.98x

I don't know how to explain the fact that the Swizzle Bench was
slower on Nexus 9, but the decode times got faster.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1618003002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1618003002
/external/skia/src/core/SkOpts.h
efcc125acd2d71eb077caf6db65fdd6b9eb1dc0d 22-Jan-2016 mtklein <mtklein@chromium.org> de-proc sk_float_rsqrt

This is the first of many little baby steps to have us stop runtime-detecting NEON.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1616013003
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1616013003
/external/skia/src/core/SkOpts.h
8bf7b79cf9776b4edb3f6810e5ab8c80c49d3480 22-Jan-2016 mtklein <mtklein@chromium.org> Refactor swizzle names and types.

- Plant a flag to say "pretend all the inputs are RGBA".
This is how libpng thinks.
This is the opposite of what the implementation had been doing,
so I've rearranged everything to reflect the new orientation.

- Rewrite the names to be less mysterious looking. No more Xs.

- Make the src type uniformly const void*, to allow for 888 (RGB) srcs.

This should be performance and pixel neutral. (Please revert if it's not.)

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1626463002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1626463002
/external/skia/src/core/SkOpts.h
a1bfaad00486883669e20a94cf8f7a8f4e34266a 07-Jan-2016 mtklein <mtklein@chromium.org> Set up some hooks for premul/swizzzle opts.

You can call these as SkOpts::premul_xxxa, SkOpts::swaprb_xxxa, etc.

For now, I just backed the function pointers with some (untested) portable
code, which may autovectorize. We can override with optimized versions in
Init_ssse3() (in SkOpts_ssse3.cpp), Init_neon() (SkOpts_neon.cpp), etc.

BUG=skia:4767
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1569013002

Review URL: https://codereview.chromium.org/1569013002
/external/skia/src/core/SkOpts.h
29a440d2d6f306498a7b54e978e3b034bd625c19 02-Nov-2015 senorblanco <senorblanco@chromium.org> Make SkBlurImageFilter capable of cropping during blur (raster path)

SkBlurImageFilter can currently only process a source image
which is larger than or equal to the destination rect. If
the source image (or crop rect) is smaller, it is padded
out to dest size with transparent black via the 6-param
version of applyCropRect().

Fixing this requires modifying all the flavours of RGBA
box_blur() to accept a src crop rect.

BUG=skia:4502, skia:4526
CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Committed: https://skia.googlesource.com/skia/+/1b82ceb737c73327412f2e8a91748481e1aec9e4

Review URL: https://codereview.chromium.org/1415653003
/external/skia/src/core/SkOpts.h
8de01c642c145e9a6e0b9d68066e9fe63c4c5492 02-Nov-2015 senorblanco <senorblanco@chromium.org> Revert of Make SkBlurImageFilter capable of cropping during blur (patchset #16 id:400001 of https://codereview.chromium.org/1415653003/ )

Reason for revert:
ASAN failures (see https://codereview.chromium.org/1415653003/)

Original issue's description:
> Make SkBlurImageFilter capable of cropping during blur (raster path)
>
> SkBlurImageFilter can currently only process a source image
> which is larger than or equal to the destination rect. If
> the source image (or crop rect) is smaller, it is padded
> out to dest size with transparent black via the 6-param
> version of applyCropRect().
>
> Fixing this requires modifying all the flavours of RGBA
> box_blur() to accept a src crop rect.
>
> BUG=skia:4502, skia:4526
> CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/1b82ceb737c73327412f2e8a91748481e1aec9e4

TBR=mtklein@google.com,reed@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:4502, skia:4526

Review URL: https://codereview.chromium.org/1428053002
/external/skia/src/core/SkOpts.h
1b82ceb737c73327412f2e8a91748481e1aec9e4 02-Nov-2015 senorblanco <senorblanco@chromium.org> Make SkBlurImageFilter capable of cropping during blur (raster path)

SkBlurImageFilter can currently only process a source image
which is larger than or equal to the destination rect. If
the source image (or crop rect) is smaller, it is padded
out to dest size with transparent black via the 6-param
version of applyCropRect().

Fixing this requires modifying all the flavours of RGBA
box_blur() to accept a src crop rect.

BUG=skia:4502, skia:4526
CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1415653003
/external/skia/src/core/SkOpts.h
4e8a09d3672702704436112f3aa5611bd79b2690 10-Sep-2015 mtklein <mtklein@chromium.org> Port SkMatrix opts to SkOpts.

No changes to the code, just moved around.

This will have the effect of enabling vectorized code on ARMv7.
Should be no effect on ARMv8 or x86, which would have been vectorized already.

nanobench --match mappoints changes on Nexus 5 (ARMv7):

_affine: 132 -> 95
_scale: 118 -> 47
_trans: 60 -> 37

A teaser:
We should next look at the ABCD->BADC shuffle we've noted that we need in _affine. A quick hack showed doing that optimally is another ~35% speedup on x86. Got to figure out how to do it best on ARM though: that same quick hack was a 2x slowdown there. Good reason to resurrect that SkNx_shuffle() CL!

(I believe the answers are vrev64q_f32(v) and _mm_shuffle_ps(v,v, _MM_SHUFFLE(2,3,0,1), but we should probably find out in another CL.)

BUG=skia:4117

Review URL: https://codereview.chromium.org/1320673014
/external/skia/src/core/SkOpts.h
4a37d08382a16717cde52c3d2687b021c5413464 10-Sep-2015 mtklein <mtklein@chromium.org> Port SkBlitRow::Color32 to SkOpts.

This was a pre-SkOpts attempt that we can bring under its wing now.

This should be a perf no-op, deo volente.

BUG=skia:4117

Review URL: https://codereview.chromium.org/1314863006
/external/skia/src/core/SkOpts.h
2d141ba2df8f7506848aa9369f502944e837cd09 18-Aug-2015 mtklein <mtklein@chromium.org> Patches on top of Radu's latest.

patch from issue 1273033005 at patchset 120001 (http://crrev.com/1273033005#ps120001)

BUG=skia:

Review URL: https://codereview.chromium.org/1288323004
/external/skia/src/core/SkOpts.h
4977983510028712528743aa877f6da83781b381 10-Aug-2015 mtklein <mtklein@chromium.org> Sk4px blit mask.

Local SKP nanobenching ranges SSE between 1.05x and 0.87x, much more heavily weighted toward <1.0x ratios (speedups).
I profiled the top five regressions (1.05x-1.01x) and they look like noise. Will follow up after broad bot results.

NEON looks similar but less extreme than SSE changes, ranging between 1.02x and 0.95x, again mostly speedups in 0.99x-0.97x range.

The old code trifurcated into black, opaque-but-not-black, and general versions as a function of the constant src color. I did not see a significant difference between general and opaque-but-not-black, and I don't think a black version would be faster using SIMD. So we have here just one version of the code, the general version.

Somewhat fantastically, I see no pixel diffs on GMs or SKPs.

I will be following up with more CLs for the other procs called by SkBlitMask.
BUG=skia:

Review URL: https://codereview.chromium.org/1278253003
/external/skia/src/core/SkOpts.h
b6394746ff546a9c60d68e3be162cb38feffa803 06-Aug-2015 mtklein <mtklein@chromium.org> Port SkTextureCompression opts to SkOpts

Pretty vanilla translation. I cleaned up who calls whom a little.
Used to be utils -> opts -> utils, now it's just utils -> opts.

I may follow up with a pass over the NEON code for readability
and to clean up dead code.

This turns on NEON A8->R11EAC conversion for ARMv8.
Unit tests which now hit the NEON code still pass.
I can't find any related bench.

BUG=skia:4117

Review URL: https://codereview.chromium.org/1273103002
/external/skia/src/core/SkOpts.h
d029ded92d409a004f2096c78f5a99b524206481 04-Aug-2015 mtklein <mtklein@chromium.org> Port morphology to SkOpts.

Nothing too fancy.

Direction enums become enum classes so they don't get all confused. An
alternative is to create one single Direction enum that both blur and
morphology opts use.

BUG=skia:4117

Review URL: https://codereview.chromium.org/1267343004
/external/skia/src/core/SkOpts.h
dce5ce4276e2825efc6d8c4daa819c965794cd12 04-Aug-2015 mtklein <mtklein@chromium.org> Port SkBlurImage opts to SkOpts.

+268 -535 lines

I also rearranged the code a little bit to encapsulate itself better,
mostly replacing static helper functions with lambdas. This also
let me merge the SSE2 and SSE4.1 code paths.

BUG=skia:4117

Review URL: https://codereview.chromium.org/1264103004
/external/skia/src/core/SkOpts.h
bdb34d0345748f424e3c787ad9bbef04134c7e3b 31-Jul-2015 mtklein <mtklein@chromium.org> Move SkOpts.h back to src/core.

The Chrome opts targets (sse2, ssse3, sse41, etc) don't have include/private on
their include path. This should unblock the roll.

TBR=reed@google.com

BUG=skia:4117

Review URL: https://codereview.chromium.org/1268853007
/external/skia/src/core/SkOpts.h
7eb0945af254d376df11475150d184623104cf93 31-Jul-2015 mtklein <mtklein@chromium.org> Port SkUtils opts to SkOpts.

With this new arrangement, the benefits of inlining sk_memset16/32 have changed.

On x86, they're not significantly different, except for small N<=10 where the inlined code is significantly slower.
On ARMv7 with NEON, our custom code is still significantly faster for N>10 (up to 2x faster). For small N<=10 inlining is still significantly faster.
On ARMv7 without NEON, our custom code is still ridiculously faster (up to 10x) than inlining for N>10, though for small N<=10 inlining is still a little faster.

We were not using the NEON memset16 and memset32 procs on ARMv8. At first blush, that seems to be an oversight, but if so it's an extremely lucky one. The ARMv8 code generation for our memset16/32 procs is total garbage, leaving those methods ~8x slower than just inlining the memset, using the compiler's autovectorization.

So, no need to inline any more on x86, and still inline for N<=10 on ARMv7. Always inline for ARMv8.

BUG=skia:4117

Review URL: https://codereview.chromium.org/1270573002
/external/skia/src/core/SkOpts.h
f684a78d9ea988883c9b2c7bcc4ea4d5e68bd998 30-Jul-2015 mtklein <mtklein@chromium.org> Runtime CPU detection for rsqrt().

This enables the NEON sk_float_rsqrt() code for configurations that have NEON at run-time but not compile-time.

These devices will see about a 2x (1.26 -> 2.33) slowdown in sk_float_rsqrt(), but it should be more precise than our portable fallback.

(When inlined, the portable fallback and the NEON code are almost identical in speed. The only difference is precision. Going through a function pointer is causing all this slowdown. This is a good example of a place where Skia really benefits from compile-time NEON.)

BUG=skia:4117,skia:4114

No public API changes.
TBR=reed@google.com

Review URL: https://codereview.chromium.org/1264893002
/external/skia/src/core/SkOpts.h
8317a1832f55e175531ee7ae7ccd12a3a15e3c75 30-Jul-2015 mtklein <mtklein@chromium.org> Lay groundwork for SkOpts.

This doesn't really do anything yet. It's just the CPU detection code, skeleton new .cpp files, and a few little .gyp tweaks.

BUG=skia:4117

Committed: https://skia.googlesource.com/skia/+/ce2c5055cee5d5d3c9fc84c1b3eeed4b4d84a827

Review URL: https://codereview.chromium.org/1255193002
/external/skia/src/core/SkOpts.h
56b78a7a2aa87998e6a3f3d026e2184f1bccaf0c 27-Jul-2015 mtklein <mtklein@chromium.org> Revert of Lay groundwork for SkOpts. (patchset #3 id:40001 of https://codereview.chromium.org/1255193002/)

Reason for revert:
Chromium doesn't call SkGraphics::Init(). This setup won't work.

Original issue's description:
> Lay groundwork for SkOpts.
>
> This doesn't really do anything yet. It's just the CPU detection code, skeleton new .cpp files, and a few little .gyp tweaks.
>
> BUG=skia:4117
>
> Committed: https://skia.googlesource.com/skia/+/ce2c5055cee5d5d3c9fc84c1b3eeed4b4d84a827

TBR=djsollen@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:4117

Review URL: https://codereview.chromium.org/1261743002
/external/skia/src/core/SkOpts.h
ce2c5055cee5d5d3c9fc84c1b3eeed4b4d84a827 27-Jul-2015 mtklein <mtklein@chromium.org> Lay groundwork for SkOpts.

This doesn't really do anything yet. It's just the CPU detection code, skeleton new .cpp files, and a few little .gyp tweaks.

BUG=skia:4117

Review URL: https://codereview.chromium.org/1255193002
/external/skia/src/core/SkOpts.h