History log of /external/skia/src/core/SkCpu.cpp
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
8f11d4dcafef4447fa68ea0ab28a72589241e9fd 24-Jan-2018 Mike Klein <mtklein@chromium.org> eliminate SK_BUILD_FOR_WIN32

SK_BUILD_FOR_WIN and SK_BUILD_FOR_WIN32 have long meant the same thing.

Chrome fix is https://chromium-review.googlesource.com/c/chromium/src/+/884007

Change-Id: I0e907b1bcd2a358eabf776f414fd3aeb3c689561
Reviewed-on: https://skia-review.googlesource.com/99340
Reviewed-by: Mike Reed <reed@google.com>
/external/skia/src/core/SkCpu.cpp
592c225b03ca677a1217eabdbc38eede6afcdb14 28-Nov-2017 bsheedy <bsheedy@google.com> Make Skia compatible with Android NDK r16

Changes to Skia that are necessary to make Chromium compile with
Android NDK r16, which switches to unified headers.

Sister CLs:
src/third_party/android_tools/ndk: https://chromium-review.googlesource.com/c/android_ndk/+/784230
src/: https://chromium-review.googlesource.com/c/chromium/src/+/777822

Bug: chromium:771171
Change-Id: I3d35df5b99d8eb7d7d938d21b5aecdf4c2d5da0f
Reviewed-on: https://skia-review.googlesource.com/75422
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkCpu.cpp
ca5d2cfe41c28c1d446e633b5991a2eab4681ce2 25-Oct-2017 Mike Klein <mtklein@chromium.org> fine-grained ARMv7 CPU feature detection

VPFv4 does not imply NEON, so check that bit separately.

Bug: b/63553517
Change-Id: Ibc218871804204d5a91d0b7fc8d5c91fe2e95f01
Reviewed-on: https://skia-review.googlesource.com/63640
Reviewed-by: Bailey Forrest <bcf@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkCpu.cpp
04d488cf138e4d6935aa26f1e9f742b597c56cc7 18-Jul-2017 Mike Klein <mtklein@chromium.org> Tweak HWCAP_... names to avoid clash with hwcap.h

These HWCAP_... values are defined in hwcap.h, but we don't get them
from there because some platforms have older hwcap.h that don't have
these bits named yet.

Even though we don't directly include hwcap.h, it seems it can get
itself included somehow on some platforms. That leads to a name
clash with the HWCAP_... #defines in there. To avoid it, rename them.

Change-Id: I70788b5e4072c307c6eee55d6f197c3b9a49f5dc
Reviewed-on: https://skia-review.googlesource.com/24408
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkCpu.cpp
c665fddeab30dde8f43862d2e8381f4c73e80079 06-Jun-2017 Mike Klein <mtklein@chromium.org> define HWCAP_* ourselves in SkCpu.cpp

For compatibility with older system headers, instead of looking for HWCAP_
values in asm/hwcap.h, just define the bits we want to test ourselves.

This lets us compile this code on systems before those bits were defined.
At runtime the bits will harmlessly test as zero.

Change-Id: I44b6aba7d6f0fc2c5df08ad262c2b0537d900209
Reviewed-on: https://skia-review.googlesource.com/18844
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkCpu.cpp
c6a449d6bf80c5bb9e02aeaed049a99870e8b1e8 01-Mar-2017 Mike Klein <mtklein@chromium.org> Add AVX-512 detection to SkCpu, try 2.

This time, don't call xgetbv() before checking we can.

This reverts commit b26373cfd8151b2fa56bdf532ddcde4919cce09f.

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Mac-Clang-MacMini4.1-GPU-GeForce320M-x86_64-Debug

Change-Id: I148302cb36446891b1d79b2e60cde0b43420c1a8
Reviewed-on: https://skia-review.googlesource.com/9089
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkCpu.cpp
b26373cfd8151b2fa56bdf532ddcde4919cce09f 28-Feb-2017 Cary Clark <caryclark@google.com> Revert "Add AVX-512 detection to SkCpu"

This reverts commit 3c322e23a013e78fcbe0edd7adccd580af8466bc.

Reason for revert: crash in SkCpu on Mac

Original change's description:
> Add AVX-512 detection to SkCpu
>
> I've added a SKY alias for the five new bits detected on a Skylake Xeon.
>
> Change-Id: I9f7dd48f4dc866608d81befd061434ca325ef451
> Reviewed-on: https://skia-review.googlesource.com/9043
> Reviewed-by: Herb Derby <herb@google.com>
> Commit-Queue: Mike Klein <mtklein@chromium.org>
>

TBR=mtklein@chromium.org,herb@google.com,reviews@skia.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true

Change-Id: I3cc06c7e32391e68d6cfe084786b18270cdab631
Reviewed-on: https://skia-review.googlesource.com/9074
Reviewed-by: Cary Clark <caryclark@google.com>
Commit-Queue: Cary Clark <caryclark@google.com>
/external/skia/src/core/SkCpu.cpp
3c322e23a013e78fcbe0edd7adccd580af8466bc 28-Feb-2017 Mike Klein <mtklein@chromium.org> Add AVX-512 detection to SkCpu

I've added a SKY alias for the five new bits detected on a Skylake Xeon.

Change-Id: I9f7dd48f4dc866608d81befd061434ca325ef451
Reviewed-on: https://skia-review.googlesource.com/9043
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkCpu.cpp
30ec0b3735d5f728c2aea4184736a3e286a5ccda 08-Feb-2017 Mike Klein <mtklein@chromium.org> Simplify SkCpu.cpp preprocessor guards.

We have a couple ways to detect CPU features on ARM:
- on ARMv8, getauxval(AT_HWCAP)
- on ARMv7, getauxval(AT_HWCAP) and cpu-features.h

This guards each of these methods with preprocessor guards to match
exactly when we can use them. Today they're sort of a mix of that and
higher level expectations about particular build and operating systems.

I'm looking into doing this directly by reading CPU registers,
much like we do for x86 further up the file.

None of this is super important right now, so as long as we don't decide
that we have these features when we don't, things will be fine. It's no
big deal for now if we fail to detect them.

Change-Id: I3b7768483086d0f3f4f6516b754c3ea5ec2d03e5
Reviewed-on: https://skia-review.googlesource.com/8182
Reviewed-by: Chinmay Garde <chinmaygarde@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkCpu.cpp
4ef8cb3527b7e3f453dccd39eea76e31eb2c33c7 12-Jan-2017 Mike Klein <mtklein@chromium.org> some armv7 hacking

We can splice these stages if we drop them down to 2 at a time.
Turns out this is significantly (2-3x) faster than the status quo.

SkRasterPipeline_…
…f16_compile 1x …srgb_compile 2.06x …f16_run 3.08x …srgb_run 4.61x

Added a couple ways to detect (likely) the required VFPv4 support:
- use hwcap when available (NDK ≥21, Android framework)
- use cpu-features when not (NDK <21)

The code in SkSplicer_generated.h is ARM, not Thumb2. SkSplicer seems
to be blx'ing into it, so that's great, and we bx lr out. There's no
point in attempting to use Thumb2 in vector heavy code... it'll all be
4 byte anyway.

Follow ups:
- vpush {d8-d9} before the loop, vpop {d8-d9} afterwards,
skip these instructions when splicing;
- (probably) drop jumping stages down to 2-at-a-time also.

Change-Id: If151394ec10e8cbd6a05e2d81808488d743bfe15
Reviewed-on: https://skia-review.googlesource.com/6940
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkCpu.cpp
f44703a87f532b3f593d91605d66d52c6bbc45c9 12-Dec-2016 Mike Klein <mtklein@chromium.org> Remove dependency on NDK cpufeatures.

Instead of relying on cpu-features.c, just do what it does.

Good reading: http://man7.org/linux/man-pages/man3/getauxval.3.html

While it's nice to use the headers when possible, should either of these headers not be available, we can fall back to doing it all manually:

extern "C" uint32_t getauxval(uint32_t)
static const int AT_HWCAP = 16;
static const int HWCAP_CRC32 = (1<<7);


To keep things simple I've slimmed cpu feature detection down to just the features we actually make use of. This removes all runtime feature detection for ARMv7... we expect NEON to be globally available, and so far we haven't used the other FMA/FP16 bits on ARMv7. ARMv8 feature dection remains the same, CRC32 before, CRC32 after. x86 (cpuid-based detection) and MIPS (nothing) are untouched.


We need to keep //third_party/cpu-features for //third_party/libwebp.

Change-Id: I6c96df9a09ae68c8c0e54c1152aa177ba9bafc83
Reviewed-on: https://skia-review.googlesource.com/5800
Reviewed-by: Derek Sollenberger <djsollen@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
/external/skia/src/core/SkCpu.cpp
78d5a3bac5cbde50cd12d8b9ab6dd269324b5272 30-Sep-2016 Mike Klein <mtklein@chromium.org> Add an SkOpts target for Haswell+ Intel chips.

Haswell brought a whole slew of handy new instructions for us (AVX2, FMA, BMI1+BMI2) and also feature F16C, which came one generation earlier on Ivybridge. We work with integers often enough that we really want to target AVX2 instead of AVX, and this means it's pretty practical to ask for all those other goodies along with it.

Chrome's GN files and Google3's BUILD file will need an update, before or after this CL.

BUG=skia:

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2840
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Change-Id: I826daf77b5104664c5d31ddaabee347e287b87a2
Reviewed-on: https://skia-review.googlesource.com/2840
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
/external/skia/src/core/SkCpu.cpp
7d6fb2c92d096ac3630e23d561a4077a974a815c 25-Aug-2016 mtklein <mtklein@chromium.org> GN: Android

Once you have downloaded an android NDK, you can set the ndk GN arg to use it.
E.g. my gn.args looks like:
is_debug = false
ndk = "/opt/android-ndk"

This should be enough to get you going for an arm64 build. You ought to be able to tweak that to other architectures by changing target_cpu to "arm", "x86", "x86-64", etc. That won't quite work until I follow this up a bit, but the skeleton is there.

This is enough to get me compiled, linked, and running to completion on my N5x.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2275983004

Review-Url: https://codereview.chromium.org/2275983004
/external/skia/src/core/SkCpu.cpp
f1b6030b44a4a9523183c3809a165b6b5353fff5 19-Aug-2016 mtklein <mtklein@chromium.org> Detect CRC32 instructions on ARMv8.

I have successfully detected CRC32 instruction support on my Nexus 5x.

Use of these instructions to follow... I am not yet sure which compilers if any will give me instrinsics or let me write them in asm.

defined(__ARM_FEATURE_CRC32) should cover users like Android Framework who build with the best settings possible. cpu-features.h covers use cases like Clank and our bots.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2259133002

Review-Url: https://codereview.chromium.org/2259133002
/external/skia/src/core/SkCpu.cpp
5608e2ed2299496eee3c57e0fe426ae9bd0d07a4 11-Jul-2016 mtklein <mtklein@chromium.org> Clean up hyper-local SkCpu feature test experiment.

This removes the code paths where we make SkCpu::Supports() calls
from within a tight loop. It keeps code paths using SkCpu::Supports()
to choose entire routines from src/opts/.

We can't rely on these hyper-local checks to be hoisted up reliably enough.
It worked pretty well with the first couple platforms we tried (e.g. Clang
on Linux/Mac) but we can't gaurantee it works everywhere.

Further, I'm not able to actually do anything fancy with those tests
outside of x86... I've not found a way to get, say, NEON+F16 conversion
code embedded into ordinary NEON code outside writing then entire function
in external assembly.

This whole idea becomes less important now that we've got a way to chain
separate function calls together efficiently. We can now, e.g., use an
AVX+F16C method to load some pixels, then chain that into an ordinary AVX
method to color filter them.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2138073002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review-Url: https://codereview.chromium.org/2138073002
/external/skia/src/core/SkCpu.cpp
eb85fd746d6390f53e250583a0544bf59ed34b35 21-Apr-2016 mtklein <mtklein@chromium.org> SkCpu w/o static initializer

I think I cracked it.

Though, this may not technically be legal C++...
I've only got one definition of SkCpu::gCachedFeatures,
but two different declarations: non-const in SkCpu.cpp, const elsewhere.

Is this...
- legal C++?
- not C++ but probably works as I think?
- not C++ and will probably blow up?
- who knows, let's see?

I have tested that the features are cached properly, read properly, and that the generated code treats SkCpu::gCachedFeatures as a global constant outside SkCpu.cpp. So it all observably works optimally.

Expanding testing to more bots.
TBR=reed@google.com

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1905683003

Review URL: https://codereview.chromium.org/1905683003
/external/skia/src/core/SkCpu.cpp
4311f016612a814282029daa4bd102053a853d82 19-Apr-2016 mtklein <mtklein@chromium.org> Move CPU feature detection to its own file.

- Moves CPU feature detection to its own file.
- Cleans up some redundant feature detection scattered around core/ and opts/.
- Can now detect a few new CPU features:
* F16C -> Intel f16<->f32 instructions, added between AVX and AVX2
* FMA -> Intel FMA instructions, added at the same time as AVX2
* VFP_FP16 -> ARM f16<->f32 instructions, quite common
* NEON_FMA -> ARM FMA instructions, also quite common
* SSE and SSE3... why not?

This new internal API makes it very cheap to do fine-grained runtime CPU
feature detection. Redundant calls to SkCpu::Supports() should be eliminated
and it's hoistable out of loops. It compiles away entirely when we have the
appropriate instructions available at compile time.

This means we can call it to guard even a little snippet of 1 or 2 instructions
right where needed and let inlining hoist the check (if any at all) up to
somewhere that doesn't hurt performance. I've explained how I made this work
in the private section of the new header.

Once this lands and bakes a bit, I'll start following up with CLs to use it more
and to add a bunch of those little 1-2 instruction snippets we've been wanting,
e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps
(for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Committed: https://skia.googlesource.com/skia/+/872ea29357439f05b1f6995dd300fc054733e607

Review URL: https://codereview.chromium.org/1890483002
/external/skia/src/core/SkCpu.cpp
86498fbfcb93a9048bbe1c28cc0df40d8d0c96e9 15-Apr-2016 mtklein <mtklein@google.com> Revert of Move CPU feature detection to its own file. (patchset #7 id:120001 of https://codereview.chromium.org/1890483002/ )

Reason for revert:
many unexpected GM diffs across GPU+CPU configs on Windows (hopefully just text masks on GPU?). seems like we pick a different srcover variant in some places.

Original issue's description:
> Move CPU feature detection to its own file.
>
> - Moves CPU feature detection to its own file.
> - Cleans up some redundant feature detection scattered around core/ and opts/.
> - Can now detect a few new CPU features:
> * F16C -> Intel f16<->f32 instructions, added between AVX and AVX2
> * FMA -> Intel FMA instructions, added at the same time as AVX2
> * VFP_FP16 -> ARM f16<->f32 instructions, quite common
> * NEON_FMA -> ARM FMA instructions, also quite common
> * SSE and SSE3... why not?
>
> This new internal API makes it very cheap to do fine-grained runtime CPU
> feature detection. Redundant calls to SkCpu::Supports() should be eliminated
> and it's hoistable out of loops. It compiles away entirely when we have the
> appropriate instructions available at compile time.
>
> This means we can call it to guard even a little snippet of 1 or 2 instructions
> right where needed and let inlining hoist the check (if any at all) up to
> somewhere that doesn't hurt performance. I've explained how I made this work
> in the private section of the new header.
>
> Once this lands and bakes a bit, I'll start following up with CLs to use it more
> and to add a bunch of those little 1-2 instruction snippets we've been wanting,
> e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps
> (for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM.
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002
> CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/872ea29357439f05b1f6995dd300fc054733e607

TBR=fmalita@chromium.org,herb@google.com,reed@google.com,mtklein@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:

Review URL: https://codereview.chromium.org/1892643003
/external/skia/src/core/SkCpu.cpp
872ea29357439f05b1f6995dd300fc054733e607 14-Apr-2016 mtklein <mtklein@chromium.org> Move CPU feature detection to its own file.

- Moves CPU feature detection to its own file.
- Cleans up some redundant feature detection scattered around core/ and opts/.
- Can now detect a few new CPU features:
* F16C -> Intel f16<->f32 instructions, added between AVX and AVX2
* FMA -> Intel FMA instructions, added at the same time as AVX2
* VFP_FP16 -> ARM f16<->f32 instructions, quite common
* NEON_FMA -> ARM FMA instructions, also quite common
* SSE and SSE3... why not?

This new internal API makes it very cheap to do fine-grained runtime CPU
feature detection. Redundant calls to SkCpu::Supports() should be eliminated
and it's hoistable out of loops. It compiles away entirely when we have the
appropriate instructions available at compile time.

This means we can call it to guard even a little snippet of 1 or 2 instructions
right where needed and let inlining hoist the check (if any at all) up to
somewhere that doesn't hurt performance. I've explained how I made this work
in the private section of the new header.

Once this lands and bakes a bit, I'll start following up with CLs to use it more
and to add a bunch of those little 1-2 instruction snippets we've been wanting,
e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps
(for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1890483002
/external/skia/src/core/SkCpu.cpp