History log of /external/skia/src/core/SkHalf.h
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
be8c19e8d3deac9b9585c44b9a423912dd00a75a 19-Feb-2016 mtklein <mtklein@chromium.org> NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
Also adds NEON f32 <-> u16 code to make the comparison fair.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest, so we use a tiny amount of inline assembly.

The ARMv7 half -> float is different enough from the SSE version that it does not make sense to use SkNx.

Still TODO:
ARMv7 float -> half. Naively translating the SSE version results in 0x0000 where we'd expect a denormal output.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1700473003
CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot,Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1700473003
/external/skia/src/core/SkHalf.h
ddb64c81fbe05ef7188135564bbd695edea9fdf0 11-Feb-2016 mtklein <mtklein@chromium.org> new version of SkHalfToFloat_01

This is a little faster than the previous version, and much better explained.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1688233002

Review URL: https://codereview.chromium.org/1688233002
/external/skia/src/core/SkHalf.h
fff055cc5f9ca5015678f4f13a4f842084bd62d5 11-Feb-2016 mtklein <mtklein@chromium.org> SkHalfToFloat_01 / SkFloatToHalf_01

These are basically inlined, 4-at-a-time versions of our existing functions,
but cut down to avoid any work that's only necessary outside [0,1].

Both f16 and f32 denorms should work fine modulo the usual ARMv7 NEON denorm==zero caveat.

In exchange for a little speed, f32->f16 does not round properly.
Instead it truncates, so it's never off by more than 1 bit.

Support for finite values >1 or <0 is straightforward to add back.
>1 might already work as-is.

Getting close to _u16 performance:
micros bench
261.13 xferu64_bw_1_opaque_u16
1833.51 xferu64_bw_1_alpha_u16
2762.32 ? xferu64_aa_1_opaque_u16
3334.29 xferu64_aa_1_alpha_u16
249.78 xferu64_bw_1_opaque_f16
3383.18 xferu64_bw_1_alpha_f16
4214.72 xferu64_aa_1_opaque_f16
4701.19 xferu64_aa_1_alpha_f16

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1685133005

Committed: https://skia.googlesource.com/skia/+/9ea11a4235b3e3521cc8bf914a27c2d0dc062db9

CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1685133005
/external/skia/src/core/SkHalf.h
cbefc5e4ca7fd7aaa5d2a3aa85b30f16148c3d2f 11-Feb-2016 mtklein <mtklein@google.com> Revert of SkHalfToFloat_01 / SkFloatToHalf_01 (patchset #11 id:200001 of https://codereview.chromium.org/1685133005/ )

Reason for revert:
Gotta fix Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD

Original issue's description:
> SkHalfToFloat_01 / SkFloatToHalf_01
>
> These are basically inlined, 4-at-a-time versions of our existing functions,
> but cut down to avoid any work that's only necessary outside [0,1].
>
> Both f16 and f32 denorms should work fine modulo the usual ARMv7 NEON denorm==zero caveat.
>
> In exchange for a little speed, f32->f16 does not round properly.
> Instead it truncates, so it's never off by more than 1 bit.
>
> Support for finite values >1 or <0 is straightforward to add back.
> >1 might already work as-is.
>
> Getting close to _u16 performance:
> micros bench
> 261.13 xferu64_bw_1_opaque_u16
> 1833.51 xferu64_bw_1_alpha_u16
> 2762.32 ? xferu64_aa_1_opaque_u16
> 3334.29 xferu64_aa_1_alpha_u16
> 249.78 xferu64_bw_1_opaque_f16
> 3383.18 xferu64_bw_1_alpha_f16
> 4214.72 xferu64_aa_1_opaque_f16
> 4701.19 xferu64_aa_1_alpha_f16
>
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1685133005
>
> Committed: https://skia.googlesource.com/skia/+/9ea11a4235b3e3521cc8bf914a27c2d0dc062db9

TBR=jvanverth@google.com,reed@google.com,mtklein@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:

Review URL: https://codereview.chromium.org/1693443003
/external/skia/src/core/SkHalf.h
9ea11a4235b3e3521cc8bf914a27c2d0dc062db9 11-Feb-2016 mtklein <mtklein@chromium.org> SkHalfToFloat_01 / SkFloatToHalf_01

These are basically inlined, 4-at-a-time versions of our existing functions,
but cut down to avoid any work that's only necessary outside [0,1].

Both f16 and f32 denorms should work fine modulo the usual ARMv7 NEON denorm==zero caveat.

In exchange for a little speed, f32->f16 does not round properly.
Instead it truncates, so it's never off by more than 1 bit.

Support for finite values >1 or <0 is straightforward to add back.
>1 might already work as-is.

Getting close to _u16 performance:
micros bench
261.13 xferu64_bw_1_opaque_u16
1833.51 xferu64_bw_1_alpha_u16
2762.32 ? xferu64_aa_1_opaque_u16
3334.29 xferu64_aa_1_alpha_u16
249.78 xferu64_bw_1_opaque_f16
3383.18 xferu64_bw_1_alpha_f16
4214.72 xferu64_aa_1_opaque_f16
4701.19 xferu64_aa_1_alpha_f16

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1685133005

Review URL: https://codereview.chromium.org/1685133005
/external/skia/src/core/SkHalf.h
28f9c606e4c8e61015e864219c4bc83a3fdb4a86 05-Dec-2014 jvanverth <jvanverth@google.com> Add support for half float alpha textures.

This allows us to create distance field textures with better precision,
which may help text quality.

BUG=skia:3103

Review URL: https://codereview.chromium.org/762923003
/external/skia/src/core/SkHalf.h
936799204b34e7a2f20ac6c0868058799ceb851e 26-Nov-2014 jvanverth <jvanverth@google.com> Add float-to-half (binary16) conversion functions.

Based on code by Fabian Giesen at
https://fgiesen.wordpress.com/2012/03/28/half-to-float-done-quic/.

These will be needed for creating binary16 textures from floating point data.

BUG=skia:3103

Review URL: https://codereview.chromium.org/760753003
/external/skia/src/core/SkHalf.h