commit | 5c12e0b2de33e9a3031526c1f392cc0d11d49f5f | [log] [tgz] |
---|---|---|
author | George Steed <george.steed@arm.com> | Tue May 07 13:26:07 2024 +0100 |
committer | Frank Barchard <fbarchard@chromium.org> | Thu Nov 07 18:53:00 2024 +0000 |
tree | d27b4ade9d05bbbad61d0d2d02810c5d050036a7 | |
parent | 7d383c2f1a11957c6cc71c2856d498f3e8819de5 [diff] |
[AArch64] Add SVE2 implementations of HalfFloat{,1}Row For HalfFloat1Row, SVE has direct 16-bit integer to half-float conversion instructions so there is no need to widen to 32-bits. For HalfFloatRow, SVE zero-extending loads avoid the need for seperate UXTL(2) instructions. Observed reductions in runtime compared to the existing Neon code: | HalfFloat1Row | HalfFloatRow Cortex-A510 | -38.3% | -17.3% Cortex-A520 | -37.6% | -18.8% Cortex-A720 | -50.1% | -7.8% Cortex-X2 | -50.2% | -0.4% Cortex-X4 | -51.5% | -12.5% Bug: b/42280942 Change-Id: I445071ccd453113144ce42d465ba03c9ee89ec9e Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5975319 Reviewed-by: Justin Green <greenjustin@google.com> Reviewed-by: Frank Barchard <fbarchard@chromium.org>
libyuv is an open source project that includes YUV scaling and conversion functionality.
See Getting started for instructions on how to get started developing.
You can also browse the docs directory for more documentation.