commit | f27b983f382be8d49b1d473562918820aa124ed1 | [log] [tgz] |
---|---|---|
author | George Steed <george.steed@arm.com> | Sun May 05 20:51:51 2024 +0100 |
committer | Frank Barchard <fbarchard@chromium.org> | Thu Nov 07 18:46:02 2024 +0000 |
tree | 1fa88f415d99483dfe7968fbe4361131b23babcd | |
parent | aec4b4e22ef0782601affe488b4f596b0991b127 [diff] |
[AArch64] Add SVE2 implementation of DivideRow_16 SVE contains the UMULH instruction which allows us to multiply and take the high half of the result in a single instruction rather than needing separate widening multiply and then narrowing shift steps. Observed reduction in runtime compared to the existing Neon code: Cortex-A510: -21.2% Cortex-A520: -20.9% Cortex-A715: -47.9% Cortex-A720: -47.6% Cortex-X2: -5.2% Cortex-X3: -2.6% Cortex-X4: -32.4% Cortex-X925: -1.5% Bug: b/42280942 Change-Id: I25154699b17772db1fb5cb84c049919181d86f4b Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5975318 Reviewed-by: Justin Green <greenjustin@google.com> Reviewed-by: Frank Barchard <fbarchard@chromium.org>
libyuv is an open source project that includes YUV scaling and conversion functionality.
See Getting started for instructions on how to get started developing.
You can also browse the docs directory for more documentation.