commit | 776a509891796d07006d2aedfe40e26b1075b026 | [log] [tgz] |
---|---|---|
author | George Steed <george.steed@arm.com> | Wed May 15 21:26:24 2024 +0100 |
committer | Frank Barchard <fbarchard@chromium.org> | Fri Jul 19 19:51:45 2024 +0000 |
tree | dce5565ba0394900e0b4c81d63cd27b1273589e2 | |
parent | be5de19db32f6191f2533d0fa69166e89ea60915 [diff] |
[AArch64] Unroll ScaleRowDown34_1_Box_NEON We can make use of wider instructions for the loads and stores as well as the URHADD instructions. In addition the duplicated instructions of the code from the unrolling provides a further small improvement for little cores with limited out-of-order capability. Reduction in runtimes observed compared to the existing Neon implementation: Cortex-A55: -23.5% Cortex-A510: -35.4% Cortex-A520: -40.5% Cortex-A76: -15.1% Cortex-A715: -6.2% Cortex-A720: -6.2% Cortex-X1: -17.9% Cortex-X2: -18.4% Cortex-X3: -18.3% Cortex-X4: -14.0% Bug: b/42280945 Change-Id: I5905e026a0507870bfc580b702906d6acb4ed6f4 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5725170 Reviewed-by: Frank Barchard <fbarchard@chromium.org>
libyuv is an open source project that includes YUV scaling and conversion functionality.
See Getting started for instructions on how to get started developing.
You can also browse the docs directory for more documentation.