commit | 7eb552c891d3f874f9b87f1860d4c3ba65cd2c5d | [log] [tgz] |
---|---|---|
author | George Steed <george.steed@arm.com> | Mon Sep 16 16:56:18 2024 +0100 |
committer | Frank Barchard <fbarchard@chromium.org> | Fri Sep 20 00:28:12 2024 +0000 |
tree | d4aa0516d09cfde1013ac384ab2e245da0c89b9b | |
parent | 23a6a412e5d3d10c3bbd79b147c1eab4d284bc77 [diff] |
[AArch64] Avoid unnecessary MOVs in ScaleARGBRowDownEvenBox_NEON The existing code uses three MOV instructions through a temporary register to swap the low and high halves of a vector register, however this can be done with a pair of ZIP instructions instead. Also use a pair of RSHRN rather than RSHRN2 to allow these to execute in parallel on little cores. Reduction in runtime observed compared to the existing Neon implementation: Cortex-A55: -8.3% Cortex-A510: -20.6% Cortex-A520: -16.6% Cortex-A76: -6.8% Cortex-A715: -6.2% Cortex-A720: -6.2% Cortex-X1: -22.0% Cortex-X2: -18.7% Cortex-X3: -21.1% Cortex-X4: -25.8% Cortex-X925: -21.9% Change-Id: I87ae133be86c3c9f850d5848ec19d9b71ebda4d9 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5872801 Reviewed-by: Frank Barchard <fbarchard@chromium.org>
libyuv is an open source project that includes YUV scaling and conversion functionality.
See Getting started for instructions on how to get started developing.
You can also browse the docs directory for more documentation.