commit | 00886670bbed1e939c09e22e0b9ec067cf93c609 | [log] [tgz] |
---|---|---|
author | George Steed <george.steed@arm.com> | Thu May 16 17:12:28 2024 +0100 |
committer | Frank Barchard <fbarchard@chromium.org> | Mon Sep 16 04:27:39 2024 +0000 |
tree | 0b92db3832b6128caf1de9795786d0c85a951af6 | |
parent | 4620f1705822fd6ab99939f43ce63099bd3d9ae0 [diff] |
[AArch64] Avoid LD4/ST2 in ScaleARGBRowDown2_NEON Use separate permute instructions to avoid using LD4/ST2 as these instructions are known to be slow on some micro-architectures. Observed reduction in runtimes compared to the existing Neon code: Cortex-A55: -12.4% Cortex-A510: -44.8% Cortex-A520: -31.1% Cortex-A76: -55.3% Cortex-A715: -63.7% Cortex-A720: -62.3% Cortex-X1: -79.0% Cortex-X2: -78.9% Cortex-X3: -79.6% Cortex-X4: -59.8% Co-authored-by: Cosmina Dunca <cosmina.dunca@arm.com> Bug: libyuv:976 Change-Id: I33cf27ae5e16c1ce62f1f343043e6bd9fca92558 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5790971 Reviewed-by: Justin Green <greenjustin@google.com> Reviewed-by: Frank Barchard <fbarchard@chromium.org>
libyuv is an open source project that includes YUV scaling and conversion functionality.
See Getting started for instructions on how to get started developing.
You can also browse the docs directory for more documentation.