commit | e6297afd14dc22c7d2835bae4ae657503d7c2a32 | [log] [tgz] |
---|---|---|
author | George Steed <george.steed@arm.com> | Thu May 16 17:13:30 2024 +0100 |
committer | Frank Barchard <fbarchard@chromium.org> | Mon Sep 16 04:28:25 2024 +0000 |
tree | 032402c6e1cd926c053f892d1046637f84810b76 | |
parent | 00886670bbed1e939c09e22e0b9ec067cf93c609 [diff] |
[AArch64] Optimize ScaleARGBRowDown2Linear_NEON Replace LD4 with a pair of LD2 instructions to avoid needing an ST2 instruction for storing the result, since ST2 instructions are known to be slow on some micro-architectures. Observed reduction in runtimes compared to the existing Neon code: Cortex-A55: -23.3% Cortex-A510: -49.6% Cortex-A520: -31.1% Cortex-A76: -44.5% Cortex-A715: -45.8% Cortex-A720: -46.0% Cortex-X1: -74.5% Cortex-X2: -72.4% Cortex-X3: -76.8% Cortex-X4: -39.5% Co-authored-by: Cosmina Dunca <cosmina.dunca@arm.com> Bug: libyuv:976 Change-Id: Iab9e802d0784d69b7e970dcc8f1f4036985cd2e1 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5790972 Reviewed-by: Justin Green <greenjustin@google.com> Reviewed-by: Frank Barchard <fbarchard@chromium.org>
libyuv is an open source project that includes YUV scaling and conversion functionality.
See Getting started for instructions on how to get started developing.
You can also browse the docs directory for more documentation.