vmx: implement fast path composite_add_8888_8888
Copied impl. from sse2 file and edited to use vmx functions
It was benchmarked against commid id 2be523b from pixman/master
POWER8, 16 cores, 3.4GHz, ppc64le :
reference memcpy speed = 27036.4MB/s (6759.1MP/s for 32bpp fills)
Before After Change
---------------------------------------------
L1 248.76 3284.48 +1220.34%
L2 264.09 2826.47 +970.27%
M 261.24 2405.06 +820.63%
HT 217.27 857.3 +294.58%
VT 213.78 980.09 +358.46%
R 176.61 442.95 +150.81%
RT 107.54 150.08 +39.56%
Kops/s 917 1125 +22.68%
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
1 file changed