vmx: implement fast path vmx_composite_over_n_8888
Running "lowlevel-blt-bench over_n_8888" on Playstation3 3.2GHz,
Gentoo ppc (32-bit userland) gave the following results:
before: over_n_8888 = L1: 147.47 L2: 205.86 M:121.07
after: over_n_8888 = L1: 287.27 L2: 261.09 M:133.48
Cairo non-trimmed benchmarks on POWER8, 3.4GHz 8 Cores:
ocitysmap 659.69 -> 611.71 : 1.08x speedup
xfce4-terminal-a1 2725.22 -> 2547.47 : 1.07x speedup
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
1 file changed