vmx: implement fast path vmx_fill

Based on sse2 impl.

It was benchmarked against commid id e2d211a from pixman/master

Tested cairo trimmed benchmarks on POWER8, 8 cores, 3.4GHz,
RHEL 7.1 ppc64le :

speedups
========
     t-swfdec-giant-steps  1383.09 ->  718.63  :  1.92x speedup
   t-gnome-system-monitor  1403.53 ->  918.77  :  1.53x speedup
              t-evolution  552.34  ->  415.24  :  1.33x speedup
      t-xfce4-terminal-a1  1573.97 ->  1351.46 :  1.16x speedup
      t-firefox-paintball  847.87  ->  734.50  :  1.15x speedup
      t-firefox-asteroids  565.99  ->  492.77  :  1.15x speedup
t-firefox-canvas-swscroll  1656.87 ->  1447.48 :  1.14x speedup
          t-midori-zoomed  724.73  ->  642.16  :  1.13x speedup
   t-firefox-planet-gnome  975.78  ->  911.92  :  1.07x speedup
          t-chromium-tabs  292.12  ->  274.74  :  1.06x speedup
     t-firefox-chalkboard  690.78  ->  653.93  :  1.06x speedup
      t-firefox-talos-gfx  1375.30 ->  1303.74 :  1.05x speedup
   t-firefox-canvas-alpha  1016.79 ->  967.24  :  1.05x speedup

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
1 file changed