pixman-filter: Fix several issues related to normalization

There are a few bugs in the current normalization code

(1) The normalization is based on the sum of the *floating point*
    values generated by integral(). But in order to get the sum to be
    close to pixman_fixed_1, the sum of the rounded fixed point values
    should be used.

(2) The multiplications in the normalization loops often round the
    same way, so the residual error can fairly large.

(3) The residual error is added to the sample located at index
    (width - width / 2), which is not the midpoint for odd widths (and
    for width 1 is in fact outside the array).

This patch fixes these issues by (1) using the sum of the fixed point
values as the total to divide by, (2) doing error diffusion in the
normalization loop, and (3) putting any residual error (which is now
guaranteed to be less than pixman_fixed_e) at the first sample, which
is the only one that didn't get any error diffused into it.

Signed-off-by: Søren Sandmann <soren.sandmann@gmail.com>
diff --git a/pixman/pixman-filter.c b/pixman/pixman-filter.c
index 11e7d0e..ee58045 100644
--- a/pixman/pixman-filter.c
+++ b/pixman/pixman-filter.c
@@ -247,7 +247,7 @@
         double frac = step / 2.0 + i * step;
 	pixman_fixed_t new_total;
         int x, x1, x2;
-	double total;
+	double total, e;
 
 	/* Sample convolution of reconstruction and sampling
 	 * filter. See rounding.txt regarding the rounding
@@ -278,24 +278,31 @@
 			      ihigh - ilow);
 	    }
 
-	    total += c;
-            *p++ = (pixman_fixed_t)(c * 65536.0 + 0.5);
+            *p = (pixman_fixed_t)floor (c * 65536.0 + 0.5);
+	    total += *p;
+	    p++;
         }
 
-	/* Normalize */
+	/* Normalize, with error diffusion */
 	p -= width;
-        total = 1 / total;
+        total = 65536.0 / total;
         new_total = 0;
+	e = 0.0;
 	for (x = x1; x < x2; ++x)
 	{
-	    pixman_fixed_t t = (*p) * total + 0.5;
+	    double v = (*p) * total + e;
+	    pixman_fixed_t t = floor (v + 0.5);
 
+	    e = v - t;
 	    new_total += t;
 	    *p++ = t;
 	}
 
-	if (new_total != pixman_fixed_1)
-	    *(p - width / 2) += (pixman_fixed_1 - new_total);
+	/* pixman_fixed_e's worth of error may remain; put it
+	 * at the first sample, since that is the only one that
+	 * hasn't had any error diffused into it.
+	 */
+	*(p - width) += pixman_fixed_1 - new_total;
     }
 }