Eliminate some bounds checks in the encoder.

As per
https://groups.google.com/d/msg/golang-dev/jVP6h21OyL8/Syhfot9XBQAJ,
recent versions of the gc compiler can optimize:

func load32(b []byte, i int32) uint32 {
  b = b[i : i+4 : len(b)]
  return uint32(b[0]) | etc | uint32(b[3])<<24
}

benchmark                     old MB/s     new MB/s     speedup
BenchmarkWordsEncode1e1-8     5.78         5.77         1.00x
BenchmarkWordsEncode1e2-8     47.22        47.96        1.02x
BenchmarkWordsEncode1e3-8     183.53       190.33       1.04x
BenchmarkWordsEncode1e4-8     198.95       190.25       0.96x
BenchmarkWordsEncode1e5-8     144.60       150.65       1.04x
BenchmarkWordsEncode1e6-8     172.11       180.11       1.05x
BenchmarkRandomEncode-8       4547.98      4782.70      1.05x
Benchmark_ZFlat0-8            359.18       372.49       1.04x
Benchmark_ZFlat1-8            181.57       186.49       1.03x
Benchmark_ZFlat2-8            4566.75      4979.47      1.09x
Benchmark_ZFlat3-8            86.00        85.76        1.00x
Benchmark_ZFlat4-8            558.08       566.31       1.01x
Benchmark_ZFlat5-8            354.18       366.01       1.03x
Benchmark_ZFlat6-8            156.20       162.13       1.04x
Benchmark_ZFlat7-8            147.76       153.69       1.04x
Benchmark_ZFlat8-8            162.49       167.91       1.03x
Benchmark_ZFlat9-8            142.33       147.71       1.04x
Benchmark_ZFlat10-8           401.93       414.06       1.03x
Benchmark_ZFlat11-8           235.94       248.87       1.05x
1 file changed