Use the same encoding algorithm as C++ snappy.
When encoding the benchmark files, the output size is smaller:
len(in) old_len(out) new_len(out) new/old_ratio description
102400 23488 22842 0.97 html
702087 346345 335387 0.97 urls
123093 123034 123034 1.00 jpg
200 144 146 1.01 jpg_200
102400 83786 83817 1.00 pdf
409600 95095 92221 0.97 html4
152089 91386 88017 0.96 txt1
125179 80526 77525 0.96 txt2
426754 244658 234392 0.96 txt3
481861 331356 319097 0.96 txt4
118588 24789 23295 0.94 pb
184320 74129 69526 0.94 gaviota
On GOARCH=amd64, the throughput numbers are also faster:
benchmark old MB/s new MB/s speedup
BenchmarkWordsEncode1e1-8 674.93 681.22 1.01x
BenchmarkWordsEncode1e2-8 47.92 49.91 1.04x
BenchmarkWordsEncode1e3-8 189.48 213.64 1.13x
BenchmarkWordsEncode1e4-8 193.17 245.31 1.27x
BenchmarkWordsEncode1e5-8 151.44 178.84 1.18x
BenchmarkWordsEncode1e6-8 180.63 203.74 1.13x
BenchmarkRandomEncode-8 4700.25 5711.91 1.22x
Benchmark_ZFlat0-8 372.12 422.42 1.14x
Benchmark_ZFlat1-8 187.62 270.16 1.44x
Benchmark_ZFlat2-8 4891.26 5542.08 1.13x
Benchmark_ZFlat3-8 86.16 92.53 1.07x
Benchmark_ZFlat4-8 570.31 963.51 1.69x
Benchmark_ZFlat5-8 366.84 418.91 1.14x
Benchmark_ZFlat6-8 164.18 182.67 1.11x
Benchmark_ZFlat7-8 155.23 175.64 1.13x
Benchmark_ZFlat8-8 169.62 193.08 1.14x
Benchmark_ZFlat9-8 149.43 168.62 1.13x
Benchmark_ZFlat10-8 412.63 497.87 1.21x
Benchmark_ZFlat11-8 247.98 269.43 1.09x
2 files changed