Inline extendMatch for the noasm encoder.

This is a partial undo of 4f2f9a13 "Write the encoder's extendMatch in
asm" but we can selectively apply the undo only to the noasm case now
that encodeBlock (the function that calls extendMatch) is itself written
in asm.

With "go test -test.bench='Encode|ZFlat' -tags=noasm":
name              old speed      new speed      delta
WordsEncode1e1-8   676MB/s ± 1%   676MB/s ± 0%     ~     (p=0.841 n=5+5)
WordsEncode1e2-8  85.3MB/s ± 0%  87.5MB/s ± 1%   +2.50%  (p=0.008 n=5+5)
WordsEncode1e3-8   241MB/s ± 0%   258MB/s ± 0%   +7.33%  (p=0.008 n=5+5)
WordsEncode1e4-8   199MB/s ± 0%   245MB/s ± 0%  +23.15%  (p=0.008 n=5+5)
WordsEncode1e5-8   171MB/s ± 0%   186MB/s ± 0%   +8.57%  (p=0.008 n=5+5)
WordsEncode1e6-8   192MB/s ± 0%   211MB/s ± 0%   +9.51%  (p=0.008 n=5+5)
RandomEncode-8    13.1GB/s ± 2%  13.2GB/s ± 1%     ~     (p=0.690 n=5+5)
_ZFlat0-8          404MB/s ± 0%   431MB/s ± 0%   +6.84%  (p=0.008 n=5+5)
_ZFlat1-8          260MB/s ± 0%   277MB/s ± 0%   +6.46%  (p=0.008 n=5+5)
_ZFlat2-8         13.8GB/s ± 1%  13.8GB/s ± 2%     ~     (p=1.000 n=5+5)
_ZFlat3-8          170MB/s ± 1%   173MB/s ± 0%   +1.60%  (p=0.008 n=5+5)
_ZFlat4-8         2.94GB/s ± 5%  3.10GB/s ± 0%   +5.35%  (p=0.008 n=5+5)
_ZFlat5-8          397MB/s ± 1%   426MB/s ± 0%   +7.32%  (p=0.008 n=5+5)
_ZFlat6-8          175MB/s ± 2%   190MB/s ± 0%   +8.61%  (p=0.008 n=5+5)
_ZFlat7-8          169MB/s ± 0%   182MB/s ± 0%   +7.47%  (p=0.016 n=4+5)
_ZFlat8-8          184MB/s ± 3%   200MB/s ± 0%   +8.65%  (p=0.008 n=5+5)
_ZFlat9-8          163MB/s ± 0%   175MB/s ± 0%   +7.57%  (p=0.016 n=4+5)
_ZFlat10-8         481MB/s ± 0%   509MB/s ± 0%   +5.80%  (p=0.016 n=4+5)
_ZFlat11-8         254MB/s ± 0%   275MB/s ± 0%   +8.32%  (p=0.008 n=5+5)

For the record, after this commit, the comparison between the noasm
('old') and vanilla (i.e. with asm, 'new') encoder benchmarks, summing
up the last eight or so commits, is:
name              old speed      new speed       delta
WordsEncode1e1-8   676MB/s ± 0%    677MB/s ± 1%      ~     (p=0.310 n=5+5)
WordsEncode1e2-8  87.5MB/s ± 1%  428.3MB/s ± 0%  +389.71%  (p=0.008 n=5+5)
WordsEncode1e3-8   258MB/s ± 0%    446MB/s ± 1%   +72.67%  (p=0.008 n=5+5)
WordsEncode1e4-8   245MB/s ± 0%    316MB/s ± 0%   +28.94%  (p=0.008 n=5+5)
WordsEncode1e5-8   186MB/s ± 0%    269MB/s ± 0%   +44.86%  (p=0.008 n=5+5)
WordsEncode1e6-8   211MB/s ± 0%    314MB/s ± 1%   +48.84%  (p=0.008 n=5+5)
RandomEncode-8    13.2GB/s ± 1%   14.4GB/s ± 1%    +9.33%  (p=0.008 n=5+5)
_ZFlat0-8          431MB/s ± 0%    792MB/s ± 0%   +83.67%  (p=0.008 n=5+5)
_ZFlat1-8          277MB/s ± 0%    436MB/s ± 1%   +57.46%  (p=0.008 n=5+5)
_ZFlat2-8         13.8GB/s ± 2%   16.2GB/s ± 1%   +17.16%  (p=0.008 n=5+5)
_ZFlat3-8          173MB/s ± 0%    632MB/s ± 1%  +265.85%  (p=0.008 n=5+5)
_ZFlat4-8         3.10GB/s ± 0%   8.00GB/s ± 0%  +157.99%  (p=0.008 n=5+5)
_ZFlat5-8          426MB/s ± 0%    768MB/s ± 0%   +80.06%  (p=0.008 n=5+5)
_ZFlat6-8          190MB/s ± 0%    282MB/s ± 1%   +48.48%  (p=0.008 n=5+5)
_ZFlat7-8          182MB/s ± 0%    264MB/s ± 1%   +44.97%  (p=0.008 n=5+5)
_ZFlat8-8          200MB/s ± 0%    298MB/s ± 0%   +49.45%  (p=0.008 n=5+5)
_ZFlat9-8          175MB/s ± 0%    247MB/s ± 0%   +41.02%  (p=0.008 n=5+5)
_ZFlat10-8         509MB/s ± 0%   1027MB/s ± 0%  +101.72%  (p=0.008 n=5+5)
_ZFlat11-8         275MB/s ± 0%    411MB/s ± 0%   +49.57%  (p=0.008 n=5+5)
1 file changed