Inline extendMatch for the noasm encoder.

This is a partial undo of 4f2f9a13 "Write the encoder's extendMatch in
asm" but we can selectively apply the undo only to the noasm case now
that encodeBlock (the function that calls extendMatch) is itself written
in asm.

With "go test -test.bench='Encode|ZFlat' -tags=noasm":
name              old speed      new speed      delta
WordsEncode1e1-8   676MB/s ± 1%   676MB/s ± 0%     ~     (p=0.841 n=5+5)
WordsEncode1e2-8  85.3MB/s ± 0%  87.5MB/s ± 1%   +2.50%  (p=0.008 n=5+5)
WordsEncode1e3-8   241MB/s ± 0%   258MB/s ± 0%   +7.33%  (p=0.008 n=5+5)
WordsEncode1e4-8   199MB/s ± 0%   245MB/s ± 0%  +23.15%  (p=0.008 n=5+5)
WordsEncode1e5-8   171MB/s ± 0%   186MB/s ± 0%   +8.57%  (p=0.008 n=5+5)
WordsEncode1e6-8   192MB/s ± 0%   211MB/s ± 0%   +9.51%  (p=0.008 n=5+5)
RandomEncode-8    13.1GB/s ± 2%  13.2GB/s ± 1%     ~     (p=0.690 n=5+5)
_ZFlat0-8          404MB/s ± 0%   431MB/s ± 0%   +6.84%  (p=0.008 n=5+5)
_ZFlat1-8          260MB/s ± 0%   277MB/s ± 0%   +6.46%  (p=0.008 n=5+5)
_ZFlat2-8         13.8GB/s ± 1%  13.8GB/s ± 2%     ~     (p=1.000 n=5+5)
_ZFlat3-8          170MB/s ± 1%   173MB/s ± 0%   +1.60%  (p=0.008 n=5+5)
_ZFlat4-8         2.94GB/s ± 5%  3.10GB/s ± 0%   +5.35%  (p=0.008 n=5+5)
_ZFlat5-8          397MB/s ± 1%   426MB/s ± 0%   +7.32%  (p=0.008 n=5+5)
_ZFlat6-8          175MB/s ± 2%   190MB/s ± 0%   +8.61%  (p=0.008 n=5+5)
_ZFlat7-8          169MB/s ± 0%   182MB/s ± 0%   +7.47%  (p=0.016 n=4+5)
_ZFlat8-8          184MB/s ± 3%   200MB/s ± 0%   +8.65%  (p=0.008 n=5+5)
_ZFlat9-8          163MB/s ± 0%   175MB/s ± 0%   +7.57%  (p=0.016 n=4+5)
_ZFlat10-8         481MB/s ± 0%   509MB/s ± 0%   +5.80%  (p=0.016 n=4+5)
_ZFlat11-8         254MB/s ± 0%   275MB/s ± 0%   +8.32%  (p=0.008 n=5+5)

For the record, after this commit, the comparison between the noasm
('old') and vanilla (i.e. with asm, 'new') encoder benchmarks, summing
up the last eight or so commits, is:
name              old speed      new speed       delta
WordsEncode1e1-8   676MB/s ± 0%    677MB/s ± 1%      ~     (p=0.310 n=5+5)
WordsEncode1e2-8  87.5MB/s ± 1%  428.3MB/s ± 0%  +389.71%  (p=0.008 n=5+5)
WordsEncode1e3-8   258MB/s ± 0%    446MB/s ± 1%   +72.67%  (p=0.008 n=5+5)
WordsEncode1e4-8   245MB/s ± 0%    316MB/s ± 0%   +28.94%  (p=0.008 n=5+5)
WordsEncode1e5-8   186MB/s ± 0%    269MB/s ± 0%   +44.86%  (p=0.008 n=5+5)
WordsEncode1e6-8   211MB/s ± 0%    314MB/s ± 1%   +48.84%  (p=0.008 n=5+5)
RandomEncode-8    13.2GB/s ± 1%   14.4GB/s ± 1%    +9.33%  (p=0.008 n=5+5)
_ZFlat0-8          431MB/s ± 0%    792MB/s ± 0%   +83.67%  (p=0.008 n=5+5)
_ZFlat1-8          277MB/s ± 0%    436MB/s ± 1%   +57.46%  (p=0.008 n=5+5)
_ZFlat2-8         13.8GB/s ± 2%   16.2GB/s ± 1%   +17.16%  (p=0.008 n=5+5)
_ZFlat3-8          173MB/s ± 0%    632MB/s ± 1%  +265.85%  (p=0.008 n=5+5)
_ZFlat4-8         3.10GB/s ± 0%   8.00GB/s ± 0%  +157.99%  (p=0.008 n=5+5)
_ZFlat5-8          426MB/s ± 0%    768MB/s ± 0%   +80.06%  (p=0.008 n=5+5)
_ZFlat6-8          190MB/s ± 0%    282MB/s ± 1%   +48.48%  (p=0.008 n=5+5)
_ZFlat7-8          182MB/s ± 0%    264MB/s ± 1%   +44.97%  (p=0.008 n=5+5)
_ZFlat8-8          200MB/s ± 0%    298MB/s ± 0%   +49.45%  (p=0.008 n=5+5)
_ZFlat9-8          175MB/s ± 0%    247MB/s ± 0%   +41.02%  (p=0.008 n=5+5)
_ZFlat10-8         509MB/s ± 0%   1027MB/s ± 0%  +101.72%  (p=0.008 n=5+5)
_ZFlat11-8         275MB/s ± 0%    411MB/s ± 0%   +49.57%  (p=0.008 n=5+5)
diff --git a/encode_other.go b/encode_other.go
index e626d55..dbcae90 100644
--- a/encode_other.go
+++ b/encode_other.go
@@ -195,8 +195,15 @@
 			// Invariant: we have a 4-byte match at s, and no need to emit any
 			// literal bytes prior to s.
 			base := s
+
 			// Extend the 4-byte match as long as possible.
-			s = extendMatch(src, candidate+4, s+4)
+			//
+			// This is an inlined version of:
+			//	s = extendMatch(src, candidate+4, s+4)
+			s += 4
+			for i := candidate + 4; s < len(src) && src[i] == src[s]; i, s = i+1, s+1 {
+			}
+
 			d += emitCopy(dst[d:], base-candidate, s-base)
 			nextEmit = s
 			if s >= sLimit {