Rearrange the emitCopy register allocation.
This minimizes the diff in a follow-up commit, when manually inlining.
It's not an optimization per se, but for the record:
name old speed new speed delta
WordsEncode1e1-8 711MB/s ± 1% 700MB/s ± 1% -1.64% (p=0.000 n=9+10)
WordsEncode1e2-8 407MB/s ± 1% 430MB/s ± 0% +5.57% (p=0.000 n=10+10)
WordsEncode1e3-8 441MB/s ± 1% 447MB/s ± 0% +1.52% (p=0.000 n=8+8)
WordsEncode1e4-8 311MB/s ± 1% 322MB/s ± 0% +3.69% (p=0.000 n=9+10)
WordsEncode1e5-8 267MB/s ± 0% 267MB/s ± 1% ~ (p=0.068 n=8+10)
WordsEncode1e6-8 312MB/s ± 1% 314MB/s ± 0% +0.45% (p=0.000 n=9+10)
RandomEncode-8 14.4GB/s ± 2% 14.4GB/s ± 2% ~ (p=0.739 n=10+10)
_ZFlat0-8 792MB/s ± 1% 801MB/s ± 0% +1.11% (p=0.000 n=8+9)
_ZFlat1-8 435MB/s ± 1% 437MB/s ± 0% ~ (p=0.857 n=9+10)
_ZFlat2-8 16.0GB/s ± 4% 16.3GB/s ± 1% ~ (p=0.143 n=10+10)
_ZFlat3-8 613MB/s ± 0% 634MB/s ± 0% +3.54% (p=0.000 n=8+10)
_ZFlat4-8 7.96GB/s ± 1% 7.97GB/s ± 1% ~ (p=0.829 n=8+10)
_ZFlat5-8 770MB/s ± 0% 773MB/s ± 0% +0.33% (p=0.000 n=8+9)
_ZFlat6-8 283MB/s ± 0% 283MB/s ± 0% +0.13% (p=0.043 n=8+9)
_ZFlat7-8 264MB/s ± 2% 265MB/s ± 0% +0.61% (p=0.000 n=9+9)
_ZFlat8-8 297MB/s ± 3% 299MB/s ± 0% ~ (p=0.161 n=9+9)
_ZFlat9-8 247MB/s ± 1% 247MB/s ± 0% ~ (p=0.465 n=8+9)
_ZFlat10-8 1.03GB/s ± 0% 1.05GB/s ± 1% +1.75% (p=0.000 n=9+9)
_ZFlat11-8 409MB/s ± 0% 412MB/s ± 0% +0.64% (p=0.000 n=8+8)
1 file changed