Optimize silk_NSQ_del_dec() for ARM NEON

The optimization is bit exact with C function.

This optimization speeds up SILK encoder on NEON as following.

Fixed-point:
Complexity 0-5:  0%
Complexity 6-7:  6%
Complexity 8-9: 10%
Complexity  10:  8%

Got similar results on floating-point.

Change-Id: I11aa1d599ab2e9c6d63901542a079fb3165ad04f

Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
7 files changed