ARM32 NEON SIMD implementation of Huffman encoding

Full-color compression speedups relative to libjpeg-turbo 1.4.2:

800 MHz ARM Cortex-A9, iOS, 32-bit:  26-44% (avg. 32%)

Refer to #42 and #47 for discussion.

This commit also removes the unnecessary

    if (simd_support & JSIMD_ARM_NEON)

statements from the jsimd* algorithm functions.  Since the jsimd_can*()
functions check for the existence of NEON, the corresponding algorithm
functions will never be called if NEON isn't available.  Removing those
if statements improved performance across the board by a couple of
percent.

Based on:
https://github.com/mayeut/libjpeg-turbo/commit/fc023c880ce1d6c908fb78ccc25f5d5fd910ccc5
4 files changed
tree: 27d99cf089c94d4cd79a5cf9c696403fd6fd40e7
  1. cmakescripts/
  2. doc/
  3. java/
  4. md5/
  5. release/
  6. sharedlib/
  7. simd/
  8. testimages/
  9. win/
  10. .gitignore
  11. acinclude.m4
  12. bmp.c
  13. bmp.h
  14. BUILDING.md
  15. cderror.h
  16. cdjpeg.c
  17. cdjpeg.h
  18. change.log
  19. ChangeLog.txt
  20. cjpeg.1
  21. cjpeg.c
  22. CMakeLists.txt
  23. coderules.txt
  24. configure.ac
  25. djpeg.1
  26. djpeg.c
  27. doxygen-extra.css
  28. doxygen.config
  29. example.c
  30. jaricom.c
  31. jcapimin.c
  32. jcapistd.c
  33. jcarith.c
  34. jccoefct.c
  35. jccolext.c
  36. jccolor.c
  37. jcdctmgr.c
  38. jchuff.c
  39. jchuff.h
  40. jcinit.c
  41. jcmainct.c
  42. jcmarker.c
  43. jcmaster.c
  44. jcomapi.c
  45. jconfig.h.in
  46. jconfig.txt
  47. jconfigint.h.in
  48. jcparam.c
  49. jcphuff.c
  50. jcprepct.c
  51. jcsample.c
  52. jcstest.c
  53. jctrans.c
  54. jdapimin.c
  55. jdapistd.c
  56. jdarith.c
  57. jdatadst-tj.c
  58. jdatadst.c
  59. jdatasrc-tj.c
  60. jdatasrc.c
  61. jdcoefct.c
  62. jdcoefct.h
  63. jdcol565.c
  64. jdcolext.c
  65. jdcolor.c
  66. jdct.h
  67. jddctmgr.c
  68. jdhuff.c
  69. jdhuff.h
  70. jdinput.c
  71. jdmainct.c
  72. jdmainct.h
  73. jdmarker.c
  74. jdmaster.c
  75. jdmerge.c
  76. jdmrg565.c
  77. jdmrgext.c
  78. jdphuff.c
  79. jdpostct.c
  80. jdsample.c
  81. jdsample.h
  82. jdtrans.c
  83. jerror.c
  84. jerror.h
  85. jfdctflt.c
  86. jfdctfst.c
  87. jfdctint.c
  88. jidctflt.c
  89. jidctfst.c
  90. jidctint.c
  91. jidctred.c
  92. jinclude.h
  93. jmemmgr.c
  94. jmemnobs.c
  95. jmemsys.h
  96. jmorecfg.h
  97. jpeg_nbits_table.h
  98. jpegcomp.h
  99. jpegint.h
  100. jpeglib.h
  101. jpegtran.1
  102. jpegtran.c
  103. jquant1.c
  104. jquant2.c
  105. jsimd.h
  106. jsimd_none.c
  107. jsimddct.h
  108. jstdhuff.c
  109. jutils.c
  110. jversion.h
  111. libjpeg.map.in
  112. libjpeg.txt
  113. LICENSE.md
  114. Makefile.am
  115. rdbmp.c
  116. rdcolmap.c
  117. rdgif.c
  118. rdjpgcom.1
  119. rdjpgcom.c
  120. rdppm.c
  121. rdrle.c
  122. rdswitch.c
  123. rdtarga.c
  124. README.ijg
  125. README.md
  126. structure.txt
  127. tjbench.c
  128. tjbenchtest.in
  129. tjbenchtest.java.in
  130. tjexampletest.in
  131. tjunittest.c
  132. tjutil.c
  133. tjutil.h
  134. transupp.c
  135. transupp.h
  136. turbojpeg-jni.c
  137. turbojpeg-mapfile
  138. turbojpeg-mapfile.jni
  139. turbojpeg.c
  140. turbojpeg.h
  141. usage.txt
  142. wizard.txt
  143. wrbmp.c
  144. wrgif.c
  145. wrjpgcom.1
  146. wrjpgcom.c
  147. wrppm.c
  148. wrrle.c
  149. wrtarga.c