Make htmlCurrentChar always translate U+0000

The general assumption is that htmlCurrentChar only returns 0 if the
end of the input buffer is reached. The UTF-8 path already logged an
error if a zero byte U+0000 was found and returned a space character
instead. Make the ASCII code path do the same.

htmlParseTryOrFinish skips zero bytes at the beginning of a buffer, so
even if 0 was returned from htmlCurrentChar, the push parser would make
progress. But rescanning the input could cause performance problems.

The pull parser would abort parsing and now handles zero bytes in ASCII
mode the same way as the push parser or as in UTF-8 mode.

It would be better to return the replacement character U+FFFD instead,
but some of the client code assumes that the UTF-8 length of input and
output matches.
1 file changed
tree: f746054947ab6af7365eda14712099d5845f7022
  1. bakefile/
  2. doc/
  3. example/
  4. fuzz/
  5. include/
  6. macos/
  7. optim/
  8. os400/
  9. python/
  10. result/
  11. test/
  12. vms/
  13. VxWorks/
  14. win32/
  15. xstc/
  16. .gitattributes
  17. .gitignore
  18. .gitlab-ci.yml
  19. .travis.yml
  20. acinclude.m4
  21. AUTHORS
  22. autogen.sh
  23. buf.c
  24. buf.h
  25. build_glob.py
  26. c14n.c
  27. catalog.c
  28. ChangeLog
  29. check-relaxng-test-suite.py
  30. check-relaxng-test-suite2.py
  31. check-xinclude-test-suite.py
  32. check-xml-test-suite.py
  33. check-xsddata-test-suite.py
  34. chvalid.c
  35. chvalid.def
  36. CMakeLists.txt
  37. config.h.cmake.in
  38. configure.ac
  39. CONTRIBUTING
  40. Copyright
  41. dbgen.pl
  42. dbgenattr.pl
  43. debugXML.c
  44. dict.c
  45. DOCBparser.c
  46. elfgcchack.h
  47. enc.h
  48. encoding.c
  49. entities.c
  50. error.c
  51. genChRanges.py
  52. gentest.py
  53. genUnicode.py
  54. global.data
  55. globals.c
  56. hash.c
  57. HTMLparser.c
  58. HTMLtree.c
  59. INSTALL.libxml2
  60. legacy.c
  61. libxml-2.0-uninstalled.pc.in
  62. libxml-2.0.pc.in
  63. libxml.3
  64. libxml.h
  65. libxml.m4
  66. libxml.spec.in
  67. libxml2-config.cmake.cmake.in
  68. libxml2-config.cmake.in
  69. libxml2.doap
  70. libxml2.syms
  71. list.c
  72. MAINTAINERS
  73. Makefile.am
  74. Makefile.tests
  75. Makefile.win
  76. nanoftp.c
  77. nanohttp.c
  78. NEWS
  79. parser.c
  80. parserInternals.c
  81. pattern.c
  82. README
  83. README.cvs-commits
  84. README.tests
  85. README.zOS
  86. regressions.py
  87. regressions.xml
  88. relaxng.c
  89. rngparser.c
  90. runsuite.c
  91. runtest.c
  92. runxmlconf.c
  93. save.h
  94. SAX.c
  95. SAX2.c
  96. schematron.c
  97. testapi.c
  98. testAutomata.c
  99. testC14N.c
  100. testchar.c
  101. testdict.c
  102. testdso.c
  103. testHTML.c
  104. testlimits.c
  105. testModule.c
  106. testOOM.c
  107. testOOMlib.c
  108. testOOMlib.h
  109. testReader.c
  110. testrecurse.c
  111. testRegexp.c
  112. testRelax.c
  113. testSAX.c
  114. testSchemas.c
  115. testThreads.c
  116. testURI.c
  117. testXPath.c
  118. threads.c
  119. timsort.h
  120. TODO
  121. TODO_SCHEMAS
  122. tree.c
  123. trio.c
  124. trio.h
  125. triodef.h
  126. trionan.c
  127. trionan.h
  128. triop.h
  129. triostr.c
  130. triostr.h
  131. uri.c
  132. valid.c
  133. xinclude.c
  134. xlink.c
  135. xml2-config.1
  136. xml2-config.in
  137. xml2Conf.sh.in
  138. xmlcatalog.c
  139. xmlIO.c
  140. xmllint.c
  141. xmlmemory.c
  142. xmlmodule.c
  143. xmlreader.c
  144. xmlregexp.c
  145. xmlsave.c
  146. xmlschemas.c
  147. xmlschemastypes.c
  148. xmlstring.c
  149. xmlunicode.c
  150. xmlwriter.c
  151. xpath.c
  152. xpointer.c
  153. xzlib.c
  154. xzlib.h