Re: [xml] htmlParseFile fails to parse HTML file with UTF-8 BOM/ZWNBSP



On Mon, Jun 29, 2009 at 04:45:52PM +0000, David Vergnaud wrote:

Environment: libxml2.so.2.6.32 (09.02.05), gcc 4.3.1, SuSE 11.x

Hi everyone,
[...]

A related bug seems to have been reported concerning DTD's with BOM, this one is about plain HTML files. 

I'm just trying to parse a basic HTML file using function htmlParseFile. If the file starts with a UTF-8 
"BOM" (ZWNBSP: 0xEF 0xBB 0xBF), then executing the program gives the following output: 

  1/ could you try with a version which is not 4 years old ???
     2.7.3 is out for a number of months now !
  2/ what does xmllint --html --noout reports with a recent version
     when given your input
  3/ if the problem persists with a recent version provide your test
     file as a mail attachment

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]