Re: [xml] Crash when parsing bad HTML



On Thu, Aug 23, 2007 at 03:58:40AM -0400, Daniel Veillard wrote:
On Wed, Aug 22, 2007 at 07:16:51PM -0400, Pierre Belzile wrote:

   Hi,
   I'm  using  the  HTML  parser  (htmlParseDocument)  and my application
   segfaults when processing this document:
   <HTML>
   <PRE>
     Some text
   <?PRE>
   <HTML>
   The gdb traceback is:
   #0    0x0000002a96e3a4e4   in   xmlSAX2ProcessingInstruction  ()  from
   /usr/lib64/libxml2.so.2
   #1      0x0000002a96db9779     in     htmlParseEntityRef    ()    from
   /usr/lib64/libxml2.so.2
   #2      0x0000002a96dbb69a     in     htmlParseElement     ()     from
   /usr/lib64/libxml2.so.2
   # ...
   Of  course, it's bad HTML but an exception would be more acceptable. I
   tried  using  the  latest  official  version  (2.6.29) and the problem
   occurs  there too. Is there a patch somewhere? If not, a hint would be
   appreciated because I'm going to have to fix it.
   Cheers, Pierre

  I can't reproduce this. You MUST provide the full input document as an
attachment, I can't reproduce this with xmllint --html . If you do that

  Also note that the document you gave have no entities, the stack trace you
gave cannot correspond to what actually happen, htmlParseElement() never
call htmlParseEntityRef() directly, and htmlParseEntityRef() does not
call xmlSAX2ProcessingInstruction(), in a nutshell the stack trace
information is completely unusable !

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]