Re: [xml] HTML push parser fix for repeated start tags



On Sun, Jul 03, 2005 at 11:17:40PM +0100, James Bursa wrote:
I found that the following document:

  <td><td><!-- <a><b> -->

was not parsing correctly and giving an error in the HTML push parser:
[...]
I tracked the problem down to the code around line 4773 of HTMLparser.c.
The if statement appears to be intended to check if htmlParseStartTag()
failed. It compares the tag name and depth with those before the call,
and assumes that htmlParseStartTag() failed if they are equal. However,
this situation occurs in the case above when the second <td> is being
parsed. The depth is equal because a td start tag is defined to close
any open td (in htmlStartClose).

The result is that the parser is left in the wrong state for parsing the
comment, and that's why the "invalid element name" error occurs.

The attached patch fixes the bug by making htmlParseStartTag() return 0
on success and -1 on error, and replacing the comparison of tag name and
depth.

  Excellent, analysis and patch look right on, applied and commited.
I added the test to the regression suite.

   thanks !

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]