Re: [xml] HTMLparser: comments in <style> element

On Thu, Apr 12, 2007 at 01:14:20PM +1000, Michael Day wrote:
Hi Daniel,

Here is the patch to stop htmlParseScript() interpreting <!-- as the 
start of a comment.

  yeah the patch is apparently quite simple. But this has non
neglectible side effects, this can be seen in test/HTML/doc3.htm
around line 785, you have a SCRIPT embedded in the middle of 
the document, it uses <!-- to escape some Javascript, which does

  From a script perspective '</' immediately ends up the current
element (c.f. the comment embedded in the function and the related )

 So changing this is not a garanteed gain in the absolute. More
errors are gonna be raised (and the regression tests will need to be fixed).
To some extend the <!-- is used to avoid errors in some environments
and basically that's what libxml2 parser was doing.

  I'm not against the change, but I must raise the drawback publicly
too before applying it.

  I have only one personal comment: HTML parsing is pure hell, you just
cannot do it right, no matter how hard you try.


Red Hat Virtualization group
Daniel Veillard      | virtualization library
veillard redhat com  | libxml GNOME XML XSLT toolkit | Rpmfind RPM search engine

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]