Re: [xml] HTMLparser: comments in <style> element

On Mon, Apr 09, 2007 at 06:29:08PM +1000, Michael Day wrote:

Currently the HTML parser seems to incorrectly parse comments in the 
<style> element. For example:

h1 { color: red }

Because this is HTML not XML and the <style> element is CDATA not PCDATA 
the <!-- should be treated as text, not as the beginning of a comment. 
However, the HTML parser seems to treat it as an actual comment. 
Surprisingly, the HTML parser does not treat &amp; as an entity 
reference, so it does seem to be partially treating <style> as CDATA.

  See htmlParseScript() in HTMLparser.c , it indeed consider <!-- as
a comment parsing start.
says nothing about comments, sone one supposedly must know SGML specific 
on the topic and sorry I never studied SGML. If you have pointer to a
description explaining that comments are not to be interpreted in CDATA
a patch should be easy to design.
But the whole thing is a pile of ad-hoc attempts at working around code
written 10+ years ago , and honnestly I doubt there is any code possible
in libxml2 which will contempt the zillions of different expected behaviour
from various tools, agents etc ... 


Red Hat Virtualization group
Daniel Veillard      | virtualization library
veillard redhat com  | libxml GNOME XML XSLT toolkit | Rpmfind RPM search engine

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]