Re: [xml] HTML script/style parsing change in 2.6.28
- From: Michael Day <mikeday yeslogic com>
- To: "Edward Z. Yang" <edwardzyang thewritingpot com>
- Cc: xml gnome org
- Subject: Re: [xml] HTML script/style parsing change in 2.6.28
- Date: Sat, 16 Feb 2008 17:09:27 +1100
HTMLparser.c: change the way script/style are parsed to
not try to detect comments, reported by Mike Day (2.6.28)
That would be me.
But, despite my Google-fu, I couldn't find what exactly the change
entailed. Let's suppose we have the code:
In XML, this is a comment. In HTML, it isn't, as <script> and <style>
tags are unparsed CDATA in HTML.
The habit of commenting out the content dates back to ancient browsers
or CSS as literal text in the page.
Since no browsers do this any more, there is no point adding comments,
but millions of existing pages already have them.
1. Is the behavior, as I observed it, true to the intention of the change?
Yes, it makes the libxml2 HTML parser consistent with web browsers.
2. Is this behavior desirable? As it turns out, the new version returns
HTML comment delimiters should be ignored by the parser. If you check
the CSS spec, you'll see that it actually mentions how <!-- is ignored.
3. Is it a good idea to do a libxml version sniff (2.6.28 or later) to
accomodate for this behavior change?
No idea, entirely up to you :)
Print XML with Prince!
] [Thread Prev