Re: [xml] HTML script/style parsing change in 2.6.28
- From: Michael Day <mikeday yeslogic com>
- To: "Edward Z. Yang" <edwardzyang thewritingpot com>
- Cc: xml gnome org
- Subject: Re: [xml] HTML script/style parsing change in 2.6.28
- Date: Sat, 16 Feb 2008 17:09:27 +1100
Hi Edward,
HTMLparser.c: change the way script/style are parsed to
not try to detect comments, reported by Mike Day (2.6.28)
That would be me.
But, despite my Google-fu, I couldn't find what exactly the change
entailed. Let's suppose we have the code:
<script><!--
alert('Test!');
// --></script>
In XML, this is a comment. In HTML, it isn't, as <script> and <style>
tags are unparsed CDATA in HTML.
The habit of commenting out the content dates back to ancient browsers
that didn't recognise these elements, and would include the JavaScript
or CSS as literal text in the page.
Since no browsers do this any more, there is no point adding comments,
but millions of existing pages already have them.
1. Is the behavior, as I observed it, true to the intention of the change?
Yes, it makes the libxml2 HTML parser consistent with web browsers.
2. Is this behavior desirable? As it turns out, the new version returns
*invalid* JavaScript (unless our js parser is smart enough to ignore a
leading <!--)
HTML comment delimiters should be ignored by the parser. If you check
the CSS spec, you'll see that it actually mentions how <!-- is ignored.
3. Is it a good idea to do a libxml version sniff (2.6.28 or later) to
accomodate for this behavior change?
No idea, entirely up to you :)
Best regards,
Michael
--
Print XML with Prince!
http://www.princexml.com
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]