Hi Daniel, The invalid comment in wired.html is this: <!------TRADES---------> Because it has an odd number of "--" sequences the comment is actually not terminated according to the SGML rules. Web browsers will actually parse this comment differently depending on whether they are using standards-mode or quirks-mode to parse the document. I have attached an HTML document that demonstrates the issue. If you open it in Mozilla, it will be parsed in standards-mode because it has a DOCTYPE declaration. In this case the comment will not be terminated and some of the document text will be hidden. If you delete the DOCTYPE it will be parsed in quirks-mode, the comment will be terminated and the text will be shown. I cannot think of any way to detect comment termination that will handle both cases correctly without adding a quirks-mode feature to the libxml HTMLparser; there is no other way to parse old HTML and new HTML and get them both right. Would it be reasonable for me to add a quirks-mode flag to the HTML parser that would only toggle comment parsing behaviour for now? Cheers, Michael -- Print XML with Prince! http://www.princexml.com
There will be text after this paragraph in quirks mode.
This will be hidden in standards-mode.-->
There will be text before this paragraph in quirks mode.