I mentioned this a while back, but now I can demonstrate it using the standard libxml2 utils and have a fix for it. If you take the attached XML file (ok, rdf/xml but that's not important) and run it like this: $ xmllint --version xmllint: using libxml version 20607 compiled with: DTDValid FTP HTTP HTML C14N Catalog XPath XPointer XInclude Iconv Unicode Regexps Automata Schemas $ xmllint bad-cdata-libxml.rdf bad-cdata-libxml.rdf:13: parser error : EntityRef: expecting ';' blah blah A&B ^ $ xmllint --push bad-cdata-libxml.rdf bad-cdata-libxml.rdf:11: parser error : EntityRef: expecting ';' blah blah A&B ^ so there is a difference of 2 which is caused by the two newlines in the CDATA section not being counted in the push parser. The error is that the SKIP macro in parser.c does not count newlines that it crosses, which affects CDATA. The solution is to make a new SKIPL macro that does, and use it in the two places it needs it. Patch attached, working against CVS sources right now. Dave
Attachment:
bad-cdata-libxml.rdf
Description: Text Data
Attachment:
parser.c.patch
Description: Text document