You need to pass XML_PARSE_NOCDATA | XML_PARSE_DTDATTR | XML_PARSE_NOENT
as the parser options for the XSLT and the XML to be sure to have a compliant
XPath data model representation in the tree.
I can't make sense of that code and the error is there. Why use a Push parse when you have all data already in memory ?
I copied it from our XML+CSS code, which parses incrementally. I can obviously simplify the XSLT case, since that won't be incremental. I can change that.
That will be way easier to maintain IMO.
Why try to fool
the parser about encoding when libxml2 does implement the encoding
detection specified in appendix F of the XML specification ? Also
xmlCreatePushParserCtxt has encoding detection, you're redoing in
a likely untested way what libxml2 does reliably for ages. By doing
a forced cast to UTF16 you're breaking the encoding detection,
you're breaking performances, and you're likely to also break
of the parser. Do not force a cast to UTF-16, it's really really bad !
Beware too of the decoupling from the HTTP engine and the XML parser,
you must read http://www.w3.org/TR/REC-xml/#sec-guessing and RFC 3023
you will have to pass the encoding as declared in the Content-Type
I didn't write this particular code, but I believe that it was added to
fix a bug where XHTML with a BOM was not rendering correctly. I'll try
to get you more details so that we can figure out what's going on.
libxml2 will use and detect the BOM if present. But when you create
the Push parser context you should pass down the 4 first bytes of the
entity. Again this is all related to appendix F in the XML Rec. This is
tricky to fully get right, and I think bypassing the libxml2 code which
implement as exactly as possible that part of the spec is likely to break
the parser conformance.