I first have examined the document with a binary editor. It seems to be correctly encoded. The tests you recommend me to do give the following results : 1) "xmllint" : OK 2) "xmllint --stream" returns "failed to parse" I did not try the library on an other platform (Linux). Send with this mail, you will find the following documents (the both encoded as UTF-16) : - shell.txt : a copy of the shell interface - testXmlReader.xml : the document parsed Pierre -----Message d'origine----- De : Daniel Veillard [mailto:veillard redhat com] Envoyé : dimanche 22 juin 2003 22:20 À : GARNIER Pierre Cc : 'xml gnome org' Objet : Re: [xml] Problems to parse UTF-16 encoded xml with libxml implementation o f xmlReader On Fri, Jun 20, 2003 at 05:31:23PM +0100, GARNIER Pierre wrote:
When using xmlReader in order to parse an xml document encoded in UTF-16 the parser fails to read nodes. It seems that the document is not recognized as UTF-16 encoded document.
The instance must have a problem, have you checked xmllint against it ?
The document is in UTF-16 little endian
That should work, there are test in the regression suite for UTF-16
I resolve my problem by converting the document from UTF-16 encoding to UTF-8 encoding by myself before to parse it.
That should not be needed
Is this the only solution? Is this a bad solution regarding the performance? Is xmlReader supposed to parse only UTF-8 encoded xml?
No, No (libxml2 will convert internally), and No. Make 100% sure your XML file is correct, check xmllint against it. Then check xmllint --stream against it to check the reader interface. If one fails and not the other send a copy. Otherwise you probably have a problem with the document. Daniel -- Daniel Veillard | Red Hat Network https://rhn.redhat.com/ veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
Attachment:
shell.txt
Description: Text document
Attachment:
testXmlReader.xml
Description: Binary data