RE : [xml] Problems to parse UTF-16 encoded xml with libxml imple mentation o f xmlReader



I first have examined the document with a binary editor. It seems to be
correctly encoded.

The tests you recommend me to do give the following results :
1) "xmllint" : OK
2) "xmllint --stream" returns "failed to parse"

I did not try the library on an other platform (Linux).

Send with this mail, you will find the following documents (the both encoded
as UTF-16) :
  - shell.txt          : a copy of the shell interface
  - testXmlReader.xml  : the document parsed

Pierre

-----Message d'origine-----
De : Daniel Veillard [mailto:veillard redhat com]
Envoyé : dimanche 22 juin 2003 22:20
À : GARNIER Pierre
Cc : 'xml gnome org'
Objet : Re: [xml] Problems to parse UTF-16 encoded xml with libxml
implementation o f xmlReader


On Fri, Jun 20, 2003 at 05:31:23PM +0100, GARNIER Pierre wrote:
  When using xmlReader in order to parse an xml document encoded in
UTF-16 the parser fails to read nodes.
  It seems that the document is not recognized as UTF-16 encoded
document.

  The instance must have a problem, have you checked xmllint against it ?

  The document is in UTF-16 little endian

  That should work, there are test in the regression suite for UTF-16

  I resolve my problem by converting the document from UTF-16 encoding
to UTF-8 encoding by myself before to parse it.

  That should not be needed

  Is this the only solution? Is this a bad solution regarding the
performance? Is xmlReader supposed to parse only UTF-8 encoded xml?

  No, No (libxml2 will convert internally), and No.
Make 100% sure your XML file is correct, check xmllint against it. Then
check xmllint --stream against it to check the reader interface. If one
fails and not the other send a copy. Otherwise you probably have a problem
with the document.

Daniel

--
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Attachment: shell.txt
Description: Text document

Attachment: testXmlReader.xml
Description: Binary data



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]