Re: [xml] Problems to parse UTF-16 encoded xml with libxml implementation o f xmlReader



On Fri, Jun 20, 2003 at 05:31:23PM +0100, GARNIER Pierre wrote:
  When using xmlReader in order to parse an xml document encoded in UTF-16
the parser fails to read nodes.
  It seems that the document is not recognized as UTF-16 encoded document.

  The instance must have a problem, have you checked xmllint against it ?

  The document is in UTF-16 little endian

  That should work, there are test in the regression suite for UTF-16

  I resolve my problem by converting the document from UTF-16 encoding to
UTF-8 encoding by myself before to parse it.

  That should not be needed

  Is this the only solution? Is this a bad solution regarding the
performance? Is xmlReader supposed to parse only UTF-8 encoded xml?

  No, No (libxml2 will convert internally), and No.
Make 100% sure your XML file is correct, check xmllint against it.
Then check xmllint --stream against it to check the reader interface.
If one fails and not the other send a copy. Otherwise you probably
have a problem with the document.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]