[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [xml] How to determine document encoding
- From: Daniel Veillard <veillard redhat com>
- To: "Erik F. Andersen" <ea ascott dk>
- Cc: xml gnome org
- Subject: Re: [xml] How to determine document encoding
- Date: Mon, 24 Jan 2005 08:54:47 -0500
On Mon, Jan 24, 2005 at 02:17:17PM +0100, Erik F. Andersen wrote:
> I have a SOAP document that contains another SOAP document
> as a node value. When I extract the embedded SOAP document
> (xmlnode->children->contents) this will always be in UTF-8 because that's
> how xmllib encodes contents internally.
All strings returned from the API will be in UTF-8, yes definitely.
> My problem is now how to decode the contents so that I can load it
> via xmlParseDoc?
Use xmlReadxxx APIs and provide the encoding. In general use the new
APIs based on xmlReadxxx instead of the xmlParsexxx ones.
> In other words, how can I read the encoding attribute in <?xml...>
> prior to actually loading the document?
You should not do this, this is a very flawed design.
> I tried loading the UTF-8 encoded document and this can lead to some
> strange results because the document is actually ISO-8859-1 encoded
> in the first place. Of course I can just decode the document by calling
> UTF8Toisolat1 directly but this is not a very generic solution to my
> problem...
Drop the encoding in the first line it will be UTF-8 in the string you
read from the libxml2 API.
Daniel
--
Daniel Veillard | Red Hat Desktop team http://redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]