Re: [xml] encoding
- From: Daniel Veillard <veillard redhat com>
- To: Malcolm Tredinnick <malcolm commsecure com au>
- Cc: xml gnome org
- Subject: Re: [xml] encoding
- Date: Tue, 22 Feb 2005 17:30:02 -0500
On Wed, Feb 23, 2005 at 09:26:09AM +1100, Malcolm Tredinnick wrote:
On Tue, 2005-02-22 at 23:26 +0200, Bar Gam wrote:
Hello
Hi :)
If I try to parse a document encoded in iso-8859-8 - should it be
converted to UTF-8, or is it supported and handled by the parser on
the fly? If the content should be converted (and deconverted) - what
method should be used in this
?case
Providing the document encoding is correctly specified and providing you
have Iconv support compiled in, the conversion to UTF-8 will be done for
you automatically as libxml2 parses the document.
If the document encoding is not specified in the xml declaration at the
top of the file (<?xml encoding="..."?>), there is a way to pass it in
directly when using the libxml API -- this is needed because of the way
HTTP documents have their encoding specified, for example. But I cannot
remember the exact call off the top of my head.
You can see if you have Iconv support available by looking at the output
of 'xmllint --version'.
If iconv is not present all the iso-8859-[1-15] are compiled in the library
by default so unless a very specific setup the conversion will be supported.
Anyway if the encoding is not supported, per the spec it's a fatal error
and the parser fail immediately and deliver no data.
Daniel
--
Daniel Veillard | Red Hat Desktop team http://redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]