Re: [xml] encoding practice?
- From: Daniel Veillard <veillard redhat com>
- To: Hannu Krosing <hannu tm ee>
- Cc: Derek Holden <dsh2120 draper com>, xml gnome org
- Subject: Re: [xml] encoding practice?
- Date: Fri, 13 Dec 2002 10:41:54 -0500
On Fri, Dec 13, 2002 at 05:07:46PM +0000, Hannu Krosing wrote:
that any XML parser MUST support. And I would advise against UTF16 in general
because it forces extra conversion in most processing tools and in general
wastes spaces with useless zeroes...
AFAIK this is only true for latin-based scripts. For most far-east
"in general"
scripts UTF-16 actually saves space by spending only 2 bytes per char
instead of 3.
Even people using Japanese or Chinese usually don't go for
UTF-16 anyway. And still most of the non-CDATA will still waste
every other byte. Non ASCII characters are really non-frequent for
markup, that and indentation still make UTF8 win in most documents.
I maintain UTF-8 is a better choice in most cases.
BTW, is there any support for using UTF-16 internally in libxml2/libxslt
?
It's an XML parser so yes, as I said conformance REQUIRES it. If a parser
can't grasp UTF-16 it's not a conformant XML parser, period.
Daniel
--
Daniel Veillard | Red Hat Network https://rhn.redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]