Re: [xml] endianness problem in regression suite

From: Daniel Veillard <veillard redhat com>
To: Nicolai Langfeldt <janl linpro no>
Cc: xml gnome org
Subject: Re: [xml] endianness problem in regression suite
Date: Thu, 20 Nov 2003 04:46:44 -0500

On Thu, Nov 20, 2003 at 10:17:19AM +0100, Nicolai Langfeldt wrote:

Building libxml2 on Solaris/Sparc.  In the tests I see a problem in the 
XML regression tests.  From Makefile.am:

[...]

Testing utf16bom.xml
Binary files ./result/utf16bom.xml and result.utf16bom.xml differ

I've run xmllint by hand on result/utf16bom.xml.  In emacs hexl-mode I 
see this in the original file

00000000: fffe 3c00 3f00 7800 6d00 6c00 2000 7600  ..<.?.x.m.l. .v.
00000010: 6500 7200 7300 6900 6f00 6e00 3d00 2200  e.r.s.i.o.n.=.".

And this is the xmllint output:

00000000: feff 003c 003f 0078 006d 006c 0020 0076  ...<.?.x.m.l. .v
00000010: 0065 0072 0073 0069 006f 006e 003d 0022  .e.r.s.i.o.n.=."

I'm told that feff == byte-order mark and that fffe == undefined 
character.


  Well the file is flagged as UTF-16, which unfortunately has 2 variant
onle little-endian, and the other big-endian. Seems the libxml2 UTF-16
serialization code uses the platform native endianness instead of always
using little-endian. The files are still well formed, but maybe this need
to be fixed, xmlInitCharEncodingHandlers() already tests the endianness
of the architecture and it's used for the UTF-8 to UTF-16 input conversion.
This could also be the behaviour of your iconv() library taking over 
libxml2 default UTF-16 routines, that would need to be checked under a 
debugger.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Follow-Ups:
- Re: [xml] endianness problem in regression suite
  - From: William M. Brack
- Re: [xml] endianness problem in regression suite
  - From: Nicolai Langfeldt

References:
- [xml] endianness problem in regression suite
  - From: Nicolai Langfeldt

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]