Re: [xml] conversion from wchar_t

Its important to note that wchar_t != UTF16, it is a native type that can even be a single byte (and is on some 8 bit platforms).

You can't blindly stick UTF16 into a wchar_t, as you've found wchar_t on unix varients is UTF32 with native byte ordering.  On Windows it happens to be UTF16 with native byte ordering so you can stuff UTF16LE directly into a (x86 based) Windows machines without problems, but that won't work on a ARM or MIPS based CE machine.  Likewise, blindly copying UTF32LE into a wchar_t on Linux for x86 will work fine, but the code won't work on a Linux machine on a MIPS processor.  It might work on ARM as ARM will go both ways and I presume the Linux guys used LE byte ordering but thats just a guess.

On Jan 13, 2011, at 6:44 PM, Bevan Collins wrote:

On Debian sizeof(wchar_t) is indeed 4. Thanks Jonah!

On Fri, Jan 14, 2011 at 12:39 PM, Jonah Petri <jpetri izotope com> wrote:
I can't say for sure - but check sizeof(wchar_t) on debian.  If it's 4, then that's your answer.  Just a hunch!


On Jan 13, 2011, at 6:17 PM, Bevan Collins wrote:

Please tell me what I am doing wrong:

#include <libxml/encoding.h>
#include <wchar.h>

int main()
xmlCharEncodingHandlerPtr utf16Enc = xmlFindCharEncodingHandler("UTF-16");
wchar_t* url = "" href="" target="_blank">";
xmlBufferPtr in = xmlBufferCreateStatic(url, wcslen(url) * 2);
xmlBufferPtr out = xmlBufferCreate();

int rc = xmlCharEncInFunc(utf16Enc, out, in);
printf("rc=%d url="" rc, (char*)out->content);


return 0;

on Windows with version libxml2-2.7.7 I get:
rc=16 url="" href="" target="_blank">

on i386 Debian with version libxml2-2.7.8 I get:
rc=16 url="">

It looks like on Debian that it has simply copied the input buffer into the output buffer:
url[0] = 'h'
url[1] = '\0'
url[2] = 't'

xml mailing list, project page
xml gnome org

xml mailing list, project page
xml gnome org

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]