Re: [xml] libxml + unicode



On Tue, Jul 25, 2006 at 11:14:03AM -0400, Mark Wyszomierski wrote:
Hi,

I'm trying to print some characters that were saved in a unicode format
(some special french accent characters). I was hoping the following would
work but I just get strange characters printed:

   xmlChar *key = xmlNodeListGetString(doc, curNode->xmlChildrenNode, 1);

  key will be in UTF-8 , UTF-8 supports the full Unicode set, though XML
only allows a subset.

Read http://xmlsoft.org/encoding.html

   wprintf(L"Node text is: [%s]\n", key);

  I guess wprintf does not expect UTF-8 encoding.

   xmlFree(key);

My xml document looks like this:

<?xml version="1.0" encoding="UTF-8" ?>
        <test>woo: Ç Ç Ç</test>

  Which is actually completely irrelvant.

What do I have to do to get those special characters to be printed
correctly? In fact, now when I save my XML document as type UTF-8, even the
ascii characters don't print correctly. (printing ASCI chars when saved in
ASCI format works fine).

  Since UTF-8 and ASCII coincide for the ASCII subset of Unicode, I don't
know what you're doing to print them, but you really have a problem.
Being sure to understand the difference between, unicode, code points,
encoding is really a requirement for any programmer those days
  http://www.joelonsoftware.com/articles/Unicode.html

Read also the other article pointed from Tim Bray, he tries to give 
explanations around those issues.

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]