RE: [xml] French character encoding problem


Thanks for the prompt reply.

I already tried "ISO-8859-1" (and just tried again after reading your reply) and I still get the same result.

Already read the encoding.html page a few times. According to this page, does that mean that by specifying 
encoding to be ISO-8859-1, one can put "Ã" in the xml file ? What about if they choose to generate Ç 
instead of the character ? I actually just tried putting "Ã" in the xml file with encoding ISO-8859-1. 
xmlNodeGetContent() still returns "Ãî" instead.

Also, if xmllint is able to return the proper character, what am I missing that's causing xmlNodeGetContent() 
not ?



-----Original Message-----
From: Daniel Veillard [mailto:veillard redhat com] 
Sent: Thursday, September 15, 2005 12:23 PM
To: Fred Fung
Cc: xml gnome org
Subject: Re: [xml] French character encoding problem

On Thu, Sep 15, 2005 at 12:17:34PM -0400, Fred Fung wrote:
We are using libxml version 2.0.0 on Red Hat Linux Enterprise version 2.4.9.
I have an xml file with the first line specifying the encoding scheme :
         <?xml version="1.0" encoding="LATIN1" ?>

  Uing "LATIN1" is a very bad idea, it is absolutely not portable
  encoding="ISO-8859-1" is the right way.

and one of the text node in the file is the following :

  Horror, uppercase tags !

After the document has been parsed via xmlParseFile( ) and xmlDocGetRootElement( ), a call to 
xmlNodeGetContent( ) returns "FRANÃîOIS" (a strlen of 9) instead of "FRANÃOIS".
Am I missing something in the C program to convert the encoded sequence to the origianl character ?

  Read the doc:


Daniel Veillard      | Red Hat Desktop team
veillard redhat com  | libxml GNOME XML XSLT toolkit | Rpmfind RPM 
search engine

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]