Re: FW: [xml] Problem with encoding in libxml.



On Fri, Dec 16, 2005 at 08:38:44AM +0530, Arun S K (RBIN/EDM3) * wrote:
Hi,

I simply made the libxml parser to 
  1. parse a file. (parse_file() in perl)
  2. convert the obtained tree structure to a string again (toString())
  3. Write back the string to another file.

The input file was UTF-8 encoded. It had the character ß (Beeta) which when viewed in hex viewer was 
showing the following hex values (c3 9f). But in the output file, this encoding is changed and replaced by 
the character (df) which is the hex value for the extended ascii set character ß (beeta). I am attaching 
the files as well.

Is this a bug with libxml2 or there is something wrong from my side? How can this be avoided?

  Reproduce the problem with xmllint and I may consider this a bug in
libxml2. You are not using a libxml2 function call to do the serialization
and I have no idea what is the semantic of the toString() you are using !

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]