Re: FW: [xml] Problem with encoding in libxml.

At 8:38 AM +0530 12/16/05, Arun S K (RBIN/EDM3) * wrote:

I simply made the libxml parser to
  1. parse a file. (parse_file() in perl)
  2. convert the obtained tree structure to a string again (toString())
  3. Write back the string to another file.

The input file was UTF-8 encoded. It had the character ß (Beeta) which when viewed in hex viewer was showing the following hex values (c3 9f). But in the output file, this encoding is changed and replaced by the character (df) which is the hex value for the extended ascii set character ß (beeta). I am attaching the files as well.

Is this a bug with libxml2 or there is something wrong from my side? How can this be avoided?

FWIW, with the implementation of libxml2 into XMLLib (Scripting Addition for AppleScript), saving a XML which includes c39f does save that same pattern (that on our macs with can read as ASCII "?ü"), not a replacement character.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]