FW: [xml] Problem with encoding in libxml.



Hi,

I simply made the libxml parser to 
  1. parse a file. (parse_file() in perl)
  2. convert the obtained tree structure to a string again (toString())
  3. Write back the string to another file.

The input file was UTF-8 encoded. It had the character ß (Beeta) which when viewed in hex viewer was showing 
the following hex values (c3 9f). But in the output file, this encoding is changed and replaced by the 
character (df) which is the hex value for the extended ascii set character ß (beeta). I am attaching the 
files as well.

Is this a bug with libxml2 or there is something wrong from my side? How can this be avoided?

Thanks and regards,
Arun.

-----Original Message-----
From: xml-bounces gnome org [mailto:xml-bounces gnome org] On Behalf Of Arun S K (RBIN/EDM3) *
Sent: Tuesday, 13. December 2005 3:55 PM
To: xml gnome org
Subject: [xml] Problem with encoding in libxml.

Sorry for the wrong subject in the earlier mail. I have the files in windows machine.

Hi all,

I was trying to parse an XML file with encoding set to UTF8 having the following header
<?xml version="1.0" encoding="UTF8"?>

The document has the character ß (Beeta) in it. The parser aborts with the following message 
--------------------------------------------------------------------
:13: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0x80 0x20 0x3C 0x2F
                                <NAME>test_1ß</NAME>
--------------------------------------------------------------------

Is ß not a valid UTF8 character? How can this be corrected.
Could anybody please help me.

Thanks and regards,
Arun.
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]