[xml] Non-ASCII characters
- From: Dan Timis <timis rahul net>
- To: xml gnome org
- Subject: [xml] Non-ASCII characters
- Date: Sat, 20 Sep 2003 21:54:36 -0700
I am trying to save internal data from an application to xml, then
later parse the xml to restore the internal data. Some strings come
from external sources and I noticed that sometimes there are characters
above 0x7F. In one instance, a string had two characters, first 'x'
then 0xB2, which is "superscript two" in other words "x squared."
When I save to xml I get:
<string>x²</string>
When I use SAX to retrieve the string I get "x" then I get 0xC2 0xB2.
I know very little about Unicode, but I found that "superscript two" is
00B2 in the Unicode Latin-1 supplement. What is C2B2?
I fixed the problem by skipping the 0xC2 and appending the the
character that follows it to the string, but it feels like a bad hack.
Is there a way to specify that I am working with 1-byte characters in
the full range 00-FF? Is there another way to solve this problem?
Thanks,
Dan Timis
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]