Re: [xml] Possible bug with iconv-less UTF8 to ISO-8859-15 conversion
- From: "Peter Jacobi" <pj walter-graphtek com>
- To: Mark Itzcovitz <mark itzcovitz ntlworld com>
- Cc: xml gnome org
- Subject: Re: [xml] Possible bug with iconv-less UTF8 to ISO-8859-15 conversion
- Date: Wed, 08 Sep 2004 22:28:59 +0200
Hi Mark, All,
I did this conversion routine some 13 months ago and your criticism look
valid, but OTOH the code did test O.K. that time. Perhaps I did some last
minute changes breaking the code.
I think the problem is about 32 lines down in UTF8ToISO8859x in encoding.c.
The line that reads
if ((c & 0xC0) != 0xC0) {
should read
if ((c & 0xC0) != 0x80) {
since the second byte of a UTF-8 sequence must be of the form 10bbbbbb. If I
make this change then my xmllint outputs the expected characters rather than
the values - that is, apart from the euro symbol, which I will look into
tomorrow.
There are also two lines of code further down, for three-byte sequences,
which I think need changing in the same way. They are:
if ((c1 & 0xC0) != 0xC0) {
and
if ((c2 & 0xC0) != 0xC0) {
Hopefully someone else can verify that I on the right lines.
Please change also these lines and test again. It should work then.
Sorry for all that,
Peter
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]