Re: [xml] xmlCheckUTF8-problem (bugfix) [signed]
- From: "Julius Mittenzwei [c]" <julius muenchen-sued de>
- To: xml gnome org
- Subject: Re: [xml] xmlCheckUTF8-problem (bugfix) [signed]
- Date: Fri, 27 Aug 2004 18:20:29 +0200
Hi again,
i tried to trace the Problem a bit.
A valid 2-byte utf8 char must be something like:
110xxxxx 10xxxxxx (http://de.wikipedia.org/wiki/UTF8)
I would suggest to change this line:
if ((c & 0xc0) != 0x80 || (utf[ix + 1] & 0xc0) != 0x80)
in
xmlstring.c
to
if ((c & 0xe0) != 0xc0 || ( utf[ix + 1] & 0xc0 ) != 0x80 )
it "ands" the "c" with 11100000=0xe0 to get the first 3 bits.
If this is exactly 11000000=0xc0 you can be sure, that the byte starts
with "110".
Regards
/Julius
On 27.08.2004, at 15:53, Julius Mittenzwei [c] wrote:
>
Hi,
i just updated to libxml 2.6.12 and became problems with the function
xmlCheckUTF8().
This function returns false even if the string is a valid utf8-string,
which can easily be translated to isolat with the function
UTF8Toisolat1.
Im not quite sure whether this has something to do with:
http://bugzilla.gnome.org/show_bug.cgi?id=148115
Any suggestions?
Thank you
/Julius
-----------------------------------------------------
#include <libxml/parser.h>
int main (int i,char** s) {
const xmlChar* utf = "Köchin";
int utflen = xmlStrlen(utf);
unsigned char* lat = (unsigned char*) malloc(utflen);
int latlen;
if(xmlCheckUTF8 (utf))
printf("valid utf8\n");
else
printf("no valid utf8\n");
UTF8Toisolat1(lat,&latlen,(unsigned char*)utf,&utflen);
lat[latlen]=0x00;
printf("%s: %s -> %s\n",LIBXML_DOTTED_VERSION,utf,lat);
return 0;
}
---------------------------------------------------
[chef bruce test]$ ./test
no valid utf8
2.6.12: Köchin -> Köchin
---------------------------------------------------
>
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml
--
---------------------[ Ciphire Signature ]----------------------
From: julius muenchen-sued de signed email body (1527 characters)
Date: on 27 August 2004 at 16:20:36 GMT
To: xml gnome org
----------------------------------------------------------------
: The message above has been secured using Ciphire Mail.
: Verify this signature and download your free encryption
: software at www.ciphire.com. The three garbled lines
: below are the sender's verifiable encoded signature.
----------------------------------------------------------------
00fAAAAAEAAABUXy9B9wUAAJwCAAIAAgACACCmPgNAJQoFEAwysJwtcX5m05sj5F
cuq6WfqRmNBGuajQEAB8ZV8kQLz9eHXt1kqpjkBfkmIa/UpvlGUjvMKJl/qx/YII
R6yG5La0w0um+FgGv20NNRmwEaRdmlLBNa8sPc2g==
------------------[ End Ciphire Signed Message ]----------------
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]