g_locale_to_utf8() chokes on =?iso-8859-1?q?'=F6'?=
- From: Chris Martin <c martin sheffield ac uk>
 
- To: gtk-app-devel-list <gtk-app-devel-list gnome org>
 
- Subject: g_locale_to_utf8() chokes on 'ö'
 
- Date: Wed, 25 Sep 2002 11:05:55 +0100
 
To get round this problem, here is a little function which will
convert a string of bytes with the values 1 to 255 into a UTF-8 string
which will pass muster (g_utf8_validate() accepts it).
static void
utf8_cnvt(gchar * source, int source_lgth, gchar * dest)
{
  gchar *p, *q;
  int i;
  if (source_lgth == -1)
    source_lgth = strlen(source);
  for (p = source, q = dest, i = 0; i < source_lgth; p++, i++)
    if (isascii(*p)) {
      *q++ = *p;
    } else {
      *q++ = 0xc0 | (((guchar) (*p) >> 6) & 0x03);
      *q++ = 0x80 | ((guchar) (*p) & 0x3f);
    }
  *q = '\0';
}
The conversion comes from the comments to Naoto Takahashi's utf-8.el
(in emacs-21.1 and 21.2), the relevant bit of the comment is:
;; UTF-8 is defined in RFC 2279.  A sketch of the encoding is:
;;        scalar       |               utf-8
;;        value        | 1st byte  | 2nd byte  | 3rd byte
;; --------------------+-----------+-----------+----------
;; 0000 0000 0xxx xxxx | 0xxx xxxx |           |
;; 0000 0yyy yyxx xxxx | 110y yyyy | 10xx xxxx |
;; zzzz yyyy yyxx xxxx | 1110 zzzz | 10yy yyyy | 10xx xxxx
This mailing list has been very helpful, perhaps this will be a little
repayment.
There is no copyright on this code, do with it what you will, offer it
for sale on EBay if you like ;)
[
Date Prev][
Date Next]   [
Thread Prev][
Thread Next]   
[
Thread Index]
[
Date Index]
[
Author Index]