[Evolution] [mogutan din or jp: A suggestion about decoding charsets]



Hi all, 

I'm using camel's camel-mime-utils.c in Pan and got this mail from a Pan
user today regarding i18n improvements.

cheers,
Charles

----- Forwarded message from Yamahata Kenichiro <mogutan din or jp> -----

Delivered-To: charles code rebelbase com
Delivered-To: superpimp org-pan superpimp org
Date: Mon, 18 Sep 2000 01:49:31 +0900
From: Yamahata Kenichiro <mogutan din or jp>
To: pan superpimp org
Subject: A suggestion about decoding charsets
X-Mailer: Sylpheed version 0.3.28 (GTK+ 1.2.8; Linux 2.2.16; i686)

Hi, there's a problem with i18n of PAN, for people like me who uses
charsets used in Asia.

Because most of the charsets used in Asia are incompatible with UTF-8
(character maps are completely different), we need to iconv strings to
a charset specified by enviroment variable "LANG".

For example, LANG variable in my environment is like this:

ja_JP.eucJP

A string after a period describes the default charset. So my default
charset is "eucJP".


I have modified the code like below and it worked well.

-----------------------------------------------------------------------------

*** camel-mime-utils.c.orig     Sun Sep 17 23:07:19 2000
--- camel-mime-utils.c  Mon Sep 18 00:33:45 2000
***************
*** 820,825 ****
--- 820,849 ----
        *in = inptr;
  }
  
+ static char *
+ get_current_charset()
+ {
+       gchar *locale_str;
+       gchar **split_str;
+       gchar *ret_str;
+ 
+       locale_str = gtk_set_locale();
+       if (!locale_str) {
+               return g_strdup("UTF-8");
+       } 
+       split_str = g_strsplit(locale_str, ".", 2);
+       
+       if (*(split_str + 1)) {
+               ret_str = g_strdup(*(split_str + 1));
+       }
+       else {
+               ret_str = g_strdup("UTF-8");
+       }
+       g_strfreev(split_str);
+       return ret_str;
+ }
+ 
+ 
  /* decode rfc 2047 encoded string segment */
  static char *
  rfc2047_decode_word(const char *in, int len)
***************
*** 867,872 ****
--- 891,897 ----
                }
                d(printf("The encoded length = %d\n", inlen));
                if (inlen>0) {
+                       char *charset;
                        /* yuck, all this snot is to setup iconv! */
                        tmplen = inptr-in-3;
                        encname = alloca(tmplen+1);
***************
*** 879,886 ****
                        outbase = alloca(outlen);
                        outbuf = outbase;
  
                        /* TODO: Should this cache iconv converters? */
!                       ic = unicode_iconv_open("UTF-8", encname);
                        if (ic != (unicode_iconv_t)-1) {
                                ret = unicode_iconv(ic, (const char **)&inbuf, &inlen, &outbuf, &outlen);
                                unicode_iconv_close(ic);
--- 904,914 ----
                        outbase = alloca(outlen);
                        outbuf = outbase;
  
+                       charset = get_current_charset();
+ 
                        /* TODO: Should this cache iconv converters? */
!                       ic = unicode_iconv_open(charset, encname);
!                       g_free(charset);
                        if (ic != (unicode_iconv_t)-1) {
                                ret = unicode_iconv(ic, (const char **)&inbuf, &inlen, &outbuf, &outlen);
                                unicode_iconv_close(ic);

-----------------------------------------------------------------------------

Also, we need to decode a message body from a charset specified in Content-Type
header to the default.

----
Yamahata Kenichiro  mogutan din or jp

----- End forwarded message -----




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]