Re: gtk bug or glibc locale bug?




Changwoo Ryu <cwryu@adam.kaist.ac.kr> writes:

> I recently upgraded several Debian packages---xfree86 with no
> -DX_LOCALE, wcsmbs patch from Debian-JP and Korean locale for wcsmbs.
> But after the upgrade, all my GTK+ programs didn't properly work with
> multibyte language.  GTK+ library has been compiled from the source.
> 
> I attached the fix.  The patch is a part of the GTK+ XIM improvement,
> <http://arch.comp.kyutech.ac.jp/~matsu/my_products/gtk/xim-1998.09.16.patch>
> 
> The problem was, the result of mblen ("\xc0", MB_CUR_MAX) was -1 in
> "C" locale.  But I believe it should be 1.  Is it glibc's (or
> wcsmbs's) bug?
>
> Anyway, this patch fixes the problem.  If noone complains, I'll commit
> this.

This is not really a correct patch.

The problem the code is supposed to detect, is that under 
stock glibc (I don't know about the Debian-modified glibc)
the mb* functions always deal in UTF-8, which isn't useful
for what GTK+ wants to do. 

"0xc0\n" is not a valid UTF8 string, hence the return of -1.
This tells GTK+ - OK, the C library's multibyte functions
aren't useful, so treat everything as 1 byte.

Your patch, unfortunately, breaks the 1 byte locales in stock glibc,
because encoded in UTF8, the maximum length of a 1-byte character is
2, so for a 1 byte character MB_CUR_MAX==2.

I think the correct thing to do, in the short term, is
to apply something like:

 ftp://ftp.gtk.org/pub/gtk/patches/gtk-a-higuti-980912-0.patch.gz

which switches over the Entry and Text widget to using wide
characters. Locale-dependent variable-width encodings are just not
reliable. In the long term, Unicode is the right way to go.

                                        Owen
 
> ----------------------------------------------------------------------
> diff -u -r1.85 gtkmain.c
> --- gtkmain.c	1998/10/25 19:30:02	1.85
> +++ gtkmain.c	1998/11/05 13:36:49
> @@ -405,15 +405,18 @@
>    current_locale = g_strdup (setlocale (LC_CTYPE, NULL));
>  
>  #ifdef X_LOCALE
> +  /* with X_LOCALE, MB_CUR_MAX is always 4 regardless of the locale */
>    if ((strcmp (current_locale, "C")) && (strcmp (current_locale, "POSIX")))
>      gtk_use_mb = TRUE;
>    else
> +    gtk_use_mb = FALSE;
> +#else
> +  if ((strcmp (current_locale, "C")) && (strcmp (current_locale, "POSIX"))
> +      && MB_CUR_MAX != 1)
> +    gtk_use_mb = TRUE;
> +  else
> +    gtk_use_mb = FALSE;
>  #endif
> -    {
> -      setlocale (LC_CTYPE, "C");
> -      gtk_use_mb = (mblen ("\xc0", MB_CUR_MAX) == 1);
> -      setlocale (LC_CTYPE, current_locale);
> -    }
>  
>    g_free (current_locale);



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]