Re: gtk bug or glibc locale bug?




Changwoo Ryu <cwryu@adam.kaist.ac.kr> writes:

> Owen Taylor <otaylor@redhat.com> writes:
> 
> > The problem the code is supposed to detect, is that under 
> > stock glibc (I don't know about the Debian-modified glibc)
> > the mb* functions always deal in UTF-8, which isn't useful
> > for what GTK+ wants to do. 
> >
> > "0xc0\n" is not a valid UTF8 string, hence the return of -1.
> > This tells GTK+ - OK, the C library's multibyte functions
> > aren't useful, so treat everything as 1 byte.
> 
> Yes, I can understand.  (*sigh*)
> 
> The wcsmbs package is a workaround for multibyte users.  It put
> "libwcsmbs.so" to /etc/ld.so.preload, and the libwcsmbs.so redefines
> mb* and wc* functions.  mb* function is not only for UTF-8 encoding in
> this environment.
> 
> > I think the correct thing to do, in the short term, is
> > to apply something like:
> > 
> >  ftp://ftp.gtk.org/pub/gtk/patches/gtk-a-higuti-980912-0.patch.gz
> 
> I just read the patch.  But the patch breaks binary compatibility.
> (We need to use GTK+ *now*.)

Hmmm, well I'm not sure there will be another 1.0.x at all before
1.2 which will break binary compatibility already (along with
source compatibility in a few places.) If you are going to be
using your own patches to GTK+, you are, of course, free to
use whatever you like.
 
> How about (MB_CUR_MAX >= 3) after the "\xc0" check?  It seems OK to
> me.  Is there any 1-byte locale whose MB_CUR_MAX >= 3, or multibyte
> locale whose MB_CUR_MAX < 3 ?
> 
> ----------------------------------------------------------------------
>        setlocale (LC_CTYPE, "C");
>        gtk_use_mb = (mblen ("\xc0", MB_CUR_MAX) == 1);
>        setlocale (LC_CTYPE, current_locale);
> +      if (! gtk_use_mb && (MB_CUR_MAX >= 3))
> +        gtk_use_mb = TRUE;
>      }
>  
>    g_free (current_locale);
> ----------------------------------------------------------------------
> 
> Please comment about this.

Hmmm, this is somewhat ugly, since a conformant C library
could report MB_CUR_MAX as 1024 always and not handle
multibyte characters at all, thought it would work
on all machines I know of currently.

If the wide-character patches don't get in for 1.2, this
or something similar probably should be added. But for
now, perhaps this is better kept as a downstream patch.

Regards,
                                        Owen



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]