Re: gtk bug or glibc locale bug?
- From: Owen Taylor <otaylor redhat com>
- To: gtk-list redhat com
- Subject: Re: gtk bug or glibc locale bug?
- Date: 06 Nov 1998 09:31:20 -0500
Changwoo Ryu <cwryu@adam.kaist.ac.kr> writes:
> Owen Taylor <otaylor@redhat.com> writes:
>
> > > How about (MB_CUR_MAX >= 3) after the "\xc0" check? It seems OK to
> > > me. Is there any 1-byte locale whose MB_CUR_MAX >= 3, or multibyte
> > > locale whose MB_CUR_MAX < 3 ?
> > >
> > > ----------------------------------------------------------------------
> > > setlocale (LC_CTYPE, "C");
> > > gtk_use_mb = (mblen ("\xc0", MB_CUR_MAX) == 1);
> > > setlocale (LC_CTYPE, current_locale);
> > > + if (! gtk_use_mb && (MB_CUR_MAX >= 3))
> > > + gtk_use_mb = TRUE;
> > > }
> > >
> > > g_free (current_locale);
> > > ----------------------------------------------------------------------
> > >
> > > Please comment about this.
> >
> > Hmmm, this is somewhat ugly, since a conformant C library
> > could report MB_CUR_MAX as 1024 always and not handle
> > multibyte characters at all, thought it would work
> > on all machines I know of currently.
>
> Mm.. But the \xc0 check is also ugly, isn't it?
True, no doubt about that.
> If "C" locale is same as US-ASCII, mblen() result can be -1.
I'm sort of counting on the laziness of C library writers
here; it seems doubtful that they would have separate mb*
functions for 7-bit and 8-bit locales. The check really
should set the locale to "en_US" or something, but I didn't
want to rely on the existance of another locale.
> And C
> library can pass the \xc0 check and not handle mb* functions at all.
>
> Why does the current code set gtk_use_mb value by "C" locale, not by
> current locale? I think it is a bug to be fixed.
The question isn't whether the C library handle multi-byte locales,
the question is whether it will, for single byte locales, correctly
report a length of 1.
GTK+ is currently completely innocent of all knowledge of what
locales are single-byte or multi-byte, so all it can do
is check in the one locale that it knows is single-byte, the
"C" locale.
If the current locale was encoded in, e.g., EUC-jp, a result of -1 for
mblen("\xc0") would be perfectly correct, so checking in the
current locale doesn't work.
Regards,
Owen
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]