Re: Gtk2, Locale::TextDomain and utf-8 locales



On Mon, 2006-05-15 at 12:02 +0300, Guido Flohr wrote:

[snip]

Why all the hassle? Why does libintl-perl not "respect" that utf-8 flag?
The answer is a little off-topic, and therefore I only summarize the
problem:

The gettext API does not allow you to portably find out the character
set of a string returned by gettext() (or ngettext, etc.).  It doesn't
even tell you whether a string has actually been translated or not.

On the other hand, the API allows you to enforce a certain output
character set by the use of bind_textdomain_codeset(), a relatively new
function.

Therefore, libintl-perl does the right thing(tm): Since the character
set of the output of gettext() and friends is unknown, the library turns
the utf-8 flag unconditionally off on these strings.  However, if you
have enforced a certain character set, you can override the library by
unconditionally turning the flag on (or use an even smarter filter).

A lot of hassle, but honestly, I don't understand why Gtk2 uses this
flag at all in the first place.  We can perfectly make do without in the
C version, why make a difference in Perl?

gtk+ requires all strings to be utf8, widgets will croak if strings are
not valid. All strings returned from gtk+ will be utf8 also.

String operators in perl will no work correctly on uft8 encoded strings
if the utf8 flag is not set. Therefore we need to set the flag on
output. And seeing that we set the flag on output, we might as well let
perl handle the "upgrading" of strings to uft8 on input.

One might argue that the dual encoding setup in perl is a bad idea, but
that really doesn't matter. Bad idea or not - its there. And perl will
break if you don't flag strings correctly. Obviously, this kind of
problem doesn't exist i C, as C doesn't have any string operators...

I hope my answer make sense. Feel free to catch me on irc if it does
not..

./borup

Attachment: smime.p7s
Description: S/MIME cryptographic signature



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]