Re: glade localization woes



On Wed, Jun 16, 2004 at 13:02:58 +0200, Thierry Vignaud wrote:
Jan Hudec <bulb ucw cz> writes:

you need to call "some_module::bind_textdomain_codeset("mydomain",
'UTF8');" in order to get utf8 strings

That's a bug in the bindings. The binding of gettext SHOULD
*convert* the string to utf-8 from whatever gettext returns and mark
it as utf-8 string. After all, in other languages that only have
utf-8 strings it has to do the same.

no, read the gettext API: the program has to tell gettext in which
encoding it expects strings, else gettext assume the program will use
the encoding specified by the current locale.

Yes. And since perl knows about unicode strings, it should always tell
it wants unicode. 

internally managing strings in utf8 does not imply my program shall
print unicode strings.

Who the HELL said that?! It's just like with python and tcl -- all
strings are unicode internaly and the conversion is done on I/O.

this is bad.

It would be -- if it was that way. It isn't.

as a package, i've already fix enough buggy packages that either print
truncated strings under X11 or garbage on the console to no want this.

simple example: translation of --help message ...

Yes, I DO want it in unicode (after all I do HAVE it in unicode, and
gettext does never recode the original...).

when one print sg on the console, it has better to follow the locale
encoding since the odds're high this'll work smoothly.

Yes, the perlIO layer should take care of that -- and it does.

Well. I do not insist on having everything in unicode, when you need
everything in locale at last. That would be useless converting there and
back (though python and tcl do it!). But when you need the strings
sometimes in that and sometimes in another charset, it's better to be
consistent, have all strings in unicode internaly and only convert when
you pass them out of your application. So the --help message should be
in unicode internaly, gettext should return unicode version and the
print routine should convert. Just see the perlunicode, Encoding and
perlio manpages and open and binmode function entries in perlfunc.

Though, well, the I18N support in perl is lacking, unfinished and
broken. It's a horror to get perl to properly honor locales.

it's not.

The C one is lacking and broken too... The GTK one is by far better (the
only thing I miss in GTK is wrapper around bindtextdoman, that would
also call bind_textdomain_codeset (with unicode, of course)).

lot of perl programs work smoothly with l10n.
mandrakelinux tools properly suppport 71 languages including bidi
support, CJK support, ...

There are. It's just a bit more work than it should be. I consider this
code uselessly verbose:

use utf8; # You must tell perl which charset the script is in...
use I18N::Langinfo qw(nl_langinfo CODESET);
use Encode::Alias;
BEGIN { define_alias(local => nl_langinfo(CODESET)); }
binmode STDIN, ':raw :encoding(local)';
binmode STDOUT, ':raw :encoding(local)';
binmode STDERR, ':raw :encoding(local)';

I really think, that the above code should be written in two statements.
So I don't say it's impossible, but I say it's not good.

b) gtk+2 perl bindinb does not include any gettext support. it just
   expect you to pass it unicode strings.

   BUT gettext will give it strings encoded in the *locale encoding*
   (the value of nl_langinfo (CODESET)', which depends on the
   `LC_CTYPE')

   just RTFM !

That's OK -- for the C version. But not for the bindings. The bindings
can simply:

    * Use bind_textdomain_charset to se utf-8 in the binding of
      bindtextdomain -- there is no point in not using unicode.
      (note: bind_textdomain_charset is broken -- it should take empty
      domain as "override for each and every domain", but it does
    not)

how can the gtk2-perl binding know which domain the app may want to
use ?

If the app wants to use a domain, it must call a bindtextdomain. The
bindtextdomain function in perl can call bindtextdomain AND
bind_textdomain_codeset on the C side.

what's more, i may want some messsages to be displayed in old ISO page
codes b/c all terminals are not utf8 capable so i may want a domain
not set to utf8 or i may not use utf8 even if gtk+ initialize smoothly
b/c the program find it lacks sg, ...

So do I.

as a package, i've already fix enough buggy packages that either print
truncated strings under X11 or garbage on the console to no want this.

this has nothing to do in perl-Gtk2 since you're enforcing a policy
into the toolkit. what's more, it would add a useless dependancy in
perl-Gtk2 on gettext.

Gtk2 already does depend on gettext!
Glade even asks for translation domain when you load the interface. But
it does not say it also requires you to bind_textdomain_codeset(utf-8)
for that domain -- and that is BROKEN.

last but not least, since Locale::Gettext has no knowledge of
gtk2-perl and cannot and shouldn't know whether gtk+2 is used or not,
there's no reason that this package should enforce utf8

You are right here. I would actualy like to see own binding to gettext
in gtk2, that would do the codeset binding. So you can choose -- use the
Locale::gettext implementation and get the string in locale charset, or
use the Gtk2 binding and get them in unicode. You can choose and
whichever you choose, it's convenient.

    * Mark return from gettext function as utf-8 string (it was forced
      to be one)

if you do the first point, this is useless.

*MARK* return from gettext as utf-8 string means, that I assume it is in
utf-8 (because I called the bind_textdomain_codeset) and now need to
tell perl about the fact.

if you do not, you are assuming that gettxt will return utf8 strings,
which is wrong since the encoding will be eventually the locale one.

last but not least, since Locale::Gettext has no knowledge of
gtk2-perl and cannot and shouldn't know whether gtk+2 is used or not,
there's no reason that this package should enforce utf8

You are right here. Perl works just right with all string in locale
encoding. Touching Locale::gettext is not a good idea.

Perhaps we should have own bindings in Glib-perl, that will do just
this...

properly use gettext ist just more sane

I just want Glib-perl to use gettext properly for me. It's nothing about
what does not work. It's just about things, that are common enough that
convenience wrapper for them should exist.

-------------------------------------------------------------------------------
                                                 Jan 'Bulb' Hudec <bulb ucw cz>

Attachment: signature.asc
Description: Digital signature



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]