On Wed, Jun 16, 2004 at 13:02:58 +0200, Thierry Vignaud wrote:
Jan Hudec <bulb ucw cz> writes:you need to call "some_module::bind_textdomain_codeset("mydomain", 'UTF8');" in order to get utf8 stringsThat's a bug in the bindings. The binding of gettext SHOULD *convert* the string to utf-8 from whatever gettext returns and mark it as utf-8 string. After all, in other languages that only have utf-8 strings it has to do the same.no, read the gettext API: the program has to tell gettext in which encoding it expects strings, else gettext assume the program will use the encoding specified by the current locale.Yes. And since perl knows about unicode strings, it should always tell it wants unicode.internally managing strings in utf8 does not imply my program shall print unicode strings.
Who the HELL said that?! It's just like with python and tcl -- all strings are unicode internaly and the conversion is done on I/O.
this is bad.
It would be -- if it was that way. It isn't.
as a package, i've already fix enough buggy packages that either print truncated strings under X11 or garbage on the console to no want this. simple example: translation of --help message ...
Yes, I DO want it in unicode (after all I do HAVE it in unicode, and gettext does never recode the original...).
when one print sg on the console, it has better to follow the locale encoding since the odds're high this'll work smoothly.
Yes, the perlIO layer should take care of that -- and it does. Well. I do not insist on having everything in unicode, when you need everything in locale at last. That would be useless converting there and back (though python and tcl do it!). But when you need the strings sometimes in that and sometimes in another charset, it's better to be consistent, have all strings in unicode internaly and only convert when you pass them out of your application. So the --help message should be in unicode internaly, gettext should return unicode version and the print routine should convert. Just see the perlunicode, Encoding and perlio manpages and open and binmode function entries in perlfunc.
Though, well, the I18N support in perl is lacking, unfinished and broken. It's a horror to get perl to properly honor locales.it's not.
The C one is lacking and broken too... The GTK one is by far better (the only thing I miss in GTK is wrapper around bindtextdoman, that would also call bind_textdomain_codeset (with unicode, of course)).
lot of perl programs work smoothly with l10n. mandrakelinux tools properly suppport 71 languages including bidi support, CJK support, ...
There are. It's just a bit more work than it should be. I consider this code uselessly verbose: use utf8; # You must tell perl which charset the script is in... use I18N::Langinfo qw(nl_langinfo CODESET); use Encode::Alias; BEGIN { define_alias(local => nl_langinfo(CODESET)); } binmode STDIN, ':raw :encoding(local)'; binmode STDOUT, ':raw :encoding(local)'; binmode STDERR, ':raw :encoding(local)'; I really think, that the above code should be written in two statements. So I don't say it's impossible, but I say it's not good.
b) gtk+2 perl bindinb does not include any gettext support. it just expect you to pass it unicode strings. BUT gettext will give it strings encoded in the *locale encoding* (the value of nl_langinfo (CODESET)', which depends on the `LC_CTYPE') just RTFM !That's OK -- for the C version. But not for the bindings. The bindings can simply: * Use bind_textdomain_charset to se utf-8 in the binding of bindtextdomain -- there is no point in not using unicode. (note: bind_textdomain_charset is broken -- it should take empty domain as "override for each and every domain", but it does not)how can the gtk2-perl binding know which domain the app may want to use ?
If the app wants to use a domain, it must call a bindtextdomain. The bindtextdomain function in perl can call bindtextdomain AND bind_textdomain_codeset on the C side.
what's more, i may want some messsages to be displayed in old ISO page codes b/c all terminals are not utf8 capable so i may want a domain not set to utf8 or i may not use utf8 even if gtk+ initialize smoothly b/c the program find it lacks sg, ...
So do I.
as a package, i've already fix enough buggy packages that either print truncated strings under X11 or garbage on the console to no want this. this has nothing to do in perl-Gtk2 since you're enforcing a policy into the toolkit. what's more, it would add a useless dependancy in perl-Gtk2 on gettext.
Gtk2 already does depend on gettext! Glade even asks for translation domain when you load the interface. But it does not say it also requires you to bind_textdomain_codeset(utf-8) for that domain -- and that is BROKEN.
last but not least, since Locale::Gettext has no knowledge of gtk2-perl and cannot and shouldn't know whether gtk+2 is used or not, there's no reason that this package should enforce utf8
You are right here. I would actualy like to see own binding to gettext in gtk2, that would do the codeset binding. So you can choose -- use the Locale::gettext implementation and get the string in locale charset, or use the Gtk2 binding and get them in unicode. You can choose and whichever you choose, it's convenient.
* Mark return from gettext function as utf-8 string (it was forced to be one)if you do the first point, this is useless.
*MARK* return from gettext as utf-8 string means, that I assume it is in utf-8 (because I called the bind_textdomain_codeset) and now need to tell perl about the fact.
if you do not, you are assuming that gettxt will return utf8 strings, which is wrong since the encoding will be eventually the locale one. last but not least, since Locale::Gettext has no knowledge of gtk2-perl and cannot and shouldn't know whether gtk+2 is used or not, there's no reason that this package should enforce utf8
You are right here. Perl works just right with all string in locale encoding. Touching Locale::gettext is not a good idea.
Perhaps we should have own bindings in Glib-perl, that will do just this...properly use gettext ist just more sane
I just want Glib-perl to use gettext properly for me. It's nothing about what does not work. It's just about things, that are common enough that convenience wrapper for them should exist. ------------------------------------------------------------------------------- Jan 'Bulb' Hudec <bulb ucw cz>
Attachment:
signature.asc
Description: Digital signature