On Wed, Jun 16, 2004 at 10:52:00 +0200, Thierry Vignaud wrote:
Jan Hudec <bulb ucw cz> writes:It seems that there is a bug in gettext. Perl has two types of strings -- octet streams and unicode. It's a mess, sometimes. Read the perlunicode manpage to find more about it. Now what really should be done is forcing gettext to mark the string as unicode one if it is one.Is it anything I'm doing wrong, or is it a Perl Glade bug (which is what I'm inclined to believe)?No. It's a gettext bug. It returns unicode string, but does not mark it as such. And all the glib-perl/gtk2-perl/gtk2-glade-perl stuff honors that marks.gettext hasn't to tag a perl string as utf8 since it's not aware of perl internals and you cannot expect it from doing so.The binding is aware of the internals and should properly recode and mark the string. The C interface to gettext function returns just character arrays.nice but there's no gettext binding (at least not in perl-Gtk2).
I hoped you were talking about the interface there is -- Locale::gettext. I just looked at the source and it does not seem it does anything with the unicode mess however. So it may be, that it won't work right. Well, then just use the utf8:: utility functions. No need to invent all the CRAP
you need to call "some_module::bind_textdomain_codeset("mydomain", 'UTF8');" in order to get utf8 stringsThat's a bug in the bindings. The binding of gettext SHOULD *convert* the string to utf-8 from whatever gettext returns and mark it as utf-8 string. After all, in other languages that only have utf-8 strings it has to do the same.no, read the gettext API: the program has to tell gettext in which encoding it expects strings, else gettext assume the program will use the encoding specified by the current locale.
Yes. And since perl knows about unicode strings, it should always tell it wants unicode. Though, well, the I18N support in perl is lacking, unfinished and broken. It's a horror to get perl to properly honor locales. Well, actualy, we should have Glib::I18N module taking care of it, because of how much broken and inconsistent the support in perl is.
then, if needed, gettext will do the conversion between the encoding used by the translator and the one expected by the programwith this functions defined in a xs as:==================================================> char * > bind_textdomain_codeset(domainname, codeset) > char * domainname > char * codeset > ==================================================>That's nice. But: 1) Undocumentedjust look at gettext doc
Sorry, found it now.
2) Shouldn't be -- the bindings should set utf-8 behind the scenes.a) about common.pm, it had to cover both gtk+-1 and gtk+-2 since until a few monthes ago, we still had a few gtk+1 app using these modules.
That explains a lot.
b) gtk+2 perl bindinb does not include any gettext support. it just expect you to pass it unicode strings. BUT gettext will give it strings encoded in the *locale encoding* (the value of nl_langinfo (CODESET)', which depends on the `LC_CTYPE') just RTFM !
That's OK -- for the C version. But not for the bindings. The bindings can simply: * Use bind_textdomain_charset to se utf-8 in the binding of bindtextdomain -- there is no point in not using unicode. (note: bind_textdomain_charset is broken -- it should take empty domain as "override for each and every domain", but it does not) * Mark return from gettext function as utf-8 string (it was forced to be one) Perhaps we should have own bindings in Glib-perl, that will do just this...
and then you've to tag strings as utf8 see N() implementation in common.pm from drakx installer and tools:The c:: package does not exist. That's pretty useless. The code is damn lot complicated.the c package is availlable in our cvs at http://cvs.mandrakesoft.com/cgi-bin/cvsweb.cgi/gi/perl-install/
You didn't mention it before... ------------------------------------------------------------------------------- Jan 'Bulb' Hudec <bulb ucw cz>
Attachment:
signature.asc
Description: Digital signature