Re: glade localization woes



On Wed, Jun 16, 2004 at 09:03:06 +0200, Thierry Vignaud wrote:
Jan Hudec <bulb ucw cz> writes:

It seems that there is a bug in gettext. Perl has two types of
strings -- octet streams and unicode. It's a mess, sometimes. Read
the perlunicode manpage to find more about it. Now what really
should be done is forcing gettext to mark the string as unicode one
if it is one.

Is it anything I'm doing wrong, or is it a Perl Glade bug (which
is what I'm inclined to believe)?

No. It's a gettext bug. It returns unicode string, but does not mark
it as such. And all the glib-perl/gtk2-perl/gtk2-glade-perl stuff
honors that marks.

gettext hasn't to tag a perl string as utf8 since it's not aware of
perl internals and you cannot expect it from doing so.

The binding is aware of the internals and should properly recode and
mark the string. The C interface to gettext function returns just
character arrays.

you need to call "some_module::bind_textdomain_codeset("mydomain",
'UTF8');" in order to get utf8 strings

That's a bug in the bindings. The binding of gettext SHOULD *convert*
the string to utf-8 from whatever gettext returns and mark it as utf-8
string. After all, in other languages that only have utf-8 strings it
has to do the same.

with this functions defined in a xs as:

==================================================>
char *                                           >
bind_textdomain_codeset(domainname, codeset)     >
   char * domainname                             >
   char * codeset                                >
==================================================>

That's nice. But:
    1) Undocumented
    2) Shouldn't be -- the bindings should set utf-8 behind the scenes.

and then you've to tag strings as utf8

see N() implementation in common.pm from drakx installer and tools:

The c:: package does not exist. That's pretty useless.
The code is damn lot complicated.

package common; # $Id: common.pm,v 1.201 2004/05/26 13:40:44 prigaux Exp $

use MDK::Common;
use MDK::Common::System;
use diagnostics;
use strict;
use run_program;
use vars qw(@ISA @EXPORT $SECTORSIZE);

[...]

sub sprintf_fixutf8 {
    my $need_upgrade;
    $need_upgrade |= to_bool(c::is_tagged_utf8($_)) + 1 foreach @_;
    if ($need_upgrade == 3) { c::upgrade_utf8($_) foreach @_ };
    sprintf shift, @_;
}

sub N {
    $::one_message_has_been_translated ||= join(':', (caller(0))[1,2]); #- see ugtk2.pm
    my $s = shift @_; my $t = translate($s);
    sprintf_fixutf8 $t, @_;
}
sub N_ { $_[0] }

[...]

sub translate_real {
    my ($s) = @_;
    $s or return '';
    foreach (@::textdomains, 'libDrakX') {
      my $s2 = c::dgettext($_, $s);
      return $s2 if $s ne $s2;
    }
    $s;
}

sub translate {
    my $s = translate_real(@_);
    $::need_utf8_i18n and c::set_tagged_utf8($s);

    #- translation with context, kde-like 
    $s =~ s/^_:.*\n//;
    $s;
}


sub untranslate {
    my $s = shift || return;
    foreach (@_) { translate($_) eq $s and return $_ }
    die "untranslate failed";
}

[...]

-------------------------------------------------------------------------------
                                                 Jan 'Bulb' Hudec <bulb ucw cz>

Attachment: signature.asc
Description: Digital signature



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]