Hi, here you have a patch that tries to detect the file encoding used and then, calls xgettext with that encoding option. Please, test it and review it I have no idea about perl so my change it's mainly a cut & paste fix I will also prepare a gettext >= 0.12 check so new intltool releases will require it to be able to deal with UTF-8 source files. Some comments. I get the file encoding with the "file" command and thus, I'm not able to know the XML and yacc files encoding, so I'm assuming that XML files are UTF-8 and any other file that it's not UTF-8, ISO* or XML is ASCII Comments, ideas? I will apply this patch to my intltool installation so the status pages will start to use it. Cheers. P.S.: This patch is an evolution from this http://deaddog.org/~maddog/intltool-update.patch -- Carlos Perelló Marín Debian GNU/Linux Sid (PowerPC) Linux Registered User #121232 mailto:carlos pemas net || mailto:carlos gnome org http://carlos.pemas.net Valencia - Spain
? .intltool-update.in.in.swp Index: intltool-update.in.in =================================================================== RCS file: /cvs/gnome/intltool/intltool-update.in.in,v retrieving revision 1.85 diff -u -w -r1.85 intltool-update.in.in --- intltool-update.in.in 25 May 2003 23:06:05 -0000 1.85 +++ intltool-update.in.in 14 Jul 2003 18:26:45 -0000 @@ -55,6 +55,7 @@ my $conf_file; # remove later my %varhash = (); my %po_files_by_lang = (); +my $encoding = "ASCII"; # Regular expressions to categorize file types. # FIXME: Please check if the following is correct @@ -251,6 +252,38 @@ return "gettext\/$gettext_type"; } +sub determine_code ($) +{ + my $comments = $_; + my $code = $_; + my $filename; + my $type; + my $gettext_code="ASCII"; # All files are ASCII by default + + if ($comments =~ /^[^#]/) + { + $code =~ s/^\[[^\[].*]\s*//; + $filename = "../$code"; + $type=`file $filename | cut -d ' ' -f 2`; + if ($? eq "0") + { + if (!($type =~ /^[^ISO]/) or !($type =~ /^[^UTF]/)) + { + $gettext_code=$type; + chomp $gettext_code; + } + elsif (!($type =~ /^[^XML]/)) + { + $gettext_code="UTF-8"; # We asume that .glade and other .xml files are UTF-8 + } + } + + } + + return $gettext_code; +} + + sub find_leftout_files { my (@buf_i18n_plain, @@ -517,14 +550,31 @@ while (<INFILE>) { chomp; + + my $gettext_code; + if (/\.($xml_extension|$ini_extension)$/ || /^\[/) { s/^\[.*]\s*//; print OUTFILE "$_.h\n"; + $gettext_code = &determine_code ("$_.h"); } else { print OUTFILE "$_\n"; + $gettext_code = &determine_code ("$_"); + } + + if ($gettext_code ne "" and $encoding ne $gettext_code) + { + if ($encoding eq "ASCII") + { + $encoding=$gettext_code; + } + elsif ($gettext_code ne "ASCII") + { + print "Warning: You should use the same file encoding for all your project files, but you have $encoding and $gettext_code files ($_).\n"; + } } } @@ -534,6 +584,7 @@ system ("xgettext", "--default-domain\=$MODULE", "--directory\=\.\.", "--add-comments", + "--from-code\=$encoding", "--keyword\=\_", "--keyword\=N\_", "--keyword\=U\_", @@ -675,7 +726,7 @@ } } - if ($str =~ /^(.*)\${?([A-Z]+)}?(.*)$/) + if ($str =~ /^(.*)\${?([A-Z_]+)}?(.*)$/) { my $rest = $3; my $untouched = $1;
Attachment:
signature.asc
Description: Esta parte del mensaje =?ISO-8859-1?Q?est=E1?= firmada digitalmente