Re: po files and charset field

From: Pablo Saratxaga <pablo mandrakesoft com>
To: gnome-i18n gnome org
Subject: Re: po files and charset field
Date: Thu, 7 Dec 2000 12:14:20 +0100

Kaixo!

On Thu, Dec 07, 2000 at 07:27:31AM +0100, Karl Eichwalder wrote:

> > PS: What would be the path for strings not in po nor in sgml nor in xml;
> > namely the ones in *.desktop and *.directory with no charset information
> > at all ?
> 
> And don't forget the hints files (yes, they are also in XML).

I excluded xml as xml being flexible enough it can tell the encoding (the
way to do so may have to be defined, but it can be done)
However, I agree that an xml string that doesn't tell its encoding should
be in utf-8 too.

> > Should they be made all utf-8 (I can take the task to convert all those
> > files if needed),
> 
> I'd say yes.  But it's arguable whether author have to write them
> directly with an UTF-8 editor or whether is better to build the final
> .desktop etc. files on the fly from other encoded input files (XML
> prefered).

Myself I think it would be nice to include the strings in the po files;
then the desktop/sndlist/etc files could easily be generated with some
script (calling iconv (or gconv, its command line equivalent) when needed)

> prog.cs.desktop (ISO-8859-2) --> iconv --+
> prog.de.desktop (ISO-8859-1) --> iconv --+
> prog.es.desktop (ISO-8859-1) --> iconv --+---> prog.desktop (UTF-8)
> prog.fr.desktop (ISO-8859-1) --> iconv --+
> prog.ru.desktop (koi8-r) ------> iconv --+

The problem is that those files are somwhat "hidden" for the translators,
they aren't so easy to see than the po files.
Also, having one file (the po file) per module and per language makes it
much simpler, either a language is translated or it is not.

>> and for which release of Gnome should it be mandatory that they will
>> be all in utf-8 (with conversion to the real charset asked by the user
>> trough LC_CTYPE if needed).
> 
> 2.0?  For systems lacking glibc 2.2 it might be necessary to provide a
> configuration option to install encoded files with one of the
> traditional encodings (of course, them will have to accept other
> restrictions).

For the desktop files they don't use gettext, isn't it? so glibc 2.2 or not
is irrelevant, what is needed is that the libc provide a good implementation
of iconv(), and include handling of it in the functions that translate
the texts in the desktop and such files. The hard part imho would be to find
out the charset the user wants.

So, as soon as that functionality is provided all desktop and such files can
be converted, and the next release after providing that may assume that
all the texts are in utf-8.

Maybe also the translated texts could be checked to see if they are in
legal utf-8; if yes, no problem, if not, then assume they are in the
legacy encoding for that locale. That would allow to have a mix of utf-8 and
legacy encodings in those files (or, more likely, files in utf-8 and some
others, belonging to pacjkages not yet updated, in legacy encodings).

> 
> -- 
> work : ke@suse.de                          |                   ,__o
>      : http://www.suse.de/~ke/             |                 _-\_<,
> home : keichwa@gmx.net                     |                (*)/'(*)

-- 
Ki ça vos våye bén,
Pablo Saratxaga

http://www.srtxg.easynet.be/		PGP Key available, key ID: 0x8F0E4975

References:
- po files and charset field
  - From: Pablo Saratxaga
- Re: po files and charset field
  - From: Karl Eichwalder

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]