Re: encoding/charset of *.desktop files



Jungshik Shin wrote:
>   I was very surprised to find that various encodings/charsets are mixed
> together in *.desktop files. For instance, gnorpm.desktop files has a lot
> of Name and Comment entries in many different languages. What surprised
> me is each entry has different encoding/charset. That is, entries for
> fr,de,es use ISO-8859-1, entries for zh_CN use GB2312 (EUC-CN), and
> entries for ko use EUC-KR. Is this by design?  I guess not.

Actually it is. The .desktop standard was designed this way - many
different translations in the same file, but no information at all about
the used charset(s). Yes, I guess the standard is very broken that way.


> I'm wondering
> if there's any plan to clean up this 'mess' and convert them all to UTF-8.

Yes, this is supposed to be done for GNOME 2.0, afaik. But I've been
very wrong before...

Notice that many .desktop files in GNOME are automatically populated
with translations from the application's po files with the use of
xml-i18n-tools. But I think it was pretty much agreed upon that making
the use of UTF-8 in .desktop files mandatory is the only way to go in
the future.


> I tried to look up the gnome-i18n archive to find any message regarding
> this issue, but couldn't find any.

It has been discussed.
http://lists.gnome.org/archives/gnome-i18n/2000-December/msg00023.html
is the start of one thread about this.


>   If this has been discussed before, I'd like to be e given a pointer
> and apologize for not being able to dig it up.
> 
>   Otherwise, I guess this
> has to be dealt with sometime before gnome 2.0 (if not before gnome
> 1.4).

GNOME 1.4 has already been released this spring :)


> This is pretty important especially during the transition from
> legacy encodings(iso-8859-x, euc-jp, euc-kr, gb2312, big5, etc) to
> UTF-8. During this transition period (and before that and long after
> that), there's no reliable one-to-one mapping between language and
> charset/encoding. In other words, unless Gnome settled on exclusive use of
> UTF-8 in *.desktop (or unless there's a definite fixed mapping bet. lang
> and charset/encoding for use in entries in *.desktop), it can't be assumed
> that all [de] ([ko]) entries in *.desktop files are in ISO-8859-1(EUC-KR).

I couldn't agree more. I think we should make the use of UTF-8 in
.desktop, .directory and .soundlist files mandatory in part of the 2.0
transition process.
If it's possible a comment could also be added to converted files to
explain that they should be kept in UTF-8. But this is all just my
personal opinion.


Christian



> P.S. I'm not on gnome-i18n list and not even sure if this list is open
> or closed. If this list is open to non-subscribers, it'd be nice if you
> could CC me although I'll try to monitor the list using the list archive.

This list is open for everyone to subscribe and everyone to post
(regardless if they are subscribed or not). You can subscribe on
http://mail.gnome.org/mailman/listinfo/gnome-i18n




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]