encoding/charset of *.desktop files



  Hello,

  I was very surprised to find that various encodings/charsets are mixed
together in *.desktop files. For instance, gnorpm.desktop files has a lot
of Name and Comment entries in many different languages. What surprised
me is each entry has different encoding/charset. That is, entries for
fr,de,es use ISO-8859-1, entries for zh_CN use GB2312 (EUC-CN), and
entries for ko use EUC-KR. Is this by design?  I guess not. I'm wondering
if there's any plan to clean up this 'mess' and convert them all to UTF-8.
I tried to look up the gnome-i18n archive to find any message regarding
this issue, but couldn't find any.

  If this has been discussed before, I'd like to be e given a pointer
and apologize for not being able to dig it up.

  Otherwise, I guess this
has to be dealt with sometime before gnome 2.0 (if not before gnome
1.4). This is pretty important especially during the transition from
legacy encodings(iso-8859-x, euc-jp, euc-kr, gb2312, big5, etc) to
UTF-8. During this transition period (and before that and long after
that), there's no reliable one-to-one mapping between language and
charset/encoding. In other words, unless Gnome settled on exclusive use of
UTF-8 in *.desktop (or unless there's a definite fixed mapping bet. lang
and charset/encoding for use in entries in *.desktop), it can't be assumed
that all [de] ([ko]) entries in *.desktop files are in ISO-8859-1(EUC-KR).

   Thank you in advance for your help/interest,

   Jungshik Shin

P.S. I'm not on gnome-i18n list and not even sure if this list is open
or closed. If this list is open to non-subscribers, it'd be nice if you
could CC me although I'll try to monitor the list using the list archive.





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]