Re: .desktop files and encodings



And the patch I meant to include with the last mail:

--- DESKTOP_ENTRY_STANDARD.orig	Mon Feb 26 10:00:52 2001
+++ DESKTOP_ENTRY_STANDARD	Mon Feb 26 11:57:16 2001
@@ -1,8 +1,8 @@
 ------------
 
 Desktop Entry Standard
-Version 0.9
-May 05, 1999
+Version 0.9.1
+Feb 26, 2001
 
 Preston Brown <pbrown kde org>
 Jonathan Blandford <jrb redhat com>
@@ -22,9 +22,9 @@
 1. Basic format of the file
 ---------------------------
 
-Thes desktop entry files should have an extension of ".desktop" or
+These desktop entry files should have an extension of ".desktop" or
 ".kdelnk".  ".kdelnk" is deprecated, and is only maintained for
-backwards compatibility.  Determining file type on basis fo extension
+backwards compatibility.  Determining file type on basis of extension
 makes determining the file type very easy and quick.  When no file
 extension is present, the desktop system should fall back to
 recognition via "magic detection."  Desktop entries which describe how
@@ -62,10 +62,12 @@
 2. Possible value types
 -----------------------
 
-The value types recognized are string, regular expression, boolean
-(encoded as the string true/false), and numeric.  Strings may have
-locale-specific characters included, which should be identified as
-part of the key, as described below.
+The value types recognized are string, localestring, regular expression,
+boolean (encoded as the string true/false), and numeric.  
+
+The difference between string and localestring is that the value for
+a string key must contain only ASCII characters and while the value
+of a localestring key may contain localized encodings. (See section 5.)
 
 Some keys can have multiple values; these should be separated by a
 semicolon.  Those keys which have several values should have a
@@ -74,13 +76,33 @@
 3. Recognized desktop entry keys
 --------------------------------
 
-Keys may be postfixed by [val], where val is the LOCALE type of the
-string or numeric entry.  The locale entries should match those of the
-standard C library.  Locales which specify a specify a specific
-country should fall back to just the language name if no entry is
-available, i.e if the locale is set to pt_BR, and a key with [pt] is
-available, it should be used.  When no [language] postfix is present,
-the C locale [C] is assumed.
+Keys may be postfixed by [<locale>], where <locale> is the LOCALE type
+of the entry.  <locale> must be of the form lang[_COUNTRY][.ENCODING],
+where either _COUNTRY or .ENCODING may be omitted. If a postfixed key
+occurs, the same key must be also present without the postfix.
+
+When reading in the desktop entry file, the value of the key is
+selected by matching the current POSIX locale for the LC_MESSAGES
+category against the <locale> postfixes of all occurrences of the key,
+with the .ENCODING part stripped. (The .ENCODING is used when the
+Encoding key for the desktop entry file is Legacy-Mixed, see section
+5.)
+
+The matching is done as follows: if the current value of LC_MESSAGES is
+<lang>_<country>.<encoding>@<modifier>, then, if a key for
+<lang>_<country> is present, it will be used. Otherwise, if a key for
+<lang> is present, it will be used. If both of these are missing, the
+required key without a locale specified is used.  The encoding and
+modifier from the LC_MESSAGES value are ignored.
+
+For example, if the current value of the LC_MESSAGES category
+is de_DE, and the desktop file includes:
+
+ Name=Foo
+ Name[de]=Foo auf Deutsch
+
+Then the value used for the name key will be 'Foo auf Deutsch'. However,
+if a value is specified for Name[de_DE], then that will be used instead.
 
 Case is significant.  The keys "Name" and "NAME" are not equivalent.
 The same holds for group names.  Key values are case sensitive as
@@ -101,18 +123,19 @@
 
 Key		Description					Value Type	REQ?	MUST?
 -------------------------------------------------------------------------------------------------
+Encoding        encoding of the desktop entry file              string          YES     YES
 Version		version of Desktop Entry Specification		numeric (4)	NO	YES
-Name		name of the entry, need not match binary name	string		YES	YES
+Name		name of the entry, need not match binary name	localestring	YES	YES
 Type		the type of desktop entry			string (1)	YES	YES
 FilePattern	a list of regular expressions to match against	regexp(s)	NO	NO
-		for a file manager to determine if this entry's
+z		for a file manager to determine if this entry's
 		icon should be displayed. Usually simply the
 		name of the main executable and friends.
 TryExec		filename of a binary on disk used to determine	string		NO	NO
 		if the program is actually installed.  If not,
 		entry may not show in menus, etc.
 NoDisplay	whether not to display in menus, etc.		boolean		NO	NO
-Comment		descriptive comment				string		NO	YES
+Comment		descriptive comment				localestring	NO	YES
 Exec		program to execute, possibly with arguments	string		NO	YES
 Actions		additional actions possible, see MIME type	string(s)	NO	YES
 		discussion in section 5
@@ -127,7 +150,7 @@
 TerminalOptions	if the program runs in a terminal, any options	string		NO	NO
 		that should be passed to the terminal emulator
 		before actually executing the program
-SwallowTitle	if entry is swallowed onto the panel, this	string		NO	NO
+SwallowTitle	if entry is swallowed onto the panel, this	localestring	NO	NO
 		should be the title of window
 SwallowExec	program to exec if swallowed app is clicked	string		NO	NO
 
@@ -153,7 +176,7 @@
 		the order in which to display files
 
 
-URL		if entry is Link type, the URL to access	string		NO	YES
+URL		if entry is Link type, the URL to access	string    	NO	YES
 
 -------------------------------------------------------------------------------------------------
 
@@ -178,8 +201,108 @@
     If the version number is not present, a "pre-standard" desktop entry
     file is to be assumed.
 
+4. Character set encoding of the file
+-------------------------------------
+
+Desktop entry files are encoded as lines of 8-bit characters separated
+by LF characters. 
 
-4. List of valid Exec parameter variables
+Except for comments and values of type localestring, only ASCII 
+characters are permitted in the file:
+
+ - Key names must contain only the characters 'A-Za-z0-9-'
+ - Group names may contain all ASCII characters except for control 
+   characters and '[' and ']'.
+ - Values of type string may contain all ASCII characters except
+   for control characters.
+ - Values of type boolean must either be the string 'true' or 'false' 
+ - Numeric values must be a valid floating point number as recognized
+   by the %f specifier for scanf.
+
+Comment lines are uninterpreted and may contain any character 
+(except for LF). However, using UTF-8 for comment lines that
+contain characters not in ASCII is encouraged.
+
+The encoding for values of type localestring is determined by the
+Encoding field of the desktop entry. This field should always
+be present. (However, many legacy files may not include it.) 
+
+Only two values for Encoding are currently defined: 'UTF-8', and 
+'Legacy-Mixed', and desktop files must not use any other value.
+Implementations must support the UTF-8 encoding, and may choose
+to support Legacy-Mixed in addition. For this reason, authors
+of desktop files are encouraged to use the value 'UTF-8'.
+
+If the file specifies an unsupported encoding, the implementation
+should either ignore the file, or, if the user has requested a direct
+operation on the file (such as opening it for editing), display an
+appropriate error indication to the user.
+
+In the absence of an Encoding line, the implementation may choose
+to autodetect the encoding of the file by using such factors
+as:
+
+ - The location of the file on the filesystem
+ - Whether the contents of the file are valid UTF-8
+
+If the implementation does not perform such auto-detection, it should
+treat a file without an Encoding key in the same way as a file with an
+unsupported Encoding Key.
+
+4.1. The Legacy-Mixed encoding
+------------------------------
+
+The Legacy-Mixed encoding corresponds to the traditional encoding
+of desktop files in older versions of the GNOME and KDE desktop
+files. In this encoding, the encoding of each localestring key
+is determined by the locale tag for that key, if any. For keys
+without a locale tag, the value must contain only ASCII 
+characters.
+
+If the locale tag includes an .ENCODING part, then that determines
+the encoding for the line. Otherwise, the encoding is determined
+by the language, or language-country pair from the locale, according
+to the following table.
+
+Encoding              Tags
+========              ====
+
+ARMSCII-8 (*):        by
+BIG5:                 zh_TW.Big5
+CP1251:               be bg
+EUC-CN:               zh_CN.GB2312
+EUC-JP:               ja
+EUC-KR:               ko
+GEORGIAN-ACADEMY (*): ka_GE.georgianacademy
+GEORGIAN-PS (*):      ka
+ISO-8859-1:           br ca da de en es eu fi fr gl it nl wa no pt pt sv
+ISO-8859-2:           cs hr hu pl ro sk sl sq sr
+ISO-8859-3:           eo
+ISO-8859-5:           mk sp
+ISO-8859-7:           el
+ISO-8859-9:           tr
+ISO-8859-13:          lt lv mi
+ISO-8859-14:          ga cy
+ISO-8859-15:          et
+KOI8-R:               ru
+KOI8-U:               uk
+TCVN-5712 (*):        vi vi_VN.TCVN
+TIS-620:              th
+
+Encodings marked with a (*) are not currently supported by the GNU C
+Library; for this reason, implementations may choose to ignore lines
+in desktop files with the corresponding tags. Desktop files with
+these tags are currently rare or non-existent.
+
+The encoding here is listed according to its canonical name in the 
+GNU C Library's iconv facility. The more common tags tags found
+with an encoding part are listed here, so that implementors can
+verify that the correct encoding will be used. (In particular,
+note the mismatch between zh_CN.GB2312, and the canonical name
+EUC-CN.)
+
+
+5. List of valid Exec parameter variables
 -----------------------------------------
 
 Each "Exec" field may take a number of arguments which will be
@@ -221,7 +344,8 @@
 %v - the name of the Device entry in the desktop file
 
 
-5. Detailed discussion of supporting MIME types
+6. Detailed discussion of supporting MIME types
+-----------------------------------------------
 
 It is in every desktop's best interest to have thorough support for
 mime types.  The old /etc/mailcap and /etc/mime.types files are rather
@@ -282,7 +406,7 @@
 one chosen for handling the MIME type.
 
 
-5. Extending the format
+7. Extending the format
 -----------------------
 
 If the standard is to be amended with a new {key,value} pair which
@@ -300,7 +424,7 @@
 different yet similar environments.
 
 
-6. Example Desktop Entry File
+8. Example Desktop Entry File
 -----------------------------
 
 [Desktop Entry]
@@ -325,8 +449,3 @@
 Exec=fooview --edit %f
 Name=Foo Viewer (edit image)
 Icon=fooview-edit.png
-
----
-  Preston Brown                                    Systems Engineer
-  pbrown redhat com                                Red Hat Software, Inc.
-


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]