Encoding in g_filename_to_uri()
- From: Federico Mena Quintero <federico ximian com>
- To: GTK+ development mailing list <gtk-devel-list gnome org>
- Subject: Encoding in g_filename_to_uri()
- Date: Thu, 15 Apr 2004 19:17:16 -0500
Here is what I think is a bug. Do this:
0. Make sure you are using GtkFileSystemUnix.
1. export LANG=es_MX.ISO8859-1
2. export G_FILENAME_ENCODING= locale
3. gedit
4. Save a filename called ��. Hit File/Open
6. Select that file
7. Gedit will tell you that the file does not exist and would you like
to create it.
The filename on disk is 5 bytes long with non-ASCII characters, as it is
in ISO8859-1. If it had been created without G_FILENAME_ENCODING, it
would be 10 bytes long and in UTF-8.
Internally, Gedit uses gtk_file_chooser_get_uris(), and then for each
one of those it does gnome_vfs_uri_exists() --- note that this has
nothing to do whether you are using GtkFileSystemUnix or
GtkFileSystemGnomeVFS; Gedit does use gnome-vfs for itself.
gtk_file_chooser_get_uris() gets the list of internal GtkFilePath, which
for the unix backend are filenames in the local encoding, and converts
them to URIs using g_filename_to_uri().
However, g_filename_to_uri() does essentially this:
char *
g_filename_to_uri (char *filename)
{
char *utf8_filename;
char *escaped;
utf8_filename = g_filename_to_utf8 (filename);
escaped = g_escape_file_uri (utf8_filename);
return escaped;
}
g_filename_to_utf8() takes the local encoding for filenames and uses it
to convert the filename to UTF8. So, our 5-byte filename from above
gets converted into a 10-byte UTF-8 string. Later, g_escape_file_uri()
turns this into a percent-escaped string and prepends a "file://". The
end result is something like
file:///home/federico/%C3%A1%C3%A9%C3%AD%C3%B3%C3%BA
which is a valid URI, but does *not* refer to the filename above. I
think the result should be
file:///home/federico/%E1%E9%ED%F3%FA
That is, the hexadecimal representation of "��in ISO8859-1.
When gnome-vfs gets the URI to see if it exists, it decodes it and fails
to locate the file because the URI is encoded incorrectly in glib.
I think g_filename_to_uri() should not call g_filename_to_utf8() and
just pass the filename to g_escape_file_uri().
Is this analysis correct?
Federico
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]