Re: Filename encodings and GLib
- From: Alexander Larsson <alexl redhat com>
- To: Owen Taylor <otaylor redhat com>
- Cc: gtk-devel-list gnome org, tml iki fi
- Subject: Re: Filename encodings and GLib
- Date: 15 Oct 2003 10:52:21 +0200
On Tue, 2003-10-14 at 16:40, Owen Taylor wrote:
> On Tue, 2003-10-14 at 04:30, Alexander Larsson wrote:
> > On Mon, 2003-10-13 at 17:14, Owen Taylor wrote:
> > > This is in reference to:
> > >
> > > http://bugzilla.gnome.org/show_bug.cgi?id=101792
> > >
> > > Right now, the GLib model is that there are three forms for a filename:
> > >
> > > A) "System filename form" ... NUL terminated byte sequence,
> > > no interpretation for user display
> > > B) UTF-8 form.
> > > C) URI form
> > ...
> > > - We still haven't figured out whether URI's encode UTF-8
> > > filenames or system filenames. Nautilus and GLib, I believe,
> > > are inconsistent about this.
> > Yeah. Which is a bit unfortunate. The problem with URIs is of course
> > that we can't rely on the encoding of the filename in general, since the
> > file could be on e.g. a remote ftp server with unknown encoding. Another
> > issue is also that nautilus must be able to handle misencoded filenames
> > so they can be renamed to something correct.
> Note that for ftp, http, etc, it's a non-issue. The encoding is whatever
> the remote server picks; the extent to which we can interpret the
> octets as a human-readable string will depend on the relevant RFC's.
> Really, the only question is for URI's we are generating ourselves,
> and in particular, for file:// URIs.
> I don't think nautilus is particularly unique in needing to handle
> misencoded filenames. And if it *is* unique, that still doesn't mean
> it can use a different file:// scheme than the rest of the desktop.
> If we believe that the straight octet encoding is correct, then we
> A) Fix GLib to do the same
> B) Push this as a mini-spec on freedesktop.org
I haven't given this a lot of though, but off the cuff I think straight
octet encoding is the right way. So maybe we should do this.
> The main problem with straight octet-encoding of filenames is that
> at best you can only guess how to display them to the user as anything
> other than the literal URI.
The same is true for system-encoded filenames. Thats why we have
g_filename_to_utf8(). In fact, the basic reason we need octet encoded
file: uris is to have a one-to-one mapping between "system filename
form" and file: uris, since some apps use file: uris instead of
filenames as the base format.
Of course, this leads to the question, how does a file: uri look on
windows? If we chose utf8 as the system filename encoding on win32, will
the file: uris be compatible with IE?
Alexander Larsson Red Hat, Inc
alexl redhat com alla lysator liu se
He's an immortal guerilla card sharp on the edge. She's a manipulative
extravagent barmaid with a flame-thrower. They fight crime!
] [Thread Prev