Re: Can someone please comment on this short program



On Wed, 2005-12-14 at 14:29 +0100, dannym wrote:
> Hi,
> 
> Am Mittwoch, den 14.12.2005, 13:30 +0100 schrieb Alexander Larsson:
> > On Wed, 2005-12-14 at 07:52 +0100, Matthias Kaeppler wrote:
> > 
> > > .. lots of stuff...
> > 
> > I'm not sure where the confusion is. I can't talk for the c++ bindings,
> > but for gnome-vfs itself the rule is:
> > 
> >   * file: URIs have no inherent encoding
> > 
> > None, nada, zilch. They are the filename as stored on disk, escaped into
> > a uri. As such, they are essentially escaped byte strings, the only
> > limit being that zero bytes and '/' bytes are not allowed. This is a
> > direct result of the fact that unix filenames have no encoding, and is
> > nothing we can anything do about in gnome-vfs.
> 
> I see... that is good to know :)
> 
> > 
> > In glib there are some functions for handling the encoding of filenames,
> > namely g_filename_to/from_utf8. These are mainly for displaying and
> > entering filenames in a UI, since the UI always uses utf8 for all
> > strings which might not be right for a filename. These functions use
> > G_BROKEN_FILENAMES and G_FILENAME_ENCODING env vars to determine what
> > encoding to use for the filenames. However, valid files can still have
> > names that are not in this encoding, and to be able to e.g. open or
> > rename such a file gnome-vfs must allow a uri point to it, and such a
> > uri would not be in the encoding specified by G_FILENAME_ENCODING (nor
> > would it necessarily be utf8).
> 
> So basically "if you construct gnome_vfs_uri's, don't bother with
> g_filename_to/from_utf8 but just leave the filename alone and pass it
> as-is" ?

It depends a bit on where you get the filename string from. Either you
got the string from e.g. readdir(), or read a filename from a file. In
this case you know that the string corresponds to the correct byte
string of the filename on disk. You then construct an uri for it like
this:

char *uri_string;
GnomeVFSURI *uri;
uri_string = gnome_vfs_get_uri_from_local_path (pathname);
uri = gnome_vfs_uri_new (uri_string);
g_free (uri);

However, if you got the filename string from a normal UI element e.g. a
GtkEntry (not a GtkFileChooser, i think that returns filename encoding)
its always in utf8 encoding (as are all strings in a Gtk+ ui). So, you
must call g_filename_from_utf8 before gnome_vfs_get_uri_from_local_path.

Its also interesting to not that g_filename_to_utf8 can fail for a
filename on the disk, and that a trip through utf8 and back to filename
encoding is lossy. So, in a UI app you must always keep track of both
the "real filename" and the "display filename" which you use in the ui.
g_filename_display_name gives you the display name, which also handles
the case where the utf8 conversion failed.

> > You just have to be very careful about your string encodings. I'd
> > recommend using hex escapes when specifying a non-ascii string in C
> > source code.
> 
> with append_file_name, does that do the escaping by its own (i.e. would
> it be escaped twice when following your suggestion) ? (and what about
> append_string, append_path)

gnome_vfs_uri_append_file_name() and gnome_vfs_uri_append_path() both
escape the string passed in. (i.e. you pass in filenames, not uris, and
filenames must always be escaped when put in a uri, so gnome-vfs does
that). gnome_vfs_uri_append_string() however, appends a piece of a uri,
and naturally, since uris are escaped its assumed to be escaped already
(and the docs say so).

However, this is a uri escaping thing. I was talking about C string
escaping. I.E. "\0x40", to avoid problems with what encoding the source
file is in.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                            Red Hat, Inc 
                   alexl redhat com    alla lysator liu se 
He's a bookish misogynist jungle king on the wrong side of the law. She's a 
sharp-shooting belly-dancing archaeologist who hides her beauty behind a pair 
of thick-framed spectacles. They fight crime! 




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]