Re: Unescaping uris



On Tue, 2006-09-12 at 23:28 +0200, Jesse van den Kieboom wrote:
> Op dinsdag 12-09-2006 om 17:14 uur [tijdzone +0200], schreef Alexander
> Larsson:
> > > gnomevfs.unescape_string('file:///%E3%83%95.txt', '')
> > > -> 'file:///%E3%83%95.txt'
> > > 
> > > gnomevfs.unescape_string('sftp:///%E3%83%95.txt', '')
> > > -> 'sftp:///%E3%83%95.txt'
> > 
> > This looks like a bug in the python wrappers. The C functions return
> > "file:///フ.txt" and "sftp:///フ.txt";.
> 
> Well, here it doesn't :( I tried it in C and it still doesn't properly
> unescape the %XX sequences when I use gnome_vfs_format_uri_for_display.
> I'm using ubuntu and have gnomevfs 2.16.0-0ubuntu1. Is this a bug in the
> latest gnomevfs?

Well, gnome_vfs_format_uri_for_display doesn't in general unescape URIs.
That is dangerous and can cause them to be misinterpreted when you try
to parse them again.

> I think I understand the differences now, thanks for explaining them to
> me. What I want to do is display file uris properly in gedit snippets
> (substituting environmental variables like GEDIT_FILENAME and
> GEDIT_BASENAME). I think that what I need to use is
> format_uri_for_display, only problem is that at the moment it's is not
> working properly for non file schemes.

Its not generally possible to unescape a URI and get a readable version
of it. You have no idea what the encoding of remote filenames are, and
even if you do you can't guarantee that the unescaped strings are valid
in that encoding. The only thing that is guaranteed about URIs is that
they are valid ascii (since non-ascii is escaped).

If you're guaranteed to never roundtrip the string (i.e. try to parse
the resulting uri) and what to display the string i guess you could try
unescaping and verifying its valid utf-8, which would help you on some
uris. If you don't need to display the string (i.e. it doesn't have to
be utf-8 or a known encoding) you can just unescape it. 

You talk about things like FILENAME and BASENAME above, so you should
probably use gnome_vfs_uri_extract_short_name() (unescapes, no guarantee
of encoding), gnome_vfs_uri_extract_short_path_name() (returns escaped
form, guaranteed to be ascii). 

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                            Red Hat, Inc 
                   alexl redhat com    alla lysator liu se 
He's a deeply religious coffee-fuelled messiah living undercover at Ringling 
Bros. Circus. She's a man-hating extravagent Valkyrie looking for love in all 
the wrong places. They fight crime! 




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]