Re: Unescaping uris



On Tue, 2006-09-12 at 11:47 +0200, Jesse van den Kieboom wrote:
> Hi,
> 
> gedit recently received a bug
> (http://bugzilla.gnome.org/show_bug.cgi?id=355477) about multibyte
> characters not being display properly.
> 
> I've been looking into the problem and I've encountered some things
> about unescaping that I don't really understand. As I understand there
> are functions to unescape uris and functions to format uris for display
> that all do approximately the same, but differ in a way I don't fully
> understand.
> 
> gnome_vfs_unescape_string
> gnome_vfs_unescape_string_for_display
> gnome_vfs_format_uri_for_display
> 
> What I'd like to know is what to use when, because they differ in
> behavior (especially for non file:// schemes). Some examples (python)
> with a file called フ.txt:
> 
> gnomevfs.unescape_string('file:///%E3%83%95.txt', '')
> -> 'file:///%E3%83%95.txt'
> 
> gnomevfs.unescape_string('sftp:///%E3%83%95.txt', '')
> -> 'sftp:///%E3%83%95.txt'

This looks like a bug in the python wrappers. The C functions return
"file:///フ.txt" and "sftp:///フ.txt";.


> In short, unescape_string_for_display seems to do about the same as
> format_uri_for_display, but format_uri_for_display removes the file
> scheme (which is what we are looking for in gedit). But, using
> format_uri_for_display on remote file schemes does not properly unescape
> the uri (which unescape_string_for_display does correctly). Is this a
> bug? If not, what's the rationale for not unescaping remote uris in
> format_uri_for_display. Should we use format_uri_for_display for local
> files and unescape_string_for_display on remote files? 
> 
> What I'd like to know is what the differences between the functions are
> and when to use what.

unescape_string and unescape_string_for_display does only on thing,
replace any substring of the form "%XX" with the hex value for XX. The
only difference with _for_display (really a bad name) is that its a bit
lenient on the form. For instance if the string contains %00 or %P2 it
returns "something", whereas unescape_string would return NULL, because
the input was invalid.

These functions are not generally all that useful by themselves, but are
useful e.g. when deconstructing uris from parts or things like that.

format_uri_for_display() however is something that is useful for a whole
uri. What it does is take the URI and creates a utf8-validated string
that can be displayed in a UI, and it does the conversion in such a way
that if you call make_uri_from_input() on the result you are guaranteed
to get back the original URI. This is used in for instance the nautilus
location entry.

I'm not sure exactly what you need to use for your application, because
you didn't really described what you wanted to do.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                            Red Hat, Inc 
                   alexl redhat com    alla lysator liu se 
He's a maverick drug-addicted paramedic with a secret. She's a tortured 
kleptomaniac safe cracker who dreams of becoming Elvis. They fight crime! 




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]