Re: Unescaping uris
- From: Alex Jones <alex weej com>
- To: gnome-vfs-list gnome org
- Subject: Re: Unescaping uris
- Date: Tue, 12 Sep 2006 11:42:00 +0100
Hi Jesse
A URI such as "file:///フ.txt" is not a valid URI. It is a valid IRI
(Internationalized Resource Identifier), but supporting those opens up a
whole new can of worms (think "IDN"). For user convenience, we "display"
file URIs just like local paths, in which URI percent-encoding is not
appropriate.
Use format_uri_for_display.
On Tue, 2006-09-12 at 11:47 +0200, Jesse van den Kieboom wrote:
> Hi,
>
> gedit recently received a bug
> (http://bugzilla.gnome.org/show_bug.cgi?id=355477) about multibyte
> characters not being display properly.
>
> I've been looking into the problem and I've encountered some things
> about unescaping that I don't really understand. As I understand there
> are functions to unescape uris and functions to format uris for display
> that all do approximately the same, but differ in a way I don't fully
> understand.
>
> gnome_vfs_unescape_string
> gnome_vfs_unescape_string_for_display
> gnome_vfs_format_uri_for_display
>
> What I'd like to know is what to use when, because they differ in
> behavior (especially for non file:// schemes). Some examples (python)
> with a file called フ.txt:
>
> gnomevfs.unescape_string('file:///%E3%83%95.txt', '')
> -> 'file:///%E3%83%95.txt'
>
> gnomevfs.unescape_string('sftp:///%E3%83%95.txt', '')
> -> 'sftp:///%E3%83%95.txt'
>
> So, what does unescape_string actually do? I read that I shouldn't use
> it on full uri's. Okay, so what should I use on full uri's? Over to the
> display functions:
>
> gnomevfs.unescape_string_for_display('file:///%E3%83%95.txt')
> -> 'file:///\xe3\x83\x95.txt'
>
> gnomevfs.unescape_string_for_display('sftp:///%E3%83%95.txt')
> -> 'sftp:///\xe3\x83\x95.txt'
>
> Okay, so this one actually does what's expected, but what are these
> functions for in relation to format_uri_for_display:
>
> gnomevfs.format_uri_for_display('file:///%E3%83%95.txt')
> -> '/\xe3\x83\x95.txt'
>
> gnomevfs.format_uri_for_display('sftp:///%E3%83%95.txt')
> -> 'sftp:///%E3%83%95.txt'
>
>
> In short, unescape_string_for_display seems to do about the same as
> format_uri_for_display, but format_uri_for_display removes the file
> scheme (which is what we are looking for in gedit). But, using
> format_uri_for_display on remote file schemes does not properly unescape
> the uri (which unescape_string_for_display does correctly). Is this a
> bug? If not, what's the rationale for not unescaping remote uris in
> format_uri_for_display. Should we use format_uri_for_display for local
> files and unescape_string_for_display on remote files?
>
> What I'd like to know is what the differences between the functions are
> and when to use what.
>
> With kind regards,
>
>
--
Alex Jones <alex weej com>
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]