Re: Can someone please comment on this short program



> Alexander Larsson wrote:
>> I'm not sure where the confusion is. I can't talk for the c++ bindings,
>> but for gnome-vfs itself the rule is:
>>
>>   * file: URIs have no inherent encoding
>>
>> None, nada, zilch. They are the filename as stored on disk, escaped into
>> a uri. As such, they are essentially escaped byte strings, the only
>> limit being that zero bytes and '/' bytes are not allowed. This is a
>> direct result of the fact that unix filenames have no encoding, and is
>> nothing we can anything do about in gnome-vfs.
>
> Yes, I actually think the problem lies in the C++ bindings, because as
> opposed to gnome_vfs_uri_new(), the construction of a Gnome::Vfs::Uri in
> gnome-vfsmm requires the string which is passed to Uri::create() to be
> encoded in UTF-8

I probably heard somewhere that all URIs are UTF-8 encoded. I didn't take
the time to look at the appropriate W3C specification, RFC, etc. I also
assumed that URIs are escaped.

In your test case, you are using the C++ wrapper for
gnome_vfs_get_uri_from_local_path(filepath).
I wonder what this does? Does it just add file:// at the start without
doing any escaping/encoding-conversoin?

http://developer.gnome.org/doc/API/gnome-vfs/gnome-vfs-gnome-vfs-utils.html#GNOME-VFS-GET-URI-FROM-LOCAL-PATH

> (that's because create() takes a Glib::ustring, and if
> you pass an std::string in place of a Glib::ustring--such as the one
> returned from FileInfo::get_name()--a conversion attempt will be made).

I don't think that's true, but maybe I'm looking at the wrong part of the
Glib::ustring string source. The ustring(std::string) constructor does a
simple assignment.

> If the string passed to create() is not valid UTF-8 however, an
> exception will be raised.
>
> On the other hand however, if the string used to initialize the
> Gnome::Vfs::Uri is valid UTF-8 (e.g. after a successful conversion using
> the glib functions), it is possible that after construction,
> Gnome::Vfs::Uri::uri_exists() will always return false /if/ the system
> uses a different encoding for the filename in question.
>
>> You just have to be very careful about your string encodings. I'd
>> recommend using hex escapes when specifying a non-ascii string in C
>> source code.
>
> it's not only about strings which are hardwired into the source code;
> the problem applies to all other strings as well.


Murray Cumming
murrayc murrayc com
www.murrayc.com
www.openismus.com




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]