Re: [gthumb-list] FileData: the untold story



Dr. Michael J. Chudobiak wrote:
Jef Driesen wrote:
If there a reason for the existence of all those different
representations, other than historical ones? I think it would be a lot
easier (thus less bugs) to keep only one representations, and generate
the others on-the-fly when necessary. Or is the caching of the
alternative representations necessary for performance?

I want to drop path and name once the gio migration is complete. Sorry, I should have mentioned that!

utf8_path and local_path are the key fields that must stay, as you say.

utf8_name is just a const pointer into utf8_path, so it's no big deal.

If we keep it, we also have make sure it remains correct at all time. If
we generate it on-the-fly, that is not necessary. Although that might be
more expensive depending on how often we need that piece of data.

Note that much of these problems could be eliminated by making the
contents of the struct private and providing accessor functions.

uri is handy for thumbnailing functions. I would like to keep it, just because I would probably forget to do the conversion otherwise. Maybe it could be renamed to thumb_uri or something, to discourage misuse of it.

Is this the uri to the thumbnail file? E.g. the uri to a thumbnail
located in ~/.thumbnails/.

I think the ones we should keep are the fd->utf8_path and fd->local_path
fields. Or maybe even better is to use GFile's everywhere? That would
have the benefit that we could pass it directly to all GIO functions.

Yes, we can easily add the gfile to the FileData struct if we find it useful and elegant (is that what you meant?). Native use of gfiles is the way of the future, but there is still a lot of filename handling required in gThumb.

Adding the gfile to the FileData struct is indeed what I meant.

Note that the filename encoding issue is quite complicated. Filenames as
stored in the filesystem are just byte strings (with or without a
particular charset encoding), while for display we always need UTF-8
string. But the conversion is not always lossless, so you need to keep
both of them.

I think if we use utf8_path with gfile functions, and local_path with non-gfile functions, we're covered. Or am I wrong?

There are two issues here:

First there is the display vs filesystem encoding. Filenames as stored
in the filenames do not need to be in a particular charset encoding.
Usually it's UTF-8 on modern unix systems, but that's not a requirement.
A filename can contain almost any byte (except for null, the directory
separater and maybe a few more). Since the result is not necessary a
valid UTF-8 (or another encoding) string, you can't display this string,
but you can still open() this file. (Of course a user won't be able to
deal with such a file in a useful way, because you can't display or type
its name.) So in general you should always keep the filesystem name and
convert to the display name, but not back.

Second there is the issue with non-local files (anything that is not
accessed with the file:// scheme, like sftp). For those files you need
the corresponding local filename if you want to access them from code
(external libraries) that does not support those non-local files. That's
where the gvfs local mount point is for.

I think you are referring to the second case, while I'm more talking
about the first one. But both cases needs to be dealt with. For instance
if create a filename in ISO-8859-1 encoding, for instance with this command

$G_FILENAME_ENCODING=ISO-8859-1 eog

and choose save as with a filename with non-ascii characters. When
trying to display this file in gthumb, it seems to work but with quite
some warnings about invalid utf-8. With eog I only get one warning.

BTW, eog is using GFile everywhere, and I think we should do that too. I checked the GFile implementation and as I suspected is uses the filesystem filename and you can request the display name when necessary. When the filename cannot be converted to utf-8 it is displayed incorrectly, but you can still access it correctly.





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]