Re: Thinking about GtkFileSystem



On 13 Mar 2003, Owen Taylor wrote:

> I've started working on fleshing out the details of the 
> GtkFileChooser API, and one are that is definitely
> a bit less clear to me is the file-system backend
> that provides the glue between the user interface
> and different file system implementations.
> 
> This is going to be a semi-public API .. that is, the
> headers will be installed, but you'll have to #define
> something special to use them, and there won't be 
> strong guarantees of future compability.
> 
> I looked some at EggFileSystem, and a little bit at 
> GnomeVFS, but mostly this is just writing down stuff.
> 
> I'd really appreciate feedback from people with experience
> with GnomeVFS as to whether this looks reasonable
> to write a GnomeVFS backend for, and whether it will work
> in the slow-filesystem case.

It looks reasonable to me. I'll comment on some details below.

One thing that is totally left out of this is Authentication. For any 
non-trivial filesystem implementation that does remote shares you will 
need to at least enter username/password, and often more (domain, key 
phrases etc). Furthermore, it would be very painful for users if each app 
had to authenticate, as the user would have to type username + password in 
each app reading files from a protected share. Any sane system must have a 
"daemon" process that remembers the entered passwords and pops up 
authentication dialogs when needed.

Now, how will this be affect the API? Well. It means that any blocking I/O 
call could potentially block for a very long time, waiting for user input. 
If the file dialog just used blocking calls this could stop the whole app 
from even repainting while the auth dialog is up. I think this means we 
have to go with an async I/O model. This ties up with some of the things 
discussed below.

> How to represent a file/folder location
> =======================================
> 
> A couple of possiblities exist:
> 
>  - The URI as a string
> 
>  - An object. (Like EggFileSystemItem)
> 
> My preference here is to avoid the opaque objects and just use
> the URI strings. This could clearly lead to some inefficiency,
> but not having to convert back-and-forth is going to simplify
> the API a bunch.
> 
> It appears to me that the URI strings are going to need to be 
> uninterpreted ... something like a URI encoding a windows 
> filename can be non-obvious to traverse with string operations.
> 
> This means that the file system object will need a method
> something like:
> 
>   get_parent (uri);
> 
> as well as a list_children (uri);
 
I agree that the uri approach is probably better. Although i know DV will 
scream at us for calling these things uris.

> Filename encoding
> =================
> 
> A subject that's simply incredibly difficult, as we've discovered
> multiple times in the past.
> 
> Here's one possibility:
> 
>  - The URI strings are uninterpreted, except that they 
>    must be valid UTF-8, to allow display and printing. Practically 
>    speaking, they probably  should always be in the 
>    ASCII subset of RFC 2396, but I don't think we'll have
>    any reason to enforce that in the code.
> 
>  - Every component (defined by calling get_parent() on
>    the URI repeatedly) has a display name that is UTF-8.
>    This display name is _not_ guaranteed to be unique
>    among children of the parent.
> 
>  - There is a function for creating an URI from a base
>    directory and UTF-8 child part.

This sounds fine to me, and is in fact probably as good as you can get.
 
> The difficult thing is interpreting things the user types
> into the file selector entry 'C:foo' 'http:www.gnome.org'
> 'http://www.gnome.org/My File', and so forth. Not sure
> what to do here except write some magic heuristics.

Yeah. I guess we could have some way for the filesystem implementations to 
tie in with this heuristic.

> Error handling
> ==============
> 
> Almost every operation needs to allow for errors because
> the underlying file system can change at any time.
> In general, we want to have friendly user strings for
> at least some of the errors, so using GError seems
> appropriate.
> 
> 
> The notification API
> ====================
> 
> Notification on individual files doesn't seem particular useful ...
> if a folder is being displayed in the file selector, the system
> will typically want notification of changes to the directory
> or to any file in the directory. So, notification can be restricted
> to a folder-level granularity.
> 
> At that point, there are basically two options for the API:
> 
>  - Global signals on the file system object for all directories,
>    with methods to (ref-counted) monitor and unmonitor particular
>    folders.
> 
>  - Explicit monitor objects with signals just for one folder..
> 
> I suspect the latter choice is going to be more convenient.

I agree.
 
 
> Semantics of notification
> =========================
> 
> At first glance, it might seem desirable to have strict 
> consistency semantics:
> 
>  If a folder is being monitored, then if no change notifications
>  are received:
> 
>   A) All calls to list_children(folder_uri) will succeed
>   B) The results of two successsive calls to list_children(folder_uri) 
>      will be identical
>   C) If a child is listed by list_children(folder_uri), then 
>      a call to get_info(child_uri) will succeed.
>   D) The results of any two calls to get_info(child_uri) will
>      be identical.
> 
>  (Would need to be elaborated to describe what happens when
>  notifications _are_ received.)
> 
> This is the level of consistency needed by GtkListStore. But
> such a high level of consistency isn't going to be found in
> any actual file system API... files can disappear at any
> point. So, implementing it would require the file system 
> object to actually keep a mirror of all the information about
> a monitored folder locally and only update it when sending
> change notifications.

This looks really really hard to get, and even if you have it i'm not sure 
its realistically useful. Take D for instance. Even if i do get a 
notification between to calls to get_info there is no guaranteed that i 
have entered the mainloop inbetween and gotten the change fam event. Or do 
you mean it should cache the old values until the event has been 
dispatched?
 
> I think it's probably better to do this detailed mirroring in
> GUI code... one implementation possibility is that we might
> want to wrap a strongly consistent file system object around
> the real file system object to reduce the amount of error
> checking that has to be scattered through the code.

Yeah.
 
> Can incremental filling be piggybacked on top of notification?
> ============================================================
> 
> For remote network filesystems, incremental filling of 
> directories is interesting. Hopefully we can simply use the
> change-notification mechanism to accomplish this. 
> Considerations:
> 
>  - It means that we can only do incremental filling for
>    directories that are currently being monitored.
> 
>  - Relevant to the above discussion of notification 
>    semantics, in order to do incremental filling via
>    notification, you have to keep a complete local copy
>    of all 'interesting' information, just as you
>    do to provide strong guarantees about consistency.
> 
>    So, this would be an argument for doing the 
>    consistency creation in the file system rather than
>    in a wrapper ... you dont' want to keep two entire
>    copies of the information around.

Now, this is interesting. The way Fam works is that when you start 
monitoring a directory you get a load of FAMExists events for each of the 
files in that directory (or a FAMDeleted if the directory doesn't exist).
I think the main reason for this is to avoid the various races you get 
between stating/listing the directory and starting to monitor it. 

I think we could use a similar idea to get both async list_dir and to 
piggyback on the file notification API.
 
> How to do the 'stat' operation?
> ===============================
> 
> There are various types of information that we want to get
> about files:
> 
>  Display name
>  Modification time (atime/ctime as well? I think not)
>  Icon
>  Mime Type
> 
> Getting them one by one if you need multiple items is expensive;
> there are basically two options, expensive in different ways
> 
>  - Get everything, and cache them.
>  - Retrieve each item separately.
>  
> It probably makes sense to have a single call similar to stat()
> that can get any combination of items.

I think just getting everything and caching it is gonna work fine, as 
long as we make sure to properly cache icon lookups so that the pixbufs 
for icons are shared.

Due to the auth issues discussed above maybe we should have an async stat.
I'm not sure how this would look. Perhaps the file_changed, file_added 
(and file_exists if we add that) signals on the monitor object could pass 
in a GtkFileInfo with the new data. 

> Draft API
> =========
> 
> typedef struct _GtkFileInfo        GtkFileInfo;
> 
> GtkFileInfo *gtk_file_info_new  (void);
> GtkFileInfo *gtk_file_info_copy (GtkFileInfo *info);
> void         gtk_file_info_free (GtkFileInfo *info);


Maybe a way to get a GtkFileInfoType mask for the FileInfo?
 
> struct GtkFolderMonitorIface
> {
>   GTypeInterface base_iface;
> 
>   /* Signals
>    */
>   void (*deleted)      (GtkFolderMonitor *monitor);
>   void (*file_added)   (GtkFolderMonitor *monitor,
> 		        const gchar      *uri);
>   void (*file_changed) (GtkFolderMonitor *monitor,
> 			const gchar      *uri);
>   void (*file_removed) (GtkFolderMonitor *monitor,
> 			const gchar      *uri);
> };

What about a "created" signal? If you start monitoring something that 
doesn't exist? Fam goes into polled mode for active monitors on files that 
doesn't exist (since dnotify need the file to exist to work).
 
 
> Miscellaneous questions about API
> =================================
> 
> - Supporting operations on a non-monitored directory is
>   going to add complexity to file system implementations.
>   Perhaps we should rename GtkFolderMonitor to GtkFileFolder
>   and move list_children() and get_info() to there. 

You might want to share folder data between several folder monitors 
though. If we add a "file_exist" signal to the monitor that means the 
monitor has "state" for the user, and each user must create a private 
monitor object.
 
> - Is additional information needed in GtkFileInfo - 
>   is-symlink? atime? ctime? permissions?

It's easy to extend if we need more.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                            Red Hat, Inc 
                   alexl redhat com    alla lysator liu se 
He's a lonely playboy cyborg who knows the secret of the alien invasion. She's 
a beautiful blonde politician on the trail of a serial killer. They fight 
crime! 




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]