Re: GIO API review



On Tue, 2007-12-11 at 17:48 +0100, Michael Natterer wrote: 
> Hey everybody,
> 
> We've been doing a GIO API review in the last couple of days and
> here is the list of comments and issues we've come up with:
> 
> 
> General:
> ========
> 
> It seems GIO allows individual files to be included, this should be
> avoided like gobject does it:
> 
> #if !defined (__GLIB_GIO_H_INSIDE__) && !defined (GIO_COMPILATION)
> #error "Only <gio.h> can be included directly."
> #endif

Why? I know gobject does this, but I never understood why. It seems to
just make build times longer.

> A big issue is that GIO wastes namespaces massively:
>    g_app, GAsync, g_buffered, g_cancellable, g_content, g_data,
>    g_desktop, g_directory, g_drive, g_file, g_filename, g_filter,
>    g_icon, g_input, g_io, g_schedule, g_cancel, g_loadable,
>    g_local, g_memory, g_mount, g_native, g_seekable, g_simple,
>    g_themed, g_unix, g_vfs, g_volume, g_win32, g_push, g_pop
> 
> That's clearly too much and can be reduced. While these are not
> strictly "namespaces" (the namespace is g_), they partly clash with
> g_foos and g_bars we already have, or might need in the future. We
> stronlgy suggest to use a common g_io/GIO prefix for all
> functions/types in GIO.
> 
> GAsnc*        -> GIOAsync*
> G*Stream      -> GIOStream*
> GIcon         -> GIOIcon
> G*Icon        -> GIOIcon*
> GCancellable  -> GIOCancellable
> ...

I strongly oppose this. 

While gio is a separate library these are all symbols in the glib
module, and while it might be a problem at times we can handle any
conflict issues (since we control both libs and release them toghether).
So, I don't think the problem with this is that bad. I mean, gobject
doesn't call its symbols GObjectClosure, g_object_signal, GObjectValue,
etc, and this has not been a problem.

The negative aspect of such a change is that every symbol and name gets
longer, which is harder to read and harder to write. Also when you work
on a higher level you're forced to know which sub library of glib each
class is from (which you don't really care about, especially if these
types are exposed in higher level APIs). 

It also makes the symbol names weird. I mean GIOIcon? Is that in icon
somehow related to an io operation (no), or is it an generic icon type
(yes, that we hopefully will be using in a lot more APIs later)?

> Also, subclasses should probably append their name, not prepend it:
> 
> GFilterOutputStream -> GIOOutputStreamFilter
> GUnixOutputStream   -> GIOOutputStreamUnix
> ...
> 
> This makes the file and inheritence structure much clearer and would
> be consistent with stuff we already have (e.g. GtkTreeModelFilter,
> GtkTreeModelSort).

This was brought up on this list before during the initial gio reviews.
The original version in fact had things appended (at least at places, it
wasn't totally consistent). But it was all cleaned up to be consistent,
and after that discussion I picked the current model.

And I really think the current model is better. Its flows more natural
in human language and its what other programming languages do, Its also
easy to understand what name a class should have, while the postfix
model does not always make this obvious, especially with multiple levels
of inheritance:

GInputStream is a stream for input (shouldn't it then be GStreamInput?),
then we have an abstract subclass for this that adds some file specific
operations: GFileInputStream. What is the name of an implementation of
this for local files. In the prefix model its the easily readable
GLocalFileInputStream. What is it in the postfix model?
GInputStreamFileLocal? GFileInputStreamLocal? The first is the strictly
correct one i guess, but its a quite bad name, the second is slightly
more readable, but seems to imply that its parent is GFileInputStream,
which does not follow the postfix pattern.

> All progress callbacks should have a cancellable argument, otherwise
> you always need to pass the cancellable as user_data if your progress
> dialog has a cancel button.

While it certainly won't hurt to add it there I don't really see it as
much of an advantage either. You really want the cancel button to
immediately cancel the operation, not wait until the next progress
callback. Furthermore, many ops don't have a progress callback so you
need a cancellable object stored anyway. So, the natural thing is to
store the cancellable in the cancel dialog and then pass it into the
operation and use this for all cancellation. Then, when updating the
progress bar in the dialog cancellation is not involved at all.

> All parent instance members should be called "parent_instance" for
> consistency, "parent" is a far too common member name to be used
> like that.

Yeah, I'll change that.

> Nitpick: The number of padding elements in class structures varies, why?

I added a lot where it seemed likely things will be extended in the
future and less for things that weren't likely to change. This is mainly
to save memory. 

> GFile:
> ======
> 
> g_mount_for_location should probably be g_file_mount_for_location

g_mount_for_location is really not an operation on the particular file
you pass in. Its more of a global operation on the set of mounts the
mounts whatever mountpoint that contains the file you pass in. For
instance, if you pass in ftp://host.com/some/dir/file.txt this will
mount ftp://host.com/. 

I can see it sort of polluting the namespace though, so maybe we should
change it anyway. Opinions?

> Some async operations are not implemented:
>    _query_settable_attributes_async
>    _query_writable_namespaces_async
>    _delete_file_async
>    _trash_async
>    _make_directory_async
>    _make_symbolic_link_async
>    _copy_async
>    _move_async
> 
> Is there an ETA for these? Especially copy_async() and move_async()
> seem very important missing pieces.

These are all quite easy to add. I just haven't gotten around to it yet,
as nothing really needed it. Any enterprising soul wants to help with
this? Its a simple way to get into gio hacking.

> A General issue with GFile is that the name suggests that you are holding
> some kind of file handle, where you actually have nothing but a filename
> (or uri). This could be problematic since the API does not make clear
> which functions simply work on the filename representation in memory and
> which do actual I/O (read: which function is cheap and which is expensive).
>
> E.g. g_file_contains_file gives no hint about whether it's just doing
> string operations on the filename, or actual I/O.
> 
> Since GFile is merely an interface, it can probably not be stated generally
> which of the functions does I/O and which doesn't. It might however make
> sense to mention that in the docs for the built-in default implementation
> of local files.

I'm aware of this issue. However, I just haven't been able to come up
with a better name. There are however in the design a set of guarantees
of which operations cause i/o (in general, construction of GFile object
never does i/o), and I've tried to make it clearer what does i/o when i
could (for instance g_file_get_info was renamed g_file_query_info as
getters are not expected to do i/o). 

In general all functions that don't take a GCancallable don't do i/o.
But I should probably go through the docs and explicitly mark all the
operations that are guaranteed to not do I/O.

> GMountOperation:
> ================
> 
> GMountOperation -> GIOMountOperation
> 
> G_PASSWORD_FLAGS_ANON_SUPPORTED -> G_IO_PASSWORD_FLAGS_ANONYMOUS_SUPPORTED

I'll change this.

> Most of the API seems to deal with setting user/pass/choice in response to
> signals. Shouldn't there be one function to pass the entire set of response
> fields related to one signal instead of a combo of setters plus reply() ?

I guess there could be such calls, but the set of responses you should
set may depend on e.g. GPasswordFlags, so it will have a bunch of
optional args. I don't think its obviously a better API though (nor
obviously worse).

Most people will not use this API, instead they will just construct a
GtkMountOperation objects (which implements all required signals and
does UI for it), so I don't think its so important to change this.

> G*Monitor:
> ==========
> 
> GFileMonitor      -> GIOMonitorFile
> GDirectoryMonitor -> GIOMonitorDirectory
> 
> Wouldn't it make sense to have a common (perhaps abstract) base class here,
> or derive one from the other? The APIs are very similar at least.

They are very similar, but I'm not sure what advantage a common base
class would have. Its not likely that you pass around an object that is
either a file monitor or a directory monitor, so GMonitor would not be
used anywhere.

> GFileAttribute:
> ===============
> 
> GFileAttribute* -> GIOFileAttribute*
> 
> GFileAttributeValue: why is the union not a GValue?

GFileAttributeValue is smaller, and you might keep a lot of them around
(for a large directory say). Its also far less generic in that it has a
limited set of types that you can marshal over an IPC, which is required
for passing these around in GVfs.

> Our biggest issue is probably that the attributes are semantically
> very similar to object properties and should therefore follow the
> conventions introduced there:
> 
> Namespace should be separated from attribute name by '::', not ':', gobject
> does this for both properties and detailled signals, so GIO should name
> file attributes consistently.

Sure, i can do this.

> The use of '_' should be avoided, please use '-' instead. We put a
> great deal of effort into canonicalizing identifiers elsewhere (object
> property names and signal names), so GIO should follow that habit here
> for the sake of consistency.

I'll do this too.

> Also, there is no good reason to use abbreviations like "std" and "fs",
> we are not in the 60ies any more ;)
> 
> std -> standard
> fs  -> filesystem
> 
> std:type      -> standard::type
> std:is_hidden -> standard::is-hidden

standard::type is 14 bytes, std:type is 8, given 10 std:* attributes per
file and 1000 files in a directory, this means transfering 60000 more
bytes over dbus when reading this directory using gvfs. I'm not sure
this is a horrible number, as there are bound to be other slow parts
when involving gvfs (such as network traffic), but it might matter.

I don't mind changing this though. What do other people change?

> There also seems to be quite some duplication between GFileInfo and
> GFileAttributeValue for no good reason, such as:
> 
> g_file_info_get_attribute_string vs. g_file_attribute_value_get_string
> g_file_info_set_attribute_int32  vs. g_file_attribute_value_set_int32
> ...
> 
> Where these functions in GFileInfo just seem to create/get a
> GFileAttributeValue and call the corresponding API, in fact these seem
> to be the only calls to g_file_attribute_value_* inside GIO.

GFileAttributeValue was initially an implementation detail of GFileInfo,
but I made it publically availible so that i could implement
g_file_set_attribute() as a single vtable function instead of having one
for each type. Maybe this was the wrong move though, as its kinda ugly
and not used anywhere else. I could have set_attribute take a type + a
gpointer.

> It also seems like a good idea to provide a more generic API similar to
> g_object_get in order to get multiple values at the same time:
> 
> g_io_file_info_get_attributes (GIOFileInfo *info,
>                                 const gchar *first_attribute_name,
>                                 ...)

That would be nice.

> GCancellable:
> =============
> 
> GCancellable -> GIOCancellable
> 
> There is no need to break the common prefix for global things:
> 
> g_push_current_cancellable -> g_cancellable_push_current
> g_pop_current_cancellable  -> g_cancellable_pop_current

g_cancellable_push_as_current maybe?

> GIOScheduler:
> =============
> 
> Each function has a different namespace here, I suggest:
> 
> GIOSchedulerJobFunc
> GIOSchedulerDataFund
> g_io_scheduler_schedule_job
> g_io_scheduler_cancel_all_jobs
> g_io_scheduler_send_to_mainloop

Yeah, this seems like unnecessary pollution. I'll fix.

> Especially the "schedule" and "send_to_mainloop" functions are
> quite similar. Doesn't the first require a running main loop too?
> Either it's mis-naming or lack of documentation (or understanding on
> my side ;-)

I think it might be bad naming from my side. g_schedule_io_job()
schedules the run of a function in the threadpool.
g_io_job_send_to_mainloop() is used by the job to send something from
the job thread to the mainloop (oneway or blocking until its finished
running).

> GContentType:
> =============
> 
> Same namespace issues:
> 
> GContent* -> GIOContent*
> 
> g_content_type_guess:
>    problematic since not all contents can be guessed from file headers
>    (which is what this function probably does). Perhaps rather pass
>    a GFile or a GInputStream so the file's end is accessible too?
> 

I know that this is problematic, but some people still need this
functionallity. For instance, gedit needs to guess the type of the
document it has loaded for its syntax highlighting. And sometimes you
just have a name and want to guess what type it is (for instance when
creating a new document with a specific filename).

If you actually want to know the content type of a file you should not
use this, you should g_file_query_info the GFile for "std:content_type".




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]