Re: Introspection binary format comments



On Fri, 2005-02-25 at 18:02 -0500, Owen Taylor wrote:
>I mentioned many of these to Matthias in person, but for the benefit
>of posterity and others who are interested:

Thanks for the writeup, Owen. I had already forgotten some of the
fine points when I got back to the computer...

> - Terminology needs some work. I don't think 'interface' can be
>   used for the things stored in 'entries', since that's confusing
>   with GInterface.

So the 'directory' contains 'entries', but what do these point to ?
Ie what is a suitable term for 
function|callback|struct|boxed|object|interface|enum|flags|constant ?
One option would be to just give up on meaningful terminology and
call them 'blobs'.

There is an even worse terminological confusion wrt to 'type'. 
There are different 'blob types' (some of which I just mentioned),
there are GTypes and there are the parameter types described by 
the various TypeBlobs.
I guess one way to improve this a bit would be to rename the blob 
type fields to 'blob_type' and rename 'type_name' and 'type_init'
to 'gtype_name' and 'gtype_init'.

> - Does the format need hash tables in strategic spots to avoid
>   linear scans with strcmp() to find particular interfaces or
>   methods?

Did we come to a conclusion about this ? To give some context 
for the question, we assume that the names of entries are unique
inside the namespace, and the names of e.g. functions are unique
inside the interface they live in. We must expect clashes between
names of e.g. signals and functions in the same interface, though.

> - Does it actually need to be specified where the string pool
>   is in the metadata blob? Can't we just let people writing
>   the blob put the strings where they want?

I inherited this idea from the gmetadata prototype by Rob Melby.
I don't think there is a strong reason to keep the strings together 
in a pool, but it is important to keep them outside of the blobs,
in order to keep blob size fixed for most blob types.

> - It seems to me that we might want to use URL's instead of
>   short prefixes to uniquely identify namespaces.
>
>    http://pango.org/api/1.0/
>
>   We then could provide a short name for the namespace 'Pango'
>   in addition. This should avoid problems if two libraries on
>   the same system not used together use the same short namespace
>   and also deals with libraries with multiple major versions.

Makes sense. Lets try to avoid the URL vs URN debate on this 
one...

> - Are we really confident that the type information for a 
>   metadata blob will fit into 64k? Creating a metadata blob
>   for GTK+ probably will give us an answer for that ... if   
>   it's less than 32k, we are probably OK for just about
>   any conceivable library.

Not really confident unfortunately. The blobs for types which 
are stored out-of-line are all >= 4 bytes, and we can't use the
first 256 indices, which leaves room for less than 16182 
different types. GTK+ currently export ~2900 functions.

The main reason for storing the type information separately for
non-basic types was to keep the size of ArgBlobs small, since
they will make up the bulk of the metadata. If we have to go 
to 32bit offsets for types, then there is not much point in
storing the types separately, since most type descriptions 
will fit in 32bit (only hashtables, arrays or lists of non-basic 
types and errors require more than that).

> - I don't really like packing all GType registered stuff
>   into GInterfaceBlob. It seems to me that it should
>   be split up - while there is a lot commonality, there
>   are also a lot of differences. A common substructure
>   could be used to encapsulate the shared part.

My initial design had them all separate. I merged them 
together because the repository api felt a bit large when
these are represented by separate info structs. I'll
look at it again, maybe there is a way to split the blobs
without blowing up the repository api.

>   Also, it seems to me that stuff that isn't inherently
>   associated with a GType (enums, flags, boxed) should
>   have the GType as an optional part. So, the difference
>   between StructBlob and BoxedBlob could go away.

Nice idea.

> - For virtual functions, structure fields, boxed fields
>   finding offsets requires walking over the lists of fields 
>   and applying architecture dependent knowledge of struct
>   packing. Would it be better to just include the struct
>   offset into the metadata blob?
>
>   This would also allow us to omit private and reserved
>   fields from the data without compromising the ability
>   to find struct offsets.

I dropped the offsets at some point, because the .defs files
I looked at did not have them. I'll add them back. What is the
struct offset of a bitfield member, though ? 

> - For ArgBlob, the use of receiver_owns_value for both in and
>   out is strange. We do forbid the in case in our APIs, but if
>   you want to allow that, maybe transfer_ownership would be
>   a better name.

Interestingly, I had that name earlier, but thought 
receiver_owns_value would be better since it is closer to 
the well-known caller_owns_value, but maybe not...

> - For ArgBlob, probably need a way to indicate that for out
>   parameters, NULL can be passed to mean "don't store"

Good point.

> - For SignalBlob, do we need a representation of collector
>   behavior? Generally this is impossible, but knowing when 
>   something is true-stops-emit might be useful.

Ah, I had a true_stops_emit flag in some earlier draft. Don't
know where that went...

> - FieldBlob uses SimpleTypeBlob. Could we have fields that
>   are arrays or lists?

At least the format can express it. Maybe we should restrict
it to basic types, for now ?

> - Do we need information about memory management for 
>   FieldBlob ... if I replace the value, do I need to free
>   the old one? Or do we restrict FieldBlob to things that
>   don't need memory management. (Think GtkTargetEntry
>   which has a string...)

Colin brought up the point that for caller-owns-value, we 
also need some memory management information, e.g wether to call
g_free or g_strfreev. Does it make sense to associate that information 
with the type, or should it be provided separately ?

> - Do all StructBlob types have constructors? Or, do we
>   want to support construction on the heap for things like
>   GdkRectangle or GtkTargetEntry?

Not sure. Do they ? GdkRectangle does not

Matthias




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]