Adding full introspection information (#139486)



Hi hackers,
this is intended as part I of discussion, featuring high-level overview
of required functionality and problems to solve. There will be more
strategic and planning stuff here, not as much directly technical issues
that can be expressed in actual code. You can treat it as summary of
discussion that happened here-and-there, mostly on IRC. More concrete
and to the point details will follow shortly, possibly tomorrow.

WARNING: This post is rather long, not suitable for gonna-miss-my-bus-
in-5-minutes reading :)

So, here it goes:

As discussed in http://bugzilla.gnome.org/show_bug.cgi?id=139486 ,
GObject's currently lacking certain aspects to introspection (and also
reflection, but that's masked by deficient introspection). Particularly,
there's no decent way to discover, and subsequently call and modify
(v)methods of any given GObject. 
This is the most serious obstacle to getting language bindings as first
class citizens in GObject world, thus impeding any development in
languages != C, and also contributing to (bad and harmful) current
situation of Unix development, where choice of development language
isn't simple function of what you're most convenient with, but instead
becomes political decision with long-term consequences, and each and
every piece of software needs to be bound separately, giving "nice" N x
M matrix of "libraries in language X you can use from language Y".
That's quite different from Win32 for example, where binding is
basically unknown practice, and mostly you just use whatever COM
component is available. Sure, we all know COM is gross and disgusting,
but in this regard it gets its job done. What needs to be done is to
fill in last remaining bits of support for language bindings, thus
getting rid of manual glue hell once and for all.

More specifically, the goals that should be accomplished by adding
introspection (hereafter called i12n, not to be confused with i18n) are
as follows:

1) Primary goal is to bring any language we have bindings to on par with
C. This means that for any given language there should eventually exist
exactly one bindings set, for GObject, and nothing else. This also means
it should be possible to rewrite whole GTK+ in python, should anyone
choose to do so, and not have apps notice anything (maybe besides some
slowdown :)

2) Also of primary interest is backwards compatibilty. As that goes into
stable 2.x series, it absolutely needs to stay compatible with every
existing GObject-using library, as well as any user of any said library.
It should be possible to mix-and-match old style (custom glue) bindings
together with new-style generic bindings for libraries already converted
to fully take advantage of new i12n support

3) Completness and extensibility. Whatever the final form of the patch,
the newly introduced architecture needs to be future-proof. No "we can
do half of cool things possible for now, and rest maybe later". If
that's needed to fully implement any of primary goals, it should go in.
We don't want to be stuck with "almost, but not quite there" for another
N years :). Also, system should be extensible, and possible to customize
for users without changing core

4) Matching other platforms availale -- we have various other type
systems and runtimes, most notably .NET and JVM, Python also has
particularly expressive type system. These are capable entities with
rich functionality, ideally it'd be possible to map GObject features 1:1
using platform's idioms, and seamlessly expose GObject to platform, as
well as platform's facilities to GObject code

5) Schedule -- Matthias has already marked the bug with 2.8 target
release, and that should really be the date of arrival

Ad 1. This part is the core of new functionality. We need to introduce
concept of typelib, and create interfaces to query typelib for info
about types. Truly making language bindings equal to C as implementation
language requires some principles shift -- .h files are no longer the
way of getting information about types, instead that is responsibility
of typelib. Headers, with appropriately enhanced comments markup and
scanning similar to one performed by gtk-doc will probably stay primary
source from which the typelib is generated, as bulk of our libs is done
in C, but will no longer be the canonical source for bindings. Also, as
we could be very well generating headers from typelib, not the other way
around (indeed, that will happen e.g. when using Python-implemented
objects from C) we need to make sure that we can recreate headers from
typelib with (almost) byte-for-byte accuracy. 

As Python, and many other languages don't have notion of headers, nor
any static type info whatsoever, typelib needs to be well suited for
runtime patching and information coming from multiple sources. Most
likely way of implementing that is some sort of TypelibChunks, which
constitute the actual typelib as seen from global perspective.
Individual chunks would be opaque in this scenario, left for for
implementors to decide upon exact structure, and merged into high-level
typelib by sort-of linker

Ad 2. The requirement of strict backward compat rules out every possible
approach of replacing most constrained parts of olds system with
something more flexible and abstracted. In particular, whatever we do
about (v)methods, they need to stay perfectly valid C callables. This
restricts any movement to annotating vtable with enough info to keep
ourselves informed, and able to move around smoothly. Also, we need to
support legacy, oh-so-sophisticated method of overriding vmethods via
simple my_object->vmethod = some_vmethod_implementation. As there is
absolutely no info given about the fact that method is being overriden,
we can't do anything that relies on knowing what implementation is there
currently. In practice, that means bindings will have to generate thunks
at runtime, and stuff them where C function pointers are required. These
thunks would then marshal C calling frame into proper call to real
implementation. For generating thunks, libffi should be good, as AFAIK
it is basis of GCC Java compiler, which gives us hope of something
working :)

Another face of back-compat is naming scheme -- as different bindings
made different decisions on how to map names, there could be quite a bit
of discrepancy going here. For example, gtk_widget_show_all() is named
gtk.Widget.show_all() in PyGTK, and Gtk.Widget.ShowAll() in Gtk#. In the
end, it's particular bindings' call to make sure things work, but we
need to give it some thought beforehand.

Ad 3. To put this simply -- no half-baked solutions, we're gonna be
stuck with it for next couple of years. One particular easy
extensibility feature I'd like to have is .NET-ish attributes support
(also called decorators) -- they're really elegant way of solving
traditional issue of conveying info that's not expressible using only
language constructs, but does not belong to the code either

Ad 4. Hmm, what can I say here -- they're there, they're going to stay.
The closer we can interoperate, the better. Also, having rich platform
that can easily hook into runtimes' capabilities could at least
partially remove the need to decide on any of them, which generated much
of heated discussion in the past, and is no doubt going to do that again

Ad 5. Going into 2.8 means we should have something workable when 2.6
goes out, which is mid-december. That's not much time, but there's
already initial implementation in the works. Also, we need at least one
bindings actively participating in testing the proposed features. I hope
to get someone from PyGTK in, as that's the binding set I'm most
interested in and familiar with.

Various semi-random thoughts:

- Should we have docstrings in? Mike (judging from COM's experiences)
considered that major misfeature, not used in practice, but I, being
Python and Lisp user, feel exactly the opposite -- having inlined docs,
available whenever the type itself is available is really, really
helpful, no matter how scarce and stripped these docs are. Of course,
having rich and complete docs is even better :)

- Should there be API for calling introspected methods available, or
should we just rely on generated thunks from libffi? Having it in
GObject proper would be easier on bindings, while doing that inside
bindings with libffi should be Easy Enough (TM), and potentially relieve
GObject from additional API to invent and support

- I lied a bit about that rewriting GTK+ in python and not having apps
notice anything. There remains problem of activating the implementation,
which is very different in python than it is in C. I'm not sure that
should go right in with i12n, developing that as separate (at least
conceptually, even if in the end it lands in GObject as well)
functionality may be good

This should be enough to give overview, if you have any comments, please
write.

Cheers,
Maciej

-- 
"Tautologizm to coś tautologicznego"
   Maciej Katafiasz <mnews2 wp pl>
         http://mathrick.org




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]