Re: [gtk-list] Re: GC and GTK+



Owen Taylor <otaylor@gtk.org> writes:

> Which means that you can have loops such as:
> 
>    LSO A  <====> GTKO A (parent)
>      |                   |
>      |                   |
>      ^                   v
>      |                   |
>      |                   |
>    LSO B  <====> GTKO B (child)
> 
> Which are simply impossible to detect without a unified GC scheme.
> This is why GTK has a manual destruction step that in
> non-pathalogical cases breaks enough links to prevent leaks.
> 
> But being 100% correct is impossible. Which really shows,
> that as you say, doing more than handling GTK's own
> memory management well (and providing hooks for language
> bindings) is probably too much.

I completely agree with that, but I'd say that we should at least try
to go as far as it makes sense.

I think we all agree that ref counting is not a real solution for the
GC problem.  But I still think that in the context of Gtk, and for its
own purposes, it is the right thing.  I don't think it is acceptable
to go all the way and implement some real-time, incremental,
generational, copying, locality-improving GC, as some language
implementation can afford to have.

I have so far only really mastered simple, non-incremental mark/sweep
collectors, but my impression is that going to an incremental GC
requires some support from the compiler or the OS (mprotect, etc to
implement read/write barriers).  This is too much for Gtk, I think.
Now, the Boehm collector might actually have solved all this, and I
actually have a very high opinion of it, but still.

My judgements could be complete off here, and we should at least try
the Boehm collector or make sure that Gtk works nicely with it (I
think it has some hooks that can be used to inject knowledge into the
collector that improves its performance.)

Anyway, what I would suggest as a viable approach is to make Gtk+ more
friendly to `tracing' collectors.  It should not implement such an
tracing collector itself, but it should provide enough hooks and
information to make it possible to combine the ref_counting of Gtk
with a tracing collector.

This means that Gtk needs to be able to enumerate all pointers to
GtkObjects that can possibly be part of an cycle.  This includes all
pointers from one GtkObject to other GtkObjects that are reflected in
the reference count of the other GtkObject and maybe others.  The good
thing about this is that we are allowed to err by missing some of
these pointers.  When we miss one that forms a cycle, then we get a
leak, but we do not collect non-garbage.  So, the more of the pointers
can be identified, the more closer to 100% correctness we get.

Additionally, Gtk needs to be able to enumerate all `foreign' objects
that have been handed to it (essentially all things that now have a
DestroyNotify).

As a first approximation to this (which isn't all that bad, I think),
I have implemented the following for guile-gtk: The set of traced
references is simply the set of children of a widget.  With this
information, guile-gtk is able to detect and collect the cycle
depicted above (and all other cycles that run over Scheme values and
GtkObject pointers that are traced).

Some details: only a subset of all GtkObjects are known to Guile.  The
known ones are called `internal' objects, the rest `external'.  The
reference count of an internal object is considered to be the sum of
`internal' references and `external' references.  An internal
reference is one that stems from an internal object and can be traced,
a `external' one comes from somewhere else.

Now, during the mark phase of its GC, Guile counts all internal
references for each internal object (it keeps a list of all internal
objects).  A internal object is only then marked when it has
`external' references.  The ref_count due to the proxy is taken into
account, of course.

For the example from above:

The list of internal objects is GTKO A, GTKO B.  GTKO A has no internal
references to it, GTKO B has one (from GTKO A).  When GTKO A has a ref
count of 1, it is not marked; likewise, when GTKO B has a reference
count of 2, it is not marked.  The two LSO aren't marked either,
because the GTKOs aren't marked.  Thus, both LSOs get collected during
the sweep phase and they take the GTKOs with them.

    LSO A  <====> GTKO A (parent)  1 internal reference (from LSO A)
      |                   |
      |                   |
      ^                   v
      |                   |
      |                   |
    LSO B  <====> GTKO B (child)   2 int refs (from LSO B and GTKO A)

Now, suppose that there is a reference to LSO A from another Scheme
object.  The tracing of the GTKO still decides not to make any of
them, but eventually (or before that) LSO A will be marked.  The
marking of a LSO will then proceed to mark all traced references of
its GTKO (without paying attention to any ref_counts).  So in the end
GTKO B will be marked and thus LSO B.  The cycle is preserved.

I think that only tracing children is already quite useful and we can
improve the cycle detection abilities easily by making more references
`traceable'.  This could be done by implementing more special case
knowledge into Guile-gtk, or by adding a generic tracing capability to
Gtk+.  Of course, making Gtk itself do the tracing is the Right Thing,
but we do not need to hurry with this to get a semi-decent cooperation
between Gtk's ref_counting and a tracing GC like Guile's.

The code is in

    http://www-nt.e-technik.uni-dortmund.de/m_mvo/guile-gtk-19980421.tar.gz

Be sure to read the NEWS for other goodies.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]