The bug hunt continues.




[ Elliot, please take note. ]

Hello everyone,

    So I have continued to search frantically for that Bonobo
UIHandler bug which has vexed me so much.  To refresh your collective
memories, the bug can be reproduced thusly:

    1. Update your ORBit and Bonobo.  Build, make install Bonobo.
    2. Run bonobo/samples/sample-container.
    3. Insert a paint component.
    4. Create two views of the component.
    5. Activate one.  Active the next.
    6. Continue until you get a core dump.

On the surface, this looks like a bug in my code, because there is a
segfault in menu_toplevel_prune_compare_function.  In fact, the only
reason that this segfault is occurring is that an untested invariant
of the UIHandler code is being violated.

    Here's how it's supposed to work (get comfortable, this will take
a little while).  All the top-level menus and toolbar items are
managed by the top-level UIHandler object.  Each subdocument component
creates its own UIHandler object which communicates with the toplevel
in order to merge UI elements with the parent.

    When a subdocument is activated, it registers its UIHandler with
the top-level UIHandler and merges its menus.  When it is deactivated,
it unregisters its UIHandler.  The toplevel, upon receiving the
deregistration, accordingly deletes all the UI items which belong to
the containee which is unregestering itself.  This is done by
uih_toplevel_unregister_containee.

    The way that function works is fairly simple.  Let's just consider
the case of menus.  First, it iterates through all of the menus stored 
in its internal hash table.  It comapres the CORBA Objref associated
with each of those menu items with the CORBA Objref of the
deregistering containee using CORBA_Object_is_equivalent().  If the
objrefs match, it adds that item to a list of items which are going to 
be deleted.  Let's call this the removal list.

    Then, it sorts the list in descending order of depth (the deepest
elements come first in the list).  The reason for this is that, if you
delete an item which has children which are also on the removal list,
then deletion will fail badly when you reach those children.  This is
because the removal list is populated with pointers to internal Menu
item data structures, not menu item paths.  This is because there may
be several items using the same path (items can override each other
and the overrided items remain dormant until they are popped off of
the override queue by the deletion of all their superiors).

    Some people find this a bit gross, and it is, but the UIHandler
solves a fairly complex problem and some of the code is complex as a
result.  The only other option, in this case, is to try to maintain a
list of root per-containee menu nodes, and this means:

    a. Yet another data structure to maintain.

    b. Some extremely obnoxious code to find root level containee menu
       nodes.

And so this is the best way.  Trust me.

    So the job of the (poorly-named)
menu_toplevel_prune_compare_function is to compare two menu items and
return the appropriate qsort-standard value depending on the relative
lengths of their paths.

    The crash occurs in this function, because a menu item makes it
into the removal list which should *never* be there -- the root menu
item.  The root menu item's associated corba objref is the top-level
UIHandler's objref.  Remeber that we construct the removal list by
comparing the menu item's objref with the objref of the containee
which is asking that it be removed.  The unregister-me CORBA method
looks like this:

    /**
     * unregister_containee:
     * @uih: The UIHandler of the containee which is being
     * unregistered
     *
     * This method is used by an embedded component to unregister
     * its GnomeUIHandler with the container's GnomeUIHandler.
     * The container will then
     */
    void unregister_containee (in UIHandler uih);

So the containee is expected to pass its own objref to the top-level
UIHandler.  The thing that's really screwy is, when I break on the
unregister_containee() impl, the value which is actually passed is the 
container's objref!  Something is clearly amiss.  Observe (and note
that __bonobo_orb is the *local* Orb):

Breakpoint 2, impl_unregister_containee (servant=0x80997d8, containee_uih=0x8099184, ev=0xbfffee98) at gnome-ui-handler.c:914
(gdb) p containee_uih->orb
$8 = 0x8091318
(gdb) p __bonobo_orb
$9 = 0x8091318
(gdb)  p containee_uih->orb.orb_identifier
$10 = 0x805a100 "orbit-local-orb"

On the client side, the code looks like this (read along with me in
gnome_ui_handler_unset_container):

    GNOME_UIHandler_unregister_containee (uih->top_level_uih,
		      gnome_object_corba_objref (GNOME_OBJECT (uih)),
                      &ev);

Here, 'uih' refers to the *local component's* UIHandler GtkObject, and 
gnome_object_corba_objref should therefore return the CORBA Objref
which corresponds to it.  Examining the situation with gdb again, we
see that:

(gdb) p uih->base->corba_objref->orb->orb_identifier
$7 = 0x80592e0 "orbit-local-orb"

And so everything appears to be in order on the client side..

    So the basic problem is that this objref which gets marshaled
across the corba connection is not the objref which gets received.

    So as I see it, there are one of two reasons for this.  Either my
code is trampling on memory so badly that it's screwed ORBit up
somehow, or someone owes me several hours of my life back :-).  At
this point, I'm willing to bear the humiliation of being dead wrong to
get some help solving this problem.

Nat



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]