Re: GTK and ATK



On Tue, 2011-05-10 at 16:28 +0200, Benjamin Otte wrote:

> The TL;DR version is this:
> I think the problem is the fact that we support a separate API for
> accessibility.

Let me present my (very limited) understanding of how a11y works right
now.  This is for the benefit of gtk-devel-list; people on
gnome-accessibility-devel are obviously already well-informed of all of
this.

All the a11y code we have is essentially a solution to the problem of
mapping an unknown number of accessibility-specific dongles (screen
readers, braille readers, sip-and-puff controls, "debugging" shit like
accerciser and the old at-poke), to an unknown number of widgets and
widget toolkits.  We have:

    screen readers   --------\            /--- Firefox
    dasher  -----------------\\         //---- LibreOffice
    debugging software ----------- ? --------- GTK+ apps
    braille readers  --------//         \\---- Qt apps
    sip-and-puff control  ---/            \--- Your Momma's toolkit

This is a classic mapping problem.  You don't want to write (N+M)^2
implementations so that every screen reader can talk to Firefox,
LibreOffice, etc., and vice-versa.

Instead, you want a central abstraction so that you only have to write
(N+M) implementations.

at-spi is the central abstraction.  It lets you "navigate" a user
interface in very abstract terms, "read" what its various parts say, and
"write" to user-modifiable parts.  "What controls do you have?  Oh, is
that control a table?  Does that cell support an editable text
interface?  You say that there's an image there but the user can't see -
do you have a textual description for it?"  Stuff like that.

at-spi is that set of abstract interfaces described in D-Bus parlance.

Right now, the at-spi2-core module is a C binding to those D-Bus
interfaces.  In C you call atspi_selection_get_selected_child() and it
does a D-Bus call for you.  That module also has a "registry", whose
purpose I don't fully understand, but I think it's so that things on the
left side of the diagram above can find things on the right, and
vice-versa.  Or something like that.

You also need the converse - something that receives the D-Bus requests
and marshals them to real code.  You want something that receives the
D-Bus requests, unpacks the arguments, and marshals them to... what?
anything could be on the other side - Firefox, GTK+, Qt, LibreOffice,
etc.

For GTK+, there are two intermediate layers between the D-Bus glue and
GTK+ itself.

One is the layer that actually receives the D-Bus requests and marshals
them to more concrete code.  That layer is at-spi2-atk.

The other layer is ATK.  As far as I can tell, ATK is just a
re-statement of the interfaces that at-spi gave us, but in GObject
terms.  Instead of a DBus interface called org.blahblah.a11y.Selection,
you have an AtkSelection which is a GTypeInterface.  Instead of a DBus
method called SelectChild, you have a vtable slot in AtkSelection called
ref_selection.  It's not called the same as the DBus method because...
oh, I don't know, memory management details, or something.

But ATK is just interfaces and glue, not real code.  You want something
that will take a GtkLabel and expose it as an AtkText.  This adapter is
GAIL.  It's just a ton of code that takes GTK+'s ad-hoc, comfortable
structure, and plugs it to the bunch of abstract interfaces that come
from ATK (and that in turn come from at-spi).

GAIL exists for historical reasons.  We didn't want to glue the
"experimental" a11y code directly to our "stable" GTK+ and the GNOME
libraries, so we put that glue in a separate module.

Applications with custom widgetry, like Evolution, usually implement the
ATK interfaces directly in their code - they don't have a separate
module like GAIL.


There are several problems with all of this.

Nobody likes to maintain glue code.  It's tedious to write and it breaks
in ways that are hard to test.  Having GAIL separated from GTK+ makes
them tend to be out of sync.

ATK is "duplicated" interfaces.  It needs to be kept in sync with the
rather axiomatic interfaces provided by at-spi.  It has to deal with
messy details like GTK+'s reference counting (and who knows if at-spi is
amenable to that kind of detail).

Various other software has seen that ATK is a convenient place to plug
in, and instead of providing its own DBus<->toolkit layer, it instead
implements ATK interfaces in terms of calling its own toolkit, and calls
it a day.  This is not a problem in itself, but it makes ATK even more
of a point of failure.  (FIXME: is this accurate?
http://accessibility.kde.org/developer/atk.php suggests that this is
what Mozilla does.)


One problem may be that at-spi sometimes wants to be tightly-bound to
things that are far away from it, and that's bad design.

For example, consider the use of numerical indices in the Selection
interface.  When the innermost layer (GTK+ or any other toolkit, or a
custom widget) changes something in its UI, the numerical indices that
you got from at-spi may no longer be valid.  This means that you have to
write code to keep things in sync, and it gets messy real fast.  Maybe
you should get opaque, unchanging IDs for objects, so that if one of
them is deleted, the other references you have are still valid.

Maybe at-spi needs to be more stateless.  I don't know exactly how that
would work, but maybe it's about letting you reconstruct your "view" of
an accessible application easily, in a couple of D-Bus calls, instead of
doing a million calls just to re-sync everything.  Take a widget
hierarchy and its available accessible interfaces, and push the entire
thing as a JSON blob?  Use paths to refer to widgets (or accessible
objects), like SomeApp/MainToolbar/SaveButton instead of querying each
sub-hierarchy by hand just to get a volatile object reference?  I don't
know.

Maybe the D-Bus stuff on the client side should be maintained directly
inside GTK+?  (Does anyone *call* ATK directly to implement an
accessibility dongle, or do they always go through D-Bus?  Why does ATK
need a public API - can't it just be basically the "structs" for the
GTypeInterfaces and as little boilerplate as needed to make those
interfaces implementable?)

There needs to be a gradient of tightly-coupled versus loosely-coupled,
with tighter being close to the real code, and looser in the upper
layers.  We need to find places where this is not the case.


Anyway, this is just an overview of the problem.  I hope it's useful to
see what to do.

  Federico



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]