Coming to grips with the state of a11y in GNOME
- From: Matthias Clasen <matthias clasen gmail com>
- To: gnome-accessibility-devel gnome org, gtk-devel-list gnome org
- Cc: Ryan Lortie <desrt desrt ca>
- Subject: Coming to grips with the state of a11y in GNOME
- Date: Thu, 3 Nov 2011 16:13:09 -0400
I've been asked to present my thoughts on accessibility in GNOME,
again. After some discussions and some playing with orca in GNOME
3.2.1, here's what I have come up with. Again, this is largely my
opinions, and not meant as an attack.
The current state of accessibility in GNOME
Most of the accessibility features of GNOME 3 can be used without
turning on toolkit accessibility:
- High-contrast and large text are implemented via theming
- Zoom is entirely implemented in the shell
- The screen keyboard is implemented in the shell
- Visual alerts are implemented with a combination of AccessX,
XKB and the window manager
- The keyboard accessibility features (sticky/slow/bounce/mouse keys)
are all handled entirely on the X layer
Most of these features work reasonably well, if somewhat limited (zoom
and the screen keyboard could clearly benefit from more widget-level
information, such as focus tracking and entry type hints). The only
accessibility feature that really requires toolkit accessibility is
the screen reader.
The screen reader, orca, clearly is our 'flagship' AT. In my recent
experience with orca, it worked surprisingly well and spoke to me for
hours. The impression I got was much better than I had expected. In
earlier attempts (sometime during 3.1.x), it randomly stopped speaking
to me and I was unable to make it speak again.
Problems I've seen today include
- gnome-shell crashes when orca is restarted
- interacting with treeviews while orca is running easily crashes
(just select a file in the file chooser in any app...)
- people have also seen gnome-terminal crash with orca
- Insufficient labelling in new UIs, e.g gnome-shell is largely just
'panel' to orca...
- Lack of a11y support in 3rd party widgets (stock widgetry in
evolution is read just fine to me, but I don't hear any of the
'interesting' things: subjects, senders, email bodies...)
- I couldn't get a word out of firefox - I thought maybe my gtk2 a11y
stack was misconfigured, but openoffice did speak...
What are the problems ?
Toolkit accessibility (ie the loading of atk-bridge into GTK+ and
Clutter applications, as controlled by the toolkit-a11y gsetting) is
too broken. We cannot turn something on by default that lets you crash
any application in the file chooser. Turning it on by itself is mostly
harmless, but as soon as ATs are actively using it, things start to
We've tried to improve things by merging the gail module into GTK+
proper in GTK+ 3.2. But this effort hasn't gotten us to a place where
we can just enable a11y. The biggest remaining problem is the tree
view a11y code.
There are multiple reasons why toolkit accessibility has been
problematic ever since. Most of these have been discussed on this list
in the past .
- atk is a big interface (the scope is essentially 'export the entire
widget tree, including all details of everything') and it contains
things that just don't belong here, such as key event filtering
- atk interfaces are only weakly specified (if at all) - no
information about expected object life-cycles, no information about
expected signal emissions and order
- the atk interfaces are implemented in a way that forces them to be
separate from the regular GTK+ apis - these apis were designed a long
time ago, and use GObject features in weird ways, or use obscure
GObject features when much more straightforward implementations would
- there's many layers between the application and the ATs: application
widgets - gail - atk-bridge - (a11y) bus - libatspi - ATs
- there are only very few consumers of this interface, and they are
not of interest to most desktop users and developers
What are our options ?
Possible remedies include:
- Change approach
Instead of 'special interface for the 1% of users that really need
it', go for something that is useful for everybody, then add the few
a11y extras that get you from 'can be used by 80%' to 90%, then 95%,
- Reduce scope
Look at what concrete ATs actually use, instead of clinging to some
overly broad inherited 'standard' interface.
My guess is that this will boil down to mainly text + focus.
- Drop layers
Consider implementing D-Bus interfaces in the toolkit itself, cut out
the bridge and other intermediate layers as far as possible.
- Broaden the audience
Candidates for non-a11y consumers of the interface include:
- input methods
- general-purpose voice control
Add a11y features to the desktop shell, instead of focusing solely on
- add scanning and other a11y features to the shell OSK
- voice control in the regular desktop
- touch-friendliness should have some synergy with a11y
Can we turn this into a concrete project ?
No finished proposal, just some starting points:
- Put concrete goals:
'We need a working screen reader that can be turned on and off at will'
'No a11y-induced crashes'
- Do research:
- What features are expected of a screen reader nowadays ?
- Study state-of-the-art offerings on Win32 / OS X (screen readers)
- Investigate if we can have a more limited API that serves the common
need of screen reading, zoom and screen keyboard: text, labelling
relations, focus tracking, change events.
This is not an easy project. There are tough architectural issues
involved here. Just hiring somebody for a year with the task of 'fix
accessibility' is almost guaranteed to end in failure.
] [Thread Prev