Re: Using LSR for Jambu On Screen Keyboard

Hi Steve,

Let's chat on the phone sometime this week after you've had time to digest this
email. I apologize up-front for the spelling and grammar errors that are strewn
throughout. I figured it was more important to get this info to you
quickly with minor mistakes rather than later in perfect form.

The biggest issue from my current stance seems to be lack of Windows
support, requiring a port first or else moving Jambu to Linux first.

I think it's still undecided whether a port of the LSR code is needed, or
additional code plus more abstraction in some places. I think that's a key
concept which you should consider as you read the text below.

So I'm going to spend a little more time looking at LSR on Ubuntu
again at the risk of further delays. Could you possibly give me some
'get up to speed' quick hints and perhaps comment on my use and even
answer a few questions?

I'm going to take the shotgun blast approach first, and shower you with the
same list of links I sent one of our developers when he started learning about
LSR on his own. He managed to start submitting bug fixes to our Perks within a
week and a half, so maybe they'll prove useful to you as well. You probably
found all of these links already, but there may be one or two you missed.

* LSR on IRC: server:, channel: #lsr (We're in here all day
collaborating. You'll need to join us. I recommend the xchat-gnome.)
* LSR homepage:
* LSR downloads:
* LSR getting started help: (Follow
 the instructions for getting the code from subversion and installing it.)
* LSR developer resources:
* LSR architecture workbook:
* LSR UI specification:
* LSR script writer resources (main page):
* LSR scripting tutorial: (Definitely read
 / practice this one if you haven't already.)

Here's my wish list for a platform

I'm going to try to point you to real LSR code to respond to your wishes. I'll
try to make up some snippets of code in some places too, but I'm not going to
guarantee I'll have the method names, parameter order, etc. exactly right off
the top of my head.

* Flexible access to all input devices - gestures across devices

From (ignore the L{}, that's link in epydoc source comment format):

"Defines a class representing an input L{Gesture}, one or more device specific
input actions (e.g. keyboard key press, Braille button press, speech input
command, camera tracked gesture, etc.) performed simultaneously."


"Defines a class representing a collection of L{Gesture}s performed in sequence
on an L{AEInput} device."

In an LSR script, a module consisting of a class deriving from the Perk class,
I can define a class that will act as an event handler, a subclass of Task. I
can bind an instance of that Task to any set of action codes that will be seen
in a gesture.

The following is a purely educational example. Only the action code constants
and the device capability I'm requesting is keyboard specific. If all devices
list the same capability (e.g. keyboard) and define equivalent action code
constants (names, not values) I can use them interchangibly. If not, I can
always have multiple Perks keyed to the specific features of the device I want
to use (e.g. webcam versus touch switch). It really depends on how much work
you want to do at the device level versus in the Perk level.

class MyPerk(Perk.Perk):
 def init(self):
   # register an instance of DummyTask with the identifier 'print identity'
   self.registerTask(DummyTask('print identity'))
   # register another instance with identifier 'print identity again'
   self.registerTask(DummyTask('print identity again'))
   # get an input device by capability, first param is get by exact name
   kbd = self.getInputDevice(None, 'keyboard')
   # keyboard requires modifiers because it serves dual purpose as system
   # input and LSR input; you wouldn't need this on other devices
   self.addInputModifiers(kbd, kbd.AEK_CAPS_LOCK)
   # bind to one keyboard command; the boolean param says whether I want to
   # consume the key or not; again, not useful except for system input devices
   self.registerCommand(kbd, 'print identity', False,
                        [kbd.AEK_CAPS_LOCK, kbd.AEK_A])
   # bind same Task instance to another key
   self.registerCommand(kbd, 'print identity', False,
                        [kbd.AEK_CAPS_LOCK, kbd.AEK_B])
   # bind different Task instance to another
   self.registerCommand(kbd, 'print identity', False,
                        [kbd.AEK_CAPS_LOCK, kbd.AEK_Z])

class DummyTask(Task.InputTask):
 def execute(self, **kwargs):
   # prints the identifier name of this Task
   print self.getIdent()

Some things to note:

1. We have not finished support for gesture lists yet. For instance,
  Alt+Shift+X followed by CapsLock+3 could be a valid gesture list
  representing one input command. Some of the machinery exists, but not all of
  it. I'm not sure how important this advanced feature is to you.
2. We do support cycles on a gesture. We use this so that the first press of a
  hotkey triggers one function, a second press of the same hotkey without an
  intervening press of anything else triggers a different function,
  etc. Search for "cyclic" in
  for an example.

* Good access to any application including via a11y APIS

* task tools
* free to import in perks

* Good graphics support, eg SVG and basic UI

We don't dictate what UI toolkit you use. One of our extension types is a
chooser. A chooser implements a very simple interface (programmatic, not
UI). The interface allows it to be loaded by a Perk (e.g. triggered by an event
or hotkey press), send signals to the Perk (e.g. some value change state, the
user activated something, the user closed a GUI dialog), and be controlled by
the Perk (e.g. Perk calls a method to update the GUI view).

* Good support for user customisation - preferably UI, declarative & script

We definitely have scripts. N scripts (Perks) can be layered on top of one
another, refer to tasks defined in other scripts, handle events in chain of
responsibility fashion, and advise tasks in other scripts (as in the elisp
advice facility which allows one to chain behaviors before, after, or around

For instance, when Firefox starts, the user profile loads the following Perks
(assuming speech and Braille output):

FirefoxPerk (

In BasicSpeechPerk (BSP) we have a Task that handles focus events. It is very
basic, and simply announces the role of the object receiving focus. (Later
events trigger the announcement of the text at the caret, the selected item in
a list, etc.) FirefoxPerk defines its own focus Task. Because FirefoxPerk is
more specific, it sits higher than BasicSpeechPerk in the stack, and gets first
crack at any events received. In some cases, it is a no-op and lets the event
pass to lower Perks such as BSP. In other cases, it performs some action, but
still lets the event pass. In still other cases, it performs some actions, and
then consumes the event so no lower Perks get to see it. (These Perks are still
notified about the existence of the event so they can update any important
state variables, but generally produce no output in this case.)

In BasicSpeechPerk, we have also have a Task named 'read new role' that
announces the role of the object at the current point of regard (POR), or to
what object on the screen the user is currently attending. This base
implementation speaks the role of the accessible at the POR if it is different
from the last one spoken.

This basic logic is not enough to give a thorough role announcement in Firefox,
however, as more specific role information (e.g. heading level 3) is stored in
object attributes. In FirefoxPerk, we define another Task called 'firefox
review role' and set this Task to execute in lieu of (or "around" in advice
terms) 'read new role' whenever that Task is executed by any part of the
system. In this more specific Task, we account for the role information in the
object attributes and make a more intelligent announcement. But we still invoke
the original "anchor" task to get the basic information.

This feature should sound a lot like inheritance with the ability to invoke
super class methods. It is very similar, but with one key difference: either
Task can change dynamically at runtime. There is no design time link between
the code in BasicSpeechPerk and FirefoxPerk. Another Perk might come along and
replace 'read new role' with its own implementation. The 'firefox review role'
Task in FirefoxPerk will merrily continue to function whenever this alternative
implementation is executed. Or, as another example, say the 'read new role'
Task is removed without a replacement. This does not generate an error when
FirefoxPerk tries to register its 'firefox review role' Task. It simply won't
ever run.

User customization is possible, with settings made persistent on disk. Right
now, they're in Python pickle files, but we're not too keen on that. It's what
worked easiest and fastest at the start. But what system is used is abstracted
from how the settings are saved. So we could easily have a gconf backend in the
future on GNOME, and fall back on some other system when gconf isn't
available. We haven't invested time in this area to make more than one backend,
but I did try hard to localize use of pickle to just one or two modules when I
wrote the code the first time. At any rate, it probably doesn't affect you
drastically. Pickle files in a home or system directory works on any
platform. If you need something better, we can work it out. (I'll definitely
take code contributions. :))

As far as the UI for customization, our settings dialog in LSR is just another
chooser extension controlled by a Perk. The Perk provides the chooser with
information about the settings of other Perks and devices, the contents of the
current profile, etc. The chooser populates a gtk dialog with that
information. Obviously, you couldn't reuse our gtk settings dialog on
Windows. But neither are you required to use it. You could easily come up with
your own settings dialog written in whatever toolkit you want. It would have as
much access to settings and the LSR core as our existing settings dialog. Both
are just extensions.

We have no concept of declaritive scripts, but here's an interesting idea. Say
you write a Perk called DeclartiveScriptLoader. The function of this Perk is to
load and parse declaritive scripts whenever a particular application is
seen. The statements in the script cause the DSL to call appropriate functions
in LSR to set up input bindings, create event handler, etc. DSL could even
access the user profile to manage which declartive scripts he or she wants to
load, what applications they are keyed to, settings associated with those
scripts, etc.

Voila. You've added support for declartive scripting by writing an
extension. No need to make the LSR core aware of such a concept.

* Good development tools leading to fast development with good architecture

We three monitor extensions which show raw events flowing into LSR, internal
LSR events and the execution of Tasks, and input / output data from devices. We
also have a DeveloperPerk which provides features such as runtime script
reloading, and debugging output. If you needed other tools to help you debug
your LSR extensions, you could write them and add them to a developer profile
for yourself.

* Interpreted & interactive, Python or Javascript, some C++ OK


* Cross platform - Windows first, then Linux, then Mac

I'll answer this question by explaining how our accessibility API abstraction
layer works.

We have a Tools API which Perk use to access the platform accessibility API
(among other things). A call to the API to get the row index of a table cell at
a point of regard looks something like the following:

row = self.getAccRow(por)

Here's what the code in the getAccRow method looks like:

por = por or self.task_por
 ai = IAccessibleInfo(por)
 return ai.getAccRow()
except LookupError:
 raise PORError
except NotImplementedError:
 return None

The reference to IAccessibleInfo names one of our internal interfaces. In this
case, we're using the info interface which defines methods for fetching
information. (We also have IAccessibleAction for manipulating objects,
IAccessibleNav for walking over objects, etc.) Wrapping a point of regard
object in this interface causes LSR to go off and look up an adapter object
that provides this interface for that POR.

In our Adapters/ATSPI/ module, we have a class that has the
following class attribute and static method:

class TableAccInfoAdapter(ContainerAccInfoAdapter):
 provides = [IAccessibleInfo]

 def when(por):
   acc = por.accessible
   r = acc.getRole()
   # make sure the role is a table
   if r != Constants.ROLE_TABLE:
     return False
   # make sure the table interface exists
   tab = Interfaces.ITable(acc)
   return True

At startup, this class is registered to provide the IAccessibleInfo interface
for all point of regards that pass the when() assertion (i.e. things with
ROLE_TABLE which support the Table interface). If the Perk provided a POR that
meets this condition, this adapter will be selected and returned by
IAccessibleInfo(por) in the Tools API. If not, the search will continue for
another matching adapter. If no adapter is found, a default is used. If no
default is registered, an exception is raised which is caught in the Tools
method, and yields a result of None to the Perk.

If an adapter is selected which does not provide getAccRow (i.e. the POR
doesn't point at an object that supports the concept of rows), a
NotImplementedError is raised, caught in the Tools API, and produces a result
of None for the Perk. If an adapter is selected which does provide getAccRow,
the result is returned to the Tools method, which passes the information to the
Perk (without any extra work in this particular example, but not in general).

So what we have is the following:

1. Perks that do not need to know which platform API is operating, only that
  some calls may fail if a query or command is not supported.
2. A Tools API which hides the weirdness of wrapping PORs in interfaces from
  the Perks.
3. A system for registering and locating adapters with arbitrarily complex
  expressions determining when it should be used to wrap a POR.

Does this system hide all platform specific information extension writers?
No. It only hides the accessibility API. There are other points of contact to

1. The values returned from some Tools API are certain to differ on different
  platforms or even across toolkits on a platform. For instance, object
  attributes and text attributes are both weakly-defined strings.
2. pyLinAcc is our Python wrapper for AT-SPI. It's what the adapters in
  Adapters/ATSPI/ use to access information about accessible objects. This
  package serves one other purpose outside the adapters in LSR: handling
  accessibility events. The internal LSR events types are abstracted from
  AT-SPI, and the mapping from AT-SPI events to LSR events is abstracted away
  in the Adapters package. However, the EventManager class in LSR still calls
  pyLinAcc methods directly to register and unregister for events. pyLinAcc
  would need to be replaced with a Python binding for IA2+MSAA on Windows and
  UA on OS X. I'm not sure how the abstraction of which binding is currently
  in use would work in the EventManager yet, but I am certain that the
  Adapters can be used to define new event mappings from raw platform
  accessibility events to internal LSR events.
3. EventManager currently derives from the event dispatch class in
  pyLinAcc. This inheritance relation will be changed to composition in when
  we start using pyatspi.
4. AccessEngine uses gobject and bonobo to register idle and timer callbacks
  and run the main program loop. On other platforms, equivalent functions need
  to be used. Nothing some try/excepts on import can't solve.
5. LSRMain does some gtk/GNOME specific initialization to ensure LSR itself is
  accessible, check that the desktop is accessible, etc. These checks need to
  be abstracted, or a different LSRMain script needs to be provided. Neither
  solution should be terribly difficult to implement, so better abstraction
  rather than a "fork" of LSRMain would be preferred.
6. All existing chooser and monitor extensions use gtk. If you want to reuse
  one of our chooser dialogs (e.g. our settings dialog, our debugging
  monitors), they may have to be ported to another toolkit if you want them
  to be fully accessible. (gtk isn't accessible on Windows as far as I know.)

* Easy deployment

We use autotools because it's required by GNOME and has good support for
internationalization. I bet you could use Python's distutils too, but I have no
idea how that would work with translations. I'm not familiar with how
translations work on other platforms. How do you compile a .po file to a .mo
which can be read by Python's or GNU's gettext on Windows?  Mac?

I think cross platform packaging is a problem no matter route you decide to
take with Python. py2exe is attactive, but it's Windows only. RPMs and DEBs are
nice, but they're for *nix. Yikes.

* Use common dev skills - attract developers

Seems like coding in Python fulfills this requirement partially. Having an
abstraction layer between extension devels and the platform accessibility API
should also reduce the learning curve.

* Stable and well supported - mature

We've been at LSR for about 2 years now. I think our architecture is pretty
mature, but not necessarily all of our screen reading functionality. We chose
to focus on the design first, and hoped it would lead to an easier time
scripting and implementing user features later. We're checking if we were right
or wrong with that gamble this year as we start scripting major apps like

* use open standards


* An existing developer community to tap into would be useful

We have a small community consisting of IBMers, contractors, university
students, and hopefully some Mozilla grantees in the near
future. Unfortunately, I still haven't had the time to codify all the info in
my head in written documentation.

So my questions are how easily could I add apis to the extensions
context if that is the way to go? That would seem safest and then
migrate to core if deemed suitable. I'm thinking I might need enhanced
input handling though I think you have thought of much of that. Other
areas include graphics/widgets for creating OSK and in-application
selection/highlight, MSAA & IA2 via comtypes and perhaps even
declarative config via an XML parser.

I think I've answered most of these questions above. If not, ping me again.

What problems could I hit? Any show stoppers?

I can't think of anything off the top of my head. The worst case is that you
find a port of the core is lacking, and you can't solve your problem in an
extension. Even then, it's not a dead end because, hey, it's open source. We
can work together to improve the core.

How do *you* debug and develop?

* Accerciser: To see what the application is giving us
* WingIDE: For syntax highlighting, source tree browsing, class hierarchy
 browsing, etc.
* -l print: For simple print statement output
* -p developer: To watch how LSR is handling events in the monitors
 to determine why my Perk code is or is not getting executed correctly, why
 output is getting clipped, why input is not being received

How hard do you think a Windows port will be. It seems pretty firmly
tied to gnome/gtk/linux so what and where are the points of contact
and how difficult would they be to replace.

See section about cross platform above for points of contact.

I guess a plus for you would be more exposure for  LSR, a possible
windows port and another person fiddling with the code and possibly
even answering questions ;-)

I'm all for it.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]