Re: [g-a-devel] AT-SPI caching and D-Bus usage




Thanks for the reply. COmments below.

Unfourtunately there isn't  any available metric about how much it is
gained by caching the most usual data. In that sense, I feel that they
made the usual mistake of doing early optimization, and basing the
design on that, without a real metric justifying it. In that sense,
assuming that most of the traffic cames from name/description/role is
only half-true, because as you are saying, in several cases Orca is
asking for new data when something change. But as I'm saying this is
just a feeling.

Fwiw, I believe CodeThink did some profiling to determine what the most common method calls were at the time, although I can't find their analysis on-line--it doesn't seem to be where it was. In any case, Orca has changed since then, and I think that a good design should be able to adapt to such changes (to be able to add the ability to send attributes, for instance).

- It relies on applications sending notifications any time an object's
states or children change. If an application fails to do this, then
Orca will see stale data, unless it has instructed AT-SPI not to cache
certain kinds of data for the given application. It is sometimes
difficult for a toolkit implementor to send notifications for every
such change, and the result is that AT-SPI2 is made fragile in a way
that was not previouisly the case.

Although it is true that the cache is fragile if some notifications are
missing, as I said, it is also true that those notifications are needed.
In any case, one additional problem is that it is hard (unless I miss
something) to test at-spi without the cache (that would help on the bug
triagging), and that it is not only fragile due the lack of
notifications. From my experience cache also depends on having a
listener to a specific event (see bug 707578). So I would like to make a
question: it is really possible to run at-spi without caching? I know
that that will lead to syncronous calls everywhere, but it would be good
to know in order to investigate if caching is really improving at-spi2
performance or not.

You can either set ATSPI_NO_CACHE=1 in the environment, or do something like the following:

AtspiAccessible *desktop = atspi_accessible_get_desktop (0);
atspi_accessible_set_cache_mask (desktop, ATSPI_CACHE_NONE);
g_object_unref (desktop);

or its pyatspi equivalent.

In general, as far as I see, the problem is that it was tried to cache
the "more relevant" data, and the subset defined is really small. But
increasing the subset, let's say the whole accessible, would be really
complex, unpractical, and would make the cache even more fragile. In
general reducing the subset (removing cache) as you propose later with
an alternative seems to make sense. FWIW, that was also suggested by
Frederik on one of our ATK/AT-SPI hackfests (he had a lot of doubts
about the gain of the cache).

Yeah. From what I remember, he felt "forced" to send events to update the cache that he would prefer not to need to send.

I think that a good solution would be two-fold:

- Allow an event listener to request particular data be sent along
with an event. For instance, Orca could request that an object's
attributes be sent. The attributes would be cached probably for the
duration of the event callback. While we're modifying the API for
event listeners, it would be good to take a look at
https://bugzilla.gnome.org/show_bug.cgi?id=640440.

Well, I need joanie input there, but the big amount of object's
attributes currently available in some implementations (like firefox)
should start to be reduced, replaced for proper API. Although it is true
that this is still needed for text attributes, on each (?) caret move.
So in that sense, the attributes that would be ideally needed will be
text attributes, that are tied to start_offset/end_offset. That would
make specifying the particular data in just one event somewhat tricky.
Or we can just assume that text attributes would need a round-trip D-BUS
call.

The gnome-shell focus caret tracker does something similar on a caret move (check the current offset, then get the character extents at that offset). Currently, that requires two calls, but we could add API to allow the extents of the character at the caret to be returned in one call, or sent along with a caret-moved event. This would be especially useful for gnome-shell where it is a very good idea to avoid making synchronous calls.

- Add a function to allow various data to be fetched in one go. It
would take a callback to be called whtn the data is ready. For
instance, perhaps orca would want to know the roles, states, and
extents of all (non-transient) objects descending from a particular
accessible.

Shouldn't be easier to just send the full accessible(s)? Except those
ones with a lot of text, the accessible shouldn't be really big.

That's a good point; it might be just as easy to send all data that an application might want. The only downside I see to that is that some things might potentially take a long time to calculate for some atk implementors. One option could be to add some kind of api to atk to allow an implementor to decide what to send (we already have this for children in terms of MANAGES_DESCENDANTS and TRANSIENT states).

All of this data would be stored in a cache that would be valid for
the duration of the callback(?).

When you say cache, do you mean accessible objects? How can we ensure
that the data will be valid for the duration of the callback? As far as
I understand this proposal, this procedure tries to add support for
async calls. And the callback will be called at the client side. So if
the server side is not blocked, what would prevent the data to have
changed? In any case, this issue is innerent with async calls.

Yeah, I can't think of a good way to prevent this, although, at the same time, it is already possible for data to change between the time an application responds to a synchronous D-Bus call and the time the AT receives the reply and processes it, so I don't think that we're adding a new race here. But then we need to decide if it should be possible for an application to keep such cached data alive--for the duration that an Orca flat review context is active, for instance. I think it would be better not to allow this, since otherwise the data could become stale. An AT could make the async call again to re-fetch the data if it wants to, anyway.

In the long term, this might be able to replace the current caching
mechanism. Ie, something like:

typedef void (*AtspiQueryCallback) (void *user_data)

void atspi_query_accessible (AtspiAccessible *obj,
                             const gchar **properties,
                             AtspiQueryCallback *cb,
                             void *user_data,
                             gboolean allow_sync,
                             gboolean include_descendants,
                             GError **error);

Shouldn't the callback include the accessible as one of the possible
parameters? As far as I understanding this call, this would allow to get
an object in an asynchronous call. As soon as the accessible is
available the callback would be called, and the client could use it. In
that case, the very same accessible is relevant there.

Yeah, that sounds like a good idea. Thanks.

If allow_sync is FALSE, then AT-SPI will throw an exception rather
than make a synchronous call. This could be used when it is desirable
to guarantee that no synchronous calls are made (probably anything
running inside gnome-shell would want this, since otherwise it can
deadlock if it queries an application which is in turn making a
non-AT-SPI-related D-Bus call to gnome-shell).

I don't get this. As far as I understand the proposal, the idea of this
new API is providing an async call to get data. So when that sync call
will be called ?

For instance, _updateFocus() in magnifier.js would have something like this:

Atspi.query_accessible (event.source, lang.bind(this, _updateFocusCb), false, false);

The logic would be moved to _updateFocusCb, and an exception would be thrown if _updateFocusCb were to make a synchronous call. If it is requesting data that it needs, then it should be added to the data sent by atspi_query_accessible() so that a synchronous call is not needed. This would guarantee that we are not making synchronous calls / creating the possibility for deadlocks.

Also, maybe we want a parameter to specify the starting point--it
might be useful to be able to specify an ancestor with a given role as
a starting point? This gets into whether it might be useful to take a
string parameter specifying a predicate. It may be worth investigating
whether xpath would be helpful / whether there are libraries that
would be useful in terms of parsing it.

Well, if I understand this properly, this goes a little to the
Collection interface, in the sense of providing a match-filter, and
getting only the accessible we are interested on. Are you thinking on
something different?

That's true; we already have collection match rules, so they might be a good starting point at least, and it should be possible to re-use the code.

Thanks,
-Mike


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]