Re: freeze break request: fix control-center deadlock w/magnifier running



On Fri, Sep 20, 2013 at 4:51 PM, Mike Gorse <mike straddlethebox org> wrote:
Hi all,

Some background: Late in the cycle, some improvements to the magnifier were
committed to gnome-shell; the GSoC project of Magdalen Berns (BGO#647074).
Giovanni discovered a few days ago that starting gnome-control-center and
switching to the display panel now resulted in the desktop freezing for up
to 30 seconds. He then made a commit to gnome-shell so that it would not
start the magnifier unless it was actually used. Jasper felt that the
underlying bug should be fixed and tried to convince Giovanni to revert his
commit, although now it would require a freeze break request to do so, and
he has not made one--I don't know if he plans on doing so. The bug still
affects users of the magnifier; it doesn't affect all GNOME users unless
Giovanni's commit is reverted.

The problem is that gnome-control-center is making a synchronous D-Bus call
to gnome-shell to get information on the monitor, and the new gnome-shell
focus caret tracker responds to the focus changing by making a synchronous
D-Bus call to gnome-control-center. The processes deadlock waiting on each
other until the AT-SPI call times out.

I filed https://bugzilla.gnome.org/show_bug.cgi?id=708387 for this.

There are a few options for handling this, and what makes the most sense
might depend in part on whether the "don't start the magnifier unless it is
needed" commit gets reverted.

I've made patches for at-spi2-core and at-spi2-atk (attached, and added to
the bug) that fix the issue for me. They start to move in the direction that
I think we should take for 3.12 (reduce the need for synchronous D-Bus
calls--for instance, when sending an event, also send the information that
the AT will need in order to process the event, rather than make the AT
request the information). Ideally this should be configurable, but this
patch sends the information that gnome-shell's state-changed:focused handler
needs, avoiding the need for it to make calls over D-Bus. Having these
patches in 3.10 would be nice since they eliminate the deadlock completely
and don't cause the magnifier's calls to time out and fail, but they aren't
trivial patches.

Alternatively, gnome-shell could lower AT-SPI's timeout values (also
attached a patch to do this). This would make the AT-SPI calls time out
after a relatively short amount of time, mitigating the impact of the
deadlock, although it would not eliminate it (the number of users impacted
on this would depend on whether or not Giovanni's commit stays), and the
magnifier may behave incorrectly when switching to the display panel in
gnome-control-center. It would also increase the likelihood that a call
would time out if an app is busy, although I picked a timeout of 250 ms
fairly arbitrarily as a timeout that would have a relatively small impact in
the case of a deadlock.

Committing both sets of patches would also be an option in case there are
other deadlocks that we don't know about (I haven't tried to account for the
listener that is activated when the text caret moves, although I'm guessing
that that is less likely to result in a D-Bus query to gnome-shell than
would a focus change). We need to fix this more properly, but that will have
to wait for 3.12.

Hey Mike,

thanks for all this background information, that is really useful.

I can easily reproduce the 30 second hang with .92 - it is really
uncomfortable, because the compositor stops drawing - it feels like
the system is entirely frozen. We clearly need to do something for
3.10. I do think that your idea of sending relevant information along
with the events is the right fix in principle - avoiding roundtrips,
etc.

But looking at the non-trivial nature of the patches in multiple
modules, I would say that the one-line timeout reduction in the shell
is the safer choice for 3.10. I've tested that fix, and with it, the
behaviour is very much acceptable.

So, my suggestion is: use the one-line timeout fix in gnome-shell for
3.10, and do the proper fix in 3.11.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]