Re: [Evolution-hackers] EDS API calls hang when EDS dies - how to detect that?

From: Patrick Ohly <Patrick Ohly gmx de>
To: Evolution Hackers <evolution-hackers gnome org>
Subject: Re: [Evolution-hackers] EDS API calls hang when EDS dies - how to detect that?
Date: Sat, 03 Nov 2007 16:52:15 +0100

On Sun, 2007-09-23 at 22:21 +0200, Patrick Ohly wrote:
> Looking at the current libebook API documentation [1] I see a
> "backend-died" signal prototype. I don't remember having seen that
> before - was the signal or its documentation added recently?
> 
> I suppose I can call g_signal_connect(ebook, "backend-died", mycallback,
> NULL), then in mycallback print an error and abort the process, right?

I tried that, but although the evolution-data-server process is gone the
signal wasn't delivered.

My actual code is:
    g_signal_connect_after(m_calendar,
                           "backend-died",
                           G_CALLBACK(EvolutionSyncClient::fatalError),
                           (void *)"Evolution Data Server has died unexpectedly, database no longer available.");

I saw in the code that (backend_died" (BTW, why the underscore instead
of the hyphen used elsewhere?) is hooked to component_died in
e-component-listener.c which in turn depends on
ORBit_small_listen_for_broken(). Somehow the callback given to
ORBit_small_listen_for_broken() is never called: I have a breakpoint
there in a debugger.

[a bit later]

Tracing through it in the debugger I see that orbit2-2.14.7/linc2 is
responsible for monitoring the file descriptor. linc notices that the
link goes down (LINK_DISCONNECTED in link_connection_state_changed_T_R)
and it schedules an idle callback which would have raised the
necesessary signals (g_idle_add (link_connection_broken_idle, cnx)).

The problem now is: this idle callback is not invoked.

[yet a bit later]

I stepped through g_idle_add(). The idle callback is attached to the
default context returned by g_main_context_default():
(gdb) p *default_main_context
$16 = {mutex = {runtime_mutex = 0x0, static_mutex = {
      pad = "\001\000\000\000\000\000\000\0005U\000\000\000\000\000\000\001\000\000\000\000\000\000", 
      dummy_double = 4.9406564584124654e-324, dummy_pointer = 0x1, dummy_long = 1}}, cond = 0x0, owner = 0x0, 
  owner_count = 0, waiters = 0x0, ref_count = 1, pending_dispatches = 0x8198010, timeout = 0, next_id = 5, 
  source_list = 0x81983a0, in_check_or_prepare = 0, poll_records = 0x8198020, n_poll_records = 2, 
  cached_poll_array = 0x0, cached_poll_array_size = 0, wake_up_pipe = {9, 10}, wake_up_rec = {fd = 9, events = 1, 
    revents = 0}, poll_waiting = 0, poll_changed = 1, poll_func = 0xb73fc7c0 <poll>, current_time = {tv_sec = 0, 
    tv_usec = 0}, time_is_current = 0}
(gdb) p default_main_context
$17 = (GMainContext *) 0x81982e8

This is not the context of the thread which runs the orbit loop:
g_main_context_iterate (context=0x81a0528, ...

So if I understand this correctly, the problem with the synchronous EDS
API calls is that no thread drives the default event loop unless the
application does that.

[trying it out in SyncEvolution]

Adding a background thread which runs an event loop in the default
context indeed caused the "backend-died" signal to be emitted. I thought
that maybe now the API functions would also return an error, but not so:
unless I terminate the process in my own "backend-died" signal handler
the process remains stuck in the synchronous API call.

Enough for now: I have a solution which is good enough for
SyncEvolution. I still wonder if and how the problem could be fixed in
the EDS libs, but this is out of scope for me - sorry!

-- 
Bye, Patrick Ohly
--  
Patrick Ohly gmx de
http://www.estamos.de/

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]