Re: ModemManager: MMCallbackInfo scheduled twice when device removed

From: Aleksander Morgado <aleksander lanedo com>
To: Dan Williams <dcbw redhat com>
Cc: "Network Manager \(Devel\)" <networkmanager-list gnome org>
Subject: Re: ModemManager: MMCallbackInfo scheduled twice when device removed
Date: Thu, 12 May 2011 11:48:37 +0200

> > 
> > Each MMCallbackInfo holds a weak reference to the MMModem to which the
> > AT command was sent. When the MMModem is destroyed,
> > mm-callback-info.c::modem_destroyed_cb() gets called and the modem
> > pointer in the callback info is reset to NULL, to avoid having a pointer
> > to an already disposed MMModem. See [1].
> > 
> > But, in addition to setting the modem pointer to NULL, an explicit error
> > is just set directly in the callback info, and it gets scheduled. The
> > problem here is that after destroying the modem, the ports it held also
> > get closed as part of the destruction, and during port closing there's
> > the task of finalizing all pending commands (see
> > mm-serial-port.c::mm_serial_port_close()), which is done by passing an
> > error to the specific response handler, and which will then (usually)
> > propagate that error to the MMCallbackInfo and schedule it right away.
> > 
> > That ends up in a memory leak (most response handlers will just set
> > info->error without assuming there may already be a GError set); plus a
> > critical in the second call to mm_callback_info_schedule() because a
> > callback info can't be scheduled twice.
> > 
> > I've been looking at this problem some time, and the best way to fix it
> > seems to be to just avoid setting the first error and scheduling in
> > modem_destroyed_cb():
> > 
> > diff --git a/src/mm-callback-info.c b/src/mm-callback-info.c
> > index 1986bb5..8f8ecca 100644
> > --- a/src/mm-callback-info.c
> > +++ b/src/mm-callback-info.c
> > @@ -53,13 +53,9 @@ modem_destroyed_cb (gpointer data, GObject
> > *destroyed)
> >  {
> >      MMCallbackInfo *info = data;
> >  
> > +    /* Just set the modem pointer to NULL, do not further process the
> > +     * callback info */
> >      info->modem = NULL;
> > -    if (!info->pending_id) {
> > -        info->error = g_error_new_literal (MM_MODEM_ERROR,
> > -                                           MM_MODEM_ERROR_REMOVED,
> > -                                           "The modem was removed.");
> > -        mm_callback_info_schedule (info);
> > -    }
> >  }
> > 
> > Comments?
> 
> How does that play into the mm_modem_check_removed() checks which I
> think are one of the more common places this error is looked for?  I
> think the motivation here was to be *sure* that any callback handler
> waiting for the result got an error.  The thing we want to ensure here
> is that if a D-Bus client of MM made a request when the modem was pulled
> out or crashed (which can happen pretty often) that the client gets a
> reply instead of a timed-out D-Bus method return error.  I'm sure
> there's room for cleanup here, but that was the original motivation.  Do
> we just need to audit the callback handlers to make sure they do the
> right thing with a ERROR_REMOVED?
> 

You're right, removing the error setting in modem_destroyed_cb() is not
the right thing to do, if we want to keep the ERROR_REMOVED logic
around.

I will try to explain the issue with an example flow of execution, and
suggest another approach below:

 (1) MM sends an AT command to the modem, like the periodic ones to get
signal strength.

 (2) Modem is disconnected, reply to the sent AT command was not
received yet, so it gets disconnected in the middle of a send-receive
operation.

 (3) MMModem object destruction starts. Due to the weak reference to the
modem in the MMCallbackInfo, modem_destroyed_cb() gets called.
info->modem is set to NULL, info->error is set to ERROR_REMOVED and the
callback info is scheduled.

 (4) During modem object destruction (in the same loop iteration as step
3!), ports in the modem also get destroyed. During port destruction, all
pending commands which didn't get reply are processed, and a SEND_FAILED
error is passed to the response-received callback. This callback then
uses mm_modem_check_removed() (well, actually not all use it), and as
info->modem is NULL, it will overwrite info->error with a newly created
ERROR_REMOVED error. Once the new error is set, the callback info is
scheduled, again.

So always ends up with a memleak and a re-schedule of the callback info.

>From what I see, there are some things to fix here then:

 (a) Not all response-received callbacks use mm_modem_check_removed(),
they should all use it.

 (b) The response-received callback should not overwrite info->error if
already set (avoiding memleak).

 (c) If ERROR_REMOVED is already set in the MMCallbackInfo, the
response-received callback should *not* re-schedule the callback info,
it should assume it was already scheduled. Another option would be to
modify mm_callback_info_schedule() to just allow being called twice,
silently returning without doing anything the second time it gets
called.

What do you think of this?

Cheers!

-- 
Aleksander

Follow-Ups:
- Re: ModemManager: MMCallbackInfo scheduled twice when device removed
  - From: Aleksander Morgado

References:
- ModemManager: MMCallbackInfo scheduled twice when device removed
  - From: Aleksander Morgado
- Re: ModemManager: MMCallbackInfo scheduled twice when device removed
  - From: Dan Williams

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]