Re: Race condition while using usbnet with NM 0.7.1
- From: Dan Williams <dcbw redhat com>
- To: Ricardo Salveti de Araujo <ricardo salveti openbossa org>
- Cc: networkmanager-list gnome org
- Subject: Re: Race condition while using usbnet with NM 0.7.1
- Date: Mon, 13 Jul 2009 19:44:19 -0400
On Fri, 2009-07-10 at 20:32 -0300, Ricardo Salveti de Araujo wrote:
> Hi all,
>
> We're currently using Network Manager at Mamona, a developer distro
> based on open embedded for Nokia tablet devices.
>
> Currently we're using NM 0.7 and we just created the packages for NM
> 0.7.1, but while testing it, I faced some race condition issues with
> the usbnet.
Ha! I've been looking for that race off and on for a while. Thanks for
finding the root cause. Can you try out this commit from master? If it
works for you I'll also cherry-pick to 0.7.x.
commit 302c9fcbccf3ad945afbc3f58e42013045c6e352
Author: Dan Williams <dcbw redhat com>
Date: Mon Jul 13 19:40:39 2009 -0400
netlink: fix race that caused stale carrier state signals
Found by Ricardo Salveti de Araujo <ricardo salveti openbossa org>
The link cache was updated immediately, but the carrier state signals
were emitted a lot later, when the cache data was already stale. So
just update the cache at the same time we emit the signals. The
carrier-state-request stuff wasn't originally converted to deferred
for any netlink-specific reason, just to smooth the initial device
creation process in NM.
Thanks!
Dan
> Here's the log of the Network Manager, while booting the device (with
> additional debug):
> <info> starting...
> <info> nm_netlink_monitor_open_connection()
> <info> nm_netlink_monitor_request_status() <- add the handler to the main loop
> <info> deferred_emit_carrier_state() <- consume the cache
> <info> netlink_object_message_handler() (lo) IFF_LOWER_UP
> <info> netlink_event_input()
> <info> netlink_object_message_handler() (lo) IFF_LOWER_DOWN
> <info> netlink_object_message_handler() (usb0) IFF_LOWER_DOWN
> <info> nm_netlink_monitor_request_status() <- add the handler to the main loop
> <info> (usb0): new Ethernet device (driver: 'ehci_udc')
> <info> (usb0): exported as /org/freedesktop/Hal/devices/net_7a_ce_13_55_f7_81
> <info> Trying to start the supplicant...
> <info> netlink_event_input()
> <info> (usb0): device state change: 1 -> 2
> <info> (usb0): bringing up device.
> <info> (usb0): preparing device.
> <info> (usb0): deactivating device (reason: 2).
> <info> Setting system hostname to 'localhost.localdomain' (no default device)
> <info> netlink_object_message_handler() (usb0) IFF_LOWER_UP
> <info> (usb0): carrier now ON (device state 2)
> <info> (usb0): device state change: 2 -> 3
> <info> netlink_event_input()
> <info> Trying to start the system settings daemon...
> <info> deferred_emit_carrier_state() <- consume the cache
> <info> netlink_object_message_handler() (lo) IFF_LOWER_UP
> <info> netlink_object_message_handler() (usb0) IFF_LOWER_DOWN
> <info> (usb0): carrier now OFF (device state 3)
> <info> (usb0): device state change: 3 -> 2
> <info> (usb0): deactivating device (reason: 40).
>
> The problem is that at the end the device carrier status is OFF, while
> it should be ON, so the NM could finish setting the IP address and
> letting it ready to use.
>
> Looking forward to try to identify where is the problem, I found that
> the function deferred_emit_carrier_state (nm-netlink-monitor.c) is
> taking more than expected to be called, and between the
> nl_cache_refill and the actual message handler, NM brought the device
> up, setting the carrier status to ON. At the moment
> deferred_emit_carrier_state is called by the main loop, the cache data
> is not valid anymore, letting the usb0 carrier status to OFF again.
>
> Because of this behavior, NM is not configuring the device as it
> should, and the interface remains up while without any IP.
>
> This is not happening every time though. When NM brings up the device
> after calling deferred_emit_carrier_state, everything works fine, so
> that's why it seems that a racing condition is going on.
>
> The question is, what's the best way to fix this issue?
>
> I know that I could go to 2 directions, one is to check the cached
> data when getting a new event (like bringing up the usb0 interface),
> and the other is to call nl_cache_refill inside
> deferred_emit_carrier_state, changing a little bit the current
> behavior.
>
> As I still don't understand a lot of the NM code (started reading it
> deeply today), I would like to know on what solution should I work on,
> so I could send you the patch later.
>
> Thanks!
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]