Race condition while using usbnet with NM 0.7.1



Hi all,

We're currently using Network Manager at Mamona, a developer distro
based on open embedded for Nokia tablet devices.

Currently we're using NM 0.7 and we just created the packages for NM
0.7.1, but while testing it, I faced some race condition issues with
the usbnet.

Here's the log of the Network Manager, while booting the device (with
additional debug):
<info>  starting...
<info>  nm_netlink_monitor_open_connection()
<info>  nm_netlink_monitor_request_status() <- add the handler to the main loop
<info>  deferred_emit_carrier_state() <- consume the cache
<info>  netlink_object_message_handler() (lo) IFF_LOWER_UP
<info>  netlink_event_input()
<info>  netlink_object_message_handler() (lo) IFF_LOWER_DOWN
<info>  netlink_object_message_handler() (usb0) IFF_LOWER_DOWN
<info>  nm_netlink_monitor_request_status() <- add the handler to the main loop
<info>  (usb0): new Ethernet device (driver: 'ehci_udc')
<info>  (usb0): exported as /org/freedesktop/Hal/devices/net_7a_ce_13_55_f7_81
<info>  Trying to start the supplicant...
<info>  netlink_event_input()
<info>  (usb0): device state change: 1 -> 2
<info>  (usb0): bringing up device.
<info>  (usb0): preparing device.
<info>  (usb0): deactivating device (reason: 2).
<info>  Setting system hostname to 'localhost.localdomain' (no default device)
<info>  netlink_object_message_handler() (usb0) IFF_LOWER_UP
<info>  (usb0): carrier now ON (device state 2)
<info>  (usb0): device state change: 2 -> 3
<info>  netlink_event_input()
<info>  Trying to start the system settings daemon...
<info>  deferred_emit_carrier_state() <- consume the cache
<info>  netlink_object_message_handler() (lo) IFF_LOWER_UP
<info>  netlink_object_message_handler() (usb0) IFF_LOWER_DOWN
<info>  (usb0): carrier now OFF (device state 3)
<info>  (usb0): device state change: 3 -> 2
<info>  (usb0): deactivating device (reason: 40).

The problem is that at the end the device carrier status is OFF, while
it should be ON, so the NM could finish setting the IP address and
letting it ready to use.

Looking forward to try to identify where is the problem, I found that
the function deferred_emit_carrier_state (nm-netlink-monitor.c) is
taking more than expected to be called, and between the
nl_cache_refill and the actual message handler, NM brought the device
up, setting the carrier status to ON. At the moment
deferred_emit_carrier_state is called by the main loop, the cache data
is not valid anymore, letting the usb0 carrier status to OFF again.

Because of this behavior, NM is not configuring the device as it
should, and the interface remains up while without any IP.

This is not happening every time though. When NM brings up the device
after calling deferred_emit_carrier_state, everything works fine, so
that's why it seems that a racing condition is going on.

The question is, what's the best way to fix this issue?

I know that I could go to 2 directions, one is to check the cached
data when getting a new event (like bringing up the usb0 interface),
and the other is to call nl_cache_refill inside
deferred_emit_carrier_state, changing a little bit the current
behavior.

As I still don't understand a lot of the NM code (started reading it
deeply today), I would like to know on what solution should I work on,
so I could send you the patch later.

Thanks!
-- 
Ricardo Salveti de Araujo


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]