supplicant interface scan state tracking



Hi Dan,

I'm finding it quite easy to reproduce a bug related to
nm_supplicant_interface_get_scanning()
but I'm not sure how to fix it.

The logic implemented in my OLPC mesh device so-far is that if the
companion device is scanning, it postpones stage2 until the scanning has
finished.

It does this by monitoring a new "scanning" property on NMDeviceWifi
which is implemented based on nm_supplicant_interface_get_scanning().
http://dev.laptop.org/git/users/dsd/NetworkManager/commit/?h=olpc&id=111baac88318f4db467360fd9703f37ac0449023

Also, if a connection on eth0 is active when you activate a msh0
connection, msh0 moves eth0 into NM_DEVICE_STATE_DISCONNECTED and
disables autoconnect (via the mechanism in the patch I emailed earlier).

Anyway, I can now easily reproduce the following sequence of events
causing nm_supplicant_interface_get_scanning() to be less than truthful
and cause a deadlock:

to start with, an eth0 connection is activating:

<info>  Activation (eth0) Stage 2 of 5 (Device Configure) complete.
<info>  Config: set interface ap_scan to 1

at this point, inside NMDeviceOlpcMeshPrivate, "scanning" is TRUE and
con_state = SCANNING (I know this through some debug messages)

<info>  (eth0): supplicant connection state:  disconnected -> scanning

but I interrupt it here by starting a mesh connection

<info>  Activation (msh0) starting connection 'olpc-mesh-1'
<info>  (msh0): device state change: 3 -> 4 (reason 0)
<info>  Activation (msh0) Stage 1 of 5 (Device Prepare) scheduled...
<info>  Activation (msh0) Stage 1 of 5 (Device Prepare) started...

msh0 now disconnects eth0

<info>  (eth0): device state change: 5 -> 3 (reason 2)
<info>  (eth0): deactivating device (reason: 2).
<info>  Activation (msh0) Stage 1 of 5 (Device Prepare) complete.

At this point, another dbus signal comes in from wpa_supplicant so
"scanning" moves to FALSE. This wakes up msh0 device which calls
nm_supplicant_interface_get_scanning() to figure out the new state, but
this returns TRUE because con_state is still SCANNING, so msh0 does not
continue the connection process and everything stops.

What confuses me a little here is that the supplicant is still alive and
running, even though there aren't any active connections. It did also
manage to raise a dbus signal indicating the termination of the scan
*after* NM sent the disconnection request, but it did not manage to
communicate any change in con_state. Also I cannot connect to it with
wpa_cli to see if is still in SCANNING state.

Thoughts?

Thanks,
Daniel




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]