[PATCH] core: fix DNS/default route race between IPv4/IPv6 activation



The race is caused by 0.9.4-era changes in how later-completing IP
versions commit their IP config asynchronously by scheduling their
ip_config_result() handler after the device is activated.  Because
nm_device_is_activating() looked at priv->act_source_id, which is
set by nm_device_activate_schedule_ipX_config_result(), it could
return TRUE even when the device was in the ACTIVATED state.

This was exacerbated by 0.9.10, which listens for external changes
to IP configuration and updates the device's IPConfig with them,
which triggered a premature policy update which checked
nm_device_is_activating() and got the wrong result.

The reporter has a dual-stack network with very quick responding
IPv6 router advertisements, no DHCPv6, and slow DHCPv4.  The
sequence of events was:

1) DHCPv4 started

2) IPv6 router advertisement is received and IPv6 is completed
very quickly.

3) Because IPv6 has completed, the device enters the ACTIVATED state

4) Some external IPv6 change comes in via netlink, which schedules
queued_ip_config_change() via an idle handler

5) DHCPv4 replies and dhcp4_state_changed() calls
nm_device_activate_schedule_ip4_config_result() because IPv4 has
not yet completed.  This sets priv->dev_ip4_config with the DHCP
IP4Config details, and then schedules
nm_device_activate_ip4_config_commit() and sets priv->act_source_id
in the process.

6) The scheduled queued_ip_config_change() from (4) runs and calls
ip4_config_merge_and_apply() to merge IPv4 configuration (including
priv->dev_ip4_config) into the combined device IP4Config.  (Note
that this requires that some external IPv4 address/route exist, but
that could have been added by anything at any point.)  This calls
nm_device_set_ip4_config() which diffs the new + old configs and
emits IP4_CONFIG_CHANGED because priv->dev_ip4_config is new.

7) nm-policy.c::device_ip4_config_changed() runs;
nm_device_is_activating () returns *TRUE* because of
the scheduled nm_device_activate_ip4_config_commit() from (5),
and DNS and the default route are not updated.  This is the key
failure point.

8) nm_device_activate_ip4_config_commit() from (5) runs and calls
ip4_config_merge_and_apply(), which calls nm_device_set_ip4_config()
again.  But this time, because nothing has actually changed, because
everything was already merged in step (6), no IP4_CONFIG_CHANGED
is emitted and nm-policy.c::device_ip4_config_changed() does not
run.  Thus DNS and the default route are still not updated.
---
 src/devices/nm-device.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/devices/nm-device.c b/src/devices/nm-device.c
index a38506f..afa243e 100644
--- a/src/devices/nm-device.c
+++ b/src/devices/nm-device.c
@@ -5088,18 +5088,21 @@ nm_device_is_activating (NMDevice *device)
 
        state = nm_device_get_state (device);
        if (state >= NM_DEVICE_STATE_PREPARE && state <= NM_DEVICE_STATE_SECONDARIES)
                return TRUE;
 
        /* There's a small race between the time when stage 1 is scheduled
         * and when the device actually sets STATE_PREPARE when the activation
-        * handler is actually run.  If there's an activation handler scheduled
-        * we're activating anyway.
+        * handler is actually run.  Catch that by checking for a scheduled
+        * activation handler in the DISCONNECTED state.
         */
-       return priv->act_source_id ? TRUE : FALSE;
+       if (state == NM_DEVICE_STATE_DISCONNECTED && priv->act_source_id)
+               return TRUE;
+
+       return FALSE;
 }
 
 /* IP Configuration stuff */
 
 NMDHCP4Config *
 nm_device_get_dhcp4_config (NMDevice *self)
 {
-- 
1.9.3




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]