Re: BUG: NM only connects the third time

On Fri, 2007-11-02 at 21:35 +0100, Eric Brunet lps ens fr wrote:
> On Fri, Nov 02, 2007 at 07:19:05AM -0400, Dan Williams wrote:
> > NM ignores link changes during device activation.  This is for a number
> > of reasons.  The first is driver variance, and because during the
> > activation, the device may bounce up and down due to dhclient starting
> > up, or whatever.  Drivers are quite finicky; rtl8189 are quite notorious
> > for link bouncing.
> > 
> > This may have gotten better as drivers get better over time, and may no
> > longer be the appropriate behavior.  At a minimum, a link timer needs to
> > be started that would wait 5 seconds and smooth out link state before
> > terminating the activation.
> Ah yes, I understand that. It make sense to not interrupt an ongoing
> connexion, just in case the link information is wrong. But I am still
> confused by the logs: after the line "Activation (eth0): cancelled",
> dhclient is still running. Is that expected ? NM announces it is giving
> up but goes on trying just in case ? And this dhclient is still running
> while NM is connecting through wlan0, and even after it managed to
> connect through wlan0 !

Looks like a bug; although we just found an issue with Fedora 8 where
the network service would start a dhclient when the machine comes out of
suspend or a device gets hotplugged.

> Maybe it is the expected behaviour (let's just run dhclient on both
> interfaces and see who's winning), but then the messages in the logs are
> confusing and NM should not announce that it cancelled eth0.
> > > 11:06:53 NM: <info>  Activation (eth0) Beginning DHCP transaction.
> > > 11:06:53 NM: <info>  DHCP daemon state is now 12 (successfully started) for interface eth0
> > > 11:06:53 NM: <info>  Activation (eth0) Stage 3 of 5 (IP Configure Start) complete.
> > > 11:06:53 NM: <info>  nm-device-802-3-ethernet.c - link_deactivated_helper (129) device eth0 will set active link to FALSE
> > > 11:06:53 NM: <info>  nm-device.c - nm_device_set_active_link (596) device eth0 link state set to 0
> > > 11:06:53 NM: <info>  SWITCH: terminating current connection 'eth0' because it's no longer valid.
> > > 11:06:53 NM: <info>  Deactivating device eth0.
> > > 11:06:53 NM: <info>  Activation (eth0): cancelling...
> > > 11:06:53 NM: <info>  Activation (eth0) cancellation handler scheduled...
> > > 11:06:53 NM: <info>  Activation (eth0): waiting for device to cancel activation.
> > > 11:06:54 NM: <info>  Activation (eth0) cancellation handled.
> > > 11:06:54 NM: <info>  Activation (eth0): cancelled.
> > > 11:06:54 NM: nm_device_is_802_3_ethernet: assertion `dev != NULL' failed
> > > 11:06:54 NM: nm_device_is_802_11_wireless: assertion `dev != NULL' failed
> > > 11:06:54 dhclient: DHCPDISCOVER on eth0 to port 67 interval 6
> > > 11:07:00 dhclient: DHCPDISCOVER on eth0 to port 67 interval 10
> > > 11:07:10 dhclient: DHCPDISCOVER on eth0 to port 67 interval 13
> > > 11:07:23 dhclient: DHCPDISCOVER on eth0 to port 67 interval 21
> > > 11:07:30 NM: <info>  Updating allowed wireless network lists.
> > > 
> > > 	[... from here, NM switchs to wlan0, launches a dhclient on the wifi and gets my connexion, but one can still find in the logs: ...]
> > > 
>     11:07:38 NM: <info>  Activation (wlan0) successful, device activated.
> > > 11:07:44 dhclient: DHCPDISCOVER on eth0 to port 67 interval 8
> > > 11:07:52 dhclient: DHCPDISCOVER on eth0 to port 67 interval 3
> > > 11:07:55 dhclient: No DHCPOFFERS received.
> But again, it is not very important.
> A last thing and I'll stop bothering you, a usability request.
> As I understand, NM keeps a memory of Access Points that the wifi card
> found around, and it doesn't drop them at once when the card stop
> reporting them, just in case they reappear.

Right; you don't get every AP in each scan for number of reasons.
Therefore the scan list is a composite over 6 minutes of the APs that
are seen.  An AP will get dropped from the list if it hasn't been seen
in 6 minutes.  One thing I may do is to drop them earlier if there are a
_lot_ of APs in the scan list.  When you're on a train or in a car, you
can get a lot of transient scan results that need to be dropped quickly.
So if there are a lot of new APs found within a given time, it makes
sense to drop APs more quickly.

> For this reason, it makes sense to stop the connexion before suspend and
> reestablish it after resume, because otherwise NM would stay connected to
> the AP it had before suspending, until he'd realize the AP is no longer
> around, and this could take about 30 seconds, without a valid connection.
> However, because of the suspend/wake mechanism, after each resume NM has
> to reestablish a connexion, which takes around 5 seconds. That's 5
> seconds garanteed without connexion, to avoid the possibility of having
> 30 seconds without if the laptop moved during the suspend. I understand
> the logic, even if I am not sure it is a good deal.
> [I hope I have well understood the problem and am not talking rubbish]
> I was wondering if there was a way to make the best of both worlds: the
> suspend method of NM could be made a NOP, so that the connexion is not
> shut down at suspend. At resume time, when NM receives the wake method,
> it could check at once the existing networks around and try to determine,
> by comparing the previous list and the new list, if the laptop moved or
> not during the suspend. If it moved, get a new connection. If it didn't,
> keep the existing good connection.
> This way, a laptop moving during the suspend gets a 5 second delay on
> resume without connectivity (the time NM takes to connect to a new AP),
> and the laptop which didn't move has instantaneously a correct connexion.

The problem here is that the card is powered down during suspend or
hibernate.  When the card is down, it looses the association, and you
have to go back through the entire connection process anyway, just get
association, authenticate, and get an address back.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]