Re: periodic_update(): Roamed ... - msg#00300
- From: Dan Williams <dcbw redhat com>
- To: Anastas Giokov <antzi antzi bg>
- Cc: networkmanager-list gnome org
- Subject: Re: periodic_update(): Roamed ... - msg#00300
- Date: Thu, 22 Mar 2012 11:52:02 -0500
On Tue, 2012-03-20 at 21:16 +0200, Anastas Giokov wrote:
> > On Tue, 2009-04-28 at 09:42 -0700, Howard Chu wrote:
> > > Dan Williams wrote:
> > > > On Fri, 2009-04-24 at 15:18 -0700, Howard Chu wrote:
> > > >> Dan Williams wrote:
> > > >>> On Mon, 2009-04-20 at 15:37 -0700, Howard Chu wrote:
> > > >>>> Howard Chu wrote:
> > > >>>>> This is probably more related to the ath9k driver, but I
> > wanted to
> > > >>>>> start here
> > > >>>>> in case anyone is familiar with it. I've been seeing this
> > for the past
> > > >>>>> couple
> > > >>>>> months, and I just now rebuilt NM fresh from git and it's
> > still
> > > >>>>> happening:
> > > >>>>
> > > >>>> I seem to have ruled out the driver; doing a kill -9 on
> > NetworkManager
> > > >>>> so it
> > > >>>> doesn't have the opportunity to tear down the connection on
> > exit, shows
> > > >>>> that
> > > >>>> the wifi connection works perfectly once NetworkManager is
> > gone. No
> > > >>>> disassociation messages in dmesg, no pauses in ssh sessions,
> > etc.
> > > >>>
> > > >>> Don't rule out the driver. Does the driver always return the
> > currently
> > > >>> associated AP in the scan list? If not, you might hit this.
> > And the
> > > >>> driver is being stupid, because of *course* the AP you're
> > currently
> > > >>> connected to should always be in the scan list, unless you're
> > no longer
> > > >>> connected to it.
> > > >>>
> > > >>> The code in NM grabs the SSID& BSSID on a 6 second timer, and
> > tries to
> > > >>> find that AP in the scan list. If it can't find it (because
> > the driver
> > > >>> didn't return that AP in the scan list) then it reports none.
> > > >>>
> > > >>> If that's your problem, it's a driver problem.
> > > >>
> > > >> How would I check this? Should it be obvious from "iwlist
> > scan" ? That
> > > >> returns
> > > >> the current AP along with the other visible ones.
> > > >>
> > > >> Also, reviewing the comments in bug 291760, this problem is not
> > just
> > > >> isolated
> > > >> to the ath9k driver. It's also being reported for ath5k, wl,
> > iwl3945,
> > > >> ipw2100,
> > > >> rtl8187, and b43, across multiple kernel and driver revisions.
> > As such it
> > > >> seems unlikely to be the drivers' fault.
> > > >
> > > > Depends; it might show up in that scan, or it might not. If you
> > can
> > > > reliably get it to show up every time, that's great. But until
> > 2.6.30,
> > > > mac80211-based drivers would not always return the current AP.
> > And some
> > > > of the older drivers don't either, though fullmac drivers are
> > more
> > > > likely to be OK there.
> > >
> > > If you already know for a fact that certain drivers are
> > incompatible with NM,
> > > it seems you should be documenting that in your release notes or
> > something.
> > > Or, you should be maintaining a list of tested known-to-work
> > drivers.
> > >
> > > > There is one window where NM wouldn't be able to find the AP; if
> > NM just
> > > > did a scan, and then the card reassociates to a different AP
> > that's not
> > > > in the scan list, and doesn't send an GIWSCAN event so that the
> > AP list
> > > > gets pulled (ipw2x00 do this, other drivers might not), then
> > when NM
> > > > goes to pull the BSSID off the card, the scan list doesn't
> > contain the
> > > > current AP. What NM should be doing here is to request that the
> > > > supplicant grab the scan list again when it sees a BSSID it
> > doesn't know
> > > > about, but that's somewhat complicated.
> > >
> > > There must be more cases than this, because there are no other APs
> > for my
> > > card
> > > to associate to. (They're all secured with WEP or WPA, and I only
> > have
> > > credentials for mine.) The only reason I ever see the card
> > reassociate at all
> > > is due to NM's scanning; with that patched out it just stays
> > associated.
> > >
> > > > If the driver doesn't return the frequency of the BSSID in the
> > scan
> > > > results, or that frequency doesn't match what the card reports
> > from
> > > > SIOCGIWFREQ, then NM also can come up with (none). Check that
> > the
> > > > information from an 'iwlist scan' for that BSSID matches what
> > 'iwconfig'
> > > > reports when associated to that specific AP.
> > > >
> > > > So in conclusion there are actual driver bugs; (a) not all
> > drivers scan
> > > > results contain the currently associated AP in every scan, and
> > (b) not
> > > > all drivers emit scan results events when they associate to a
> > new AP
> > > > that's not already in the scan list, and finally (c) some
> > drivers are
> > > > just busted and return wrong channel information.
> > >
> > > Pretty sure (c) is not the case here, the info from iwlist scan
> > and iwconfig
> > > all matches. (b) won't happen in my current environment, so I
> > can't test one
> > > way or another. (a) doesn't appear to happen when I look, but I
> > have no idea
> > > how many scans are needed before the symptom occurs.
> >
> > You can run NM with:
> >
> > NM_ACTIVE_AP_DEBUG=1 /usr/sbin/NetworkManager --no-daemon
> >
> > and get a lot more info about what's happening here and why the
> > active
> > AP isn't found in the scan list. Any chance you can do that and post
> > the results?
> >
> > Dan
> >
> Since I am having the same problem, with the wl driver, I did the test
> and found the following in the log:
>
> NetworkManager: <debug> [1332269638.001163] get_active_ap(): (eth1)
> BSSID: 00:14:bf:09:0b:86
> NetworkManager: <debug> [1332269638.001288] get_active_ap(): (eth1)
> SSID: 'Homenet'
> NetworkManager: <debug> [1332269638.001325] get_active_ap(): Pass
> #1
> NetworkManager: <debug> [1332269638.001355] get_active_ap(): AP:
> 'Homenet' 00:14:bf:09:0b:86
> NetworkManager: <debug> [1332269638.001424] get_active_ap():
> matched
> NetworkManager: <debug> [1332269644.002103] get_active_ap(): (eth1)
> BSSID: 00:14:bf:09:0b:86
> NetworkManager: <debug> [1332269644.002238] get_active_ap(): (eth1)
> SSID: 'Homenet'
> NetworkManager: <debug> [1332269644.002301] get_active_ap(): Pass
> #1
> NetworkManager: <debug> [1332269644.002330] get_active_ap(): AP:
> 'Homenet' 00:14:bf:09:0b:86
> NetworkManager: <debug> [1332269644.002401] get_active_ap():
> matched
> NetworkManager: <debug> [1332269650.002517] get_active_ap(): (eth1)
> BSSID: 00:14:bf:09:0b:86
> NetworkManager: <debug> [1332269650.002650] get_active_ap(): (eth1)
> SSID: 'Homenet'
> NetworkManager: <debug> [1332269650.002686] get_active_ap(): Pass
> #1
> NetworkManager: <debug> [1332269650.002716] get_active_ap(): AP:
> 'Homenet' 00:14:bf:09:0b:86
> NetworkManager: <debug> [1332269650.002789] get_active_ap():
> frequency mismatch (device 0, ap 2427)
> NetworkManager: <debug> [1332269650.002820] get_active_ap(): AP:
> (none) 38:22:9d:9c:b7:6f
> NetworkManager: <debug> [1332269650.002848] get_active_ap():
> BSSID mismatch
> ...
> NetworkManager: <debug> [1332269650.005566] get_active_ap(): AP:
> 'home_net' 38:22:9d:9c:b7:6e
> NetworkManager: <debug> [1332269650.005593] get_active_ap():
> BSSID mismatch
> NetworkManager: <debug> [1332269650.005616] get_active_ap(): No
> matching AP found.
> NetworkManager: <debug> [1332269650.005646] periodic_update(): Roamed
> from BSSID 00:14:BF:09:0B:86 (Homenet) to (none) ((none))
> NetworkManager: <debug> [1332269656.002429] get_active_ap(): (eth1)
> BSSID: 00:14:bf:09:0b:86
> NetworkManager: <debug> [1332269656.002855] get_active_ap(): (eth1)
> SSID: 'Homenet'
> NetworkManager: <debug> [1332269656.002903] get_active_ap(): Pass
> #1
> NetworkManager: <debug> [1332269656.002964] get_active_ap(): AP:
> 'Homenet' 00:14:bf:09:0b:86
> NetworkManager: <debug> [1332269656.003045] get_active_ap():
> matched
> NetworkManager: <debug> [1332269656.003102] periodic_update(): Roamed
> from BSSID (none) ((none)) to 00:14:BF:09:0B:86 (Homenet)
>
> I cannot understand where that "frequency mismatch (device 0, ap
> 2427)" comes from. My AP is a WRT54GL (802.11b/g) and my laptop
> device
It happens because the wl.o driver fails to return a correct frequency
when it's asked to provide it's operating frequency. NM already
contains a workaround specifically for the wl.o driver, but that
workaround depends on the driver returning something greater than 0.
I've pushed a workaround for this case that should hit 0.9.4. If
there's any possible way you can use an actual open driver from the
upstream kernel (b43 or bcma) that may provide a better experience.
Dan
> is the following:
>
> 07:00.0 Network controller: Broadcom Corporation Device 4353 (rev 01)
> Subsystem: Hewlett-Packard Company Device 1510
> Flags: bus master, fast devsel, latency 0, IRQ 17
> Memory at d3000000 (64-bit, non-prefetchable) [size=16K]
> Capabilities: [40] Power Management version 3
> Capabilities: [58] Vendor Specific Information <?>
> Capabilities: [48] Message Signalled Interrupts: Mask- 64bit+
> Queue=0/0 Enable-
> Capabilities: [d0] Express Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting <?>
> Capabilities: [13c] Virtual Channel <?>
> Capabilities: [160] Device Serial Number 26-00-ed-ff-ff-82-00-00
> Capabilities: [16c] Power Budgeting <?>
> Kernel driver in use: wl
>
> Hope that will help solve the problem, because it is EXTREMELY
> annoying.
>
> > > It seems to me that blaming the driver is not particularly useful
> > unless you
> > > can provide a procedure or script to demonstrate the driver bugs.
> > In the
> > > meantime, that whole spectrum of drivers is out there and people
> > are trying
> > > to
> > > use them. And except for whatever NM's undocumented expectations,
> > those cards
> > > and drivers work fine. Since only NM causes problems, it's your
> > > responsibility
> > > to either help identify the problems so the driver writers can fix
> > them, or
> > > make NM work despite those problems. E.g., if you know that scans
> > return
> > > unreliable information, then *stop relying on the scan results*.
> > Clearly the
> > > driver can tell you if it's associated or not. Assuming that the
> > association
> > > is gone because the current AP doesn't show up in the current scan
> > list, when
> > > you know that scans can be incomplete, is stupid.
> > >
> > > Likewise, continual scanning seems to be counterproductive. The
> > impact it has
> > > on network throughput is significant:
> > >
> > > 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=1.57 ms
> > > 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=1.56 ms
> > > 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=4607 ms
> > > 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=3604 ms
> > > 64 bytes from 192.168.1.1: icmp_seq=11 ttl=64 time=2604 ms
> > > 64 bytes from 192.168.1.1: icmp_seq=12 ttl=64 time=1604 ms
> > > 64 bytes from 192.168.1.1: icmp_seq=13 ttl=64 time=604 ms
> > > 64 bytes from 192.168.1.1: icmp_seq=14 ttl=64 time=1.54 ms
> > > 64 bytes from 192.168.1.1: icmp_seq=15 ttl=64 time=1.54 ms
> > > 64 bytes from 192.168.1.1: icmp_seq=16 ttl=64 time=1.54 ms
> > >
> > > Obviously the usual ping time is ~1.5ms; "iwlist scan" slows that
> > down quite
> > > a
> > > lot. I'd rather wait 6 seconds *once* to find a new AP after I've
> > > legitimately
> > > lost an AP association, instead of waiting ~10 seconds every two
> > minutes
> > > rescanning for a list of APs that simply don't matter.
> > >
> > > At this point I've spent as much time as I can afford on it, and
> > my patched
> > > NM
> > > works for me.
> > >
> >
> > _______________________________________________
> > NetworkManager-list mailing list
> > NetworkManager-list@xxxxxxxxx
> > http://mail.gnome.org/mailman/listinfo/networkmanager-list
> Regards,
> Anastas
> _______________________________________________
> networkmanager-list mailing list
> networkmanager-list gnome org
> http://mail.gnome.org/mailman/listinfo/networkmanager-list
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]