On Tue, 2009-04-28 at 09:42 -0700, Howard
Chu wrote:
> Dan Williams wrote:
> > On Fri, 2009-04-24 at 15:18 -0700, Howard Chu wrote:
> >> Dan Williams wrote:
> >>> On Mon, 2009-04-20 at 15:37 -0700, Howard Chu
wrote:
> >>>> Howard Chu wrote:
> >>>>> This is probably more related to the
ath9k driver, but I wanted to
> >>>>> start here
> >>>>> in case anyone is familiar with it.
I've been seeing this for the past
> >>>>> couple
> >>>>> months, and I just now rebuilt NM
fresh from git and it's still
> >>>>> happening:
> >>>>
> >>>> I seem to have ruled out the driver;
doing a kill -9 on NetworkManager
> >>>> so it
> >>>> doesn't have the opportunity to tear down
the connection on exit, shows
> >>>> that
> >>>> the wifi connection works perfectly once
NetworkManager is gone. No
> >>>> disassociation messages in dmesg, no
pauses in ssh sessions, etc.
> >>>
> >>> Don't rule out the driver. Does the driver
always return the currently
> >>> associated AP in the scan list? If not, you
might hit this. And the
> >>> driver is being stupid, because of *course*
the AP you're currently
> >>> connected to should always be in the scan
list, unless you're no longer
> >>> connected to it.
> >>>
> >>> The code in NM grabs the SSID& BSSID on a
6 second timer, and tries to
> >>> find that AP in the scan list. If it can't
find it (because the driver
> >>> didn't return that AP in the scan list) then
it reports none.
> >>>
> >>> If that's your problem, it's a driver
problem.
> >>
> >> How would I check this? Should it be obvious from
"iwlist scan" ? That
> >> returns
> >> the current AP along with the other visible ones.
> >>
> >> Also, reviewing the comments in bug 291760, this
problem is not just
> >> isolated
> >> to the ath9k driver. It's also being reported for
ath5k, wl, iwl3945,
> >> ipw2100,
> >> rtl8187, and b43, across multiple kernel and
driver revisions. As such it
> >> seems unlikely to be the drivers' fault.
> >
> > Depends; it might show up in that scan, or it might
not. If you can
> > reliably get it to show up every time, that's great.
But until 2.6.30,
> > mac80211-based drivers would not always return the
current AP. And some
> > of the older drivers don't either, though fullmac
drivers are more
> > likely to be OK there.
>
> If you already know for a fact that certain drivers are
incompatible with NM,
> it seems you should be documenting that in your release
notes or something.
> Or, you should be maintaining a list of tested
known-to-work drivers.
>
> > There is one window where NM wouldn't be able to find
the AP; if NM just
> > did a scan, and then the card reassociates to a
different AP that's not
> > in the scan list, and doesn't send an GIWSCAN event
so that the AP list
> > gets pulled (ipw2x00 do this, other drivers might
not), then when NM
> > goes to pull the BSSID off the card, the scan list
doesn't contain the
> > current AP. What NM should be doing here is to
request that the
> > supplicant grab the scan list again when it sees a
BSSID it doesn't know
> > about, but that's somewhat complicated.
>
> There must be more cases than this, because there are no
other APs for my
> card
> to associate to. (They're all secured with WEP or WPA, and
I only have
> credentials for mine.) The only reason I ever see the card
reassociate at all
> is due to NM's scanning; with that patched out it just
stays associated.
>
> > If the driver doesn't return the frequency of the
BSSID in the scan
> > results, or that frequency doesn't match what the
card reports from
> > SIOCGIWFREQ, then NM also can come up with (none).
Check that the
> > information from an 'iwlist scan' for that BSSID
matches what 'iwconfig'
> > reports when associated to that specific AP.
> >
> > So in conclusion there are actual driver bugs; (a)
not all drivers scan
> > results contain the currently associated AP in every
scan, and (b) not
> > all drivers emit scan results events when they
associate to a new AP
> > that's not already in the scan list, and finally (c)
some drivers are
> > just busted and return wrong channel information.
>
> Pretty sure (c) is not the case here, the info from iwlist
scan and iwconfig
> all matches. (b) won't happen in my current environment,
so I can't test one
> way or another. (a) doesn't appear to happen when I look,
but I have no idea
> how many scans are needed before the symptom occurs.
You can run NM with:
NM_ACTIVE_AP_DEBUG=1 /usr/sbin/NetworkManager --no-daemon
and get a lot more info about what's happening here and why the
active
AP isn't found in the scan list. Any chance you can do that and
post
the results?
Dan
Since I am having the same problem, with the wl driver, I did the
test and found the following in the log:
NetworkManager: <debug> [1332269638.001163] get_active_ap():
(eth1) BSSID: 00:14:bf:09:0b:86
NetworkManager: <debug> [1332269638.001288] get_active_ap():
(eth1) SSID: 'Homenet'
NetworkManager: <debug> [1332269638.001325] get_active_ap():
Pass #1
NetworkManager: <debug> [1332269638.001355]
get_active_ap(): AP: 'Homenet' 00:14:bf:09:0b:86
NetworkManager: <debug> [1332269638.001424]
get_active_ap(): matched
NetworkManager: <debug> [1332269644.002103] get_active_ap():
(eth1) BSSID: 00:14:bf:09:0b:86
NetworkManager: <debug> [1332269644.002238] get_active_ap():
(eth1) SSID: 'Homenet'
NetworkManager: <debug> [1332269644.002301] get_active_ap():
Pass #1
NetworkManager: <debug> [1332269644.002330]
get_active_ap(): AP: 'Homenet' 00:14:bf:09:0b:86
NetworkManager: <debug> [1332269644.002401]
get_active_ap(): matched
NetworkManager: <debug> [1332269650.002517] get_active_ap():
(eth1) BSSID: 00:14:bf:09:0b:86
NetworkManager: <debug> [1332269650.002650] get_active_ap():
(eth1) SSID: 'Homenet'
NetworkManager: <debug> [1332269650.002686] get_active_ap():
Pass #1
NetworkManager: <debug> [1332269650.002716]
get_active_ap(): AP: 'Homenet' 00:14:bf:09:0b:86
NetworkManager: <debug> [1332269650.002789]
get_active_ap(): frequency mismatch (device 0, ap 2427)
NetworkManager: <debug> [1332269650.002820]
get_active_ap(): AP: (none) 38:22:9d:9c:b7:6f
NetworkManager: <debug> [1332269650.002848]
get_active_ap(): BSSID mismatch
...
NetworkManager: <debug> [1332269650.005566]
get_active_ap(): AP: 'home_net' 38:22:9d:9c:b7:6e
NetworkManager: <debug> [1332269650.005593]
get_active_ap(): BSSID mismatch
NetworkManager: <debug> [1332269650.005616] get_active_ap():
No matching AP found.
NetworkManager: <debug> [1332269650.005646] periodic_update():
Roamed from BSSID 00:14:BF:09:0B:86 (Homenet) to (none) ((none))
NetworkManager: <debug> [1332269656.002429] get_active_ap():
(eth1) BSSID: 00:14:bf:09:0b:86
NetworkManager: <debug> [1332269656.002855] get_active_ap():
(eth1) SSID: 'Homenet'
NetworkManager: <debug> [1332269656.002903] get_active_ap():
Pass #1
NetworkManager: <debug> [1332269656.002964]
get_active_ap(): AP: 'Homenet' 00:14:bf:09:0b:86
NetworkManager: <debug> [1332269656.003045]
get_active_ap(): matched
NetworkManager: <debug> [1332269656.003102] periodic_update():
Roamed from BSSID (none) ((none)) to 00:14:BF:09:0B:86 (Homenet)
I cannot understand where that "frequency mismatch (device 0, ap
2427)" comes from. My AP is a WRT54GL (802.11b/g) and my laptop
device is the following:
07:00.0 Network controller: Broadcom Corporation Device 4353 (rev
01)
Subsystem: Hewlett-Packard Company Device 1510
Flags: bus master, fast devsel, latency 0, IRQ 17
Memory at d3000000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Capabilities: [58] Vendor Specific Information <?>
Capabilities: [48] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable-
Capabilities: [d0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting <?>
Capabilities: [13c] Virtual Channel <?>
Capabilities: [160] Device Serial Number 26-00-ed-ff-ff-82-00-00
Capabilities: [16c] Power Budgeting <?>
Kernel driver in use: wl
Hope that will help solve the problem, because it is EXTREMELY
annoying.
> It seems to me that blaming the driver is not particularly
useful unless you
> can provide a procedure or script to demonstrate the
driver bugs. In the
> meantime, that whole spectrum of drivers is out there and
people are trying
> to
> use them. And except for whatever NM's undocumented
expectations, those cards
> and drivers work fine. Since only NM causes problems, it's
your
> responsibility
> to either help identify the problems so the driver writers
can fix them, or
> make NM work despite those problems. E.g., if you know
that scans return
> unreliable information, then *stop relying on the scan
results*. Clearly the
> driver can tell you if it's associated or not. Assuming
that the association
> is gone because the current AP doesn't show up in the
current scan list, when
> you know that scans can be incomplete, is stupid.
>
> Likewise, continual scanning seems to be
counterproductive. The impact it has
> on network throughput is significant:
>
> 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=1.57 ms
> 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=1.56 ms
> 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=4607 ms
> 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=3604 ms
> 64 bytes from 192.168.1.1: icmp_seq=11 ttl=64 time=2604 ms
> 64 bytes from 192.168.1.1: icmp_seq=12 ttl=64 time=1604 ms
> 64 bytes from 192.168.1.1: icmp_seq=13 ttl=64 time=604 ms
> 64 bytes from 192.168.1.1: icmp_seq=14 ttl=64 time=1.54 ms
> 64 bytes from 192.168.1.1: icmp_seq=15 ttl=64 time=1.54 ms
> 64 bytes from 192.168.1.1: icmp_seq=16 ttl=64 time=1.54 ms
>
> Obviously the usual ping time is ~1.5ms; "iwlist scan"
slows that down quite
> a
> lot. I'd rather wait 6 seconds *once* to find a new AP
after I've
> legitimately
> lost an AP association, instead of waiting ~10 seconds
every two minutes
> rescanning for a list of APs that simply don't matter.
>
> At this point I've spent as much time as I can afford on
it, and my patched
> NM
> works for me.
>
_______________________________________________
NetworkManager-list mailing list
NetworkManager-list@xxxxxxxxx
http://mail.gnome.org/mailman/listinfo/networkmanager-list
Regards,
Anastas
|