Ok another patch attached. Same as last patch, but it comments out a bunch of the debug logging. It should point us in the right direction at least. Dan On Tue, 2017-05-30 at 16:19 +0000, Matthew Starr wrote:
-----Original Message----- From: Dan Williams [mailto:dcbw redhat com] Sent: Thursday, May 25, 2017 11:25 PM To: Matthew Starr; networkmanager-list gnome org Subject: Re: Network Manager 1.0.X Wi-Fi Autoconnect Issues On Thu, 2017-05-25 at 22:00 +0000, Matthew Starr wrote:-----Original Message----- From: Dan Williams [mailto:dcbw redhat com] Sent: Thursday, May 25, 2017 12:49 PM To: Matthew Starr; networkmanager-list gnome org Subject: Re: Network Manager 1.0.X Wi-Fi Autoconnect Issues On Thu, 2017-05-25 at 13:06 +0000, Matthew Starr wrote:-----Original Message----- From: Dan Williams [mailto:dcbw redhat com] Sent: Wednesday, May 24, 2017 3:26 PM To: Matthew Starr; networkmanager-list gnome org Subject: Re: Network Manager 1.0.X Wi-Fi Autoconnect Issues On Wed, 2017-05-24 at 18:22 +0000, Matthew Starr wrote:-----Original Message----- From: Dan Williams [mailto:dcbw redhat com] Sent: Wednesday, May 24, 2017 12:48 PM To: Matthew Starr; networkmanager-list gnome org Subject: Re: Network Manager 1.0.X Wi-Fi Autoconnect Issues On Thu, 2017-05-18 at 22:25 +0000, Matthew Starr wrote:-----Original Message----- From: Dan Williams [mailto:dcbw redhat com] Sent: Thursday, May 18, 2017 4:55 PM To: Matthew Starr; networkmanager-list gnome org Subject: Re: Network Manager 1.0.X Wi-Fi Autoconnect Issues On Thu, 2017-05-18 at 20:23 +0000, Matthew Starr wrote:-----Original Message----- From: Dan Williams [mailto:dcbw redhat com] Sent: Thursday, May 18, 2017 2:24 PM To: Matthew Starr; networkmanager-list gnome or g Subject: Re: Network Manager 1.0.X Wi-Fi Autoconnect Issues On Thu, 2017-05-18 at 18:43 +0000, Matthew Starr wrote:-----Original Message----- From: Dan Williams [mailto:dcbw redhat com] Sent: Thursday, May 18, 2017 1:31 PM To: Matthew Starr; networkmanager-list@gnom e.or g Subject: Re: Network Manager 1.0.X Wi-Fi Autoconnect Issues On Thu, 2017-05-18 at 15:54 +0000, Matthew Starr wrote:I have tried using NetworkManager 1.0.0 and 1.0.12 on an embedded device built with buildroot that has Ethernet (eth0), Wi-Fi client (mlan0), Wi-Fi Access Point (uap0), and Cellular interfaces (ttyACM0 and ppp0). The Wi-Fi AP (uap0) interface is ignored by Network Manager based on my NetworkManager.conf file. I am able to boot the device and Network Manager will automatically configure and connect with Ethernet, Wi-Fi Client, and Cellular interfaces every time. If I move out of range of the Wi-Fi access point the device will disconnect and if I move back into range in under an hour, NetworkManager will reestablish the connection. If I wait multiple hours before moving back into range of the Wi-Fi access point, Network Manager will not reestablish a connection automatically with the access point (I waited hours with the AP within range and visible in Wi-Fi scan results). When Network Manager is not automatically reestablishing a connection to the access point I can use nmcli to bring up the profile associated with the access point and it connects immediately. Why is Network Manager not able to auto connect to a Wi- Fi AP after a longer period of time of not seeing the AP? Is there a timeout within Network Manager? Is this a bug?Like you say, it does look like NM is trying to auto- activate the connection, but it's not doing it correctly. The most likely thing happening is that it does try to activate, but it's not able to find the "best" connection for the device. Somehow the existing WiFi connection profile isn't matching. Can you run 'nmcli con show <name of connection you expect to start>'?Dan, This issue has occurred on several different access point I have attempted to connect to all from different vendors (Linksys, Ubiquiti, D- link).Ok, that doesn't ellucidate anything. Are you able to apply a debugging patch to NetworkManager and rebuild it? Alternatively, you could use 'gdb' to step through the code and see where it's not proceeding with the activation in nm-policy.c. DanSome additional testing I just finished shows that version 1.6.2 exhibits the exact same behavior. I am able to apply patches easily and rebuild. I could run gdb but it is not quite as easy on my current setup.Which version do you prefer patches for? DanMy more immediate need is with the 1.0.12 version, but I plan to do a release within the next 6 months with the 1.6.X or 1.8.X version.Patch against upstream 1.0.12 attached. Hopefully applies to your version. It should log the right stuff without touching the logging level or domains. Run with this patch, reproduce the issue, and lets see what the logs say. DanI was able to cleanly apply the patch, but shortly after starting Network Manager it seems to seg fault (verified by using the -n option to not run as a daemon). I don't see any error messages in the logs. Here is where the logs left off:You've probably got some hidden-SSID APs, and I didn't account for that. Can you back out the previous patch, and try the latest attached one? DanWith the new patch I was able to establish a connection, put the module running Network Manager in a faraday cage for an hour, and then on removing it from the cage Network Manager connected successfully. This usually is not the case after an hour of not seeing the APs. At this point I wanted to test again for a longer period so I put the module back in the cage for an overnight test and it appears as soon as the module was isolated from all the APs, Network Manager crashed again. See the attached log for what was going on when it crashed at May 24 22:00:24. I will try my setup again after a reboot to get Network Manager running again. Let me know if there is another patch you want me to apply to resolve the crashing issue.Again my fault. Any place you see: g_free (tmp); in the patch, replace that with: if (ssid) g_free (tmp); Or back out the previous patch, and apply the attached one. DanIt appears I cannot reproduce the issue with the patch you provided that includes the debug statements. The Wi-Fi reconnects within 1-2 minutes or less every time. I wonder if this is a timing issue that the debug statements are delaying just enough to not make the issue occur. Before the patch I could reproduce it almost every time on multiple devices running the same software after an hour of no Wi-Fi signal. The only change on the devices is the updated network manager with your patch. I will continue to test after memorial day. Is there anything else you want me to check with the unpatched version or any other patches you want me to try out?Could you rebuild without the patch and test that version? eg, to determine whether it's the debug patch making it work, or whether for some reason the rebuild is doing it. I can also start removing log statements to reduce any potential timing issue. DanI rebuilt network manager without the debug patches and after 1 hour of no Wi-Fi signal, network manager is not attempting to reconnect when the Wi-FI AP is visible again (verified using "iw dev mlan0 scan"). If you have any other patches you want to try with log statements removed, I would be happy to test them. -Matt
Attachment:
avail-debug4.patch
Description: Text Data