SARA-R4 unreliable LTE-M connection problem



Hello everyone,

I need some help to further debug a mobile broadband modem connection problem.

We are using a mikroe LTE IoT Click board [1] with an u-blox SARA-R410M-02B 
cellular modem (LTE-M, NB-IoT) to connect some custom embedded ARM SoC based 
hardware to the internet. The LTE module is connected through the serial UART 
only.

The current software stack is running on a custom ptxdist based board support 
package (BSP) with Linux kernel 4.9.201, ModemManager 1.16.6, NetworkManager 
1.30.4, and pppd 2.4.9. I have full control over the software, and can apply 
and test patches if needed.

Provider is Deutsche Telekom (DT), we are using some special SIM cards in some 
so called Business Smart Connect plan.

The symptoms we face are like this: after reboot of the whole system, 
NetworkManager successfully connects. We see that both in Linux, there's a 
ppp0 device with the correct IPv4 address, route setup looks fine, resolv.conf 
looks fine, `mmcli -m 0` shows the modem is connected, `journalctl -u 
ModemManager` looks fine and so does `journalctl -u NetworkManager`. We can 
see the modem is connected in the dashboard provided by DT [2].

However we can't receive any data. :-/

After roughly 10 minutes (some random time between 9 and 11 minutes) we get a 
disconnect (LCP terminated by peer), and this happens always, every time. 
Sometimes after automatic reconnect, we can send/receive data then, but 
reconnect is not always successful. :-/

See nm journal output of such a disconnect:

        Feb 01 00:10:17 unit pppd[230]: LCP terminated by peer
        Feb 01 00:10:17 unit pppd[230]: nm-ppp-plugin: status 8 / phase 'network'
        Feb 01 00:10:17 unit NetworkManager[230]: LCP terminated by peer
        Feb 01 00:10:17 unit pppd[230]: Connect time 9.8 minutes.
        Feb 01 00:10:17 unit pppd[230]: nm-ppp-plugin: status 5 / phase 'establish'
        Feb 01 00:10:17 unit NetworkManager[230]: Connect time 9.8 minutes.
        Feb 01 00:10:17 unit NetworkManager[230]: Sent 22602 bytes, received 0 
bytes.
        Feb 01 00:10:17 unit pppd[230]: Sent 22602 bytes, received 0 bytes.
        Feb 01 00:10:17 unit NetworkManager[130]: <info>  [1612138217.8665] device 
(ppp0): state change: disconnected -> unmanaged (reason 'connection-assumed', 
sys-iface-state: 'external')
        Feb 01 00:10:20 unit pppd[230]: nm-ppp-plugin: status 11 / phase 
'disconnect'
        Feb 01 00:10:20 unit NetworkManager[230]: Connection terminated.
        Feb 01 00:10:20 unit pppd[230]: Connection terminated.
        Feb 01 00:10:21 unit pppd[230]: Modem hangup
        Feb 01 00:10:21 unit pppd[230]: nm-ppp-plugin: status 1 / phase 'dead'
        Feb 01 00:10:21 unit NetworkManager[230]: Modem hangup
        Feb 01 00:10:21 unit NetworkManager[130]: <info>  [1612138221.8974] device 
(ttymxc4): state change: activated -> failed (reason 'ip-config-unavailable', 
sys-iface-state: 'managed')
        Feb 01 00:10:21 unit pppd[230]: Exit.
        Feb 01 00:10:21 unit pppd[230]: nm-ppp-plugin: cleaning up
        Feb 01 00:10:21 unit NetworkManager[130]: <error> [1612138221.9571] kill 
child process 'pppd' (230): failed due to unexpected return value -1 by 
waitpid (No child processes, 10) after sending SIGTERM (15)

I'm currently struggling to debug the whole thing. I see at least 4 components 
interacting (kernel, mm, nm, pppd), and I'm not sure where to start debugging, 
but I think nm is worth a try.

I get logs as shown above, however I could not get NetworkManager to increase 
log level. I tried to set it in /etc/NetworkManager/NetworkManager.conf like 
this:

        root@unit:~ cat /etc/NetworkManager/NetworkManager.conf
        [main]
        plugins=ifupdown,keyfile
        rc-manager=file
        
        [ifupdown]
        managed=false
        
        [logging]
        domains="MB:DEBUG,PPP:DEBUG"

The connection itself is defined like this:

        root@unit:~ cat /etc/NetworkManager/system-connections/gsm-ttymxc4
        [connection]
        id=gsm-ttymxc4
        type=gsm
        interface-name=ttymxc4
        permissions=
        autoconnect=yes
        autoconnect-retries=0
        
        [gsm]
        apn=iot.telekom.net
        
        [ipv4]
        dns-search=
        method=auto
        
        [ipv6]
        addr-gen-mode=stable-privacy
        dns-search=
        method=auto

I'm a little puzzled about that log message:

        Feb 01 00:59:34 unit NetworkManager[342]: <warn>  [1612141174.8636] config: 
invalid logging configuration: Unknown log level 'DEBUG"'

Can certain log levels set deactivated by meson options on build?

Maybe someone has some hints how to dig in deeper?

Greets
Alex

[1] https://www.mikroe.com/lte-iot-click
[2] https://business-portal-smart-connect.telekom.de/





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]