Re: What NetworkManager does to VPN devices?



On Tue, 2017-04-11 at 09:17 +1000, Dan Fruehauf wrote:
G'day,

I hope that post will not be long, but I've spent a few hours trying
to narrow down the problem so I can provide as much information
without wasting anyone's time.

I started to debug why NetworkManager-ssh (which I maintain) does not
allow traffic through interfaces (tun interfaces usually) and went
down a deep rabbit hole, which I'm not entirely sure has got to do
much with NetworkManager-ssh at the moment.

What I'm trying to do:
 * Setup a "poor man's VPN" aka SSH VPN to a remote host (on AWS)
 * On my machine I should eventually have a tun device with
172.16.40.2
 * On the server machine a tun device with 172.16.40.1
 * Those two internal addresses should be reachable from one another

The steps that are usually necessary are:
 1. SSH to remote machine (with tunnel creation parameters) + run a
ifconfig command to configure the tun device
 2. Configure the tun device on the client host with ifconfig
(ifconfig tun0 ...)
 3. Replacing default routes etc

So when NetworkManager-ssh does what it does, the end result is what
I expect, except that things don't work. Traffic can reach
172.16.40.1, but nothing can reach 172.16.40.2. Before you point a
problem with the server routing tables, please read along.

I decided to just run the SSH command from the command line,
eliminating everything that NetworkManager does to the VPN
connection. What happens here is very unclear to me, but someone
might be able to explain it quickly. When running the steps 1 & 2
quickly one after the other - the VPN is setup properly and
172.16.40.2 is reachable from the server side. When I see quickly, I
mean no delay in between, see log for that here (https://paste.fedora
project.org/paste/QgnyCz7hEuvuLiAiOhtTfl5M1UNdIGYhyRLivL9gydE=). Then
I can obviously proceed to replacing the default route and so on and
so forth. VPN works.

On the other hand, if i introduce a delay of say 5 seconds, it allows
NetworkManager to have enough time to do something I don't understand
to the tun0 device which then renders it as unreachable, see log for
that here (https://paste.fedoraproject.org/paste/0~lxbpwNuIxGUvDwjokW
6F5M1UNdIGYhyRLivL9gydE=). I spaced it out a bit where the 5 seconds
gap appears.

A few other things to note:
 * I'm running fedora 24 (with kernel 4.9.9-100.fc24.x86_64 and
NetworkManager 1.2.6-1)
 * Selinux is disabled (both on server and client)
 * firewalld is disabled
 * When sniffing traffic, all traffic reaches the server
(172.16.40.1)
 * Traffic comes back on on the SSH tunnel, but never reaches
172.16.40.2 (verified it both with strace and by sniffing)
 * Routing tables after the VPN is up are identical in both cases (ip
route show table main), except for metric, but I also tried to modify
it to be the same and it didn't help either

So what I'm looking for here is to understand *what* NetworkManager
does to the tun0 interface that renders it unreachable? And why if I
don't allow NetworkManager do its thing and configure the interface
quickly - things actually work?

My end result is to fix NetworkManager-ssh to do what it should do,
however at the moment I'm puzzled as to why this simple scenario
doesn't even work out of NetworkManager-ssh.Any help more than
appreciated.


Hi Dan,


NM notices that a device tun0 appears and "assumes" the connection.
That process should not modify the interface, in order not to interfere
with whoever created and manages the device. It seems that doesn't work
well here.


It's not clear to me what NM does to interfere with the tun device. But
it would be interesting to see that setting tun0 as unmanaged
avoids the problem.
like:
  [keyfile]
  unmanaged-device=interface-name:tun0
and `killall -SIGHUP NetworkManager`, and reactivate the SSH VPN, and
notice tun0 as unmanaged in `nmcli device`.


With upcoming 1.8, NM was changed to improve the situation here
(https://bugzilla.gnome.org/show_bug.cgi?id=746440).
It would also be interesting to see how 1.8 works there.
It's actually very simple to build a RPM for Fedora of upstream NM, see
  https://wiki.gnome.org/Projects/NetworkManager/Hacking


best,
Thomas

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]