Re: NetworkManager behavior answers not found in docs



On Tue, 2018-10-16 at 14:40 +0200, Thomas HUMMEL wrote:


Hi,



I'm considering migrating from a CentOS 6 HPC cluster to CentOS 7
one. 
For this purpose I've read quite a lot of doc (man, guides, ...)
about 
systemctl and NetworkManager to be fluent in their use.

Regarding NetworkManager, I've been at first confused by the induced 
complexity or number of possible mixed old way/native use cases
brought 
by the "network-scripts" compatibility layer provided by the ifcfg-
rh 
plugin.

The compatibility layer doesn't need to concern you much. With this,

  - connection profiles are persisted as ifcfg-rh files
  - `ifup`/`ifdown` delegates to calling `nmcli connection up`

but there is no reason to even care about that and you can ignore
initscripts entirely. For example, the ifcfg-rh file format (`man nm-
settings-ifcfg-rh`) would only concern you if you plan to edit these
file manually. Which is more cumbersome then just using nmcli.


As a matter of fact, I experimented a lot to be sure of who did what
and 
how or when between network-scripts and NetworkManager.

Digging deeper, things got clearer. Still I'm not really sure about
the 
points described below.

Notes :

- I'm running NetworkManager-1.10.2-16.el7_5.x86_64 on a CentOS
Linux 
release 7.5.1804 (Core)
- I'm using default settings, so this must be ifcfg-rh plugin first, 
then keyfile otherwise
- I'm talking only about ethernet connections/devices here


1. /run/NetworkManager/devices directory


I know NetworkManager exports connections as 
/org/freedesktop/NetworkManager/Settings/<num> D-Bus objets

What exactly is under /run/NetworkManager/devices/<num> path ? UUID 
inside those files don't always match what nmcli shows me ?

This directory is the internal per-device state. It should not be
necessary to be concerned with it (also: it's not stable nor public
API!). That said:

 - the number is the "ifindex" of the device, as known to kernel and
visible in `ip link` output. This is the only real identifier for a
networking device in linux, as the name and MAC address may change.
That is why we usually match a profile against a device via attributes
like "connection.interface-name" or "ethernet.mac-address". This
ifindex is not meaningful after reboot. And so are all files under /run
(on RHEL/Fedora/CentOS this is a tempfs mount and lost after reboot).

 - when NetworkManager is stopped, it writes there some state for each
file. In particular, so that `systemctl restart NetworkManager` works
better, with possibly few changes. One usually wouldn't ever restart
NetworkManager daemon (unless having good reasons, like package
update), but when doing so, the aim is that connectivity is not
affected. Hence the state directory.

 - NetworkManger may also slightly behave differently whether it is
started the first time (for the current boot) or after a `systemctl
restart NetworkManager`. You see that in the logfile with

  NetworkManager (version ...) is starting... (after a restart)

and this is also determined based on the presence of this "devices" directory.
If you stop NetworkManager and delete the directory, NetworkManager will think
it starts first time after boot. Again, the difference again shouldn't matter
and you shouldn't rely on this.


2. device vs connection


My understanding is that there's a clear distinction between a
device 
and a connection. To be more precise, that an active connection is a
set 
of settings (potentially persisted on disk on a connection file via
some 
plugin) "applied" to a device. The applied part is not very clear to
me :

I find the name "profile" clearer. But we often say "connection" or
"connection profile". Sometimes also "settings connection" to
distinguish it from "active connection".

An "active connection" is the data associate with the fact that a
connection profile. There is the "device" ("networking interface"), the
profile, and when you activate a profile on a device, you also have an
"active connection". The active connection is usually not directly
visible in nmcli, except with `nmcli -f ACTIVE-PATH connection show`.
And of course, the current activation state with `nmcli -f NAME,STATE
connection show`.

What's the difference between connecting a device (nmcli device
connect) 
and activating a connection (nmcli connection up) ? Maybe my
confusion 
comes from the fact that adding a connection automatically connects
it 
to a device ?

There is little difference

  - `nmcli connection up "$PROFILE"` will find a suitable device
automatically.
  - `nmcli device connect "$DEVICE"` will find a suitable connection
automatically. Note this may create a new profile (see later).
  - `nmcli connection up "$PROFILE" ifname "$DEVICE" explicitly selects
both the profile and the device.


As the profile contains the necessary settings for what to do with the
device, you cannot ~configure~ a device without a profile.


With NetworkManager there is no API for "configure an IP address on a
device". You can configure an IP address to a profile and then activate
it.
That is different from what `ip addr add` does, which only configures
the address ad-hoc in kernel.

Actually, we also have the term "apply" to indicate that a profile is
currently configured on a device.

The difference is, if you do

  nmcli connection up "$PROFILE"
  nmcli connection modify "$PROFILE" +ipv4.addresses 192.168.7.5/24

then the changes do not take effect (on the device) immediately. While
"$PROFILE" was and is still active on a device, the applied
configuration of the device is an internal copy of when it was
activated last. The changes only take effect after re-activating or re-
applying the profile.

-- note, the two properties "connection.metered" and "connection.zone"
do take effect immediately.

re-activating can be done via `nmcli connection up` (or similar) and
goes through a full re-activation cycle (and temporarily disconnects
you). A more graceful way is `nmcli device reapply "$DEVICE"` which
takes the changes and configured on the device.

`nmcli device reapply` may also be useful if there are no actual
changes in the device. For example, it will re-start DHCP and restore
IP address configuration (if it was modified, for example via iproute).


You see, when a profile is activated on a device, the original settings
were internally copied and those are "applied". And "nmcli device
reapply" just updates the "applied" clone to be the current profile and
does the changes.

Before I said, everything is profile based you cannot add an IP address
without modifying a profile. Well, you can modify the (invisible)
"applied" connection and do volatile changes:

  nmcli device modify $DEVICE +ipv4.addresses ...


3. connection autocreation

in what circumstances exactly does NetworkManager autocreate
connections ?

a) What I found as well is that nmcli connection add...ifname=xxx
seems 
to auto-connect the xxx device to the named interface thus voiding
the 
need to connect the device.

"autocreate" and "autoconnect" are two entirely different things.


Usually, NetworkManager (the daemon) does not automatically create
connection profiles. Profiles are usually created by an NetworkManager
client (GUI?) or via files (ifcfg, keyfile).

In your example above, `nmcli connection add` is an explicit (client)
action by you. No profile gets autocreated here.


There are a few exceptions for when profiles get created:

1) for bluetooth pan and Wi-Fi iwd, profiles are generated. They
represent the underlying bluez/iwd "profile".

2) see `main.no-auto-default` in NetworkManager.conf. These "auto-
default" connections only apply to ethernet and are usually named
"Wired Connection 1". When NM starts, and sees an ethernet device but
has no profile for it, it will create an in-memory profile and go ahead
autoactivating it. This profile can be modified (and persisted to disk)
or deleted. In both cases, the profile won't be recreated the next
time, because the MAC address of the device is blacklisted in
/var/lib/NetworkManager/no-auto-default.state. The purpose of this is
that you can boot a machine without configuration, and NetworkManager
will create some suitable profiles automatically.

I don't like this automatism. But on your machine, you commonly don't
have this, because you create a persistent, suitable profile which
prevents to create another auto-default profile.

3) if you do

  nmcli device disconnect eth0
  ip addr add 192.168.7.5/24 dev eth0

NM now creates an in-memory connection "eth0". This means that
NetworkManager *does not* manage this device. The device was externally
configured by somebody, and this generated profile only indicates that
something is active. Such an connection gets autocreated but (depending
on how you look at it), NetworkManager does not really activate it. It
appears active, but NetworkManager does no configuration of the device.
Unfortunately, this is quite confusing and needs improvemnt.

4) `nmcli device connect $DEVICE` or clicking on a Wi-Fi network in nm-
applet may create a new profile and activate it. But I wouldn't call
that "autocreated". The client (you) initiated the creation of the
profile and activated it right away.



Regarding "autoconnect", that is a property of the connection profile
itself. If connection.autoconnect is false, then NetworkManager will
not automatically activate the profile. Otherwise, whenever
NetworkManager has a device which is currently disconnected but could
be activated, it will search whether there are suitable profiles and
connect them.
Note, that autoconnect can also be blocked for internal reasons. For
example, the following will all temporarily block autoconnect:

  - per-device: nmcli device set $DEVICE autoconnect no
  - per-device: nmcli device disconnect $DEVICE
  - per-profile: no secrets are provided (e.g. cancel the password 
    prompt)
  - per-profile: activation failed connection.autoconnect-retries
    times.

If you add a new profile with "connection.autoconnect" yes (the default
in most clients), it may immediatley become suitable to autoactivate on
a device. Likewise, if you modify a profile, this usually will unblock
the autoactivation right away and it may start autoactivating.


Oh, btw, recently we added "connection.multi-connect" setting. Usually,
a profile can only be active on one device at a time. But with
"connection.multi-connect" it can simultaneously be activated on
multiple device. Likewise, when a profile with "connection.multi-
connect=single" (the default) is already active, it will not
autoactivate again. And issuing another `nmcli connection up` will
deactivate it first.


Is device connect needed only for special cases, like when we set
the 
NM_UNMANAGED udev property to 1 or do I miss something (refer to 
question 2 above)

you don't need to mark profiles as autoconnect=yes. In such a case, you
must always explicitly activate them.

Also, while for ethernet one usually only has one profile per device,
it it very possible to have multiple profiles for a device (that's in
particular common with Wi-Fi). "Autoconnect" prefers profiles which
were active last, so, in face of multiple profiles that all
autoconnect, you quite possibly still need to explicitly select the
right one.


NM_UNMANAGED is again something different. A device can be configured
as unmanaged via multple ways:

 - nmcli device set $DEVICE managed no
 - udev rules
 - device.managed in NetworkManager.conf
 - keyfile.unmanaged-devices in NetworkManager.conf
 - NM_CONTROLLED=no in ifcfg file
 - some device types are marked as unmanaged by default (or always).
 - a software device (e.g. bridge), which is not created by 
   NetworkManager and is down or has not IP addresses, is marked as 
   unmanaged.

If a device is unmanaged, NetworkManager does nothing with it. It won't
autocreate a profile for it, and it won't autoactivate anything.

Depending on the reasons why it's unmanaged, you still can let
NetworkManager take over, either via

  nmcli device set $DEVICE managed yes

or simply `nmcli con up` or `nmcli device connect`.



Is there a distinction between autocreation vs autoactivation of a 
connection (if yes in what cases) ?

yes. See above.


b) auto create and connect when restarted

I'm confused about a case where a connection is autocreated and 
activated at NetworkManager restart while not at boot (twisted case
I 
must admit) :

Let's take the following use case as an example :

- a host with 2 ethernet devices : eth0 and eth1

- eth0 has a network-script like connection file 
(/etc/sysconfig/network-scripts/ifcfg-eth0) and is NetworkManager 
managed (either by default because no "NM_CONTROLLED=no" is stated
or 
explicitly with "NM_CONTROLLED=yes")
- eth1 has no connection file (neither ifcfg-rh style nor keyfile
style, 
that is nothing for eth1 neither in /etc/sysconfig/network-scripts
nor 
in /etc/NetworkManager/system-connections)

- no NM_UNMANAGED=1 in udev

test 1 :

-> after boot, as expected eth0 is configured and up, nmcli shows a 
connection for it but eth1 has no connection. It has to be manually 
connected for that matter. If done, NetworkManager seems to auto 
magically CREATE the connection

I guess, on CentOS you have a package NetworkManager-config-server,
which installs a file /usr/lib/NetworkManager/conf.d/00-server.conf
with:

  [main]
  no-auto-default=*

hence, no auto-default (case 2) above) are created. You manually need
to create a profile for eth1. That is very fine. I don't like this
automatism of auto-default, but whatever suits you.

Yes, if you then do

  $ nmcli device connect eth1

a new profile gets created. See above 4).




-> where is that magic described ? How does a simple connect CREATES
a 
connection. Man states that connect looks for a matching EXISTING 
connection ?

man nmcli:

       connect ifname
           Connect the device. NetworkManager will try to find a suitable connection that will be activated. 
It will also consider connections that are not set to auto
           connect.

ok. It's not described. That's a documentation bug.


test 2 under the same conditions :

- reboot of the host
- only one connection (eth0) as expected
- create the foobar connection like this

     nmcli connection add type ethernet con-name foobar ifname eth1

-> as expected a ifcgh-rh format connection file has been created in 
/etc/sysconfig/network-scripts/ifcfg-foobar and nmcli shows 2 
connections : eth0 and foobar

- manually delete foobar connection file : rm 
/etc/sysconfig/network-scripts/ifcfg-foobar

-> as expected nmcli still shows foobar connection

- reload connection files via nmcli connection reload

-> as expected nmcli only shows eth0 connection

So far, everything seems to work as expected BUT :

if instead of nmcli connection reload we RESTART (not reload, but 
restart) NetworkManager via systemctl restart NetworkManager,
ANOTHER 
(different UUID and eth1 instead of foobar for the name) connection
is 
auto created and auto activated

Yes, that is case 3). NM (re) starts and sees the the device is already
configured. It has no configuration for that device, so it assumes
somebody else was configuring it and does not touch it. This eth0
device is no longer managed by NetworkManager. For example, it does no
DHCP and your addresses will time out.


I cannot explain why and how this connection is created on restart
since 
no connection is auto created at boot.
Granted manually removing the connection file is tricky but even
then...

it's fine to remove file, but probably you should call `nmcli
connection reload ` afterwards. Or be prepared to handle the fall-out.

It's also fine to restart NetworkManager (but don't, you seldomly need
to!!), but also here: be prepared to fix the desired state afterwards
(if you did something like here).



Can you help me figure out those points which would complete a bit 
further my understanding of NetworkManager ?

Thanks

--
Thomas H.


best,
Thomas

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]