On Tue, 2018-10-16 at 14:40 +0200, Thomas HUMMEL wrote: Hi,
I'm considering migrating from a CentOS 6 HPC cluster to CentOS 7 one. For this purpose I've read quite a lot of doc (man, guides, ...) about systemctl and NetworkManager to be fluent in their use. Regarding NetworkManager, I've been at first confused by the induced complexity or number of possible mixed old way/native use cases brought by the "network-scripts" compatibility layer provided by the ifcfg- rh plugin.
The compatibility layer doesn't need to concern you much. With this, - connection profiles are persisted as ifcfg-rh files - `ifup`/`ifdown` delegates to calling `nmcli connection up` but there is no reason to even care about that and you can ignore initscripts entirely. For example, the ifcfg-rh file format (`man nm- settings-ifcfg-rh`) would only concern you if you plan to edit these file manually. Which is more cumbersome then just using nmcli.
As a matter of fact, I experimented a lot to be sure of who did what and how or when between network-scripts and NetworkManager. Digging deeper, things got clearer. Still I'm not really sure about the points described below. Notes : - I'm running NetworkManager-1.10.2-16.el7_5.x86_64 on a CentOS Linux release 7.5.1804 (Core) - I'm using default settings, so this must be ifcfg-rh plugin first, then keyfile otherwise - I'm talking only about ethernet connections/devices here 1. /run/NetworkManager/devices directory I know NetworkManager exports connections as /org/freedesktop/NetworkManager/Settings/<num> D-Bus objets What exactly is under /run/NetworkManager/devices/<num> path ? UUID inside those files don't always match what nmcli shows me ?
This directory is the internal per-device state. It should not be necessary to be concerned with it (also: it's not stable nor public API!). That said: - the number is the "ifindex" of the device, as known to kernel and visible in `ip link` output. This is the only real identifier for a networking device in linux, as the name and MAC address may change. That is why we usually match a profile against a device via attributes like "connection.interface-name" or "ethernet.mac-address". This ifindex is not meaningful after reboot. And so are all files under /run (on RHEL/Fedora/CentOS this is a tempfs mount and lost after reboot). - when NetworkManager is stopped, it writes there some state for each file. In particular, so that `systemctl restart NetworkManager` works better, with possibly few changes. One usually wouldn't ever restart NetworkManager daemon (unless having good reasons, like package update), but when doing so, the aim is that connectivity is not affected. Hence the state directory. - NetworkManger may also slightly behave differently whether it is started the first time (for the current boot) or after a `systemctl restart NetworkManager`. You see that in the logfile with NetworkManager (version ...) is starting... (after a restart) and this is also determined based on the presence of this "devices" directory. If you stop NetworkManager and delete the directory, NetworkManager will think it starts first time after boot. Again, the difference again shouldn't matter and you shouldn't rely on this.
2. device vs connection My understanding is that there's a clear distinction between a device and a connection. To be more precise, that an active connection is a set of settings (potentially persisted on disk on a connection file via some plugin) "applied" to a device. The applied part is not very clear to me :
I find the name "profile" clearer. But we often say "connection" or "connection profile". Sometimes also "settings connection" to distinguish it from "active connection". An "active connection" is the data associate with the fact that a connection profile. There is the "device" ("networking interface"), the profile, and when you activate a profile on a device, you also have an "active connection". The active connection is usually not directly visible in nmcli, except with `nmcli -f ACTIVE-PATH connection show`. And of course, the current activation state with `nmcli -f NAME,STATE connection show`.
What's the difference between connecting a device (nmcli device connect) and activating a connection (nmcli connection up) ? Maybe my confusion comes from the fact that adding a connection automatically connects it to a device ?
There is little difference - `nmcli connection up "$PROFILE"` will find a suitable device automatically. - `nmcli device connect "$DEVICE"` will find a suitable connection automatically. Note this may create a new profile (see later). - `nmcli connection up "$PROFILE" ifname "$DEVICE" explicitly selects both the profile and the device. As the profile contains the necessary settings for what to do with the device, you cannot ~configure~ a device without a profile. With NetworkManager there is no API for "configure an IP address on a device". You can configure an IP address to a profile and then activate it. That is different from what `ip addr add` does, which only configures the address ad-hoc in kernel. Actually, we also have the term "apply" to indicate that a profile is currently configured on a device. The difference is, if you do nmcli connection up "$PROFILE" nmcli connection modify "$PROFILE" +ipv4.addresses 192.168.7.5/24 then the changes do not take effect (on the device) immediately. While "$PROFILE" was and is still active on a device, the applied configuration of the device is an internal copy of when it was activated last. The changes only take effect after re-activating or re- applying the profile. -- note, the two properties "connection.metered" and "connection.zone" do take effect immediately. re-activating can be done via `nmcli connection up` (or similar) and goes through a full re-activation cycle (and temporarily disconnects you). A more graceful way is `nmcli device reapply "$DEVICE"` which takes the changes and configured on the device. `nmcli device reapply` may also be useful if there are no actual changes in the device. For example, it will re-start DHCP and restore IP address configuration (if it was modified, for example via iproute). You see, when a profile is activated on a device, the original settings were internally copied and those are "applied". And "nmcli device reapply" just updates the "applied" clone to be the current profile and does the changes. Before I said, everything is profile based you cannot add an IP address without modifying a profile. Well, you can modify the (invisible) "applied" connection and do volatile changes: nmcli device modify $DEVICE +ipv4.addresses ...
3. connection autocreation in what circumstances exactly does NetworkManager autocreate connections ? a) What I found as well is that nmcli connection add...ifname=xxx seems to auto-connect the xxx device to the named interface thus voiding the need to connect the device.
"autocreate" and "autoconnect" are two entirely different things. Usually, NetworkManager (the daemon) does not automatically create connection profiles. Profiles are usually created by an NetworkManager client (GUI?) or via files (ifcfg, keyfile). In your example above, `nmcli connection add` is an explicit (client) action by you. No profile gets autocreated here. There are a few exceptions for when profiles get created: 1) for bluetooth pan and Wi-Fi iwd, profiles are generated. They represent the underlying bluez/iwd "profile". 2) see `main.no-auto-default` in NetworkManager.conf. These "auto- default" connections only apply to ethernet and are usually named "Wired Connection 1". When NM starts, and sees an ethernet device but has no profile for it, it will create an in-memory profile and go ahead autoactivating it. This profile can be modified (and persisted to disk) or deleted. In both cases, the profile won't be recreated the next time, because the MAC address of the device is blacklisted in /var/lib/NetworkManager/no-auto-default.state. The purpose of this is that you can boot a machine without configuration, and NetworkManager will create some suitable profiles automatically. I don't like this automatism. But on your machine, you commonly don't have this, because you create a persistent, suitable profile which prevents to create another auto-default profile. 3) if you do nmcli device disconnect eth0 ip addr add 192.168.7.5/24 dev eth0 NM now creates an in-memory connection "eth0". This means that NetworkManager *does not* manage this device. The device was externally configured by somebody, and this generated profile only indicates that something is active. Such an connection gets autocreated but (depending on how you look at it), NetworkManager does not really activate it. It appears active, but NetworkManager does no configuration of the device. Unfortunately, this is quite confusing and needs improvemnt. 4) `nmcli device connect $DEVICE` or clicking on a Wi-Fi network in nm- applet may create a new profile and activate it. But I wouldn't call that "autocreated". The client (you) initiated the creation of the profile and activated it right away. Regarding "autoconnect", that is a property of the connection profile itself. If connection.autoconnect is false, then NetworkManager will not automatically activate the profile. Otherwise, whenever NetworkManager has a device which is currently disconnected but could be activated, it will search whether there are suitable profiles and connect them. Note, that autoconnect can also be blocked for internal reasons. For example, the following will all temporarily block autoconnect: - per-device: nmcli device set $DEVICE autoconnect no - per-device: nmcli device disconnect $DEVICE - per-profile: no secrets are provided (e.g. cancel the password prompt) - per-profile: activation failed connection.autoconnect-retries times. If you add a new profile with "connection.autoconnect" yes (the default in most clients), it may immediatley become suitable to autoactivate on a device. Likewise, if you modify a profile, this usually will unblock the autoactivation right away and it may start autoactivating. Oh, btw, recently we added "connection.multi-connect" setting. Usually, a profile can only be active on one device at a time. But with "connection.multi-connect" it can simultaneously be activated on multiple device. Likewise, when a profile with "connection.multi- connect=single" (the default) is already active, it will not autoactivate again. And issuing another `nmcli connection up` will deactivate it first.
Is device connect needed only for special cases, like when we set the NM_UNMANAGED udev property to 1 or do I miss something (refer to question 2 above)
you don't need to mark profiles as autoconnect=yes. In such a case, you must always explicitly activate them. Also, while for ethernet one usually only has one profile per device, it it very possible to have multiple profiles for a device (that's in particular common with Wi-Fi). "Autoconnect" prefers profiles which were active last, so, in face of multiple profiles that all autoconnect, you quite possibly still need to explicitly select the right one. NM_UNMANAGED is again something different. A device can be configured as unmanaged via multple ways: - nmcli device set $DEVICE managed no - udev rules - device.managed in NetworkManager.conf - keyfile.unmanaged-devices in NetworkManager.conf - NM_CONTROLLED=no in ifcfg file - some device types are marked as unmanaged by default (or always). - a software device (e.g. bridge), which is not created by NetworkManager and is down or has not IP addresses, is marked as unmanaged. If a device is unmanaged, NetworkManager does nothing with it. It won't autocreate a profile for it, and it won't autoactivate anything. Depending on the reasons why it's unmanaged, you still can let NetworkManager take over, either via nmcli device set $DEVICE managed yes or simply `nmcli con up` or `nmcli device connect`.
Is there a distinction between autocreation vs autoactivation of a connection (if yes in what cases) ?
yes. See above.
b) auto create and connect when restarted I'm confused about a case where a connection is autocreated and activated at NetworkManager restart while not at boot (twisted case I must admit) : Let's take the following use case as an example : - a host with 2 ethernet devices : eth0 and eth1 - eth0 has a network-script like connection file (/etc/sysconfig/network-scripts/ifcfg-eth0) and is NetworkManager managed (either by default because no "NM_CONTROLLED=no" is stated or explicitly with "NM_CONTROLLED=yes") - eth1 has no connection file (neither ifcfg-rh style nor keyfile style, that is nothing for eth1 neither in /etc/sysconfig/network-scripts nor in /etc/NetworkManager/system-connections) - no NM_UNMANAGED=1 in udev test 1 : -> after boot, as expected eth0 is configured and up, nmcli shows a connection for it but eth1 has no connection. It has to be manually connected for that matter. If done, NetworkManager seems to auto magically CREATE the connection
I guess, on CentOS you have a package NetworkManager-config-server, which installs a file /usr/lib/NetworkManager/conf.d/00-server.conf with: [main] no-auto-default=* hence, no auto-default (case 2) above) are created. You manually need to create a profile for eth1. That is very fine. I don't like this automatism of auto-default, but whatever suits you. Yes, if you then do $ nmcli device connect eth1 a new profile gets created. See above 4).
-> where is that magic described ? How does a simple connect CREATES a connection. Man states that connect looks for a matching EXISTING connection ?
man nmcli: connect ifname Connect the device. NetworkManager will try to find a suitable connection that will be activated. It will also consider connections that are not set to auto connect. ok. It's not described. That's a documentation bug.
test 2 under the same conditions : - reboot of the host - only one connection (eth0) as expected - create the foobar connection like this nmcli connection add type ethernet con-name foobar ifname eth1 -> as expected a ifcgh-rh format connection file has been created in /etc/sysconfig/network-scripts/ifcfg-foobar and nmcli shows 2 connections : eth0 and foobar - manually delete foobar connection file : rm /etc/sysconfig/network-scripts/ifcfg-foobar -> as expected nmcli still shows foobar connection - reload connection files via nmcli connection reload -> as expected nmcli only shows eth0 connection So far, everything seems to work as expected BUT : if instead of nmcli connection reload we RESTART (not reload, but restart) NetworkManager via systemctl restart NetworkManager, ANOTHER (different UUID and eth1 instead of foobar for the name) connection is auto created and auto activated
Yes, that is case 3). NM (re) starts and sees the the device is already configured. It has no configuration for that device, so it assumes somebody else was configuring it and does not touch it. This eth0 device is no longer managed by NetworkManager. For example, it does no DHCP and your addresses will time out.
I cannot explain why and how this connection is created on restart since no connection is auto created at boot. Granted manually removing the connection file is tricky but even then...
it's fine to remove file, but probably you should call `nmcli connection reload ` afterwards. Or be prepared to handle the fall-out. It's also fine to restart NetworkManager (but don't, you seldomly need to!!), but also here: be prepared to fix the desired state afterwards (if you did something like here).
Can you help me figure out those points which would complete a bit further my understanding of NetworkManager ? Thanks -- Thomas H.
best, Thomas
Attachment:
signature.asc
Description: This is a digitally signed message part