Re: Why NM seems to behave differently in initrd from in real root?



On Thu, 2021-10-07 at 22:12 +0800, Coiby Xu via networkmanager-list
wrote:
Hi NM developers,

This is Coiby from the Red Hat Kernel Debug team who is responsible
for 
Fedora/RHEL's kexec-tools. Currently, kexec-tools parses ifcfg-* or 
.nmconnection to build up dracut cmdline parameter like ip= to set 
up kdump initrd network which is tedious and error-prone. Recently,
I'm 
implementing a different approach which is to set up kdump initrd
network 
by copying connection profiles from real root to initrd directly.
However, 
one unexpected thing is NM seems to behave differently in initrd from
in 
real root and the same connection profiles copied from the real root
lead 
to different result in kdump initrd. So is there a general reason why
NM 
behaves differently in initrd and real root? Is it a better approach
that 
kexec-tools sets up kdump initrd network by copying connection
profiles 
from real root to kdump initrd? It will be appreciated if NM
developers 
could provide answers or comments on these questions since you are
experts 
on this type of problems.

NetworkManager should behave very similar in real-root and initrd.
Which probably is the point of NetworkManager in initrd in the first
place: to do the same everywhere.

The points you brought up, are special cases and configuration issues
Or even missing features, and we can find ways to make those usecases
work better. I replied to the rhbz and upstream issue, if that helps.


Copying connection profiles seems like a good idea. But the real
problem is that you are writing a non-interactive tool, which is
confronted with some profiles on disk, and then automatically needs to
do the right thing. That is not possible in all cases, and that is
despite we even have an API to more conveniently parse the profile
files. 

For example, if there are two profiles on disk that are both set to
autoconnect on the same device, then a generic, non-interactive tool
cannot understand which one to prefer or what even to do about that.
That is regardless whether you copy profiles or whether you parse and
syntesize new ones. The real solution is: the user must have
configuration that works for your tool first place.


For the details of how NM behaves differently in kdump initrd, I've 
reported some of the inconsistent behaviours as bugs [1] [2]. 
connection.wait-device-timeout=6000 and connection.autoconnect=false
could be used to bypass [1] and [2] respectively so the same
connections 
could be brought up in initrd.

replied to both. Hope that helps. Let's discuss there.


 A third issue for which I haven't found a 
workaround is the case of bridging network over VLAN network over
teaming 
network where I create a teaming network interface which is used as
the 
parent interface of a VLAN interface which is in turn a slave
interface 
of network bridge. The problem is the network bridge sometimes gets
the 
IP address belonging the VLAN subnet but sometimes not. Btw, the
third
issue is found on a physical machine and can't be reproduced on a VM.

Hard to say. Seems like a bug? Please report is, so we can track it and
discuss in detail.


I've tested the modified kexec-tools [3] by setting up different
networks 
including the aforementioned bridging network over VLAN network over 
teaming network. Other tests including bridging network over physical
interface/bonding network/teaming network/VLAN network, VLAN network 
over physical interface/bonding network/teaming work and etc. All
tests
have passed for VM. And except for the bridging network over VLAN
network 
over teaming network, the tests have also passed for one physical
machine. 

That sounds promising.

But I'm not sure if they are sufficient considering there is 
machine-specific issue like znet network device. Any suggestion is
welcome.

there are countless of combinations. It's great you invest time in
considerable amount of testing!

znet seems difficult. I comment about that on [2].




[1]
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/803
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2007563
[3]
https://src.fedoraproject.org/fork/coiby/rpms/kexec-tools/commits/direct_nm



best,
Thomas




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]