Re: Problem I've been having with dhclient dhcp in Fedora27



On Tue, 2018-01-09 at 12:06 -0500, dpreed deepplum com wrote:
Thanks, Thomas for the hints.
 
I include two attachments that should clarify what is going on. The
output of 'nmcli con show', and then the trace-level log of two
cases, with **************** lines inserted at the beginning of each.
 
The first case, which is what was happening to me that I didn't
understand, is the trace for the two commands 'nmcli con down br0;
nmcli con up br0'.
 
The second case, which apparently works fine, is the trace for a
slightly different way of doing roughly the same thing 'nmcli con
down br0; nmcli con up bridge-slave-eno1'.
 
Your email made it clear that the second case is preferred as a way
to bring up a bridge. In the second case, the bridge acquires the MAC
address of the slave being brought up. In the first case, the bridge
gets a new random MAC address.
 
I'm still not clear on why or whether the result of the first case is
"right". But I get why I need to bring up the connection with the
slave interface to get what I want.
 
Note: because br0 is the source of DHCP requests, the first case's
result really confuses things, and makes br0 unable to get an IP
address after 'nmcli con up br0', whereas the second case renews the
IP address properly.
 
I'm no expert in what NetworkManager *should* do, but it seemed
logical to me that the first case should have noticed the slave and
brought it up first (just as happens in the second case), and doing
nothing about br0 until the slave was fully up. Then the rule "choose
the MAC address of the first slave" would have kicked in, and both
cases would have produced equivalent results.
 
Thank you very much for clarifying this order-dependency on bridge
connections. I wonder if there's somewhere it can be documented (like
the Networking Guide of Fedora/RHEL). It would save others some
confusion. I've never contributed to documentation, but I'd be happy
to write something as a draft and send it somewhere.

Hi,

In the first case, you only activate the master br0.
This may not activate any slaves, depending on your configuration.
You end up with a bridge device with not slaves attached.
The fact that the MAC address is unspecified doesn't matter here.
The problem is that the bridge has no carrier unless slaves are
attached. You cannot meaningfully do DHCP unless you have carrier.
You would also see:
  $ nmcli connection up br0
  Connection successfully activated (master waiting for slaves) (D-Bus active path: 
/org/freedesktop/NetworkManager/ActiveConnection/10)
and
  $ nmcli device 
  nm-bridge  bridge    connecting (getting IP configuration)  br0

Static IPv4 addresses, would avoid that problem, because you can
configure them without carrier. Static IPv6 addresses, still have that
problem, because you cannot do duplicate address detection without
carrier.


To fix that either:

- configure the bridge master with "connection.autoconnect-slaves=yes"

- always ensure to also activate at least one slave. That means to
  manually activate the slaves. Note that:

  $ nmcli con up bridge-slave-eno1
    is sufficient, because activating a slaves always ensures the
    master is active as well.

  $ nmcli con up br0 && nmcli con up bridge-slave-eno1
    works too, but is redundandent and slower

  $ nmcli con up bridge-slave-eno1 && nmcli con up br0
    is wrong,
because after the first comment, br0 is
    already fully up. So,
issuing another `con up` brings
    br0 down again to re-activate it --
and you end up
    with no slaves again.


best,
Thomas

 
 
-----Original Message-----
From: "Thomas Haller" <thaller redhat com>
Sent: Tuesday, January 9, 2018 12:34am
To: "dpreed deepplum com" <dpreed deepplum com>, networkmanager-list@
gnome.org
Subject: Re: Problem I've been having with dhclient dhcp in Fedora27

On Sat, 2018-01-06 at 17:23 -0500, dpreed deepplum com wrote:
Help needed to understand if this is a bug/feature/weirdness:

I've made some progress diagnosing this problem with losing DHCP
connectivity. I've got it reproducible by a simple command: 'nmcli
con down br0; nmcli con up br0' fails to get a DHCP lease in the
"up"
case.

It seems to be the way that NM handles a bridge connection. When
Fedora boots, it comes up with the bridge (br0) using the same MAC
address as the slave (eno1), which is the hardware MAC address of
the
wired card. However, if you do 'nmcli con down br0; nmcli con up
br0', the br0 device now has a randomly generated MAC address.

This is a little weird. I suspect I can work around my specific
problem by giving the br0 device a fixed ether.mac-address.
However,
I don't know if that is the right thing for others to do in my
situation.

In fact, there is little info about bridge management behavior in
NM
docs I can find, so it's not obvious what is the "correct" behavior
of an NM-managed bridge connection.

Should NM be giving the bridge its MAC address from the slave
device
the first time? Makes sense, though it's a little unclear what the
"default" should be.

And should the second and later times use "random addresses"?

Seems like there may be two different pieces of NM code that do the
same function of bringing up the interface, but which are not
consistent with each other.

Anyway, I'd like to know what is right.

Hi,

your previously sent logfile has no level=TRACE logging enabled, so
it's not clear whats happening.
See https://cgit.freedesktop.org/NetworkManager/NetworkManager/tree/c
ontrib/fedora/rpm/NetworkManager.conf

You might set a fixed MAC address, via "bridge.mac-address" and
"ethernet.cloned-mac-address". The first property is used when
creating
the bridge interface, the second later when activating.
On 1.10, "bridge.mac-address" got deprecated, because it's obviously
redundant.

Anyway, in general, if you activate a master device alone with `nmcli
connection up` (be it bridge, bond, or team), then you only activate
the master alone. There is also a "connection.autoconnect-slaves"
property, that aims to brings up available slaves. So whether any
slaves are attaches is unclear. But quite possibly, no slaves are
attached.

Note that the sequence:
nmcli connection up "$SLAVE"
nmcli connection up "$MASTER"
is wrong, because activating the slave already brings up the master
as
well, so activating the master again results in a disconnect of the
slave.
Either do
nmcli connection up "$MASTER"
nmcli connection up "$SLAVE"
or just

nmcli connection up "$SLAVE"

As you don't set "bridge.mac-address" nor "ethernet.cloned-mac-
address", the MAC address of a master device without slaves is
randomly
assigned by kernel.
If the bridge's MAC address is unset (from kernel's point of view),
kernel will assign the MAC address of the first slave that attaches.
So, the MAC address changes. Usually, that shouldn't matter, because
as
long as there are no slaves, the master's MAC address isn't very
useful.

Long story short, please send a log file to see what's happening.


Thanks,
Thomas



-----Original Message-----
From: "dpreed deepplum com" <dpreed deepplum com>
Sent: Thursday, January 4, 2018 3:59pm
To: networkmanager-list gnome org
Subject: Problem I've been having with dhclient dhcp in Fedora27

I've been having problems with NetworkManager dhcp on my Fedora27
Workstation (desktop, wired).. Note, because I'm using VMs on that
workstation, the interface is a bridge (br0 with slave eno1).

What seems to happen is this:
Workstation wakes up from sleeping, reactivates connection.
dhclient issues 4 DHCPDISCOVER tries, none of which get a response.
the interface state changes "unknown-->timeout->done", and the
DHCPv4
transaction is cancelled.
A restart is scheduled for 120 seconds later.
The restart 120 seconds later succeeds with DHCPDISCOVER,
DHCPREQUEST, DHCPOFFER, DHCPACK.

All the other machines served by my DHCP server have no problems at
all, however, they are MacBooks, various storage servers, and
RaspberryPis.

It seems to be that NetworkManager somehow interferes with the
DHCPDISCOVER after the workstation wakes up.

The attached log file shows this sequence of events.

(I'm wondering if there is a timing issue because the first
dhclient
call is issued when eno1 is in the "unavailable" state, and before
it
is captured as a slave to br0)
_______________________________________________
networkmanager-list mailing list
networkmanager-list gnome org
https://mail.gnome.org/mailman/listinfo/networkmanager-list

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]