Hi Maria,
On Sat, May 25, 2024 at 12:40:31AM +0100, Maria Blackmore via BitFolk Users wrote:
I will do some more investigation of this failure mode but in light of
doing away with bonding being the direction we are already going, I don't
think I want to alter how bonding is done on what will soon be a legacy
setup.
Shouldn't this failure mode have been caught by LACPDUs?
Hmm, I'm not sure.
Here's the configuration:
auto bond0
iface bond0 inet static
bond_mode 1
slaves eth0 eth1
bond_miimon 100
$ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v5.10.0-0.deb10.16-amd64
Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0
Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 0c:c4:7a:76:c6:8c
Slave queue ID: 0
Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 0c:c4:7a:76:c6:8d
Slave queue ID: 0
But, during the incident the MII Status remained up (and "ip link"
showed as up). With tcpdump I could see LACP traffic arriving but no
traffic at all going out of eth1.
This is an old post but I can't see any improvements (unless you
count BPF…) for finding out what is going on with LACP:
https://serverfault.com/a/823865/71071
But also, as far as I understand, mode 1 (active-backup) doesn't
require any LACP support in the switches so maybe doesn't care about
LACPDU and only uses link state?
Thanks,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting