BitFolk Users May 2024

users@mailman.bitfolk.com

4 participants
3 discussions

2024-05-23 ~20:25Z – ~20:50Z: Networking issue on host "jack"

by Andy Smith

Hi, Between about 20:25Z and ~20:50Z today host "Jack" lost all networking. All of the VMs on it became unreachable. It seems to have been some sort of kernel driver bug in the Ethernet module as it was "stuck" not passing traffic but the interface still showed as up. The hosts have bonded network interfaces to protect against switch failure, but as the interface stayed up this was not considered failed. Also they are in active-backup mode and the currently-active interface was the one that was stuck, so all traffic was trying to go that way. Networking was restored by setting the link down and up again. Traffic started to flow again, BGP sessions re-established and all was fine again. We could look into some sort of link keepalive method on the bonded interfaces as opposed to just relying on link state, but we have already decided to move away from bonded networking in favour of separate BGP sessions on each interface, That is how the next new servers will be deployed; they will not have network bonding. We have not yet tackled moving existing servers to this setup. If we had been in the situation without bonding I think we would have fared better here: there would have been a short blip while one BGP session went down, but the other would remain and we'd be left with some alerting and me scratching my head wondering why an interface that is up doesn't pass traffic. I will do some more investigation of this failure mode but in light of doing away with bonding being the direction we are already going, I don't think I want to alter how bonding is done on what will soon be a legacy setup. Thanks, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting

6 months, 1 week

Re: 2024-05-23 ~20:25Z – ~20:50Z: Networking issue on host "jack"

by Maria Blackmore

> > I will do some more investigation of this failure mode but in light of > doing away with bonding being the direction we are already going, I don't > think I want to alter how bonding is done on what will soon be a legacy > setup. Shouldn't this failure mode have been caught by LACPDUs? -- Maria Blackmore

10 months, 1 week

Oddity with one UBUNTU 22.04 VPS

by Ian

I am responsible for several VPSes, here and elsewhere. Five of them are running Ubuntu 22.04, three Debians. The script I use to update them checks, at the end, for the existence of /var/run/reboot-required If it finds it, it offers to reboot the VPS. It does this happily on all but one VPS, one of the Ubuntu ones here. The Ubuntu version of apt-get on all of the Ubuntu ones recognises that a reboot is required after a kernel update etc and will popup a message saying so, but it looks like only on this single machine, that file doesn't exist afterwards. I have no idea why not. Anyone got any ideas? Ian

10 months, 3 weeks

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

BitFolk Users May 2024