Hi,
Between about 20:25Z and ~20:50Z today host "Jack" lost all
networking. All of the VMs on it became unreachable.
It seems to have been some sort of kernel driver bug in the
Ethernet module as it was "stuck" not passing traffic but the
interface still showed as up.
The hosts have bonded network interfaces to protect against switch
failure, but as the interface stayed up this was not considered
failed. Also they are in active-backup mode and the currently-active
interface was the one that was stuck, so all traffic was trying to
go that way.
Networking was restored by setting the link down and up again.
Traffic started to flow again, BGP sessions re-established and all
was fine again.
We could look into some sort of link keepalive method on the bonded
interfaces as opposed to just relying on link state, but we have
already decided to move away from bonded networking in favour of
separate BGP sessions on each interface, That is how the next new
servers will be deployed; they will not have network bonding. We
have not yet tackled moving existing servers to this setup.
If we had been in the situation without bonding I think we would
have fared better here: there would have been a short blip while one
BGP session went down, but the other would remain and we'd be left
with some alerting and me scratching my head wondering why an
interface that is up doesn't pass traffic.
I will do some more investigation of this failure mode but in light
of doing away with bonding being the direction we are already going,
I don't think I want to alter how bonding is done on what will soon
be a legacy setup.
Thanks,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
***This has been solved, post for others who may experience it.***
Late yesterday evening I returned hone and had a desperate message from the
secretary of the local cancer support charity for whom I provide email
facilities and web support. The email server seemed to be refusing emails
in or out.
I investigated and saw that the VPS had run out of storage space. My first
act was to try to uninstall some programs thet have been there since Adam
was a boy. But apt would not play ball. No storage space meant it could not
do it.
So I firstly contacted Andy to request some more storage space, but he
would not see it, of course, till the morning.
Next I trawled through manually hunting larger text or data files I could
happily lose. Interestingly fail2ban logs and the latest archive of them
were very large, they were sacrificed. The emailbox that is used to collect
confirmation of DKIM was busting at the seams, they all went. I will change
the config so that I wont get confirmations anymore. 6 months of it working
seems to be sufficient. Various other gains were made which should have
provided enough space but nothing was happening and I was falling asleep at
the keyboard so gave up until bright and early today.
This morning Andy had already activated extra storage for me (as well as
giving loads of tips), but it needed activation at my end, which required
installing parted which I could not do as apt could not access any storage.
So I decided to try rebooting. (always the first thing recommended) It went
down OK but would not reboot. Dead as a dodo. Undaunted, I fired up a Xen
terminal and rebooted from there. It worked a dream. and the recovered
space was there, so I could now use apt to remove old kernel modules and do
an autoremove, install parted, get the new atorage recognised and start on
the slow process of tidying up the disc space and so on. I fully intend to
ask Andy for the Icinga monitoring service so I don't get this again.
What a fun day, but a day of learning
Keith
--
Leighton Linslade Cancer Support Group <https://leighton-linslade-csg.org>
CVE-2024-6387 details a flaw in OpenSSH that could *potentially* give an
attacker a root shell in "6-8 hours"
It's not in MITRE yet, but Qualys have named it "regreSSHion" and you can
read about it on their site
There's an updated package in Debian already, but it looks like the
information's still embargoed (even the openssh package changelog is
404ing) so I can only *assume* they've fixed it but can't tell anyone yet
(it wasn't on security.debian.org just now either
This is probably an update you don't want to be sleeping on
Hi,
An unauthenticated remote root exploit has been discovered in SSH,
including in versions shipped by Debian stable and newer, and most
other up to date Linux distributions.
https://security-tracker.debian.org/tracker/CVE-2024-6387
Please make sure you have applied the necessary upgrades.
If for some reason you are unable to apply an upgrade, the issue can
be mitigated by setting LoginGraceTime to 0 in /etc/ssh/sshd_config.
This will make it easier for people to tie up all connection slots,
denying access to legitimate connections, but does avoid the remote
root exploit.
Thanks,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi folks
I'm running mail-in-a-box on a Bitfolk VPS.
https://mailinabox.email/
It's making the following complaint:
"This box's reverse DNS is currently aquitaine.richardskingdom.net
(IPv4) and 2001-ba8-1f1-f037-0-0-0-2.autov6rev.bitfolk.space (IPv6), but
it should be aquitaine.richardskingdom.net. Your ISP or cloud provider
will have instructions on setting up reverse DNS for this box."
This is with reverse DNS set to "automatic" in the Bitfolk panel.
The only other panel option seems to be to delegate the reverse IPv6
zones to my name server.
I'm using the mail-in-a-box built-in name server, however, and
delegating to that produces the following result:
"This box's reverse DNS is currently aquitaine.richardskingdom.net
(IPv4) and [Not Set] (IPv6), but it should be
aquitaine.richardskingdom.net ..."
I infer that mail-in-a-box name server is not setting reverse IPv6
records for itself.
There doesn't appear to be a way to tell mail-in-a-box to set the
reverse DNS correctly via its GUI - there is no option to add a custom
PTR record (other record types can be added).
I don't know what name server software is running under the hood, and
I'm loath to make config changes except via the GUI in case they get
overwritten when mail-in-a-box updates.
Can anyone advise me how to set my IPv6 reverse DNS to
aquitaine.richardskingdom.net?
I should note the mail server works (I am sending this message through
it) so this is only to make the error message go away (and possibly to
get IPv6 mail transportation working correctly).
If this sounds like an ignorant / nonsense request, congratulations, you
have detected successfully that I have no idea what I'm doing with IPv6...
Thanks in advance
Richard.