Hi,
A customer running multiple Ubuntu 24.04 VPSes has reported problems
with a recent grub package update which gives this error:
grub-install: warning: this GPT partition label contains no BIOS Boot Partition; embedding won't be possible.
and then fails to complete the update, leaving dpkg in an unhappy state.
This is the first report we have seen of this. I am about to try to
replicate it. Is anyone else experiencing it?
I have a working theory that grub has become more strict and when it is
instal;led on a disk with a GPT (rather than a legacy MBR) it wants to
see an actual partition of code type EF02 "BIOS boot partition" rather
than simply the 4MiB of empty space we have been leaving at the start of
your xvda disk.
If that theory is correct then:
- It may be tricky to fix for existing VPSes
- It's an unfortunate change to introduce during an LTS release (i.e.
this worked when 24.04 was released)
But for now if you are affected I would just like you to get in touch
with me off-list.
While this is irritating and possibly awkward to fix, I don't think it
will end up as a critical issue as we don't actually need grub installed
to boot your VPS, only a grub.cfg that looks correct. It's just that the
easiest way to get that is to properly install grub.
Thanks,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
It was pointed out to us that the HTTPS checks on our monitoring system
were only checking for a valid TLS certificate, not for a success code
from the URL. e.g. serving a completely secure 503 error page would
result in an "OK" check result.
This morning at around 09:55 we fixed that so that the HTTPS checks are
really checking the status code of the URL supplied. This has caused a
few new alerts to start being sent to people.
By fixing that, TLS certificate validity is now NOT being checked. We
will shortly add an additional check for this. You don't have to do
anything.
HTTPS and many other checks through our monitoring system are available
free upon request.
https://tools.bitfolk.com/wiki/Monitoring
Thanks,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
At approximately 00:03Z we start receiving alerts of various services
not responding and it was determined that host talisker was having some
problems with its storage.
There were lots of errors being spewed into the kernel log from the SAS
controller's driver mostly of a timeout variety, and none of the drives
attached to it were responding. A number of its MD RAID arrays fell
apart as a result and IO errors would have been seen inside your virtual
machines.
I did try a few things around resetting the controller but nothing
worked so at around 00:35 I had to forcibly kill all running VPSes and
reboot the host, which happened at about 00:29.
The host talisker booted without incident and all its RAID arrays synced
up. By around 00:39 all customer VPSes should have booted, and all those
we have monitoring for did show as up by then.
Due to abruptly losing access to storage, some data in memory will have
been lost, but hopefully apps are aware of that. I do not think any
reads or writes were corrupted so I don't think there should be any
filesystem corruption. If you are seeing any problems and your VPS is
actually on talisker than you should first have a look at your Xen
Shell consoles.
Apologies for the disruption. We will keep an eye on talisker to gain
some assurance that this was a one-off event.
Thanks,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting