Hi,
This is your one week to go reminder that there will be scheduled
maintenance on four servers starting at 23:00 BST (22:00Z) on
Thursday 27 May 2021. The four servers affected will be:
- elephant.bitfolk.com
- limoncello.bitfolk.com
- snaps.bitfolk.com
- talisker.bitfolk.com
The maintenance window is 3 hours long but we expect the work to
take less than 30 minutes per server.
A direct email has also been sent out to contacts for all customers
on those servers.
If you cannot tolerate the ~30 minutes of downtime at that time
please reply to the email to open a support ticket asking for your
service to be moved about. That will take place at a time of your
choosing in the next week, but please ask early if you need it.
There are further details here:
https://tools.bitfolk.com/wiki/Maintenance/2021-05-Re-racking
Please note that we are still moving customers around due to rolling
software upgrades on our servers. That is unrelated to this work,
but right now customers are being moved off of snaps.bitfolk.com.
Possibly that server will be emptied before this work takes place.
The above wiki page tells you how to work out which server your
service is on.
Any other questions, please do let us know!
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
Tonight from around 21:00 BST onwards we started getting alerts and
support tickets regarding customer services on host
jack.bitfolk.com.
I had a look and unfortunately it appears to be a re-occurrence of
previous issues regarding stalled IO:
https://lists.bitfolk.com/lurker/message/20210425.071102.9d9a1cc5.en.htmlhttps://lists.bitfolk.com/lurker/message/20210220.032844.00dc9600.en.htmlhttps://lists.bitfolk.com/lurker/message/20201116.003514.25278824.en.html
Things were largely unresponsive so I had to forcibly reboot the
server. Customer services were all booted or in the process of
booting by about 21:53.
What we know so far:
- It must be a software issue, not a hardware issue, as it's
happened on multiple servers of different specifications.
- It's only happening with servers that we've upgraded to Xen
version 4.12.x.
- It's going to be really difficult to track down because there can
be months between occurrences.
Each time this has occurred I've made some change that I'd hoped
would lead to a solution, but I've now tried all the easy things and
so all that remains is to do another software upgrade.
I think we're going to have to build some packages for Xen 4.14 and
install that on a test host and see how it goes. The difficulty is
that once I do that and it seems to work, we'll never really know
because it could just be in the long period of time where the
problem is not triggered. Clearly once we have seemingly-working
packages we can't leave them spinning for 6 months just to reassure
ourselves of that.
I also am unsure about whether it is a good idea to force additional
downtimes on you in order to upgrade servers to 4.14.x when I don't
even know yet if that will fix the issue. What I can do is have the
upgrade ready and then if/when the issue re-occurs do the upgrade
then, so it boots into that.
Anyway, all I can say is that this is a really unfortunate state of
affairs that obviously I'm not happy with and I'm doing all that I
can to resolve it. These outages are unacceptable and rest assured
they are aggravating me more than anyone else.
Thanks,
Andy Smith
BitFolk Ltd
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
At approximately 17:28 BST we started receiving numerous alerts for
server "macallan" and customer services on it. Upon investigation I
was unable to connect to the IPMI console of the server.
I got in contact with the colo provider who quickly realised that
they were doing work in that rack and had knocked out the power
cable for this server.
The server started booting around 17:35 and all customer VMs had
booted by 17:47.
We use locking power cables in our servers to try to minimise this
sort of thing, but they only lock at one end - the server end. The
server's power cord had come loose at the other end.
"macallan" is one of our older servers which is single power supply
unit. To mitigate that risk it plugs into a automatic transfer
switch so that its single PSU continues to receive power even if one
of the rack's two PDUs or power feeds fails. Unfortunately that does
not protect it against its single power cord coming out of the ATS.
We have started a hardware refresh and the new spec servers do have
dual PSUs which should help to avoid things like this in future.
Please accept my apologies for this disruption.
Andy Smith
BitFolk Ltd
--
https://bitfolk.com/ -- No-nonsense VPS hosting
Hi,
TL;DR: There's 21 serious security vulnerabilities recently
published for the Exim mail server, 10 of which are remotely
triggerable. Anyone running Exim needs to patch it ASAP or risk
having their server automatically root compromised as soon as an
exploit is cooked up. Which may have happened already.
Details: https://lwn.net/Articles/855282/
We don't usually post about other vendors' security issues on the
announce@ list but I'm making an exception for this one because Exim
is installed by default on all versions of Debian, and more than
60% of BitFolk customers use some version of Debian.
If you're running Exim you need to upgrade it immediately. Package
updates have already been posted for Debian 9 and 10
(stretch/oldstable and buster/stable). The last time this sort of
thing happened with Exim several customers were automatically
compromised. As it's a root level compromise, if it happens to you
then you will never be sure what exactly what done to your server.
You might end up needing to reinstall it.
Most hosts, unless they are acting as a server listed in one or more
domains' MX records, do not need to be remotely accessible on port
25. If that's the case for you then you would be well advised to
reconfigure Exim to only listen on localhost. Though there are still
11 other vulnerabilities that local users could exploit. At least
you'd only get rooted by a friend, right?
An exploit hasn't been published yet but that doesn't mean that one
doesn't exist, and now that the source changes are public it should
be fairly easy for developers to work out how to do it.
Some of the bugs go back to 2004 so basically every Exim install is
at risk. If you are running a release of Debian prior to version 9
(stretch) then it's out of security support and may not ever see an
updated package for this, so you need to strongly consider turning
off any Exim server and doing an OS upgrade before you turn it back
on.
If you need help, you could reply to this and seek help from other
customers, or BitFolk can help you as a consultancy service, but you
probably don't want to pay consultancy prices and in any moderately
complicated setup our approach is going to be an OS upgrade anyway.
Email support(a)bitfolk.com to discuss if still interested in that.
Best of luck with the upgrading!
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting