BitFolk Announcements

announce@mailman.bitfolk.com

265 discussions

2021-02-20 0100Z onwards: Emergency reboot of host "talisker"

by Andy Smith

Hi, In the last couple of hours I unfortunately had to reboot host "talisker" including full shutdown & boot of all the VMs on it. It seems from logs that problems started at approximately 01:00. The first alerts came in at 01:22 when customers started trying to reboot their VMs. Symptoms for customers were stalling of tasks, unable to shut down properly, unable to boot again after forcibly shutting down. I spent some time trying to investigate but it wasn't making things any better so by about 02:30 I decided to issue a reboot. Customer VMs were all back up and running by about 02:45. I continue to investigate what the root cause may be and am keeping a close eye on things. Apologies for the disruption this will have caused you. Thanks, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting

3 years, 10 months

CentOS 8 installs now available from Xen Shell

by Andy Smith

Hi, ==TL;DR: version== You can now perform a mostly-automated install of CentOS 8.x from our Xen Shell: https://tools.bitfolk.com/wiki/Using_the_self-serve_net_installer xen shell> install centos_8 ==Full version== Installing CentOS 8 at BitFolk has previously only been possible by booting the Rescue VM and doing it in a chroot: https://tools.bitfolk.com/wiki/Installing_CentOS_8 This is because as of CentOS 8, Red Hat decided to disable support for PV and PVH mode Xen guests in all their kernels, even though the upstream Linux kernel does have that supported by default. Thanks to some work by Jon Fautley¹ in hacking together a modified installer kernel and initrd for CentOS and RHEL we were able to boot the installer anyway, so now a more normal install experience is possible. It is still necessary for CentOS users to switch to the kernel-ml kernel package from ElRepo, so our installer does that for you. ===But isn't CentOS 8 dead?=== Red Hat recently moved the EOL date for CentOS 8 forward from 2029 to 31 December 2021. After that point, existing CentOS 8 users would need to switch to CentOS Stream or some other distribution. We would like to support CentOS Stream, as well as RHEL and perhaps one of the more popular CentOS replacements (e.g. Rocky Linux) should they ever make a release. This work was necessary for that. ===Should I install CentOS 8?=== Probably not given its short remaining lifespan, unless you want to switch it to CentOS 8 Stream or RHEL8 later. If you do we'd like to know how you get on with our installer. It's only received light testing so far. CentOS 7 is still security supported by the CentOS Linux project until 30 June 2024. ===What is CentOS Stream?=== I'm not going to try to explain what Red Hat's product lineup is. As far as I understand it's a rolling release, i.e. constantly updated, with packages that are about to go into the corresponding RHEL release. Red Hat does not recommend it for production use. Red Hat's announcement is here: https://www.redhat.com/en/blog/faq-centos-stream-updates ===Why are you considering offering RHEL?=== As of 1 February 2021 Red Hat is allowing its free Red Hat Developer subscription to have up to 16 active servers: https://www.redhat.com/en/blog/new-year-new-red-hat-enterprise-linux-progra… It should be possible for us to support this soon, though with the same caveat that it will likely be necessary to use the kernel-ml EPEL package. ===Other questions?=== Please do ask if there's anything else. Cheers, Andy ¹ https://guv.cloud/ -- https://bitfolk.com/ -- No-nonsense VPS hosting

3 years, 10 months

2020-12-27 ~00:00 – ~00:45 internal packet loss and alerts regarding "clockwork" and "limoncello"

by Andy Smith

Hi, As of about 0000Z we started receiving alerts of packet loss and began investigation. It was found to be an internal issue between hosts "clockwork" and "limoncello" only. That is, everything on both hosts was reachable from outside our network and also from inside as long as it wasn't between those two hosts. As there is a monitoring node on "limoncello", a number of alerts were sent out regarding customer services on "clockwork" that it considered to be down, but they weren't actually down - unless you happened to be hosted on "limoncello", anyway, and vice versa. I tracked the issue to one of the two bonded switch ports for "clockwork"; bringing that interface down and up again appears to have cleared it. That happened at about 0045Z. If the problem reoccurs we can down the interface and have it run on one interface until the port or switch can be changed. If the problem is actually in the NIC of the server itself things will be more tricky, but we'll cross that bridge if we come to it. Cheers, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting

3 years, 12 months

Host "elephant" needed to be power cycled just now

by Andy Smith

Hi, At around 00:15Z we started receiving alerts regarding some servers on host "elephant". Looking at the machine's console it was reporting errors with its SAS controller, and was generally unresponsive to anything requiring block IO, so I had no choice but to power cycle it. On boot I couldn't find any issue with its SAS controller, and it was able to find all its storage devices and seemingly boot normally. The last few customer VPSes have finished booting as I type this. I will keep an eye on things for the next few hours and let you know about further actions. Please accept my apologies for the disruption. This is unrelated to the problems with "elephant" last month which were tracked down to a kernel bug. Cheers, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting

4 years, 1 month

Developer discount (5%)

by Andy Smith

Hi, A few days ago someone asked if we would match a 5% discount that another hosting company offered to developers of significant open source projects. After thinking about it for a bit I decided we would. So, if you are a registered Debian/Ubuntu Developer, Fedora maintainer, BSD committer etc etc please feel free to email support(a)bitfolk.com and ask if you qualify. When you do, please provide some sort of link to your work so we can verify it. More information here: https://tools.bitfolk.com/wiki/Developer_discount Cheers, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting

4 years, 1 month

Re: [bitfolk] Server "elephant" has crashed a few times, ongoing problems

by Andy Smith

Hi, On Fri, Oct 23, 2020 at 11:46:11AM +0000, Andy Smith wrote: > On Fri, Oct 23, 2020 at 11:19:21AM +0000, Andy Smith wrote: > > I'm trying to isolate the issue to one particular VM because if a > > guest can crash the host then it's a bug in the hypervisor and > > just moving guests around won't solve the problem. > > I can't find it. As we have had problems with elephant before I'm > going to assume hardware problem and start moving customer VMs to > other hosts. While moving customer VMs to other hosts, booting one of them caused server "macallan" to crash in exactly the same way. So, I am ruling out hardware issues with "elephant". By preventing this particular VM from booting I was able to boot all of the other VMs on "macallan". I have some hope that it is just this one VM that is tickling a particularly nasty bug. I am going to try now starting the remainder of VMs on "elephant". If that is successful I will then take the suspect VM to test hardware to see if I can further reproduce. I am confused because I am sure I tried reverting last weekend's hypervisor upgrade to the previous version while investigating matters on "elephant", yet it still crashed. Possibly I made a mistake (e.g. booted with wrong hypervisor). Also, everything obviously booted up fine last weekend when I did the maintenance so possibly this customer has found a new and unrelated bug. The best case at this point is that I can reproduce the problem with just that one VM, report it, get it fixed and then have to reboot everything to deploy the fix. Cheers, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting

4 years, 2 months

Server "elephant" has crashed a few times, ongoing problems

by Andy Smith

Hi, Server "elephant" unexpectedly crashed, then crashed twice more shortly after rebooting but before completely starting all VPSes. It is now crashing every time while trying to boot VPSes. I suspected bug in last round of XSA patches so reverted to previous hypervisor, but problem persists. We had an issue with "elephant" not so long ago so it might be hardware fault *though no logs to back this up). Still investigating, sorry. -- https://bitfolk.com/ -- No-nonsense VPS hosting

4 years, 2 months

Reboots will be necessary to address security issues, probably early hours 17/18/19 October

by Andy Smith

Hello, Unfortunately - and annoyingly only a month since the last lot - some serious security bugs have been discovered in the Xen hypervisor and fixes for these have now been pre-disclosed, with an embargo that ends at 1200Z on 20 October 2020. As a result we will need to apply these fixes and reboot everything before that time. We are likely to do this in the early hours of the morning UK time, on 17, 18 and 19 October. In the next few days individual emails will be sent out confirming to you which hour long maintenance window your services are in. The times will be in UTC; please note that UK is currently observing daylight savings and as such is currently at UTC+1. We expect the work to take between 15 and 30 minutes per bare metal host. If you have opted in to suspend and restore¹ then your VM will be suspended to storage and restored again after the host it is on is rebooted. Otherwise your VM will be cleanly shut down and booted again later. If you cannot tolerate the downtime then please contact support(a)bitfolk.com. We may be able to migrate² you to already-patched hardware before the regular maintenance starts. You can expect a few tens of seconds of pausing in that case. This process uses suspend&restore so has the same caveats. It is disappointing to have another round of security reboots 28 days after the last lot, though before that there was a gap of about 330 days. Still, as there are security implications we have no choice in the matter. Cheers, Andy ¹ https://tools.bitfolk.com/wiki/Suspend_and_restore ² https://tools.bitfolk.com/wiki/Suspend_and_restore#Migration -- https://bitfolk.com/ -- No-nonsense VPS hosting

4 years, 2 months

Reminder: Reboots will be necessary to address security issues, early hours 17/18/19 October

by Andy Smith

Hi, A reminder that maintenance is scheduled for the early hours (UK time) of 17, 18 and 19 October. Irritatingly, this may end up having to be postponed. One of the patches has problems and the vendor is still working on that. If they come up with something in the next few hours I will still have time to test it appropriately, but if they don't then I won't and we'll have to postpone this maintenance for one week. Please assume it is going ahead unless you are notified otherwise. You should have all received a direct email telling you the hour long maintenance window that each of your VMs is in. If you can't find it please check your spam folders etc; it was sent on 7 October. If you still can't find it, work out which host machine you're on¹, and then the maintenance windows are: elephant 2020-10-17 00:00 hen 2020-10-18 02:00 hobgoblin 2020-10-18 01:00 jack 2020-10-19 00:00 leffe 2020-10-19 01;00 macallan 2020-10-17 02:00 paradox 2020-10-18 00:00 snaps 2020-10-19 02:00 talisker 2020-10-17 03:00 These times are all in UTC so add 1 hour for UK time (BST). Cheers, Andy ¹ This is listed on https://panel.bitfolk.com/ and is also evident from resolving <accountname>.console.bitfolk.com in DNS, e.g.: $ host ruminant.console.bitfolk.com ruminant.console.bitfolk.com is an alias for console.leffe.bitfolk.com. console.leffe.bitfolk.com is an alias for leffe.bitfolk.com. leffe.bitfolk.com has address 85.119.80.22 leffe.bitfolk.com has IPv6 address 2001:ba8:0:1f1::d ----- Forwarded message from Andy Smith <andy(a)bitfolk.com> ----- Date: Wed, 7 Oct 2020 09:20:29 +0000 From: Andy Smith <andy(a)bitfolk.com> To: announce(a)lists.bitfolk.com Subject: [bitfolk] Reboots will be necessary to address security issues, probably early hours 17/18/19 October User-Agent: Mutt/1.5.23 (2014-03-12) Reply-To: users(a)lists.bitfolk.com Hello, Unfortunately - and annoyingly only a month since the last lot - some serious security bugs have been discovered in the Xen hypervisor and fixes for these have now been pre-disclosed, with an embargo that ends at 1200Z on 20 October 2020. As a result we will need to apply these fixes and reboot everything before that time. We are likely to do this in the early hours of the morning UK time, on 17, 18 and 19 October. In the next few days individual emails will be sent out confirming to you which hour long maintenance window your services are in. The times will be in UTC; please note that UK is currently observing daylight savings and as such is currently at UTC+1. We expect the work to take between 15 and 30 minutes per bare metal host. If you have opted in to suspend and restore¹ then your VM will be suspended to storage and restored again after the host it is on is rebooted. Otherwise your VM will be cleanly shut down and booted again later. If you cannot tolerate the downtime then please contact support(a)bitfolk.com. We may be able to migrate² you to already-patched hardware before the regular maintenance starts. You can expect a few tens of seconds of pausing in that case. This process uses suspend&restore so has the same caveats. It is disappointing to have another round of security reboots 28 days after the last lot, though before that there was a gap of about 330 days. Still, as there are security implications we have no choice in the matter. Cheers, Andy ¹ https://tools.bitfolk.com/wiki/Suspend_and_restore ² https://tools.bitfolk.com/wiki/Suspend_and_restore#Migration -- https://bitfolk.com/ -- No-nonsense VPS hosting ----- End forwarded message -----

4 years, 2 months

Reboots will be necessary to address security issues, probably early hours 19/20/21 September

by Andy Smith

Hello, Unfortunately some serious security bugs have been discovered in the Xen hypervisor and fixes for these have now been pre-disclosed, with an embargo that ends at 1200Z on 22 September 2020. As a result we will need to apply these fixes and reboot everything before that time. We are likely to do this in the early hours of the morning UK time, on 19, 20 and 21 September. In the next few days individual emails will be sent out confirming to you which hour long maintenance window your services are in. The times will be in UTC; please note that UK is currently observing daylight savings and as such is currently at UTC+1. We expect the work to take between 15 and 30 minutes per bare metal host. If you have opted in to suspend and restore¹ then your VM will be suspended to storage and restored again after the host it is on is rebooted. Otherwise your VM will be cleanly shut down and booted again later. If you cannot tolerate the downtime then please contact support(a)bitfolk.com. We may be able to almost-live migrate you to already-patched hardware before the regular maintenance starts. You can expect a few tens of seconds of pausing in that case. This is still a somewhat experimental process and also requires you to opt in to suspend and restore. Cheers, Andy ¹ https://tools.bitfolk.com/wiki/Suspend_and_restore -- https://bitfolk.com/ -- No-nonsense VPS hosting

4 years, 3 months

← Newer
1
2
3
4
5
6
7
8
9
...
27
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

BitFolk Announcements