Hi,
At about 19:30Z we started receiving alerts for customer services on
server "limoncello".
On investigation it quickly became apparent that this was the
intermittent "I/O stall" problem we've been seeing on all servers
and have been grappling with for months now.
All I could do was power cycle the server.
My current line of investigation is to upgrade both the hypervisor
and the kernel when this happens, and so far it hasn't reoccurred on
any of the servers where that has been done, though the sometimes
months long gap between incidents means it's not possible to be
sure.
Although this last happened 16 days ago, that was on a different
server ("jack").
With the upgrades done the server was rebooted again and at about
20:28Z customer VMs started booting again. This was complete by
about 20:45Z.
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting