Hi,
On Fri, Oct 23, 2020 at 11:46:11AM +0000, Andy Smith wrote:
On Fri, Oct 23, 2020 at 11:19:21AM +0000, Andy Smith
wrote:
I'm trying to isolate the issue to one
particular VM because if a
guest can crash the host then it's a bug in the hypervisor and
just moving guests around won't solve the problem.
I can't find it. As we have had problems with elephant before I'm
going to assume hardware problem and start moving customer VMs to
other hosts.
While moving customer VMs to other hosts, booting one of them caused
server "macallan" to crash in exactly the same way. So, I am ruling
out hardware issues with "elephant".
By preventing this particular VM from booting I was able to boot all
of the other VMs on "macallan". I have some hope that it is just
this one VM that is tickling a particularly nasty bug.
I am going to try now starting the remainder of VMs on "elephant".
If that is successful I will then take the suspect VM to test
hardware to see if I can further reproduce.
I am confused because I am sure I tried reverting last weekend's
hypervisor upgrade to the previous version while investigating
matters on "elephant", yet it still crashed. Possibly I made a
mistake (e.g. booted with wrong hypervisor).
Also, everything obviously booted up fine last weekend when I did
the maintenance so possibly this customer has found a new and
unrelated bug.
The best case at this point is that I can reproduce the problem with
just that one VM, report it, get it fixed and then have to reboot
everything to deploy the fix.
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting