On 5 September 2016 at 04:15, Andy Smith <andy(a)bitfolk.com> wrote:
On Thu, Aug 25, 2016 at 09:11:51PM +0000, Andy Smith
wrote:
By now you should have all received notification
of the scheduled
maintenance that will be taking place in the early hours of the
morning (UK time) on 2016-09-02, 03 and 05.
Today's maintenance is now completed and went without incident.
Thanks for your patience in this matter. It is unfortunate that we
have had two serious security issues come to light within 6 weeks of
each other.
As mentioned in previous similar events, when we have to reboot a
host we prefer to have all customer VPSes shut down and then boot
again.
Xen and Linux do contain support for suspend and restore, so in
theory it would be possible to suspend customer VPSes to storage and
then restore them again when the host has been booted. To the VPS
and processes inside it this would look like time stood still.
Network connections would most likely drop.
Although this is much less disruptive than a shutdown and boot, it
doesn't always work. Some older Linux kernels don't restore
correctly, leaving the VPS in a hung state (requires destroy and
boot again).
Some applications get massively confused by the clock getting stuck.
I have experienced that with pacemaker (clustering software); I
could not get the restored node to rejoin the cluster without
stopping the cluster on every node and starting it again, which was
much more more disruptive than just rebooting the VPS would have
been, given it was in an HA cluster anyway!
This sounds like it *might* be a Pacemaker bug, or (more likely)
potentially a worthwhile enhancement; I suggest that you send a mail
to the mailing list describing the use case:
http://clusterlabs.org/mailman/listinfo/users
They might say "sorry, we can't support that", or who knows, they
might say "nice idea, thanks!" Either way, I know several of the
Pacemaker folks and can promise that they are super friendly and
helpful :-)