Hello,
On Thu, Jan 19, 2012 at 10:09:24AM +0000, Andy Smith wrote:
Provided that does fix the problem, host
president.bitfolk.com will
be similarly rebooted the following day, Tuesday 21st February also
at 2200Z.
This is not going well. Ever since president was rebooted last night
it has suffered from poor IO performance:
http://tools.bitfolk.com/cacti/graph_2639.html
I've been working through the night to try to find the root cause,
but am not having much luck.
There's no higher demands being made of the node compared to earlier
on:
http://tools.bitfolk.com/cacti/graph_2640.html
and as far as I can see, the RAID controller is happy with every
disk and with its battery and write cache.
At the moment I am running a reinitialise on the RAID array which is
going to take a couple of hours. Once that completes, if it hasn't
improved matters then I will be rebooting the node again.
I have some ideas of a couple of settings that should be changed,
that require a reboot, although I can't explain why their current
setting would make any difference now compared to the last ~250
days. It is a bit of a shot in the dark admittedly.
If that doesn't work, I shall be on my way to the datacentre to
replace RAID controller from spares.
I'm afraid it looks like this is going to cut well into UK working
hours today. Please accept my apologies for the disruption; it has
been completely unexpected.
Cheers,
Andy
--
http://bitfolk.com/ -- No-nonsense VPS hosting