Hi Andy,

On Sat, 1 Dec 2018 at 21:55, Andy Smith <andy@bitfolk.com> wrote:
Hi Luke,

On Sat, Dec 01, 2018 at 09:28:32PM +0000, Luke Taylor wrote:
> Over the past week or so my Apache installation has an issue where it
> just.. stops responding and requests just time out

Is only apache affected? i.e. are other things on the VPS responsive
during this time?

Curently, only Apache from what I can tell (can SSH in just fine and other services seem to be running as expected), VPS is performant with practically no CPU load. The issues all started off earlier on this week with MySQL server not responding, however dmesg showed that it was being terminated by OOM killer caused by high memory usage by both MySQL and Apache, so I tweaked their settings to limit memory usage which seemed to solve that issue - not sure if related. In this case as apache is won't respond I haven't noticed any mysql connection errors. Next time it happens I'll try to connect to MySQL via command line.
 

> until the server is restarted, 12 hours later same thing happens,
> lather rinse repeat.

When you say "the server is restarted", do you mean the apache
server, or the entire VPS?

Both yield the same result, however I was referring to "sudo service apache2 restart".
 

> Not sure how else to diagnose the issue :(

There is definitely a bug in Xen which was exposed by the most
recent security patches; that bug most probably only affects 64-bit
guest and can be avoided by upgrading your kernel.

As it involves memory corruption it's hard to predict how that would
manifest itself. If there is a kernel upgrade available then I
recommend you take it.

Do you currently have a file
/sys/devices/system/cpu/vulnerabilities/l1tf ? Is there a kernel
upgrade available? If you upgrade, does that file appear?

Yes, I do have that file containing "Mitigation: PTE Inversion". I only just recently upgraded kernel to 4.4.0-139-generic and installed all available updates a few days back.


As far as I am aware, on Ubuntu 16.04 package linux-image-generic
version 4.4.0.133.139 contains the L1TF fix.

If that doesn't help, I can move your VPS to a test host where the
problematic patch is disabled, and that would then rule out whether
it is that. If your problem persists while your VPS is on that host
then it would be something I'm currently unaware of and would be
suspicious of something inside your VPS. But the timing is
suspicious.

If Apache dies again I'll do a little more investigation and contact you off list to discuss moving to a test host.
 
Thanks for your help.

Cheers,
Luke