[I tried sending this at roughly 3am this morning, and forgot yet again that replying to
this list by default replies to the sender.]
My subconscious woke me up just a minute or two after I got a Pingdom alert about site
down. I wish my mind would just let me sleep properly!
This has now happened at 2:29 on the 22nd, 2:19 on the 24th and 2:44 on the 25th.
Coincidence? I think not. My feeling is that an application is being attacked. The only
thing that could come from within is an errant Drupal cron job, but as of a few hours ago,
they run at the start of the hour (staggered), and so would not have bought it down
towards the end of the hour.
Now that I’m awake right when this has happened, I can’t login to the server via SSH. I
would normally raise an emergency ticket with the hosting company [1], but given the
history, I’m almost interested to see if it is back up for me in a few hours.
On 25 Jul 2014, at 01:10, Ian <ian(a)lovingboth.com> wrote:
I may be misunderstanding something, but why not run
WP with the same
Apache? (Apart from the way it would currently bring down everything else!)
I didn’t really want to take this hosting on, but it was a personal favour for a very
loyal client that I do D7 for. They rebuilt their WP site after their previous one was
hacked.
I just don’t trust WP, and it seemed right to be wary of exposing my D7 vhosts to a WP
vhost that was a known target.
Do you think it's crashing because of the load
before anything gets
written to the access logs?
No, it’s not crashing. When I have been able to login in the past, the system is full of
Apache processes doing nothing, with nothing in the logs. This is what made me think
slowloris. It seems to me there are loads of connections never being allowed to finish.
See the wiki article on WordPress and use a fail2ban
jail that looks for
any access to wp-login.php and bans the IP address for more than a
handful of accesses in a few minutes. If it's only legitimately accessed
from known whitelisted addresses, you can set it to ban on a single access.
I think that is the next step, yes.
Any thoughts?
What would you do differently?
Have a cron job that checks if the second Apache is running and, if not,
starts it again.
(Just to be sure, I’m now running a single Apache with mom_itk. I didn’t say it in the
past, but I’d also tried nginx/php-fpm, and I had to keep restarting the php-fpm handler
in that case, too!)
Stay up until 2am and have a look at what's
happening :)
:)
Time to go back to sleep before the little person wakes me up, again!
B
[1] They have fairly aggressive firewall settings, so I can’t be sure if the machine is
genuinely uncontactable, or if they have triggered a temporary ban. And I can’t see the
Nagios emails until the machine is reachable.