This alerting replaces the manual process of me being sent
excerpts of log files that say a customer's backups are failing,
then opening a support ticket with the customer to make them
aware.
If you receive this alert then your backups are definitely not
happening.
The alert looks like this:
From: nagios@???
Subject: ** PROBLEM alert - backup0.bitfolk.com/Backup age youraccount =
is CRITICAL **
=20
***** Nagios *****
=20
Notification Type: PROBLEM
=20
Service: Backup age youraccount
Host: backup0.bitfolk.com
Address: 85.119.80.240
State: CRITICAL
Date/Time: Tue Jan 31 12:37:38 UTC 2012
=20
Additional Info:
=20
FILE_AGE CRITICAL: /data/backup/rsnapshot.6-7-4-6/hourly.0/85.119.82.12=
1 is 16842912 seconds old and 4096 bytes
(Those who haven't had a successful backup run in the last couple
of days will have a huge number of seconds listed there because
the backup system was only modified to record last successful
contact recently)
- Backup space usage
This checks that your total backup space usage is not approaching
your current quota. The thresholds are 95% for a warning and 99%
for a critical.
This alerting replaces the manual process of me being warned about
customers who are exceeding their quota and then opening support
tickets with them to discuss what they want to do about it=B2.
If you receive this alert then your backups are still happening,
but you're in danger of (or already are) using more than the
agreed space. If you exceed your quota then we may disable your
backups, so that would eventually cause the above backup age alert
to fire.
Please note that we can't update the measurement of how much space
you're using very often. Backup directories contain hundreds of
millions of files, many of which are copies of each other, but
it's not possible to tell without looking. Adding it all up takes
quite a long time and stresses disk IO quite badly. So we only
update quotas every day at best at the moment.
Also please note that since this is a backup system, deleting
files on your VPS does not immediately result in using less disk
space for backups. It would be a rather pointless backup system if
it threw away deleted files immediately. :) Anything that gets
backed up is going to be kept for as long as your chosen backup
schedule dictates, e.g. 6 months by default. If you have blown
your quota by accidentally allowing large amounts of data to be
backed up, you are still going to have to contact support to get
it deleted.
The alert looks like this:
From: nagios@???
Subject: ** PROBLEM alert - backup2.bitfolk.com/Backup space usage your=
account is WARNING **
=20
***** Nagios *****
=20
Notification Type: PROBLEM
=20
Service: Backup space usage youraccount
Host: backup2.bitfolk.com
Address: 85.119.80.230
State: WARNING
=20
Date/Time: Tue Jan 31 15:52:28 UTC 2012
=20
Additional Info:
=20
WARNING 98.50% (394/400MiB) used
It is possible for the usage to go above 100% because we do allow
you to go over your quota for short periods of time.
Cheers,
Andy
=B9 "Successful" as in, rsync connected to your host, did some stuff
and then exited with a non-error exit code. It does not
necessarily mean that what you think should be backed up is being
backed up. As with any backup solution you need to assure yourself
on a regular basis that it's doing what you expect.
=B2 Generally one or more of:
- Buy some more disk space for backups
- Backup fewer files
- Backup less often
--=20
http://bitfolk.com/ -- No-nonsense VPS hosting
--Qtzb1h6tVL0ohdDu
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iEYEAREDAAYFAk8oEL4ACgkQIJm2TL8VSQtrvwCgpHnPToXDO8+/ehb9FGMGiXF/
P6MAoJyqosCIvtDzuayFscN2ONIMxlRP
=viem
-----END PGP SIGNATURE-----
--Qtzb1h6tVL0ohdDu--
--===============0001861887==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
_______________________________________________
announce mailing list
announce@???
https://lists.bitfolk.com/mailman/listinfo/announce
--===============0001861887==--
From andy@??? Tue Jan 31 20:38:03 2012
Received: from andy by mail.bitfolk.com with local (Exim 4.72)
(envelope