Re: [bitfolk] What to do when a customer's backups go above 100%

Author: Nigel Barker
Date:
To: users
Subject: Re: [bitfolk] What to do when a customer's backups go above 100%

Hi Andy

Personally I'm send a warn at 80%, a critical at 90% and fail any backups
taking use beyond 100%.

If you could give users the ability to delete either an entire backup, or
all backups of a specific file/directory, then it is their problem.

I'm not personally fond of the idea of billing people whose backups grow
beyond their limit, might require a check of the t&c's they agree to, but
whatever it certainly shouldn't load you with work when a customer uses
more than they've paid for!

Nige

> Hi Rodrigo,
>
> On Mon, Dec 30, 2013 at 12:44:59PM -0200, Rodrigo Campos wrote:
>> On Monday, December 30, 2013, Andy Smith wrote:
>> > - Nagios sends warnings when that usage goes above 95%, sends
>> > critical alerts if it goes above 100%
>>
>>
>> Critical on 100% is maybe too late?
>
> Having just checked it is actually 95% for warning and 99% for
> critical.
>
>> I would say maybe ~90% can be critical, as you are clearly running out
>> of
>> space and won't be able to backup anymore.
>
> That is a fair point although the words "warning" and "critical" are
> at the moment just words used in template text in the alert and
> there is therefore no significance between them except what the
> recipient reads.
>
> Also doubtless different people will consider different percentages
> to be what they want.
>
> There isn't a concept of "running out of space" at the moment - if
> you go above 100% then your backups still work. You just eventually
> get asked by me to pay for more space or have some stuff deleted.
>
> Perhaps it is best if the critical alerts stay at 99% and I allow
> the warning percentage to be configurable.
>
>> Or if the size of the last backup times 7 (like in a week you won't be
>> able
>> to backup anymore) is more than x%, then critical. But maybe is a pita
>> to
>> get the size of the last backup?
>
> It doesn't really work like this as there isn't a concept of "size
> of last backup" - only files that change are backed up, so if you
> had ten 1GB files that did not change since last time then the usage
> would be 10GB even though there are two sets of backups.
>
> If one file changed then both versions would be stored, so the usage
> across both sets of backups would be 11GB. So, there is a
> *differential* of 1GB per backup run, and it is true that I could
> take note of this and compare it to how much space is left then
> guess how many of these backup runs would fit given the same amount
> of diffs every time.
>
> That is really complicated though and I'm not convinced there'd be
> very much value in this compared to just the used percentage.
>
>> If you have the size of the last backup, is it possible to add a check
>> to
>> see if the current backup is X% more than the last one?
>>
>> This seems to me, that I'm totally inexperienced and never dealt with
>> this,
>> that can detect early when something got backed up when it shouldn't?
>
> While possible, these just sound like more alerts that people are not
> going to be very interested in. For those who do use the backups
> service, do you feel that a simple percent used alert isn't good
> enough and you need to know about rates of change?
>
>> But in any case, the most reasonable thing to do for me is to abort the
>> next backups until there is free space.
>
> I'm not sure that is reasonable, and I will explain why below..
>
>> > Note that although "just suspend the customer's backups as soon as
>> > they go past 100%" initially sounds like a good idea, it may not be
>> > as it prevents the customer from removing whatever it was they
>> > backed up that they didn't mean to, i.e. fixing it themselves.
>>
>> Sorry, don't follow you here :-S
>
> The backups are incremental. They aren't just X amount of files
> times Y backup points. It's X amount of files plus the amount of
> changes over a configurable time period that in the default case is
> 6 months but some people have it set to 12 months or more.
>
> The default backup schedule looks like this:
>
> - Once every four hours, keep 6.
> - Once every day, keep 7.
> - Once every week, keep 4.
> - Once every month, keep 6.
>
> This means that (without you contacting support to ask for stuff to
> be deleted out of backups), once a file is backed up, it isn't going
> away for 6 months. Even if you delete it off your disk.
>
> e.g., you create:
>
> /var/tmp/dvd_rip
>
> of 8GB or whatever and it gets backed up, so it's now accessible
> via:
>
> /srv/backups/hourly.0/var/tmp/dvd_rip
>
> Noticing your backup space usage went up by 8GB you delete
> /var/tmp/dvd_rip or otherwise mask it from being backed up.
>
> The file doesn't disappear out of your backups though. At the next
> run it'll be accessible as:
>
> /srv/backups/hourly.1/var/tmp/dvd_rip
>
> and tomorrow it will be:
>
> /srv/backups/daily.0/var/tmp/dvd_rip
>
> and so on.
>
> By now you're probably wondering where I am going with this since it
> doesn't explain how a customer can take some action to reduce the
> space their backups use, in fact all I have done is explain how a
> customer CAN'T fix it.
>
> Well the thing is that at every backup run the oldest iteration is
> being deleted, so on the 6th daily run hourly.5 is being deleted and
> on the 6th monthly run monthly.5 is being deleted.
>
> Therefore if you identify things that have been backed up for a long
> time but which don't actually need to be, you can delete them from
> your disk or else mask them from being backed up, and as they age
> out they won't take up disk space any more.
>
> An example might be the files in /var/log/ which change all the
> time so at every hourly run you will back up a new set of them. If
> you decide that you don't want a backup of them every four hours
> then you might mask them from being backed up. This will have
> immediate effect with the next backup run since that is a set of
> logs that got aged out of hourly.5 and never appeared in hourly.0.
>
> I do take your point though, because there is nothing stopping
> anyone doing the above well before 100% is reached. What I just
> described is also a fairly rare case - normal cause of suddenly
> going past 100% is mistakenly letting some big transient file be
> backed up, and there's currently no way for the customer to fix that
> by themselves.
>
> At the moment there is no negative effect from going past 100%
> except that I will write to you and ask you to sort it out. So I
> could be wrong about suspending their backups being an unreasonable
> thing to do.
>
> I had suggested the option of "you will automatically order more
> disk and be charged for it" as one possible negative consequence,
> and it appeals to me because it's very simple!
>
> You suggest an alternative negative consequence of "once usage goes
> above 100%, suspend backups". That is also fairly simple, and has
> the advantage that no one gets a bill that they don't intend to pay.
> It has the downside that the customer's backups now will never get
> re-enabled unless they contact me to buy more disk or ask me to
> delete things.
>
> Which of these makes the most sense?
>
> Should both options exist for people to choose between?
>
> If I implemented a way (from the Panel) to nuke the most recent set
> of backups then would that make the "suspend" option the best one as
> the customer can still fix it themselves?
>
> That is, upon receiving the critical alert that they had now used
> more than 100% of backup space and their backups have been disabled,
> they determine that this is because something large got backed up
> that shouldn't have been backed up. They could then go to the Panel
> and delete the most recent backup run, and then backups start
> working again at the next run, all of this without needing to submit
> support tickets and without being charged any extra.
>
> Now I have typed it out, that does sound rather more friendly than
> sending people bills.
>
> Cheers,
> Andy
>
> --
> http://bitfolk.com/ -- No-nonsense VPS hosting
> _______________________________________________
> users mailing list
> users@???
> https://lists.bitfolk.com/mailman/listinfo/users
>

This message is part of the following thread:
	the complete thread tree sorted by date
	Keith Williams at
	Duggie at

Re: [bitfolk] What to do when a customer's backups go above …