Hi Andy,
I know it's probably a fair amount of work, but I would advocate
exposing some sort of basic interface that provides a list of files and
allows users to select one to delete. Such an interface would then go
through all snapshots and delete any file with the same path, checksum
and date.
The solution you propose isn't really ideal IMO, as it means that months
of backups might potentially have to be deleted to remove a single
massive file - but I understand the need for you to avoid manually
looking inside backups or providing write access to them.
Regards,
Robert
On 2010-05-28, Andy Smith wrote:
Hello,
If you don't make use of the BitFolk backup service then you might
want to skip this.
Those who have backups set up have dedicated some of their disk
space to the backups. The stuff they asked to be backed up is backed
up and multiple levels of snapshots are simulated using hardlinks.
Access is via read-only NFSv3.
I provide only read-only NFS access because:
a) I don't want people to be able to corrupt their backups; and
b) I don't want people using it for general purpose file storage
Unfortunately people do from time to time end up having things backed
up that they don't want backed up. Backing up a very large set of
files that were only temporarily needed, for example. Once the
content has been backed up once it can be difficult to get rid of
because the entire point of having multiple levels of snapshots is
that deleting the data won't delete it from all of the historical
snapshots. In many cases we are talking 6 months of storage here.
The problem comes when the amount of stuff backed up exceeds the
amount of disk space set aside for backups, and the customer wants
things removed from the backups in order to bring them back under
quota. This is at direct odds with my desire for them not to have
write access to their backups.
I also have a stronger desire to not have to poke about in people's
data, though. [1]
Unless anyone can think of any clever compromises, how about this?:
I'll delete entire snapshots for you on request.
If you back up MASSIVE_FILE and then a day later delete it,
its presence in snapshots might be like this:
/hourly.0 not present
/hourly.1 present
/hourly.2 present
.
.
/daily.0 present
/daily.1 not present
If I deleted every snapshot between hourly.1 and daily.0
inclusive then hourly.0 would become a delta to daily.1,
neither of which would include MASSIVE_FILE, thus greatly
reducing disk space usage.
This has the advantages that it means I don't have to poke about in
your files, since a whole snapshot can be treated as an opaque blob
of data for my purposes. It also could be automated reasonably
easily. The obvious downside is that it's a pretty blunt tool; if
the customer leaves MASSIVE_FILE being backed up for a long time
then potentially all their backups will need to be nuked.
Thoughts?
Cheers,
Andy
[1] "hi support, I have accidentally backed up 42GiB of extreme
stoat porn onto your backup server, please can you go in and
delete anything that looks like that so my backups can work
again, thanks."
--
http://bitfolk.com/ -- No-nonsense VPS hosting
Q. How many mathematicians does it take to change a light bulb?
A. Only one - who gives it to six Californians, thereby reducing the problem
to an earlier joke.