Individual snapshot disk usage report added to panel/backups

List overview All Threads
Download

newer

older

Re: [bitfolk] Individual snapshot...

Ubuntu 10.04 Logwatch

Andy Smith

24 Mar 2011 24 Mar '11

11:02 a.m.

Hi, We've added a report of each snapshot's disk usage to: https://panel.bitfolk.com/backups/ This will make it easier for you to spot which snapshot introduced lots of files, for example. Cheers, Andy -- http://bitfolk.com/ -- No-nonsense VPS hosting _______________________________________________ announce mailing list announce(a)lists.bitfolk.com https://lists.bitfolk.com/mailman/listinfo/announce

Attachments:

signature.asc (application/pgp-signature — 198 bytes)

Show replies by date

Adam Spiers

24 Mar 24 Mar

11:26 a.m.

On 24 March 2011 11:02, Andy Smith <andy(a)bitfolk.com> wrote:

...

We've added a report of each snapshot's disk usage to: https://panel.bitfolk.com/backups/ This will make it easier for you to spot which snapshot introduced lots of files, for example.

This is great - thanks! However I'm struggling to understand the numbers shown. On my VPS I see hourly snapshots consuming 2.5GB - what does this mean exactly? Presumably not that my VPS is churning that much data per hour, because it should be idle most of the time AFAIK.

Lee Gent

11:33 a.m.

...

On my VPS I see hourly snapshots consuming 2.5GB - what does this mean exactly?

That's how much of *your* files are available in the backup; this is more like a 'virtual' size since the actual size is made up of a series of (perhaps) small deltas applied to your initial backup. Since Andy/rsync only cares about the sizes of the initial backup + all the deltas to date, this isn't how much space is taken up in your backup allowance. Cheers, L

Andy Smith

11:38 a.m.

New subject: Individual snapshot disk usage report added to panel/backups

Hi Adam, On Thu, Mar 24, 2011 at 11:26:53AM +0000, Adam Spiers wrote:

...

You refer to the differential report, so there are ~2.5GiB of diffs per snapshot. Since the per-snapshot report shows each snapshot coming in at ~9.5GiB that means you have the same amount of files but with ~2.5GiB changing content every time. Bear in mind that any time a file changes (even if it's just metadata), a copy of both the old and new version will be stored in their entirety. I can have a look and see what I can make of it if you like. I'd rather not look at customer data without permission. If you want to look yourself, look for files that change metadata (mtime, ownership) all the time, like if they're being checked out of version control or downloaded from somewhere without preserving mtimes. Something like that. Normally you'd use rsnapshot-diff on two of the snapshot directories to see what has changed, but I don't think that'll work with NFS since it probably relies on detecting the hardlinks. Cheers, Andy -- http://bitfolk.com/ -- No-nonsense VPS hosting

Lee Gent

11:42 a.m.

...

Oh right. What Andy said, then ;-) --L

Adam Spiers

12:02 p.m.

On 24 March 2011 11:38, Andy Smith <andy(a)bitfolk.com> wrote:

...

Hi Adam, On Thu, Mar 24, 2011 at 11:26:53AM +0000, Adam Spiers wrote:

Wow, OK - something's not right for sure, as there shouldn't be anywhere near that much churn on my VPS.

...

Bear in mind that any time a file changes (even if it's just metadata), a copy of both the old and new version will be stored in their entirety.

By metadata presumably you mean inode updates? So I can search for churn via find -ctime? In that case, could I expect remounting partitions with the noatime or relatime options to drastically reduce the size of incrementals? http://en.wikipedia.org/wiki/Atime_(Unix)#ctime http://lwn.net/Articles/244829/

...

I can have a look and see what I can make of it if you like. I'd rather not look at customer data without permission. If you want to look yourself, look for files that change metadata (mtime, ownership) all the time, like if they're being checked out of version control or downloaded from somewhere without preserving mtimes. Something like that.

Ahah. I think I may have located the culprits: huge log files. -rw-r--r-- 1 root root 67797975 Mar 24 11:47 /home/adam/web/adamspiers.org/logs/error.log -rw-r--r-- 1 root root 141526573 Mar 24 11:55 /var/www/redmine/log/mod_rails/error.log -rw-r--r-- 1 www-data www-data 314066833 Mar 24 11:55 /var/www/redmine/log/production.log -rw-r--r-- 1 www-data www-data 895519982 Mar 24 11:49 /var/www/wordpress/wp-content/debug.log -rw-r--r-- 1 root root 899657028 Mar 24 11:56 /home/adam/web/adamspiers.org/logs/access.log If I start rotating these regularly, then I could expect the incrementals to significantly shrink, right? For the benefit of anyone else who wishes to sanity check their churn, I did something like this: backup_dirs="/boot /etc /home /initrd /root /srv /usr/local /var" find $backup_dirs -ctime -2 >& /tmp/find_-ctime_-2.out xargs ls -ld < /tmp/find_-ctime_-2.out | sort -n -k5 If there's a better way then please do share it.

Andy Smith

12:11 p.m.

New subject: Individual snapshot disk usage report added to panel/backups

Hi Adam, On Thu, Mar 24, 2011 at 12:02:26PM +0000, Adam Spiers wrote:

...

Not sure. It's rsync's: --link-dest=DIR hardlink to files in DIR when unchanged behaviour, so whatever rsync considers "changed". atime is not going to matter (otherwise everything would be backed up, always), but mtime I expect will, as will owner and group.

...

Ahah. I think I may have located the culprits: huge log files.

...

If I start rotating these regularly, then I could expect the incrementals to significantly shrink, right?

Yes, because far fewer of the files would change. Cheers, Andy -- http://bitfolk.com/ -- No-nonsense VPS hosting "I remember the first time I made love. Perhaps it was not love exactly but I made it and it still works." -- The League Against Tedium

5124

days inactive

5124

days old

users@mailman.bitfolk.com

Manage subscription

6 comments

3 participants

tags (0)

participants (3)

Adam Spiers
Andy Smith
Lee Gent