I found the same a while ago on mine. It reduced a many GB hourly diff to a few MB which including the snapshot was ~1/5 the size.

--
Deanna Earley

----- Reply message -----
From: "Adam Spiers" <bitfolk@adamspiers.org>
Date: Thu, Mar 24, 2011 12:02
Subject: [bitfolk] Individual snapshot disk usage report added to panel/backups
To: "Andy Smith" <andy@bitfolk.com>
Cc: <users@lists.bitfolk.com>


On 24 March 2011 11:38, Andy Smith <andy@bitfolk.com> wrote:
> Hi Adam,
>
> On Thu, Mar 24, 2011 at 11:26:53AM +0000, Adam Spiers wrote:
>> This is great - thanks!  However I'm struggling to understand the
>> numbers shown.  On my VPS I see hourly snapshots consuming 2.5GB -
>> what does this mean exactly?  Presumably not that my VPS is churning
>> that much data per hour, because it should be idle most of the time
>> AFAIK.
>
> You refer to the differential report, so there are ~2.5GiB of diffs
> per snapshot. Since the per-snapshot report shows each snapshot
> coming in at ~9.5GiB that means you have the same amount of files
> but with ~2.5GiB changing content every time.

Wow, OK - something's not right for sure, as there shouldn't be
anywhere near that much churn on my VPS.

> Bear in mind that any time a file changes (even if it's just
> metadata), a copy of both the old and new version will be stored in
> their entirety.

By metadata presumably you mean inode updates?  So I can search for
churn via find -ctime?  In that case, could I expect remounting
partitions with the noatime or relatime options to drastically reduce
the size of incrementals?

http://en.wikipedia.org/wiki/Atime_(Unix)#ctime
http://lwn.net/Articles/244829/

> I can have a look and see what I can make of it if you like. I'd
> rather not look at customer data without permission.
>
> If you want to look yourself, look for files that change metadata
> (mtime, ownership) all the time, like if they're being checked out
> of version control or downloaded from somewhere without preserving
> mtimes. Something like that.

Ahah.  I think I may have located the culprits: huge log files.

-rw-r--r--  1 root     root      67797975 Mar 24 11:47
/home/adam/web/adamspiers.org/logs/error.log
-rw-r--r--  1 root     root     141526573 Mar 24 11:55
/var/www/redmine/log/mod_rails/error.log
-rw-r--r--  1 www-data www-data 314066833 Mar 24 11:55
/var/www/redmine/log/production.log
-rw-r--r--  1 www-data www-data 895519982 Mar 24 11:49
/var/www/wordpress/wp-content/debug.log
-rw-r--r--  1 root     root     899657028 Mar 24 11:56
/home/adam/web/adamspiers.org/logs/access.log

If I start rotating these regularly, then I could expect the
incrementals to significantly shrink, right?

For the benefit of anyone else who wishes to sanity check their churn,
I did something like this:

 backup_dirs="/boot /etc /home /initrd /root /srv /usr/local /var"
 find $backup_dirs -ctime -2 >& /tmp/find_-ctime_-2.out
 xargs ls -ld < /tmp/find_-ctime_-2.out | sort -n -k5

If there's a better way then please do share it.

_______________________________________________
users mailing list
users@lists.bitfolk.com
https://lists.bitfolk.com/mailman/listinfo/users