[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: du with a cache for read-only dirs
From: |
Pádraig Brady |
Subject: |
Re: du with a cache for read-only dirs |
Date: |
Thu, 23 Jun 2016 17:38:09 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 23/06/16 17:03, Brian J. Murrell wrote:
> Hi.
>
> I wonder if anyone has considered or has any patches floating around to
> speed up du of a tree of (mostly all) read-only dirs by caching the
> information from any read-only dirs.
>
> My use case is a tree of dirs where each dir is a backup of a bunch of
> machines using hard links across the dirs to synthesize full backups
> with what is really incremental information. I.e. rsync --link-dest as
> implemented by a tool called rsnapshot.
>
> Frequently I want to see what the space delta is between any number of
> those backups and trawling the whole filesystem for any dirs that have
> already been trawled just doesn't seem to make sense. Why not write
> out what is known about that dir into a file that du can then just load
> back the next time it is asked to run?
>
> It is clearly a corner case, indeed. But I wonder if anyone has gone
> down this path and has code to share before I try to find the time to
> write some myself.
I'm thinking this logic shouldn't be within du.
Perhaps a wrapper that reads such cached values if available would be better.
Something more general that I mentioned before in the context of file checksums,
would be to have extended attributes that were auto cleared on write
by the system, where one could cache checksums and sizes etc.
cheers,
Pádraig