[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Discussion of poll winning feature: repository

From: Randall Nortman
Subject: Re: [rdiff-backup-users] Discussion of poll winning feature: repository editing
Date: Sun, 1 Jan 2006 08:22:42 -0500
User-agent: Mutt/1.5.9i

On Sun, Jan 01, 2006 at 12:25:57AM -0600, Ben Escoto wrote:
> For the people that voted for repository editing, what did you want
> exactly?  Here's the first thing that crossed my mind:  you have a
> repository at /repo and maybe a directory like /repo/dir that is
> taking up too much space.  You could run something like:
>       rdiff-backup-editor delete /repo/dir

That would be very useful.  I have had a need of this on many
occasions when I decide (for space or performance reasons) to exclude
something from the backup.  But adding it to the exclude list doesn't
remove the history of it from the archive.  On archives with long
history (and I would ideally like some of my archives to keep history
forever), this means that you can never really reclaim that space.
Make sure it works on individual files as well as directories -- I
would suggest allowing the same syntax currently used in
include/exclude lists, including globbing.

> that would delete /repo/dir and remove all history of it from the
> repository.  It might also be possible to do something like:
>       rdiff-backup-editor move /repo/dir /repo/newname
> which would move /repo/dir to /repo/newname, and alter all the history
> so that all the changes that took place in /repo/dir now seem like
> they took place in /repo/newname.  Or what else did you have in mind?

You would only want to do that to mirror a similar move in the source
filesystem, but rdiff-backup wil pick up that move anyway.  The one
advantage of doing it as above would be to save the space taken up by
the gzipped copy of the file in its old location.  I suppose that if
you're moving big files or directory trees around, that could be a lot
of space, but you also have to remember to coordinate the repository
change with the filesystem change.  But you lose the record of the
rename, which means you will no longer be able to restore to the
original state -- unless the repository remembers when the move took
place and restores appropriately, as the more modern version control
systems do.  Overall, I'd rank this a much lower priority than the
delete feature.

I'd like to echo the requests to delete particular increments (i.e.,
merge increments).  This could be used if some part of the source
filesystem temporarily grew very large but you don't really want to
save that data.  You could then just delete/merge all the increments
that were made during that period.  But more importantly, it can be
used to implement the traditional daily/weekly/monthly backup
retention schemes.  For example, you may want to keep daily increments
for a month, then weekly increments for 6 months, then monthly
increments indefinitely.  This could be implemented as a script
(outside of rdiff-backup) which deletes/merges the daily increments
into weeklies once they are more than a month old, then merges
weeklies into monthlies.  For filesystems with lots of changes, this
could save a lot of space.

And one more idea -- splitting/merging repositories.  I currently back
up different parts of my filesystems into different repositories,
mostly because I want different backup frequencies and retention
policies.  For example, I back up mail every 15 minutes and keep
increments only for 30 days, but I back up /usr only once a day and
keep it for at least a year.  I split up the filesystem using a set of
fairly complex include/exclude rules that define each repository.
Sometimes, I might want to change my mind about how things are divided
up, so it would be useful to be able to merge two repositories
together, or split one repository into two, presumably by using
include/exclude lists to define what should remain in the repository
and what should be split out into another repository.  This ranks as a
"nice to have" feature for me, but once the other repository editing
features are in place, this might turn out to be easy to implement.

> Also more generally, why do you want to edit an existing repository?
> Is it to save disk space?

Yup, mostly to save disk space, but also to fix mistakes (such as
backing up something you didn't mean to, which might include sensitive
data -- passwords, etc..)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]