gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] arch lkml


From: Eric W. Biederman
Subject: Re: [Gnu-arch-users] arch lkml
Date: 10 Dec 2003 10:41:36 -0700
User-agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.2

Miles Bader <address@hidden> writes:

> On Mon, Dec 08, 2003 at 10:25:24AM -0700, Eric W. Biederman wrote:
> 
> > The one very obvious potential issue I see with arch as it currently stands
> > is that it does not use one of the more sophisticated storage formats
> > for storing deltas.  Which means as your archive size increases the work can
> > increase.   I think with a different backend format cacherev would not
> > be necessary.  But I may be wrong.
> 
> Please give details.
> 
> Arch's archive (repository) format is actually very good for a certain set of
> operations, and less good for others -- and this is true of every repository
> format.  The question is whether the things it's good at are the right things
> or not (for instance, it's extremely efficient for doing commits, and for
> doing incremental updates).
> 
> One thing _is_ clear: the current representation is a _huge_ win if
> something goes wrong, because it doesn't use a big-binary-blob (or multiple
> smaller-binary-blobs) every storage, everything is in clear common formats.

Hmm.  gzip tar files are binary blobs, just in a common format.

As I understand the literature recent work on version control has
used what is a variation on the gzip format for storing multiple
versions.  The idea is you compress the first file like normal.
But for the second and subsequent files you look back into your
archive (which you are simply appending to) and use previous text
for compression.  This makes both appends and random access fast
and in addition this happens to work for random binary files.

Thinking about the implications of this I wonder why no one has built
a general purpose archiver like with a structure like that.  I should
give roughly the same compression ration as tar | gzip while still
giving you the random access properties of zip. 

So it looks like the very practical thing to do is to build a general
purpose, compressed, random access file archiver, which can be used
for backups or file archiving whichever makes most sense.  And then
come back and look at improving arch.

Eric




reply via email to

[Prev in Thread] Current Thread [Next in Thread]