[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Libraries and changesets

From: Tom Lord
Subject: Re: [Gnu-arch-users] Libraries and changesets
Date: Wed, 31 Dec 2003 10:06:49 -0800 (PST)

    > From: Aaron Bentley <address@hidden>

    > I'll see how it goes.  I didn't want to complain though, I'd rather look
    > for positive alternatives.  Archives seem very efficient.  Compressed
    > revlibs would also save wads of space-- I've seen tla--devo--1.2 shrink
    > by an order of magnitude as a bzip2ed tar.

    > But I'm new to this problem domain, so I'm asking "why" a lot.  Sorry
    > about that-- it probably gets old fast.

Not at all.  This is all part of the "cannon" of factoids about arch
and other people benefit too when we review it on the list.

BTW, do you keep local mirrors of my archive or other remote archives
that you use?  In combination with an agressively pruned revlib, they
will give you pretty much the space performance that you'd expect from
a "compressed revlib" or a "revlib stored as changesets".  If you
revlib prune relatively sanely (e.g,. getting rid of only old
revisions you are pretty sure you don't need any more) this will also
have excellent speed.

But on to details:

    > Sure, you'd only have doubling the second time you got a
    > revision.  But a sustained growth rate of 20M per star-merge
    > looks unappealing.

(And you go on to mention wanting to star-merge from me frequently --
in other words, I'm assuming that you number is for tla--devo--1.2.)

Let me assume that you're using an ext* file system -- each directory
you create will likely consume 4K (a very small number of directories
might be slightly larger).

tla--devo--1.2 has a bit under 250 directories in it.  (This will
shrink considerably when log compression commands are added but that's
another story...)

Each new revlib entry, _before_ any changes are applied (i.e., if it
is just a literal copy of an ancestor revision) should add a bit
under 1M to your revlib.   There is an exception to that rule:  the
very first ancestor of a particular revision will cost the full size
of the tree -- somewhere between 10M and 20M most likely.

If nothing else in particular is going on, then a typical star-merge
will want to add two new revisions to your revlib.   The first time
you do this, paying the cost of full copies, 20M growth sounds about

But after that, the overhead should be closer to 2M plus the size of
any files changed in the new revisions.  As just a slightly
pessimistic guess, if you really are star-merging from me frequently,
a 5M growthrate per sounds about right.  If you do no revlib pruning,
if development of tla is stunningly rapid, if you star-merge 5 times a
week -- the pessimistic guess is that it will take you most of a year
to fill 1G.

And there's absolutely no good reason, if all you're doing is
"tracking the latest" and maybe doing a little hacking, to never prune
your revlib.  By tossing old revisions, you can keep your revlib
_at_a_constant_size_ and continue star-merging indefinately.

    > Fair enough.  There are other costs, though.  I've had to clean
    > up my home directory several times to have enough space for
    > Arch, so for me, there's been a frustration cost.  I probably
    > will install another hard disk, but I'd prefer not to need to.

I feel your pain but on the other hand, 

(a) it should be easy to automate the heck out of pruning your revlib

(b) depending on how comfortable you are with the idea, consider
    using the --link options to `get' and `buildcfg' for your
    project trees

(c) i think that the expected case really is, by far, going to be 
    that storage budgets are:

      some constant K * rate of development activity * project size

    that revlib costs for _unpruned_ revlibs are 

      some constant L * rate of development activity * project size

    and that 

        L << K          (<< meaning "is much less than")

(d) revlib costs for _pruned_ revlibs are fixed:

        some small contant M * project size

    The necessary cost to you for using revlibs to track all of my
    projects should be a few hours labor up front to write the shell
    scripts to prune plus --- oh, i dunno -- perhaps $10 or $20 (and
    falling) per disk-lifetime.

    > > When I first added revlibs, around two years ago, my worst-case
    > > projection was that my fairly dinky disk would be full by now.   In
    > > fact, I've used about 10% of it for revlibs.

    > It's also possible that we have different useage patterns.  I'd like to
    > star-merge your tree fairly frequently.

Most likely my revlib growth rate will be slightly higher than yours
because I periodically star-merge from entirely new sources -- I pay
that "full copy" penalty more often.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]