gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] patch-log sizes


From: Tom Lord
Subject: Re: [Gnu-arch-users] patch-log sizes
Date: Tue, 16 Dec 2003 14:51:14 -0800 (PST)

    > From: Robert Collins <address@hidden>

    > I just imported the automake CVS tree 's HEAD branch into arch (@
    > http://people.initd.org/automake).

    > Without a pristine, the {arch} dir takes up 16M, and the project tree
    > including the {arch} dir takes up 25M.

    > I suspect some users may find this disconcerting or worse.

    > Not sure what can be done sensibly about this... but I thought I'd raise
    > it as a discussion point... if only so that we can put it in the FAQ.

    > Oh, and the primary reason the dir is so big, is that there are 3763
    > ~400 byte files using up 4.0K each on disk. WIthout that overhead, it's
    > only 3.3M in {arch}, and 2.6M of that in the logs subtree. The entire
    > project tree in that case is 8.2M - still a noticable overhead, but 25%
    > is better than 48%.

So, yes -- there are two issue here.

One is: consider whether or not you should complain upstream to your
filesystem developers.  That 4k overhead for small files is sorely
tempting these days, of course -- except that persistent data
structures consisting of many small files are just so damn handy that
a typical >4x size penalty for each file is not likely to _ever_ make
sense.  (Worse, that penalty is certain to hork disk latency
penalties, probable to waste I/O bandwidth, and if nothing else, makes
the caching in the kernel harder to implement reasonably.  So near as
I can tell, it's one big win (well, I assume this is related but it
might not be) is that "rm -rf" goes real fast on these file systesms.)

The other is log pruning.  Do I understand that you've made a tree
that carries around several years worth of patch log history?  If that
development had been done natively in arch, odds are that are along
the way there would have been some discarding of old logs (perhaps
stashing a ChangeLog of what's being discarded, for human readers).
You might wnat your gateway program to emulate that.

(Speaking of which, it's my plan to do some log pruning as I make the
--2004 archives and start on the 1.2 line of arch.)

-t




reply via email to

[Prev in Thread] Current Thread [Next in Thread]