gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] Re: [PATCH] arch speedups on big trees


From: Miles Bader
Subject: [Gnu-arch-users] Re: [PATCH] arch speedups on big trees
Date: 07 Jan 2004 14:38:53 +0900

Chris Mason <address@hidden> writes:
> Yes, there's a space wasting issue, but arch has that in general with id
> files already.

... unless you use taglines.

> > Why not just have one big `signature database' (preferably not `big'
> > in reality of course :-) that includes both inode and pathname
> > information, and make sure it's always kept as up to date as
> > possible by all operations?
>
> I guess I went into this assuming there was a reason arch didn't already
> use an embedded database for ids (Tom's personal taste?).  The reverse
> mapping is really just a database style index into the source tree. 
> There's not much semantic difference between having id files and all
> relevant info stored in a database.

There's a big difference; a `signature database' like I'm talking about
is just an optimization:

  * It's allowed to be out-of-date or non-exstant -- and you have to
    deal with these cases.  However, for non-whole-tree operations like
    changeset application, this sort of up-to-date verification is still
    only per-file, so much, much faster than statting the whole tree

  * It's `central' (and perhaps monolithic, e.g., a single file or
    indexed db file), and so is a lot easier to grok and update
    efficiently, but doesn't have the automatically-sticks-
    close-to-the-source properties of taglines or explicit tags.

  * Contains lots of useful information (inode contents) that aren't
    strictly speaking part of the id-tag, and is at best `advisory'.

I.e., just like inode-sigs today, or indeed like your reverse database.

Really, since you're going to trouble of updating cached info during
changeset-application _anyway_, why not keep single inode-info-
tag-pathname database (indexed however you want), and update it whenever
you can?

For `targetted updates', e.g. changeset application or limited commits,
you'd just validate and update those files/tags you were interested in;
anything doing a whole-tree inventory would validate/update the whole
tree (which it already has to do).  The fact that this info would now
be up-to-date after applying a bunch of changesets would be a _huge_
win because it would make future whole-tree changeset calculation much
faster (because the inode data would be a lot closer to reality).

-Miles
-- 
((lambda (x) (list x x)) (lambda (x) (list x x)))




reply via email to

[Prev in Thread] Current Thread [Next in Thread]