gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] [PATCH] arch speedups on big trees


From: Tom Lord
Subject: Re: [Gnu-arch-users] [PATCH] arch speedups on big trees
Date: Wed, 28 Jan 2004 08:42:00 -0800 (PST)


    > From: Chris Mason <address@hidden>

    > The main thing I'm waiting for at this point is details from Tom
    > on where he wants the design of the code to go.

Right.  So, you wrote up a description:


    > Basically the changes have 3 major components, and some tweaks.

    > 1) Maintain a reverse mapping of ids to the objects that own
    > them.  This lives in project_root/{arch}/++id-mapping, one file
    > per id.  The name of the file is the id in string form, and the
    > contents of the file are the path for the object owning it.  The
    > reverse mapping allows apply_changeset to inventory only the
    > files involved in the changeset being applied.

    > Most of the speed improvements come here, so I'm hoping I can
    > convince you it's a good idea.

I don't see how such a mapping can possibly work.  That is, the step
that says "maintain a reverse mapping [...]" sounds to me like "use
magic."

The format of the mapping, and whether it is forwards or backwards,
doesn't matter at all.   If you can maintain that mapping accurately,
that is the same thing as being able to maintain a complete inventory
(in any format).   You might as well just say that {arch}/++inventory
will always contain the current tree inventory.

How can you possibly maintain that inventory, though?   For example,
if I edit a file and add a tagline, the inventory is out of date.   If
I delete, rename, or create a file without using `tla add' and similar
commands, then the inventory is out of date.

There may be some _minor_ advantage to caching and maintaining an
inventory _internally_ in some circumstances (for example, to carry
over from one changeset application to the next when replaying several
in a row), but even that will be so hard to get right that I question
its utility.  (What happens if, for example, someone uses an extended
version of patch that can change the tree in ways tla doesn't know
about?)


    > 2) add --link and --replace for tla add-pristine.  Having a hard linked
    > pristine tree makes commits faster, since the commit updates the
    > pristine tree as the last step.  The replace option lets you update an
    > existing pristine tree to a higher patch level without having to
    > inventory it again, it can make a big difference during star-merge.

    > I seem to remember a post where you talked about pristine trees being
    > dead, in my mind they are basically a private library.  It might be a
    > good idea later on to generalize them as such.

Pristine trees already have an inventory cache and are reused
implicitly in some circumstances.   If the logic of those features
isn't working for some case, that should be fixed, but I don't think
that new options to add-pristine.

Generalizing/replacing pristines to make them more literally a
tree-specific revision library strikes me as a much more practical
idea.   What do you think of this idea:

1) Permit a "special" element in library paths that means "the library
   in my current project tree".    Tla can create the library
   directory on demand (whenever it asks for the library path from
   within a given project tree).

2) By default, such libraries should be greedy and sparse.

3) It might be worth considering an option to make libraries "sliding"
   which means that new trees are formed from old by re-use rather
   than by linking.   This would be tricky to get right and not safe
   for concurrent use.   Perhaps it could be combined with a locking
   protocol for library revisions.


    > 3) Avoid inode signatures for everything except library revisions. 
    > Since taking an inode signature involves a whole tree inventory, we
    > should only take them when we know we're going to read them at least
    > twice before snapping them again.  Otherwise, the inode sig is a net
    > loss in speed

I use what-changed pretty frequently.   I _think_ I read my
non-library inode signatures more than twice, on average.

-t




reply via email to

[Prev in Thread] Current Thread [Next in Thread]