[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: [BUG] FEATURE PLANS: "perfect" summary deltas

From: Aaron Bentley
Subject: Re: [Gnu-arch-users] Re: [BUG] FEATURE PLANS: "perfect" summary deltas
Date: Sat, 19 Jun 2004 11:48:47 -0400
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4

Tom Lord wrote:

The general way to write the trace-gathering algorithm should
be something like this:

Here's some notes that I think improve it:

      forwards-starter = the revision i want to build;

      backwards-starter = the nearest namespace-descendent i have
                          of that;

      summary-starter = the corresponding summary revision;

paths-in-progress = the set of traces {forwards-starter, backwards-starter, summary-starter};

      winning-paths = the empty set;

for the changes below, make this:
        winning-path = no path

      while (paths-in-progress is not the empty set)
        remaining-paths-in-progress = the empty set;

        t = pick one trace from paths-in-progress;

for each way of continuing t, bringing it closer to the revision we want
            make t' which is that way of extending t

            if t' reaches the goal, add it to winning-paths
            otherwise, add it to remaining-paths-in-progress

Actually, I think we should have only one winning path, so

if t' reaches the goal, and is better than the winning path, replace the winning path with t'. Otherwise, discard t'.


        paths-in-progress = remaining-paths-in-progress;

And here, I think we need some more judgements:

paths-in-progress = remaning-paths-in-progress that are superior to winning-path

Otherwise, you tend to get runaway behaviour, e.g. where tla traces back to the initial import, despite having the direct descendant in a revlib.

          paths-in-progress = paths-in-progress that can continue

There's a certain amount of guesswork involved here: we guess that the namespace-previous is an ancestor and that the namespace-next is a descendant, but we don't know the truth until the archive scans tell us. So it's possible that some traces will never reach the goal.


      use the "best" of winning-paths;

So here, you just have one winning path, already determined to be the best.


Your code should be able to generalize to that easily and _that_,
unlike my earlier mistake, will provide what I promised greek0.

Yep, and what I just proposed would improve on the current backbuilder, even in the absense of summary deltas.

    > Here's an alternative deterministic approach:
    > Take both the build paths (from step 4. above).  For each of them:

I like the way mine generalizes better.  It's got that step of

        for each way of continuing [the path] t

You're working at a higher level of abstraction now. Practically anything could be slotted into "for each way of continuing [the path] t". This is polymorphic-style behavior.

I also thought at first that you meant each run through the loop could have more paths-in-progress than the last, since there could be multiple ways of continuting a path (ancestry's a tree, after all). Is that what you meant?

whereas your approach would handle _only_ summary deltas.

Yep. It was meant to be a concrete proof that using summary deltas wouldn't lead to complex, non-deterministic behaviour. Your algorithm should also have a backwards-summary-starter, otherwise (unlike mine) it won't support backbuilding.

Plus I still don't agree that adding a new free-form subdir for
whatever deltas people care to add is really all that useful an idea.

Just because the flexability's there doesn't mean it has to be used. If nothing else, the dirctory-of-deltas approach will require fewer server roundtrips. Just one directory listing is needed to determine what deltas are available. Your solution requires

- listing categories
- reading version variables in a patchlog
- verifying that the patchlog isn't out of date
- reading 1 patchlog for every backwards patch.

(btw, if we ever redesign the archive format, I'd like to have a single directory for each kind of file: one for continuations, one for imports, one for patches, etc. That would make version scanning a much faster operation.)

Another advantage is that a deltas directory is more human-readable.

Setting parameters on my-style of summary deltas can always get you
close to the best you could do with aribtrary deltas

Would there be settings to eliminate summaries with revision ordinals of less than eight? I thought we'd only be able to eliminate previous phases.

and with less
fuss and fewer unanswered questions.

Well, this is an evolution of the ideas we discussed a while back about smart servers. The DELTA command proposed back then would provide one or more changesets which could be applied to produce the revision.

The problem is, that assumes the server knows the best way for a particular client to produce a revision. I think the right way is to break it up into a LIST-DELTAS command, that lists deltas that the server is willing provide. (This may need to contain a token, so a smart server doesn't wind up overpromising and underdelivering.) Then, the DELTA command can be used to retrieve the individual changesets.

You can see how easy that would be to implement with a directory of deltas.

Arbitrary deltas are
flexability that nobody really needs.

Actually, deltas seem like cacherevs to me. Just as useful as an optimization technique, and perhaps even more volatile. Personally, I'd like to decide when to summarize based on the aggregate size of my un-summarized revisions. Or I might always provide a delta-from-cacherev-to-latest, which would be deleted when the next revision was committed.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]