[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: situations where cached revisions are not so go

From: Jason McCarty
Subject: Re: [Gnu-arch-users] Re: situations where cached revisions are not so good
Date: Sun, 28 Sep 2003 15:50:32 -0400
User-agent: Mutt/1.5.4i

Robert Collins wrote:
> Note that your algorithm is a cut down (read broken :])
> shortest-path-first finder. If you consider the problem a SPF case
> (which it is) then non transitive summaries would be no issue: the
> algorithm would continue probing the cheapest path out until a match is
> found or the cost exceeds that of another candidate. Neatly, this allows
> adding reverse patching in the future. Also, if the cost includes patch
> transmission time, this handles choosing between cached revisions,
> individual changesets and summaries.

As Tom pointed out, SPF is problematic in our case since we can't know
or even reliably estimate path costs in advance. If you want to find the
truly optimal solution and a really short one doesn't present itself
immediately, then you must traverse almost the entire ancestry searching
for summary-deltas, and comparing their sizes (note that delta size
isn't even a great cost metric for CPU time; it seems to be more
strongly dependent on total-tree-size and number of patches applied).

Furthermore, you're no longer looking for a single path. You may have
multiple pristines and revlib entries (from multiple versions, due to
continuations) on hand, and you want to find the shortest path to
_any_one_ of them, which just multiplies the amount of traversal you
need to do. You almost need to build a spanning tree to find the optimal

Now, if an index of summaries and sizes was stored for each version, I'd
agree that SPF would be the best solution, since all of the traversal
would be local instead of remote. But then you have to ask whether the
cost of downloading the index is worth it?

I believe that a near-optimal layout of summaries for SPF will also be
near-optimal for my algorithm, with the difference that mine won't have
to perform nearly so much remote traversal, and has no need for an index
to achive good results.

Reverse patching wouldn't really complicate SPF any, but it would also
only give a 2x speedup on average, so is it worthwhile vs. my simpler

> Oh, and lastly: SPF is a well known, many times coded, generic solution,
> as opposed to a custom roll our own solution to the problem. All we need
> is a traversal cost algorithm for a link, and that is needed regardless.

A generic solution which IMO is unsuited to this problem, at least
until the cost metric is very cheap to calculate, which it currently


reply via email to

[Prev in Thread] Current Thread [Next in Thread]