[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] [RFC] DAG-based revision refinement

From: Timothy Brownawell
Subject: [Monotone-devel] [RFC] DAG-based revision refinement
Date: Fri, 08 Sep 2006 18:36:01 -0500

There's been some minor thought given to having a refinement scheme for
revisions (for netsync) that makes use of the DAG structure to do better
than the merkle refinement we do now.

Merkle refinement is O(d*log n), d being the size of the difference
between the sets. It also has a problem where the subset of the revision
graph being synced needs to be agreed on beforehand.

What I'm suggesting below I think will be be O(log n) generally, and
significantly better for the common case where any one dev only touches
a few of the graph heads. Additionally, mismatched sync sets won't cause
huge retransmissions like they do now.

(Currently, pushing a nvm.newbranch can re-send all of nvm up to the
branch point, if a bad sync pattern (such as just the new branch) is
used. This should fix that.)

The change in O() won't matter that much, since revision traffic is also
linear in d and is much bigger. What will probably matter more is the
better handling of bad sync patterns.

Certs and keys will still need to use merkle refinement. However, this
should be delayed until *after* rev refinement is complete, so we know
exactly which revs we need to include certs/keys from. (specifically we
only have to ask about certs on shared revs, and we want to ask about
certs on *all* shared revs, even if they don't match the sync pattern)


matching a pair of DAGs, specifically a pair of monotone revision graphs

A "comparison" is: "I have rev XXX, do you also have it?"

After every comparison:
        * each side marks as shared all revs that it has that are
          ancestors (inclusive) of a known-shared rev
        * each side marks for send all revs that it has that are
          descendants (inclusive) of a known-unshared rev

1) compare heads
        Generally, an individual dev won't be active on all branches.
        So, the client will still have many heads that were fetched
        from the server. (maybe note that these are heads, so any
        descendants the other side has can be immediately marked for
2) compare roots (maybe)
        If it is assumed that each history has several heads, then any
        common roots would have likely been eliminated in (1). So check
        any remaining roots before going random, as these are likely to
        not be shared.
3) compare random revisions until all is known
        Or maybe try to split the revision graph into N+1 equal chunks,
        where N is the number of revs checked at once (checking several
        at once due to round-trip latency).

Should be about O(log n), but significantly better in the common case.

Only ask about revs that are in your sync set, but reply for revs you
have regardless of whether they're in the sync set.
        This would avoid the problem where, say, pushing a new feature
        branch with a glob that *doesn't* include the base branch will
        re-send large portions of the base branch.

Free (experimental) public monotone hosting:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]