monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] partial pull #2 - gaps instead of a single horizon


From: Markus Schiltknecht
Subject: [Monotone-devel] partial pull #2 - gaps instead of a single horizon
Date: Fri, 25 May 2007 10:51:10 +0200
User-agent: Icedove 1.5.0.10 (X11/20070329)

...continuing my thoughts on partial pull.

I came to think about that single horizon. Why do we limit to only one horizon? That poses the problem, that on subsequent partial-pulls which move the horizon, monotone would have to throw away revision data it has already fetched. That seems wasteful. Can't we have multiple horizons, i.e. gaps of missing revisions?

Let's see: if we want gaps, we have to keep track of the beginning and ending of the gap. A sentinel, as in partial pull, already covers the beginning. Let's just add the ending information to it. To do the same as partial pull currently does, we'd just add the magic rev id [] as the ending.

If a client does multiple partial-pulls and finally decides to fetch the whole repository, that would save him from transferring some revisions. But more importantly, I can see use for gaps to solve other problems. Namely legal problems, where you want to stop serving certain revisions, but still want to have an intact repository. Another possibility gaps would open are 'hot' servers, which serve only the last few months worth of revisions, but serve those very quickly as they aren't burdened with years of history (i.e. resulting in much smaller merkle tries).

I've had a look at implementing those gaps. I think we could simply build rosters for complete gaps and process normally from there on. I would even add the necessary (parent, child) tuples in revision_ancestry for the gaps. So you could still build an ancestry tree. Just one which has revisions in it, which stand for a whole gap, i.e. multiple (missing) revisions. For mtn log or mtn annotate, you wouldn't have to care about bailing out as soon as you hit the horizon, but you'd just get a special revision_t, which stands for a whole set of missing revisions. If you ask me, the issue gets a tad simpler, as soon as you start to think of gaps instead of a (single) horizon.

Of course, gaps also don't come for free. The hardest problem probably is, that a gap can perfectly well have more than two ending revisions, i.e. ancestors. Our current roster and merge code doesn't support that. (Hm, I remember having raised that issue before...) We could either add support for more than two ancestors or work around the issue by letting monotone insert multiple cascaded sentinels, sort of a binary tree. I still prefer the first option, but the second is probably easier to do and doesn't pose problems with the mark merge algorithm.

While talking about merging: in the presence of gaps, merging is not always possible. Of course, you'd have to have the two revisions you want to merge - we need their rosters. Additionally, we need the common ancestor. As long as that does not lie within a gap, we can do the merge. Even if there are gaps in between, since we have a valid roster for the gap. (In case of a partial pull, where the gap ends at the magic revision id [], that could probably be the common ancestor. Of course we should prevent a merge in that case, too.)

We could easily implement partial pull using gaps, but to gain the other features, we would also have to adjust netsync to be able to serve partial repositories. AFAIK, partial pull currently doesn't plan on supporting that.

What gaps clearly wouldn't solve is, wanting to remove only parts of a revision - which admittedly is much more common than wanting to remove a whole revision.


What do you think? Does that make sense? Is it worth implementing gaps? Do you foresee other problems?


To be continued...

Markus





reply via email to

[Prev in Thread] Current Thread [Next in Thread]