monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Re: branch review for net.venge.monotone.multihead


From: Zack Weinberg
Subject: [Monotone-devel] Re: branch review for net.venge.monotone.multihead
Date: Fri, 7 Jul 2006 01:08:02 -0700

On 7/7/06, Nathaniel Smith <address@hidden> wrote:
+-- Check in ce first so that the old dumb "in whatever order
+-- get_branch_heads returns" algorithm will do it wrong.

^^ The get_branch_heads returns a std::set<revision_id>, which is
always sorted lexicographically (because std::set is a sorted
tree-based structure, and revision_id::operator< sorts
lexicographically).  So the order in which things are committed
doesn't actually affect anything.

I stand corrected.  I was under the impression set<> was
unsorted-but-ordered, and, in this case, we got chronological order.

          rid_map_iter target
            = heads_for_ancestor.insert(std::make_pair(ancestor,
                                                       set<revision_id>()))
              .first;

          I(target->first == ancestor);

          target->second.insert(*i);
          target->second.insert(*j);
^^ this code seems like it would be clearer as [...]

I think you missed the point of having a set<> on the RHS.  It's
precisely to deal with the case you called out at the end, where one
parent has more than two ancestors.  I'm not sure I coded this bit
right -- what I was trying to write was something like this:

if not ancestor in heads_for_ancestor: heads_for_ancestor[ancestor] = {}
heads_for_ancestor[ancestor][i] = 1
heads_for_ancestor[ancestor][j] = 1

Does that help?

[...]
This overall structure of passes, each containing arbitrarily many
merges, seems unnecessarily baroque, and adds additional code paths
that may contain bugs.  (It is not at all obvious, for instance, what
happens when there are multiple candidate merge pairs that share the
same merge ancestor.)

It's _supposed_ to fall back to the old behavior in that situation...
The inner loop there is in fact the old loop, word for word, and so
_should_ be doing what it did, except on a restricted set of merge
candidates.  Thus for instance

 Or what about:

    A
   / \
  B   C
 / \ / \
D   E   F

Here it looks like it will merge (D, E) and (E, F) in one pass, which
is a bit silly; merging (D, E) and then the result of that with F, or
vice versa, would be better.

I'm 99% sure it does do merge (D,E) and then merge(m(D,E), F) or vice versa.

I think it would be better to just go through the loop, each time
picking _some_ pair of heads that is as good as any to merge next,
merging them, and then going through again from the top.  Simpler to
reason about, and easier to describe to the user what's going on as
well (the second number in the current "merge %d / %d" message is
quite meaningless to a user).

I'm a little worried about the cost of recalculating all the tables
and stuff on each iteration, though arguably if it's a problem, you
have bigger problems.

Also, this makes it easy to skip the whole machinery when there are
only two heads, which is probably a nice speed win in the common case.

Good thought.

In the current code, it's also very unclear what happens in the case:
   A
  /|\
 B C D
In fact, tracing through, it looks like what happens is that each pair
(B, C), (B, D), (C, D) is inserted into the heads_for_ancestor map,
but because it is a map and not a multimap, whichever pair is seen
first gets entered, and the rest get silently dropped on the floor
(because they are all entered under the same key).

I explained this above...

Tangentially, did you see IRC backlog of my discussion with Graydon
over the weekend about the possibility of trying all possible merge
orders?  What do you think of that idea?

zw




reply via email to

[Prev in Thread] Current Thread [Next in Thread]