[Monotone-devel] Deterministic *-merge

monotone-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Monotone-devel] Deterministic *-merge

From:	Nathaniel J. Smith
Subject:	[Monotone-devel] Deterministic *-merge
Date:	Fri, 12 Jan 2007 03:00:59 -0800
User-agent:	Mutt/1.2.5.1i
So, I've been thinking -- always dangerous -- about merging again,
originally motivated by the discussion about the operational
transformation properties, and in particular the discovery that
*-merge is _not_ associative.  The example is:

  a*  b*
  |\ / \
  | X   \
  |/ \   \
  c*  a   b
            
Here, if we do merge(merge(c, a), b), we get a clean merge:

  a*  b*
  |\ / \
  | X   \
  |/ \   \
  c*  a   b
   \ /   /
    c   /
     \ /
      c

But if we do merge(c, merge(a, b)) we get a conflict!:

  a*  b*
  |\ / \
  | X   \
  |/ \   \
  c*  a   b
       \ / 
     conflict!

In fact, it isn't even very easy to state the associativity property
for merge algorithms, because it just isn't clear what to say about
conflicts.  In general, conflicts make merging (especially repeated
merging) annoying to analyze, because it's always a special case -- at
each step, you may either get a value out, or you may invoke the
special go-talk-to-the-user logic.  This makes sense given the actual
use of merge algorithms, as operations run one at a time,
interactively by users -- so pretty much nothing in this note is
actually relevant to practical deployment of mergers.  But it turns
out that one can tweak the formalism a bit to get a purely
deterministic *-merge, one that always returns some kind of value --
and the resulting object is dramatically simpler to analyze, without
any loss in correctness.  (The new ideas in this analysis probably
apply more generally, which someone out there may find helpful, as
e.g. conflicts are what cause all the problems with patch theory.)

The result is a fully commutative and associative (and idempotent)
version of multi-*-merge.  These properties are interesting because if
you have them, then suddenly it is meaningful to talk about the merge
of an arbitrary collection of nodes, because all merge orders give the
same result.  This may be particularly interesting if you are crazy
enough to try to create algorithms that call a merger automatically as
a subroutine, where no user is available.  (I have gradually been
driven that crazy -- monotone-devel readers should see some followup
about this soon, hopefully.)

So to summarize, this note:
  -- describes a new formalism for dealing with conflicts
  -- uses this to derive a version of multi-*-merge that is
     analytically slick
  -- and shows that this version is fully associative.

Enough context and background.  Where am I going with this?

Deterministic merging
=====================

Okay, so the basic goal for now is to define a version of
multi-*-merge that always gives some value out.

There are two key ideas we need -- one elementary, and one, perhaps,
non-obvious.

Key idea 1: We need to be able to talk about conflicts _somehow_, but
  we're requiring that our merger always return a value... so there's
  not much we can do except make up a new _first-class value_ that
  represents "this merge was a conflict".  Basically, add one new
  entity to the set of legal scalar values, distinct from all other
  values.  I'll write it #.  So for example, here's a history graph in
  which a merge resulted in a conflict:

      a*
     / \
    b*  c*
     \ /
      #

Key idea 2: Despite what I just said, we should not think of conflicts
  as "just another value, distinct from all other values".  We should
  think of them like a wildcard -- "there is a value here, but I don't
  know what it is".  In particular, a conflict is a value that is
  somehow more general than any of the particular candidate values
  that conflicted.

  I'm probably not explaining this well.  Hopefully it'll become clear
  exactly what this means later on.

Also, let's make a simplifying assumption.  We've so far always
designed mergers with the assumption that they have to take in any
crazy arbitrary history graph that can be written down, and do
something with it.  But for now let's ignore all the excellent reasons
why we've always done that, and let ourselves book tickets to magical
mathematics land, where we can make pretty simplifying assumptions and
no-one will complain.  In this case, the assumption I want to make is
that our merger is only required to handle histories in which all
_previous_ mergers were done using the exact same merger.  If a node
has two parents x and y, then the value of that node is merge(x, y).
No exceptions.  Let's call a history DAG with this property an
"admissable" graph, and I'm only going to define deterministic *-merge
on admissable graphs.

Deterministic *-merge
=====================

All, right, here's the algorithm.

Marking
-------

Just like in the original *-merge writeups, the first thing to do is
to write up how we mark the history graph.  The rules get a lot
simpler:

   a*       graph roots are always marked

   a   a
   |   |    single-parent nodes are marked iff the value changes.
   a   b*


   x   y
    \ /     merge nodes are never marked
     z

Yes, that really says "merge nodes are never marked".  There are two
ways to justify this.

One justification for this comes from the *-merge user model.  *'s,
according to the user model's interpretation, are to be assigned
exactly when a user takes some action.  Deterministic *-merge is,
well, deterministic, which users are not.  (I could start a digression
about free will here, but I think most hackers will be perfectly
comfortable conceptualizing users as entropy sources, so let's move
on.)  Because we only have to worry about admissable graphs, we know
that the value of every merge node was determined by the merger, so,
there you go.

The other way to think about this is based around "key idea 2".
There are basically two cases in traditional algorithmic *-merge where
a merge node gets marked.  One is where the user did something totally
random with the merge, like they overrode the algorithm entirely so
you have:
   a   b
    \ /
     c*
That can't happen in an admissable graph, so never mind that.  The
_interesting_ case is that any time the merge generated a conflict,
the result of that merge gets marked.  With traditional *-merge, the
conflict has already been resolved before you see the graph, so this
looks like:
   a*  b*
    \ /
     b*
But with deterministic *-merge you just get the conflict:
   a*  b*
    \ /
     #
Okay, so here's the punch line: # is like a wildcard.  The reason the
# does not get marked is that this is an _accidental clean merge_ --
a and b are both included by #, i.e., both sides have a #, so it's an
accidental clean merge.  And accidental clean merges are not marked.
So there.

Merging
-------

Once we have a marked graph, we can perform the actual merge.  How
should we do this?  Well, we're in magical mathematics land, let's
define it in the way that makes all the math as easy as possible.
Now, if you look back at the original unique-*-merge writeup, the very
first lemma was: for all nodes R, value(*(R)) = value(R).  Then all
the proofs were built on this.  And then if you look at the
multi-*-merge writeup, the first lemma was the obvious generalization:
for all nodes R, for all elements S of *(R), value(S) = value(R).  And
then all the original proofs basically went through unchanged once we
had the new lemma.  So that suggests this is a nice, foundational
property that makes our life easy.  Let's use it to define
deterministic *-merge -- but with a wrinkle.

This is where "key idea 2" comes in again.  Let's define an
equivalence relation ~, as:
   for all x and y that are not equal to #, x ~ y iff x = y.
   for all x, # ~ x is always true.
Or in words: every normal value is "similar" to itself, plus, # is
similar to _everything_.

Now the appropriate statement of the lemma is:
  for all nodes R, for all elements S of *(R), we should have:
     value(S) ~ value(R)

This is already obviously true for non-merge nodes.  We want it to be
true for merge nodes too.  The easy way is to simply use it as the
definition of merge(A, B).  If every node in *(A) union *(B) has the
same value, then cool -- we can just make merge(A, B) that value.  If
there are different values, then we have only one option -- we must
define merge(A, B) to be #, because # is the only thing that is
similar to multiple different values.

[For the even more math inclined, we can generalize this to say we are
actually going to extend our space of possible values to contain a
whole lattice (with unique maximal element) of conflict types.  Just
adding # is the minimal extension, and gives us a lattice like:

       #
      /|\
     a b c

But one could imagine adding multiple conflict values, like, say:

     a,b,c
    /  |  \
 a,b  a,c  b,c    
   |\ / \ /|
   | X   X |
   |/ \ / \|
   a   b   c

which lattice is perfectly familiar to set theoreticians of all
stripes, it's just the standard partial ordering of sets by inclusion.
Such a lattice might be interesting because instead of just having a
generic "something went wrong" value, conflict values would actually
record all the actual candidates that were under consideration.  I
can't think of what that'd be useful for, but it seems like it might
potentially come in handy in some circumstances or another.  Anyway,
once we have all this, we can just write:

   merge(A, B) := sup(value(*(A) union *(B)))

i.e., the value of the merge is just the least upper bound of all the
mark values.

Of course, if we are doing binary merging, then the rich conflict
extension is exactly equivalent to the trivial one, and that happens
to be the case I think I'm most interested in, but hey, maybe someone
will find this useful sometime.]

Relation to traditional/algorithmic/nondeterministic *-merge
------------------------------------------------------------

Okay, so we have a merge algorithm that only works on special
carefully constructed graphs.  That lets us radically simplify the
marking and merging rules, cool.  But what relation does that have to
the real world?

The interesting thing is, it is totally trivial to map traditional
*-merge in all it's three-times-as-long-to-write-the-formulas glory
directly into deterministic *-merge.  Here is an alternative
definition of traditional *-merge, over arbitrary graphs:
  1) Take the arbitrary graph, and at each 2-parent node, insert a
     dummy node.  Set that node equal to the deterministic merge of
     the parents:
       A   B           A   B
        \ /    ===>     \ /
         C           merge(A, B)
                         |
                         C
     Obviously, this turns the arbitrary graph into an admissable
     graph.
  2) run deterministic *-merge on this graph.

That's all.  If you look carefully, you'll see that all the complex
marking rules in the original *-merge stuff just fall out
automatically from this -- a node in the original graph would have
been marked by classic *-merge iff the corresponding node in the
admissable graph gets marked by deterministic *-merge.

Well, I think that's kind of cool, anyway.  (Nerd.)

Example 1
=========

Way past time for examples.  Let's work through that original example
of non-associativity.

We first have to make the input graph admissable, by inserting a dummy
merge node.  What value should this node have?  Well, we know that its
mark set will be {a1, b1}, and we need
   merge(a1, b1) ~ a
   merge(a1, b1) ~ b
and the only value that is similar to both a and b is #.  So the value
is #:

  a1* b1*          a1* b1*
  |\ / \           |\ / \
  | X   \   ====>  | X   \
  |/ \   \         |/ \   \
  c*  a2  b2       #   |   |
                   |   |   |
                   c*  a2  b2

Then there are the two ways this can be merged:

  a1* b1*      a1* b1*
  |\ / \       |\ / \
  | X   \      | X   \
  |/ \   \     |/ \   \
  #   |   |    #   |   |
  |   |   |    |   |   |
  c*  a2  b2   c*  a2  b2
   \ /   /      \   \ /
    c   /        \   #
     \ /          \ /
      c            c

The left diagram is just the same as in the original -- if we merge in
that order, there are no conflicts, the c just wins everything,
because the mark set at each point is just {c} and we don't need #.
The right diagram is different, though -- now, when we merge in the
alternative order, there is again a conflict between the a and the b.
But, if we ignore that and continue merging anyway, we find that that
_the conflict itself loses_ to the c -- its mark set is {a1, b1} and
the c has already beaten both of those -- and overall we end up with a
clean merge to c.  So we see that this merge is now associative after
all.


Math
====

I'm not going to bother proving all the basic stuff again -- unlike
the old *-merge definitions, we actually don't need any proofs to show
that this algorithm is well-defined, since we no longer have a bunch
of possible cases only some of which are allowed.  And it's obvious
that it doesn't do anything too haywire semantically, since it always
does the same thing as traditional *-merge (just via different means).

What's interesting to prove are those properties I mentioned at the
beginning: commutivity, associativity, idempotency.

Fortunately, the proofs are all like 1 line long.

Theorem (Commutivity, AKA TP1): merge(A, B) = merge(B, A)
Proof: this reduces to the claim that
    sup(*(A) union *(B)) = sup(*(B) union *(A))
  which is obvious since union is commutative.

Theorem (Associativity, AKA TP2):
  merge(A, merge(B, C)) = merge(merge(A, B), C)
Proof:
  The two sides of the equation correspond to the two graphs:

    A   B   C     A   B   C  
     \ /   /       \   \ /   
      D   /         \   F
       \ /           \ /     
        E             G

  And we want to show that value(E) = value(G).  But the value of E
  and G depends only on *(E) and *(G), and *() depends only on the
  ancestor set of its argument.  The only difference between the
  ancestor sets of E and G is that E has {D, E} included and G has {F,
  G} included.  But D, E, F, G are not marked (because -- this is the
  main step of the proof -- merges are never marked), so they don't
  matter anyway.  So *(E) = *(G), this is actually trivial.

Theorem (Idempotency): If A is an ancestor of B, then
  merge(A, B) = value(B).
Proof:
    ancestors(merge(A, B)) = {merge(A, B)} union ancestors(B) union ancestors(A)
  (by definition), but ancestors(A) is a subset of ancestors(B),
  by assumption.  So this is just {merge(A, B)} union ancestors(B),
  and merge(A, B) isn't marked, so *(merge(A, B)) = *(B), this is
  trivial too.

Tada.

Example 2 (super bonus edition)
===============================

A more wacky example is:

      a     
     / \    
    b*  b*  
   / \ / \  
  c*  b   c*

Here we have two people who independently set the value to b, which
then makes an accidental clean merge.  Then two other people come
along and independently overwrite the b's with c's.  Because *-merge
makes the decision to be conservative about implicit convergence, if
either of the c's is merged with the b, it generates a conflict --
because someone made a decision to create the b in two different
contexts, but the c came from someone overruling that decision in only
one context:

      a             a       
     / \           / \      
    b*  b*        b*  b*    
   / \ / \       / \ / \    
  c*  b   c*    c*  b   c*  
   \ /               \ /
    #                 #    

In fact, there is a somewhat convoluted way to merge this that happens
to generate no conflicts...:

      a     
     / \    
    b*  b*  
   / \ / \  
  c*  b   c*
   \   \ /
    \   X
     \ / |
      c  |
       \ |
        \|
         c

But, magically, with deterministic *-merge, all orders work the same
-- it even turns out to be possible to merge two conflicts and get out
a non-conflict (!):

      a             a             a
     / \           / \           / \
    b*  b*        b*  b*        b*  b*
   / \ / \       / \ / \       / \ / \
  c*  b   c*    c*  b   c*    c*  b   c*
   \ /   /       \   \ /       \ / \ /
    #   /         \   #         #   #
     \ /           \ /           \ /
      c             c             c

Is that lower-right diagram not a thing of beauty?

Cheers,
-- Nathaniel
[Prev in Thread]
Current Thread
[Next in Thread]
[Monotone-devel] Deterministic *-merge, Nathaniel J. Smith <=
- Re: [Monotone-devel] Deterministic *-merge, Oren Ben-Kiki, 2007/01/12
  - [Monotone-devel] Re: Deterministic *-merge, Timothy Brownawell, 2007/01/12
    - [Monotone-devel] Re: Deterministic *-merge, Oren Ben-Kiki, 2007/01/12
    - [Monotone-devel] Re: Deterministic *-merge, Timothy Brownawell, 2007/01/12
    - [Monotone-devel] Re: Deterministic *-merge, Oren Ben-Kiki, 2007/01/12
    - Re: [Monotone-devel] Re: Deterministic *-merge, Nathaniel J. Smith, 2007/01/12
    - Re: [Monotone-devel] Re: Deterministic *-merge, Oren Ben-Kiki, 2007/01/13
    - [Monotone-devel] Re: Deterministic *-merge, Lapo Luchini, 2007/01/23
- Re: [Monotone-devel] Deterministic *-merge, Justin Patrin, 2007/01/12
  - Re: [Monotone-devel] Deterministic *-merge, Justin Patrin, 2007/01/12
Prev by Date: [Monotone-devel] Re: Summit update -- 1 month to go!
Next by Date: [Monotone-devel] [PATCH] mtn automate diff
Previous by thread: [Monotone-devel] ViewMTN suggestions and bug reports
Next by thread: Re: [Monotone-devel] Deterministic *-merge
Index(es):
- Date
- Thread