monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Mark merge for existence


From: Markus Schiltknecht
Subject: Re: [Monotone-devel] Mark merge for existence
Date: Thu, 15 Nov 2007 14:17:57 +0100
User-agent: Icedove 1.5.0.14pre (X11/20071018)

Hi,

William Uther wrote:
(which does not track merges). And try to get merging right for it.

Um, if it does not track merges, how do you "get merging right"?

Sorry, I was unclear here. There are two different cases: one is the sync vs async copy. The other problem you are trying to solve as well is keeping track of deleted and undeleted files, and merging that correctly.

Under async copy, I understand copying of a file, where the copy remains at the copied state and is independent of the origin from that point on. That's what I meant with 'not tracking merges'. A sync copy, which would track merges would get all the changes to the origin merged in as well, including getting deleted when the origin gets deleted. Thus, a sync copy would certainly not help the die die die merge problem.

But for undelete, or async copy, we'd need to be able to merge revisions which resurrected the same file at different revisions, for example. With async copy, I'm imaging that an undelete would look like that:

REVISINO(node id -> filename)
"changelog":

                 A (1 -> fileA)
                 "initial import"
                  /            \
                 /              \
          B (1 -> fileA,      D(1 -> fileA,
             2 -> fileB)        3 -> fileA.resurrected)
          "adding fileB"      "copy of fileA for undelete"
               |                     |
               |                     |
          C (2 -> fileB)             |
          "deleted fileA"            |
                    \               /
                     \             /
                      \           /
                     E (2 -> fileB,
                        3 -> fileA.resurrected)
                     "clean merge"
                           |
                           |
                      F (2 -> fileB,
                         3 -> fileA)
                      "rename to original filename"


In this example, fileA got deleted in revision C. To resurrect the file from revision A, monotone would have to add three revisions, which might be a disadvantage. OTOH, revisions aren't expensive.

As you see, we could use the existing merge algorithm. If another user would have simultaneously undeleted fileA from revision B (instead of rev A as happened already above), resulting in these additional revisions:

   B  ->   G (1 -> fileA,
              2 -> fileB,
              4 -> fileA.resurrected)
           "copy of fileA for undelete"
                    |
                    |
   C  ->   H (2 -> fileB,
              4 -> fileA.resurrected)
           "clean merge"
                    |
                    |
           I (2 -> fileB,
              4 -> fileA)
           "rename to original filename"


We'd end up with revisions F and I, which could not be merged because rev F wants node id 3 to mean 'fileA', but rev I wants node id 4 to mean 'fileA'. All of the existing merge machinery could handle this already.

I can see how not tracking merges at all might work: simply introduce a new node and a simple pointer so that when you track history the logs continue from the new node to the old one (I assume this is what your diagram was showing).

Yup, logs would have to continue. I.e. in the above example, a log on fileA, starting from rev F, would include these revisions:

  F, E, D (the copy, switching node id), A

Note that it would not include revisions B, nor does the fileA include its changes. That's why it's an async copy.

I have no objections to that at all. (And you could add that as well as what I'm suggesting.)

However, if you try to modify it to get merging working then I think you travel down the path to Operational Transformation (OT) style merging (See also DARCS). This could work well, and I think would make for interesting text merge capabilities too, but is a huge amount of work to add it and get it right.

Uh.. I'll have to read on that, thanks for the pointer. However, I don't (yet) see why I'd need another merge algorithm at all.

Can you expand on you concept of 'existince marks'?

Nothing complex. I just want to use mark-merge for existence. When a user explicitly adds or un-deletes a file, it is marked because there is a user driven existence change. When a user deletes a file, it is marked because there is a user driven existence change. When two revs are merged, for each node you use mark merge to decide the value of the existence bool. If true then the node exists in the merged revision. If false then the node is deleted in the merged revision. If conflicted then you ask the user to resolve things (delete or un-delete on either side as appropriate) and merge again.

At the moment this is a little tricky because we only keep marks for nodes which currently exist. This is good for storage size - it doesn't grow without bound. This is bad for mark-merge - you need those marks to do the merge.

I think I know how we can efficiently reconstruct those marks during the merge. If the node exists on both sides then there isn't a problem - the marks exist on both sides. If the node doesn't exist on either side then there isn't a problem - the node wont exist in the children and you don't need the marks. If the node only exists on one side then you need to reconstruct the marks on the side that doesn't have them. I think I know how to do that relatively efficiently - find the latest cut (in the graph-theoretic sense) in the common ancestors where the marks exist in each revision that makes up the cut, and reconstruct the marks from there. (And I've found a wonderful method to find that cut efficiently, but this margin is too small to contain it.)

Hm.. to me that sounds like circumventing rosters, which doesn't sound too good. But hey, you are saying it's efficient, so please go ahead!

At this stage I just want to modify mtn so that it is keeping existence marks when the node exists. They will end up being trivial marks at the moment - the revision where the node was born will be the only marked revision for existence. But I can add things incrementally from there...

Understood.

Regards

Markus




reply via email to

[Prev in Thread] Current Thread [Next in Thread]