monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] cvs import


From: Markus Schiltknecht
Subject: Re: [Monotone-devel] cvs import
Date: Thu, 14 Sep 2006 12:28:49 +0200
User-agent: Thunderbird 1.5.0.5 (X11/20060812)

Hi,

Michael Haggerty wrote:
The main problem with converting CVS repositories is its unreliable
timestamps.  Sometimes they are off by a few minutes; that would be no
problem for your algorithm.  But they might be off by hours (maybe a
timezone was set incorrectly), and it is not unusual to have a server
with a bad battery that resets its time to Jan 1 1970 after each reboot
for a while before somebody notices it.  Timestamps that are too far in
the future are probably rarer, but also occur.  CVS timestamps are
simply not to be trusted.

The best hope to correcting timestamp problems is pooling information
across files.  For example, you might have the following case:

  1   2
  |   |
  A   Z
  |
  B
  :
  Y
  |
  Z

where A..Y have correct timestamps but Z has an incorrect timestamp far
in the past.  It is clear from the dependency graph that Z was committed
after Y, and by implication revision Z of file 2 was committed at the
same time.  But your algorithm would grab revision Z of file 2 first,
even before revision A of file 1.

But you could use another method to determine what to commit first. One which takes only dependency graph into account.

The simplest variant would be:

1. randomly choose a commit (or take the one with the lowest timestamp
   for a mostly good starter)

2. collect the other file's commits which seem to belong to the same
   revision (for me, a revision is a set of files, as in monotone. I
   don't know what terms you use here, probably we should define a
   set of terms to discuss such issues and avoid confusion.)

3. check if any of those file commits conflict in the dependency graph.
   I.e. in your example above file 1 would also find a commit Z, but
   it conflicts A, B, ... and Y.

   If there are conflics, take the first one in your graph (A) and
   repeat from step 2 with that commit. Otherwise continue.

4. You now have the 'next' revision to commit (next in the dependency
   graph sense).


With such an algorithm, you won't rely on the timestamps, but only on the dependencies. Thus, what other advantages would the blob method have?

Tags and branches do not have any timestamps at all in CVS.  (You can
sometimes put bounds on the timestamps: a branch must have been created
after the version from which it sprouts, and before the first commit on
the branch (if there ever was a commit on the branch).)  And it is not
possible to distinguish whether two branches/tags sprouted from the same
revision of a file or whether one sprouted from the other.  So a
date-based method has to work hard to get tags and branches correct.

But in the above way, none of it would be timestamp based. You could, as you do in your blob method, insert tag and branch 'events', which would be dependent on a commit event of a certain file. You would then not get a 'revision' in step 4 above, but a branch or tag.

(Don't get me wrong, I think the blob method is better. Because I suspect importing a CVS repository can't be that simple. But I'm missing prove of that.)

Regards

Markus




reply via email to

[Prev in Thread] Current Thread [Next in Thread]