[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Converting a proprietary svn repository to git

From: Vadim Zeitlin
Subject: Re: [lmi] Converting a proprietary svn repository to git
Date: Sat, 27 Feb 2016 16:23:10 +0100

On Sat, 27 Feb 2016 02:11:51 +0000 Greg Chicares <address@hidden> wrote:

GC> Then I used 'msgfilter-rev2sha':
GC>   cd rev2sha
GC>   PATH=$PATH:/home/greg/tainted/migration git filter-branch --msg-filter 
msgfilter-rev2sha --tag-name-filter cat -- --date-order --all
GC> [apparently git really does require the script to be on PATH, but
GC> I'm averse to using superuser powers when I can avoid it]
GC> and I got:
GC>   Rewrite 4ba7e0eea0dda84722d5719ee55a09ab9102832a (1/237)Must run from git 
svn repository.
GC>   msg filter failed: msgfilter-rev2sha
GC> But I am running from a git repository:
GC>   /home/greg/tainted/migration/rev2sha[0]$ls -A
GC>   .git  data  src  test
GC> The error message is issued here in 'msgfilter-rev2sha':
GC>   die "Must run from git svn repository.\n" unless -d 
GC> and I seem to have a "git svn repository" that lacks remotes/svn :

 Yes, clearly this sanity check isn't robust enough :-( Probably just
checking for .git/svn/refs/remotes should be sufficient...

GC> bringing my entire knowledge of perl to bear, I forge boldly ahead:
GC> /home/greg/tainted/migration/rev2sha[0]$sed -i ../msgfilter-rev2sha -e 
's/^die/# die/'

... so this looks like a good enough workaround.

git filter-branch --msg-filter msgfilter-rev2sha --tag-name-filter cat -- 
--date-order --all
GC> Rewrite 248c5530142dde7ad67fcad348fcbd38ba6c9895 (57/237)fatal: ambiguous 
argument 'svn/trunk': unknown revision or path not in the working tree.
GC> Use '--' to separate paths from revisions
GC> rev-list --first-parent --pretty=medium svn/trunk: command returned error: 

 But the problem is, of course, that the script hasn't been tested with the
svn repositories using non-standard layouts neither, so it fails to work in
this case. I could probably fix this if you're interested, but it would be
really helpful to have a copy of your repository to test it with, would
this be possible?

GC> Let's compare the '--no-metadata' and '--msg-filter msgfilter-rev2sha' 
GC> All the differences are in .git/ , and they seem to be just binary;
GC> the contents of {data/ src/ test/} are identical. I think I can conclude
GC> that for this migration 'msgfilter-rev2sha' isn't beneficial.

 Err, I am not sure how do you make this conclusion, even if the result may
well be true. All msgfilter-rev2sha does is to update the references to svn
revisions in the repository history to the corresponding git commits, so
it's never going to result in any changes in the repository contents
itself, it only works on metadata.

 To see whether it's beneficial or not you should use "git log --grep=..."
with the regular expression at the end of the script. As you're not going
to have any 5 digit revision numbers (with only 237 revisions in total),
and as it's not a problem to get some false positives here, it should be
enough to run

        git log --grep='(r|rev\s*|revision\s*)([1-9][0-9]*)'

and check if there any references you would like to replace.

GC> What really bothers me is the git documentation:
GC> https://git-scm.com/docs/git-svn
GC> | This option [--no-metadata] is NOT recommended ...
GC> | consider git-filter-branch[1] instead.

 Well, to be fair to the documentation it does explain _why_ it is not
recommended in the "..." part: "as it makes it difficult to track down old
references to SVN revision numbers in existing documentation, bug reports
and archives".

GC> https://git-scm.com/docs/git-filter-branch
GC> | Please do not use this command if you do not know the full implications,
GC> | and avoid using it anyway

 This quote is also quite misleading as it continues with ", if a simple
single commit would suffice to fix your problem." and it would indeed be a
bad idea to use filter-branch in such a case. Of course, a single commit
wouldn't help with anything in the situation at hand, so this warning
doesn't apply here.

GC> One way is NOT recommended, and the other is to be avoided.

 Git documentation is written in rather informal style and personally I
prefer it to more prescriptive style of the traditional man pages, I
appreciate being told not only what I can do but also why doing it may or
not be a good idea. And I think it does a rather good job of it in the 2
cases highlighted above if you look at the full sentences and not just the
quoted parts.

GC> I'll just use '--no-metadata'.

 This is probably still the right decision as with so few commits there are
unlikely to be many revision references in them. And if there are a couple
you'd like to change, you can always use "git rebase -i" to easily do it
interactively. Of course, such history rewriting can only be done before
making the repository public (which is another thing git-filter-branch man
page warns you about), so you need to do it before sharing the repository
with others or not at all.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]