[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Converting a proprietary svn repository to git

From: Greg Chicares
Subject: Re: [lmi] Converting a proprietary svn repository to git
Date: Mon, 29 Feb 2016 22:29:29 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.5.0

On 2016-02-29 19:18, Vadim Zeitlin wrote:
> On Mon, 29 Feb 2016 18:58:35 +0000 Greg Chicares <address@hidden> wrote:
> GC> Because this is a one-way, permanent conversion, I think I'll stop trying
> GC> to make it work on two different platforms, and just take the 
> successfully-
> GC> converted new git repository from GNU/Linux and move it to msw.
>  Yes, sure, I wasn't certain why did you want to do it under Cygwin too but
> if there is no need for the conversion process to work under both, just
> running it once is quite enough.

I figured that if it were very easy to do the migration on each machine,
that would save the agony of sending it through email. And I did want to
see that 'git log -1' would produce the same hash on both systems (it did).
It also seemed prudent to exercise git on different operating systems, and
now I know of at least one incompatibility (due to git version rather than
OS--but still that's useful to know). And I'm certainly not going to try
sending a 27MB file in email, so I do have to make the bare repo work in
two different worlds.

> GC> And I guess that answers the question in the email I just sent: given
> GC> 
> GC>   svn --> git (A) 67MB --> bare git (B) 2.7MB --> (C) final git 28MB
> GC>      clone             push                  clone
>  The difference between (A) and (C) is intriguing, git-svn does have some
> overhead but it's usually relatively small. I admit I don't know what can
> explain it... The difference between (B) and (C) seems to indicate that
> the repository consists of text files which compress very well, but this is
> not really surprising.

Size in MB:

(A) 66.9
  data 12.2
  src   1.4
  test 11.2
  .git 42.1
    objects 41.9 (7196 items)

(B)  2.7
  objects 2.7

(C) 27.6
  data 12.2
  src   1.4
  test 11.2
  .git  2.8
    objects 2.7 (4 items)

Therefore, (B) vs. (C) is exactly as you describe, and in fact the
checked-out contents are all text files with extreme redundancy. For
example, test/ contains xml files with many <cell> elements (which
follow 'cell.rnc'), under which most subelements are identical or empty.

As for (A) vs. (C), perhaps the 237 historical commits were each packed
separately, and then (due to the extreme redundancy in the underlying
files) repacking saved a great deal of space.

Might it be that the newer version of git on Cygwin makes (A) smaller?
That information may not be very important, but I have it immediately
available via 'du -sb' on both systems:
  66895327 Debian
  65779859 Cygwin
so there's a slight difference due to git versions, but not nearly
enough to explain the size discrepancy between (A) and (C) above.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]