[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Moving to git

From: Thomas Schwinge
Subject: Re: Moving to git
Date: Sun, 11 Jan 2009 12:50:58 +0100
User-agent: Mutt/1.5.11


On Fri, Jan 09, 2009 at 09:38:20AM +0100, olafBuddenhagen@gmx.net wrote:
> On Sun, Jan 04, 2009 at 12:05:07AM +0100, Thomas Schwinge wrote:
> > Only convert GNU Mach's gnumach-1-branch, GNU MIG's HEAD, GNU Hurd's
> > HEAD.
> > 
> >     With the exception of the GNU Mach Xen branch and the Hurd GSoC
> >     branches, these are the only branches that see active development.
> So? No need to drop dead branches -- they can still be interesting for
> reference.

The old CVS repositories will of course remain available for history
inspection.  I consider the new git repositories mostly for a
looking-forward perspective.

I'm not yet convinced that we really need these old, unused branches in
the fresh git repositories.  How often do you look things up in the GNU
Mach HEAD branch (as compared to our current gnumach-1-branch, where all
the work is done)?  Or in the Hurd's miles-orphaned-changes branch or the

If you people strongly want to have all available history present in the
new git repositories, then I can arrange for that to happen, of course.
I wouldn't do that, however.

> It's not like dropping them helps with anything...

But it doesn't help with anything either, in my opinion.  And why provide
legacy branches that no one uses and that will only confuse people?

> >     For the GNU Mach Xen branch, I'd like Samuel to tell when that one
> >     is ready for being merged into the main GNU Mach 1 branch and then
> >     I intend to do that merge as one big aggregated ``blob'' (i.e.,
> >     without preserving the individual development, testing, debugging,
> >     etc. commits.  The same holds for the GSoC branches.
> Don't do that. The whole point of a revision control system is to
> preserve history... A "blob" commit is unmanageable.

Did you actually have a look at the individual commits on the Xen branch?
Again, I see no effective use in preserving them.

For the GSoC branches, the situation is indeed a different one.  But
there are (roughly) only a handful of commits, so I'd rather replay these
manually on top of the new master branch instead of fiddling with merging
that I had done between the CVS branches.  (I don't know if git-cvsimport
and cvsps are smart enough to untangle that.)  And, didn't (some of) the
GSoCers work in git branches nevertheless?  Wouldn't we be able to
directly import that (rebasing it onto our master branch, of course)?

> If you think the history of the branch(es) is too messy, you can of
> course start a new branch, say xen-cleanup. This new branch should still
> contain a series of individual changes though, even if they don't
> reflect actual development history.

Are you going to do that work of cleaning things up?

> (The latter is the standard practice in Linux development, BTW.)

I'm aware of that.

> > Exclude all automatically regeneratable and Debian package maintenance
> > files from the conversion.
> > 
> >     These files are no longer present in current checkouts (general
> >     change of policy), but they do occupy a non-insignificant amount
> >     of space in revision history, and are not interesting with respect
> >     to preserving their history.  (They might be interesting if
> >     someone indeed wants to build an old version, but this is a
> >     corner-case that can be worked around easily.)
> > 
> >     The files will simply be excluded from the conversion.  ChangeLog
> >     entries referncing them will not be changed in flight (too
> >     time-consuming for no net benefit).  Instead a follow-up clean-up
> >     patch will weed out all ChangeLog entries referencing them.
> Sounds like a lot of additional work and potential confusion, for very
> little benefit...

It's as easy as ``rm -i gnumach/{,debian/}Attic/*'' before doing the
conversion.  And who should be confused by what?  Does a commit message

    configure.in: Whatever.
    configure: Regenerate.

... and there being no `configure' file at such a check-out (who would
check something like that out, anyways?) really confuse our target
audience?  Let's stay realistic, please.

> > Split Hurd modules into separate repositories.
> I stand by what I said on this topic before: *If* we decide to make such
> a change, it should be done independently of the Git migration.

Indeed it should be done independently, and I say that it should be done
*before* the Git migration, so that we start with unencumbered git

> It would hold up the migration; it would mix a purely technial action
> with fundamental decisions.

Yes, but we are not in a hurry with the migration.  Let's rather use this
chance to do this properly.

> >     Rationale: split as far as it's still making sense.  There is no
> >     reason to have an interger hashing library, a pthread
> >     implementation, an ext2 file system interpreter, libc amendments,
> >     Hurd interfaces definition files, a library for providing an
> >     uniform interface to Mach ports, etc. in the same repository.
> Is there a reason to keep them seperate?...

Yes.  The simple rule of manageable complexity.  Or why are there
separate repositories for GCC, Emacs, the Linux kernel, X.org, and the
Intel math lib for IA64?  Is it, at least in theory, more easy to find a
maintainer for libtrivfs or for the whole set of Hurd libraries and

> >     libihash and libpthread are shared between Hurd and Viengoos.
> I agree that these should be split out

Then, why just these two?

> but probably also not during the git migration...

Before doing that, yes.

> >     Checking the state after having done a whole-repository conversion
> >     yields several change sets that span files in more than one of the
> >     new modules.
> Indeed, it's not possible to properly disentangle the modules
> retrospectively. So we *have* to keep the original history, even if we
> really want the split to happen ultimately.

It is the *very vast* majority of commits that do *not* touch several

> This is also a reason why the split should happen only after the git
> conversion.

Believe me, *I* am the one who has had a look at the relevant changesets
and judged from what I saw there.  Did you have a look yourself?  If you
want to have a look, do a conversion (with a fuzz factor of perhaps 300s)
and then this:

    $ git log --name-only --pretty=format:'commit %H %cd' | awk 'BEGIN { c = 
"INVALID"; dirs["INVALID"] = 0; }; /^$/ { next; }; /^commit / { if 
(length(dirs) > 1) { printf "%s --", c; for (dir in dirs) { printf " %s (%d)", 
dir, dirs[dir]; } printf "\n"; } delete dirs; c = $0; next; }; /\// { OFS = FS; 
FS = "/"; $0 = $0; dirs[$1]++; FS = OFS; next; }; { dirs["ROOT"]++; };' | while 
read a b c; do git show "$b"; done | less

> >       * A few others are for interface changes and follow-up
> >       adjustment in the interface-using modules (libihash rewrite, for
> >       example). (Likewise for build system enhancements or changes, as
> >       adding uselocale for libthreads, or adding libncursesw for
> >       utils/console-ncurses.c, for example.)  Or adding a driver for
> >       streamio devices and adding a stanza for these in the MAKEDEV
> >       script at the same time.  Also, there are a few (notable!)
> >       interface changes, where the aggregated documentation in `doc/'
> >       has been updated together with committing the interface change.
> >       Likewise for changes where the top-level TODO or tasks file has
> >       been updated together with committing a change.  All these
> >       changes will be broken up.  Future interface changes will be
> >       done using some sort of versioning.
> This is the main reason why I'm not convinced the split is a good idea
> at all.

Above you just agreed with splitting libihash out of the tree.  And
libihash was one of the precious few example where this paragraph is
relevant after all.

And then, would you really want a separate libihash repository (that is,
separated *after* the conversion) contain the whole libdiskfs (etc.)
history?  Sure it doesn't matter anymore in a purely complexity (disk
space, processing time) calculation, but we do have some sense for
aesthetics as well, don't we?

> We would need to start some proper versioning, which is quite a
> pain.

Proper versioning is a standard technique these days.  I don't see any
problems.  Where do you see problems?

> And what would the benefit be? It's not like we ever release these
> modules seperately... (At least not in the forseeable future.)

Why not?  I see this as one main advantage: I'd be much more comfortable
with releasing version 0.4 of libnetfs instead of releasing version 0.4
of the GNU Hurd.  Or finding a maintainer for libnetfs.


Attachment: signature.asc
Description: Digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]