[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: give us a hand with arch

From: Andrea Arcangeli
Subject: Re: [Gnu-arch-users] Re: give us a hand with arch
Date: Sun, 28 Sep 2003 15:08:20 +0200
User-agent: Mutt/1.4.1i

On Sun, Sep 28, 2003 at 07:11:30AM -0400, Miles Bader wrote:
> On Sun, Sep 28, 2003 at 12:12:49PM +0200, Andrea Arcangeli wrote:
> > Sorry but I disagree here and that's the very reason the tag-id are
> > metadata and they're not data, and they provide zero value to the
> > project in the long run.
> Actually they provide quite a bit of value.  I have no idea, why you seem so
> intent on moaning about them (are you afraid they _will_ get used for linux?).

Can you answer a very simple question?

Tell me what advantages and disavantages I would have by adding tagline
to all files, compared to explicit, with strict commits. If you find a
single advantage I will add them. Problem is you can't because there
aren't. There are only disavantages, this is a tangible fact.

You know strict commits force me to do something on the lines of `tla
add/move/delete-tag`. So far so good.

disavantages of taglines with strict commit enabled:

1) lose the math guarantee of no clashes (which can instead be
   easily provided with the explicit method)
2) pollute the code
3) annoy your and my eyes with metadata
4) having to bother choosing the tagline
5) having to check over time that there is no clash in the local tree
   (according to your email arch is already checking for this, that
   means wasting some minor cpu time)
6) waste cpu time as well to regexp all the 500M of 2.6 kernel data
   searching for those tags
7) having to check over time that there is no clash with parallel trees
   not yet synced that arch can't know about


0) None, I've to use strict commits so I can't avoid `tla add/delete/move-tag`.

I hope this explains why tagline isn't going to ever hit the linux

I could be very wrong, but only if you can achieve at least one of the
two things below:

1) you find a single advantage that I missed that overweight the long
   list of tangible disavantages I listed
2) you have to convince strict commits aren't the way to go but I'm very
   convinced they _are_ the safer and best way to do revision control.
   Adding and moving files is sooo infrequent, that it's perfectly
   acceptable to run `tla move-tag`, strict commits are a feature
   not a disavantages.

so please let's concentrate on this, and let's forget the other
discussions on this matter. This part is what really matters. i.e. the
fact taglines are only and exclusively providing disavantages if
you need strict commits like I do.

> > Today swap_state.c is swap_state.c, tomorrow
> > swap_state.c could be nvidia.c just because I did `move-tag swap_state.c
> > nvidia.c`. then it's just worthless to rememeber that was swap_state.c.
> Then _don't do that_.  If you create a _new file_, then give it a _new id_.

the thing will move over time, first fs/buffer.c, then a new dir
fs/vfs could be created and it will be fs/vfs/vfs_buffers.c, then it
gets splitted etc..etc.. I don't want obsolete info to stick with the
code. If something a random id would be better, so you just don't risk
to focus on a old concept.

> On the other hand, if you move a file somewhere, use the _same id_.  You can
> certainly think up weird edge cases where it's not clear what to do, but that
> doesn't really matter; 95% of the time, the `identity' of a file is an
> entirely intuitive concept, across renames, content changes, whatever.

And for this 95% you don't need to know the ID because the filename will
tell you about what's inside the file anyways. For the rest of 5% _only_
the filename will give you the right hint about which file is that (the
id will be misplaced, so it'd better be a random ID not anything
meaningful to an human so it doesn't generate confusion).

I mean, it's only the filename, not the id, that matters in terms of
data, the actual data, not the one year ago changeset that now it's not
relevant anymore after the last guy rewrote the subsystem.

> Really, as far as I can see, your only really defensible poitn is that you
> think they're ugly.  That's your right, but don't be surprised when it fails

see above. The list of disvantages and the list of advantages. I
definitely don't want to deal with that long list of disavantages for
not a single good reason.

As I stated emails ago: by the time you admit to need the strict
commits, arch shouldn't bother you asking or even knowing about any
tagging-mode concept. By that time you should only be presented with
explicit mode with a math safe algorithm since it's possible to
implement that.

I'm quite sure in linux we can't maintain the huge regexp, that can't be
updated in sync every time a new driver is being merged into mainline.

And even if we try to enforce an huge regexp policy there will be always
cases where human do mistakes, it's much harder to do mistakes in `tla
add-get`, if you don't know `tla add-get` your code won't go into
revision control in the first place. If you don't know about a policy
and you don't use arch yourself, you can still send a patch to Linus
adding a file that breaks the conventions. And Linus certainly want to
have a tool that tells him when something breaks the conventions, this
is what a SCM has to be _good_ for. We need that tool anyways, be it
arch or something in front of arch were we have to tell it explicitly
"put this into the arch workdir because this is really code, not garbage".

This is only a guess, I can't read Linus's brain ;), but that's at least
my own opinion on the matter. Again I could be wrong, but I'd like you
to answer to the disavantage/advantage list.

And no, I don't care about having to execute tla
add-tag/delete-tag/move-tag, I _want_ to run it, that's a _feature_ for
me, not a disvantage. So I can't buy that as a disavantage, that's an
advantage from my point of view.

> far better than most of the _other_ meta-data that gets stuck in linux source
> files [and yes, there's a lot of it]).

I don't doubt we've some old metadata in there. Maybe you mean the old
$Id$ for cvs? That's more data than metadata, it tells us the last date
the file was updated and who did that, and these days with bitkeeper I
guess it's obsolete too and it would better be deleted, I doubt
bitkeeper has an option to update the cvs metadata. Infact I would like
if arch would have the same $Id$ too, it's nice to know if you're not an
arch user but you donload the tar.gz or simply because you're reading
/usr/src/linux or because you don't have an internet connectivity and
you only have the tar.gz in the CD. That is data, not metadata. CVS
doesn't depend on that for anything, infact if something it often gives
troubles when you import (i.e. if I want to use cvs for linux I've to
-ko it immediatly during import).

The arch-tag instead would provide zero info to the developer that is
writing the code, except possibly an ancestor name of the file, but if
the developer wanted to know about it he had to call the cvsps -f
equivalent future arch needed feature, that info isn't supposed to be
provided inlined in the source without an explicit request.

Don't take me wrong, I'm fine you like tagline better, if you appreciate
the non strict commits and you're all very careful about the regexp and
you don't need to merge drivers from third party. That gives you the
advantage of not having to call add-tag and friends, but you can't
expect all projects can live without strict commits and as such for
those projects tagging-mode is a no brainer: the math safe "explicit"
one and never tagline that only and exclusively provides disavantages.

Now if you can invalidate all my points or find flaw in my reasoning I'd
be really happy to add and advocate taglines everywhere.

Andrea - If you prefer relying on open source software, check these links:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]