[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: give us a hand with arch

From: Andrea Arcangeli
Subject: Re: [Gnu-arch-users] Re: give us a hand with arch
Date: Sat, 27 Sep 2003 19:54:47 +0200
User-agent: Mutt/1.4.1i

On Sat, Sep 27, 2003 at 09:31:04AM -0700, Tom Lord wrote:
>     > From: Andrea Arcangeli <address@hidden>
>     > I think the tagline isn't even polite w.r.t. tree maintainers, that's
>     > pollution spreading in the sourcecode for the sole purpose of not
>     > wanting to call add-tag/delete-tag/move-tag, which is a very good
>     > practice anyways to be able to mark everything not tagged as
>     > unrecognized and ingoring it during the checkin after spwaning a 
> warning.
> Another point of view says that the idea of an inventory id is a
> usefully universal one that transcends the issues of revision control.
> For example, inventory ids are a prerequisite for mkpatch/dopatch
> functionality, and that functionality seems useful to me even where
> no revision control system is used at all.
> Inventory ids tell you (and your tools) something useful about the
> structure of a source tree and how that structure compares to a
> related tree, regardless of what revision control (if any) is being
> used.
> Thinking of embedded tags as _meta_data misses the point.  They're
> _data_.

They're definitely not data.

The only reason they exists is to track down the evolution of the
project, which is something metadata should do, the data has no
knowledge on the history of the project, nor it can provide any tracking

data is the information in a linux-2.4.22.tar.gz package, metadata is
what is stored inside the arch archive.

Storing part of the "tracking" code mixed with the data is certainly not

The reason this is obviously not data, is that an user downloading
linux-2.4.22.tar.gz pure data package, will have absolutely zero
benefits from the tagline embedded in the source, the tagline for this
use will be pure _pollution_. Furthmore this user will read the code
will see this pollution and he will delete it because it's not data and
in turn it's useless to him, and it's only useful to people who cares
about tracking the kernel with a revision control system, not him. You
certainly don't want to lose the tracking code because of that.

Another proof this is definitely metadata, is that any other project
could add its own tracking metadata code embdedded and again it would
pollute the real data even more. Up to a point that the metadata is 90%
and the data is 10%. The same applies to the emacs hooks at the end of
some files, there are very few in the kernel since our codying style is
quote omogeneous these days. If all editors would require that, we could
run into the same space consumption problem where the metadata is more
than the data.

Now the interesting question: am I right that there are already further
benefits in the tagline compared to explicit tag method (besides not
having to run an explicit add-tag/delete-tag/move-tag). I understood
this is the case, and as such the technical gap should be filled, and
this will make it even easier than using the tagline mode, since people
won't need to autogenerate arch-tags by hand like they have to do right
now with the tagline.

Yet another obvious reason this is metadata, is that if I would add a
secondary stream to the same inode in the filesystem explicitly meant
for userspace metadata, you could store metadata separately from the
data, and you would provide the user the same property but w/o polluting
the data (i.e. the metadata wouldn't end in the linux-2.4.22.tar.gz that
the people only caring about the data downloads). This is exactly what
you really need infact: you need the metadata stream to avoid the
add-tag/delete-tag/move-tag. The data pollution is simply a temporary
workaround for the lack of multiple streams in the inode. LD_PRELOAD
could emulate it in theory.

I've nothing against tagline, it may be the most handy method for some
project, but I wouldn't encourage its usage in large projects where it's
fundamental to avoid pollution to keep the code clean and well separated
from any possible metadata (no matter if the metadata is from an editor
or a revision control system).

Andrea - If you prefer relying on open source software, check these links:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]