[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] the state of the union

From: Andrew Suffield
Subject: Re: [Gnu-arch-users] the state of the union
Date: Wed, 18 Aug 2004 03:32:58 +0100
User-agent: Mutt/1.5.6+20040803i

On Tue, Aug 17, 2004 at 07:19:46PM -0700, Tom Lord wrote:
>     > From: Andrew Suffield <address@hidden>
>     > >   In the past, efforts to unify the changeset mechanisms of the
>     > >   various projects failed.  The primary sticking point seems to have
>     > >   been disagreement about the necessity and desirability of inventory
>     > >   tags.
>     > Ironically, one of the "major" points on my semi-formal list of things
>     > I'd like to see in the next changeset format, is to orient changesets
>     > around logical file identity ("inventory tags") rather than
>     > filenames. 
> I'm unclear what you mean by "orient" here.
> Changesets have a pretty clear abstract structure.   The individual
> file diffs happen to be stored in the tar bundles by filename, but the
> file id -> patch mapping is utterly trivial to compute from that plus 
> the index files.   How in the world could changesets be more
> "oriented around logical file identity?"   

Avoiding spending too long on this, briefly the metaphor is:

The operational set is a group of logical files. Every logical file
has a permenant and immutable identifier (the inventory id). Every
logical file also has:

 - a type (file/symlink/directory)
 - a data body (for files and symlinks, not for directories)
 - a filename
 - a logical file identifier which represents the parent directory
   (NULL/undef/whatever for the root directory)
 - some permission bits
 - (etc)

Each of these individual items has changes stored in a changeset in
some manner. The data body for regular files is obvious and
well-understood. The others are obvious but not so well-recognised;
the filename and parent directory should (but actually currently don't
in all cases) behave exactly like the body of a symlink - store 'orig'
and 'mod', apply with
"if current == orig, current := mod else conflict".

With this as the data structure, all the operations become very simple
to specify precisely. A moderate number of currently open bugs also
get swept away at the same time without complicating the process of
changeset application. Good examples are "moving the root of one
project into a subdirectory of another", and "adding a file, when its
parent directory has since been renamed".

All of this information can be constructed from current changesets
(almost, there's a bug open about parent directories). However, this
is the "natural" representation, which is what you want to be working
on in a program - so we could save some trouble and just store it that
way in the first place.

Top of my head, here's a completely arbitrary way it *could* be done
(not the best, I don't think, I just pulled it out of the air; we
really need to do this starting from objectives):

For each file $id, the directory files/$id/ in the changeset
represents it. This directory contains:

 - (a file 'body-patch') or (a file orig-body and a file mod-body) or (a
   file orig-target and a file mod-target) or (), depending on the
   logical file type.
 - (a file 'orig-name') and (a file 'mod-name')
 - (a file 'orig-parent') and (a file 'mod-parent')
 - (etc)

For each file $id, the symlink tree/$full_path/$filename links to
files/$id/ - so you have what looks like a partially populated project
tree, but made up of symlinks to file-changesets. (This is an index,
for things that want a filename->id mapping, craftily arranged so that
it's also handy during manual changeset modification).

None of this is enough to make it worthwhile to change the changeset
format - you can do it all in-core with the current format (or
something very close) and just arrange your internal data structures
that way. But if we're changing it *anyway*, then we have the
opportunity to simplify arch implementations a fair bit and also make
changesets easier to understand.

  .''`.  ** Debian GNU/Linux ** | Andrew Suffield
 : :' : |
 `. `'                          |
   `-             -><-          |

Attachment: signature.asc
Description: Digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]