[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: arch with 'special files'

From: Jan Hudec
Subject: Re: [Gnu-arch-users] Re: arch with 'special files'
Date: Tue, 5 Apr 2005 15:34:25 +0200
User-agent: Mutt/1.5.8i

On Tue, Apr 05, 2005 at 11:03:28 +0200, address@hidden wrote:
> On Tue, Apr 05, 2005 at 09:07:12AM +0200, Jan Hudec wrote:
> > On Mon, Apr 04, 2005 at 02:03:56 +0200, Robert Widhopf-Fenk wrote:
> > > On Friday, April 1, 2005 at 02:38:56, Josh England wrote:
> > > > It seems like metadata could be conceptually broken up into two
> > > > types. There is 'first-order' metadata [...]
> [...]
> > Well, there is still a bit special case of "meta-data used by
> > diff/patch". And which ones that are has to be hard-coded.
> > 
> > There would be a diff algorithm, and a list of filters to apply.
> > However, there is a problem if the filter is not fully reversible [...]
> [...]
> Yup. This metadata business doesn't solve everything. Especially with

The metadata business does not solve *anything*. Only think it can do is
prepare ground for actual solutions. And there is no point in
implementing it until you know what the solutions will actually use.

> diff/patch you get no end of possibilities ;-)
> A while ago, e.g. the idea was pushed around to have file-type specific
> diff/patch. Of course, this file type would most appropriately be expressed
> via metadata... or not?

Yes. But "metadata" is two broad a term to be useful. We need to tell
more about how it should behave.
Eg. many formats can be detected by some kind of magic number. And there
a metadatum saying "files with magic numbers X should get treating Y" is
a lot more useful than listing those files...!

> Inexact patching for jpeg images anyone? Or for (ugh!) XML RSS files[1]? Or...

I believe there is some kind of xml-diff, that compares the trees, not
the text ;-). One such is built into openoffice...

I fear inexact patching of jpegs won't work, because they are lossy. But
pngs... And for eg. xcf (Gimp format), I can even imagine a _useful_

And even if it's not inexact patching, instead of two versions, you can
store one version and a difference. And knowing the nature of the data
can make this more efficient.

> > We could minimalize this problem by implementing these filters line-wise
> > in diffutils. Sounds complicated enough?
> It does indeed.

In fact I think I could implement this in perl in finite time. I am not
sure about C or C++ version though...

> [...]
> > > The svn-book has a section on "Why properties?" and IMHO it
> > > makes some sense ...
> > 
> > That section does in fact advocate the Reiser's files-as-directories.
> > But in the working copy, the properties are stored separately anyway...
> > 
> > The think they advocate in the begining of that chapter is really an
> > argument for the Reiser's files-as-directories, but subversion does not
> > (and cannot) do that for the client (The server does it).
> > 
> > And it has to be said, that even the files-as-directories don't really
> > solve much, because they need all the applications that manipulate them
> > be aware of the extra semantics (ie. you can't copy them as normal
> > files...)
> *This* is one of the things we could solve quite elegantly with file
> ids. As the files get moved around, Arch can keep track of them quite
> nicely and keep the link to the (not natively managed) attributes.

Not really. Because the nature of the ids and the nature of the metadata
are the same. So if the id is internal, the metadata can be as well (and
indeed they should!) and we have just created file format with room for
metadata (for which we don't need arch). And if the id is external, arch
has exactly the same problem tracking the id as anything else has
tracking the metadata -- someone else can mess it up.

In fact, the ids _are_ advantage for storing arch-defined metadata. This
way if user intervention is required to add the metadata, the user only
add the id and the rest is kepts by arch behind the scenes.

> For each OS[2] there be a set of hook scripts to do the mapping magic
> on check-out and check-in (those not necessarily being considered part
> of Arch -- or better: Arch determines which metadata it explicitly
> cares for).

Hm, it's not that easy. There will have to be a set of data types
(content, type, permissions, ...) that will be recorded for each file,
and a set of procedures to diff and patch them. This set should be
extensible and might be different on each platform. However, all the
standard mapping would have to be built in.

Note, that for security reasons arch must not run archive-provided
scripts, so the diff algorithm specification has to be flexible enough
to be actually useful.

                                                 Jan 'Bulb' Hudec 

Attachment: signature.asc
Description: Digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]