[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Encoding handling proposal

From: Michael Poole
Subject: Re: [Gnu-arch-users] Encoding handling proposal
Date: 05 Sep 2004 21:11:55 -0400
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3

Marcus Sundman writes:

> On Monday 06 September 2004 01:42, Michael Poole wrote:
> > Except for the Auto-Filter idea, similar issues have come up in the
> > past in the context of arch's inventory of files.  I think that is the
> > natural place for arch to know about file formats -- whether the
> > inventory uses MIME content-types or something else.
> Sorry, I'm not very familiar with the internals of arch. Do you mean that 
> each inventory entry should get a third field, 'Content-Type'? Or do you 
> mean that one should have an additional inventory entry for each file, and 
> then somehow link the two? Or do you mean something completely different?

As you hint, the current inventory format is not suitable for storing
a wide variety of metadata, but the inventory function might be part
of a more general metadata framework.  If that happens, your original
proposal is an easy way to tell arch about file format for each file.
As a less earth-shaking alternative, a list of metadata tags could be
added to each source pattern in =tagging-method.  One heuristic
matching method is to assign full points for exactly matching list
entries, partial points for wildcard matches, and use the highest
scoring hook.

My preferred method, though, is to let the version tree define new
category names in addition to "source" [1].  Each custom category
would have its own hooks for post-get, pre-commit, diff and merge
actions, but would otherwise be treated the same as "source" files.

Custom categories are simpler than the metadata methods, but they
suffer from combinatorial explosion if there are many sensible hybrid
formats.  For example, you might have three character codings for
different files with either line-by-line or SGML/XML-based diffs.
That would require 3*2 custom categories but only 3+2 metadata rules;
although it gets much worse as the numbers grow, I suspect there will
not be many cases where it is a big problem.  If it is, the category
could always be written in the form "iso-8859-1/xml".



reply via email to

[Prev in Thread] Current Thread [Next in Thread]