[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?

From: Jeremy Shaw
Subject: Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?
Date: Sat, 28 Aug 2004 13:31:11 -0700
User-agent: Wanderlust/2.11.30 (Wonderwall) SEMI/1.14.6 (Maruoka) FLIM/1.14.6 (Marutamachi) APEL/10.6 Emacs/21.3 (i386-pc-linux-gnu) MULE/5.0 (SAKAKI)

At 27 Aug 2004 20:31:23 -0400,
Michael Poole wrote:
> Marcus Sundman writes:
> You are kidding, right?  You want arch to impose a particular solution
> for a rather rare problem on users when there is no consistent or
> clear technical way to solve the problem?

I do not think it is correct to call this a *rare* problem that does
not need to be solved. I heard the same excuse back when tla did not
support spaces in filenames, but the reality is, KDE has filenames
with spaces in them, filenames that contain characters other than
ASCII 0-127, and files whose contents have all sorts of
encodings. Mozilla also uses spaces in filenames, and certainly has to
deal with internationalization and encoding issues. These types of
problems may not affect the projects you care about, but failing to
address issues affecting some of the most visible FOSS projects around
seems like a sure plan for failure.

> Many people whine about systems that do not take special care of some
> character encoding scheme (or set of schemes).  None of them seem to
> know how (for example) tla and emacs and Apache should communicate to
> each other the information of encoding scheme for a text file.  How do
> you think it should be done in a way that does not break when "normal"
> applications copy or move the file?

Tla already has *very important meta-data* that is has to track --
namely explicit inventory ids. And it already breaks if normal
applications copy or move the file. So this is not really anything

> In other words, I disagree that the problem can be fixed at all (much
> less easily) by merely adding support for per-file metadata to arch.

The 'best' solution involves the co-operation of all the levels. For
example, one possibily solution might involve meta-data at the file
system level, a standard for storing encoding information in the
meta-data, support for handling the meta data and encoding information
by the gnu tools like tar, patch, diff, etc,. It will literally be
years before all the pieces are in place -- but the pieces are already
falling into place. For example, looking at the linux kernel's xattrs
support, reisers4 file-as-directory meta-data, and the Microsoft
WinFS, (and, of course, good old MacOS), it is clear that future
filesystems will have meta-data one way or another.

I am positive that Tom Lord, wants tla to have the best unicode and
encoding support it can in the long run -- and that will involve
waiting for (or helping) the rest of the pieces to be developed. But,
that does not have to preclude a reasonable work-around in the

It sounds to me, like an extension of the explicit id architecture so
that it could handle additional meta-data, plus a hook here and there,
might be a useful solution to a number of problems, including the
encoding problem. Perfect? No. But it might be a pragmatic solution
until the rest of the pieces are ready.

And, now that I think about it, it might actually be pretty neat if I
could attach arbitrary meta data to any file. For example, that might
be a place to indicate that 'file b' was derived from 'file a', a
feature people have requested numerous times. Or, as tla builds the
tree, everytime it patches a file, it could optionally add an entry to
the file's meta data indicating the patchset, so that you could
quickly find all the patches that touch a specific file (though, I
suspect revision libraries are a better way to do that). It might help
with porting tla to filesystems that have issues like
case-insensitivity.  The point is, solving the problem does not *have*
to be a hack for some *rare* circumstances. Instead, it may be an
insight into some ways that tla can be made more flexable.

We should be asking ourselves, how does this problem fit into the
larger context of things, and how can solving this problem to make tla
better for everyone.

Jeremy Shaw.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]