[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?

From: Marcus Sundman
Subject: Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?
Date: Sat, 28 Aug 2004 04:18:54 +0300
User-agent: KMail/1.7

On Saturday 28 August 2004 03:35, Robin Green wrote:
> On Sat, Aug 28, 2004 at 01:56:20AM +0300, Marcus Sundman wrote:
> > However, for this problem to go away completely it needs
> > to be fixed in _all_ systems, including arch. When a piece of text is
> > sent around as bytes _no_ link in the chain may throw away the encoding
> > metadata.
> If you want that property

Umm.. what property? That text files remain text files instead of turning 
into raw byte blobs? Yeah, I really do want that property.

> isn't the most sensible solution to put the encoding metadata _inside_ the
> file, like xml does?

Purists generally hate this solution of xml. Theoretically speaking it's 
wrong because you would have to interpret at least the beginning of the 
file to get information on how to interpret the file, thus creating a 
circular dependency paradox. Practically speaking it's wrong since it 
severely limits what encodings can be used, since the file would have to 
contain a byte sequence equivalent to a string like '<?xml version="1.0" 
encoding="utf-8"?>' encoded in ANSI X3.4-1986.

That said, I'm personally not completely against this approach, but I 
haven't given it much thought. However, only few formats (anything besides 
sgml?) support this system. E.g., if you want a text file to contain only 
the string "hello world" then there is no way for you to use this approach.

> Transcoding need not be a goal of a revision control system, since you
> can just transcode files to and from the working directory with a
> separate utility.

I have never said that transcoding has to be done by a CMS/RCS. However, the 
system has to support this, at least by not throwing away the encoding 

After giving it a lot of thought (quite a while ago), I concluded that I 
would personally prefer a general filter plug-in system in the CMS/RCS. 
This way the logic can be standardized and centralized, moving the burden 
(and the responsibility) of setting up the filters from each developer to 
the project leader. This way you also won't have issues with different 
people using different platforms and/or clients. (Anyhow, this is only my 
personal opinion, and I wouldn't want to impose it on others.)

- Marcus Sundman

reply via email to

[Prev in Thread] Current Thread [Next in Thread]