[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: Encoding handling proposal

From: Marcus Sundman
Subject: Re: [Gnu-arch-users] Re: Encoding handling proposal
Date: Wed, 1 Sep 2004 13:46:12 +0300
User-agent: KMail/1.7

On Monday 30 August 2004 20:34, Stefan Monnier wrote:
> > B) "Content-Type" should be a mandatory metadata string attribute.
> In keeping with the "enforce naming convention" policy of Arch, I guess
> that we could just use a mime.types file to map extensions to content
> types.

Such default might be acceptable, but you also need other means. Two files 
with the same extension might have different encodings, e.g. "text/plain; 
charset=utf-8" and "text/plain; charset=iso-8859-1".

> The various type-specific diff algorithms are only ways to optimize
> changeset size and help merging, but they should all work correctly on
> arbitrary binary files.

There are other reasons to use type-specific diff tools, such as the ability 
to actually be able to sensibly compare two OOo Writer documents, or two 
audio files, or two images.

> > D) There should be a filter/plugin architecture to enable a transcoding
> > of files on input and output based on their content-types and user
> > settings and user-provided parameters.
> How is a utf-8 going to be transcoded into latin-1 without loss?

1) If all characters in the file belongs to the "latin-1" character 
repertoire then there is no problem.
2) If there is a way to escape characters, such as in java source code, then 
that may be used.
3) If it's OK to use the utf-8 file instead of a "latin-1" file then display 
a warning and do that, otherwise raise an error.

> > E) Utilities such as "diff", "merge" and "annotate" (aka "blame")
> > should be provided by plugins mapped to content-types.
> As mentioned by someone else, such type-specific algorithms (at least
> when used for in-archive-changesets) should be "standard" within the
> user community.

I don't see a need for that, as long as the patch file format is "standard". 
(Having a standard patch file format shouldn't prevent users from also 
using some non-standard patch format for some files, if they actually need 
to. The price of doing that is to not be able to collaborate easily with 
other projects. That is sometimes a price you're willing to pay, and 
sometimes not.)

- Marcus Sundman

reply via email to

[Prev in Thread] Current Thread [Next in Thread]