[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Encoding handling proposal

From: Marcus Sundman
Subject: Re: [Gnu-arch-users] Encoding handling proposal
Date: Sun, 29 Aug 2004 22:14:38 +0300
User-agent: KMail/1.7

On Sunday 29 August 2004 21:13, John Meinel wrote:
> Marcus Sundman wrote:
> > D) There should be a filter/plugin architecture to enable a transcoding
> > of files on input and output based on their content-types and user
> > settings and user-provided parameters.
> >
> > E) Utilities such as "diff", "merge" and "annotate" (aka "blame")
> > should be provided by plugins mapped to content-types.
> You definitely have some interesting proposals here. One thing to watch
> out for, though... Once we stop having one type of diff (say a xdelta
> diff for binary files, and another type for xml files, etc.) how do we
> make (or at least help) everyone have all of these programs.

We don't. It's entirely optional. You'd only have to provide suitable 
fallbacks. E.g., all text/* files could have the standard line based diff 
as fallback and the rest could have a binary diff as fallback.

> If I *don't* have the xmldiff/xmlpatch program, then it is likely that I
> won't be able to checkout a project that used them. As I would doubt the
> format for the .patch file will be the same as diff/patch. Also, what
> about versions, is xmldiff 1.0 compatible with xmlpatch 2.0? (1 year ago
> I checked it in, but now I'm getting it back).

I don't see a reason why the same patch system couldn't be used for all file 
types, regardless of what diff tool was used. I think supporting pluggable 
patch file formats would be a very bad idea, precisely because of the 
issues you've said.

> Will there be "blessed" diff/transcode programs? Will it only be the
> ones that are bundled inside of tla?

What do you mean by "blessed"? I don't see a need for treating different 
implementations differently. However, if tla comes bundled with some 
filters I guess people won't bother writing alternatives to any of _those_ 
unless they are really bad.

> I'm not sure about your statement that files are typically stored in the
> "local" encoding. The editors I use (gvim, scintilla) allow me to
> specify the encoding. (Admittedly it's mostly latin-1, or utf-8, or
> utf-16). So in that situation, when I write out a file, if I try to check
> it into arch, then I have to worry about telling arch *not* to use the
> local encoding.

It's the "local" encoding unless specified otherwise. That's why you'd have 
to have per-project, per-user, per-module and/or per-filetype defaults. Or 
you could simply default to "Auto-Filter: false" if you want.

> I know one of your reasons for wanting encoding to be included is so you
> can keep the "official" repository in the official encoding. One way to
> do that is to put a person in there. So people are allowed to work on
> any repository they want, but only a few people commit to the "official"
> one, and they are all knowledgeable about watching out for file encoding
> issues.

And it'd be great to be able to put a filter there checking that the UTF-8 
files that you are committing actually conform to the UTF-8 specs.

> I think Tom designed hackerlab such that you deal with characters, and
> never know how many bytes/codepoints/etc is used underneath.

Yeah, hackerlab sounds *really* good.
Any idea when those nowhere-yet pages will be there? ;-)

> > E) E.g. if two files with the content-type
> > "application/vnd.sun.xml.writer" are diffed the system should use a
> > diff plugin that knows how to interpret Writer
> > documents. If no such plugin is found it defaults to the standard diff
> > which regards the files as byte blobs.
> This is where the problem with plugins exists. On *my* machine, I have
> the application/vnd.sun.xml.writer diff program. You don't have it on
> *your* machine. You can no longer read my archive.

Sure you can. The only difference is that you can diff the files in a decent 
diff program whereas I would just get my standard binary diff tool if I 
tried to diff those OOo documents.

> > Notice that there is no distinction between "text files" and "binary
> > files". The same system that converts between different text encodings
> > might just as well be used to convert between different "raw" audio
> > formats. Just add the appropriate plugin/filter and you're set.
> Interesting idea, but I have to wonder if it is what you would really
> want.

Yes, because that distinction isn't really there in the first place. They 
are all just binary files in different formats. Some types of files are 
easier to do line based diffs on and some are harder. None is impossible. 
(Now, all you people who are about to object to this point furiously, 
please think about it a bit first. So far I've seen many people object to 
this but AFAIK none that hasn't subsequently changed his opinion.)

- Marcus Sundman

reply via email to

[Prev in Thread] Current Thread [Next in Thread]