[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] How does arch/tla handle encodings?

From: John Meinel
Subject: Re: [Gnu-arch-users] How does arch/tla handle encodings?
Date: Fri, 27 Aug 2004 11:59:46 -0500
User-agent: Mozilla Thunderbird 0.7 (Windows/20040616)

Hash: SHA1

In most cases, I think it just depends on what the other tools do. Most
times, arch doesn't really care. It just asks diff to determine what to do.
I'm guessing the standard GNU diff treats a UTF-16 file as binary (since
for English text every other character is null.)

You can argue for having a better diff tool, but I don't think the
revision control system itself should care too much about what is in the

I'm thinking that you don't really want tla to say "Oh, you're using
UTF-8, let me translate all of the UTF-16 files into your native encoding."

Because more than likely you need UTF-16 for whatever is using the code,
and translating is *not* what you want.

There would be some niceties. Like if tla stored everything in one
format (like UTF-8), then if someone locally changed the encoding, when
checked in, the diff could be smart, re-encode it, and then your diff
would not include a lot of spurious changes just because of the
encoding. (Kind of like CVS does with converting line endings on win32)

If you want that behavior, you can write an arch hook that converts all
text files into UTF-8 encoding before they get checked in.

But I don't think that should be the stock tla behavior.


PS> Does GNU diff handle UTF-8, or is it ASCII/some encoding only?

Vaclav Haisman wrote:

| I agree with Marcus. File's encoding is imho metadata as much as
| are. For example how does tla/arch treat UTF-16 files? As text or as
| files? The ability to specify encoding should be present.
| Vaclav Haisman

Version: GnuPG v1.2.4 (Cygwin)
Comment: Using GnuPG with Thunderbird -


reply via email to

[Prev in Thread] Current Thread [Next in Thread]