[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 8859 unification and Emacs' ChangeLog files

From: Kai Großjohann
Subject: Re: 8859 unification and Emacs' ChangeLog files
Date: Thu, 03 Apr 2003 18:50:16 +0200
User-agent: Gnus/5.090018 (Oort Gnus v0.18) Emacs/21.3.50 (gnu/linux)

Kenichi Handa <address@hidden> writes:

> In article <address@hidden>, address@hidden (Kai Großjohann) writes:
>> I've learned the hard way that unify-8859-on-decoding-mode mangles
>> Emacs' ChangeLog files.  Now Simon (see Cc) has shown me that it is
>> possible to turn unification off for certain encodings.
>> Do you think it might be good to turn it off for iso-2022-7bit?
> [...]
>> Simon's Lisp was: (coding-system-put 'iso-2022-7bit
>> 'translation-table-for-decode (make-translation-table))
> Yes, it works because it overrides the translation table
> created by unify-8859-on-decoding-mode.
> But, it results in that a Latin-2 char read by iso-2022-7bit
> is different from what read by iso-latin-2.  I don't think
> such a change is a good idea.

Well, without unification, a Latin-2 character read from a Latin-1
file is different from the same character read from a Latin-2 file.

>> Maybe that would make it possible to turn unify-8859-on-decoding-mode
>> on by default?
> It will stop unibyte<->multibyte automatic conversion in any
> single byte lang. env. (e.g. Latin-X, Greek) except for
> Latin-1.

What do you mean by "it"?  Do you mean unify-8859-on-decoding-mode
generally, or do you mean unify-8859-on-decoding-mode with Simon's

> I think unify-8859-on-decoding-mode is still only for those
> people who knows the consequence of the command well.

I know that unify-8859-on-decoding-mode is harmful for Emacs'
ChangeLog files, because they distinguish between the various
iso-8859 charsets.

But if all Emacs developers enable unify-8859-on-decoding-mode, then
this is not a problem anymore.  Or the ChangeLog files could be
stored as UTF-8.

Of course, there might be other files where iso-8859 characters might
be present in different encodings.  Is it right to say that these
files must be encoded in iso-2022 or in emacs-mule, because no other
encodings distinguish between Latin-1 ä and Latin-2 ä, say?

If I understand correctly, Emacs 22 will automatically unify all
iso-8859 charsets, so the same ChangeLog problem will occur there,
right?  Or is Emacs 22 going to enable distinguishing between Latin-1
ä and Latin-2 ä for iso-2022 files somehow?

It seems that Europeans are really interested in more 8859
unification.  Now that unify-8859-on-encoding-mode is on by default,
they are somewhat happier.  But even with this enabled, people in a
Latin-9 locale can't search UTF-8 files well: when they hit the ä key,
they are searching for a Latin-9 ä whereas there is a Latin-1 ä in the
buffer.  Or take me, personally: I use the german-prefix input method
which produces Latin-1 characters, but I'm running in a Latin-9 locale
so that the buffers contain the `wrong' characters.

At the moment, I tell them that I don't turn on
unify-8859-on-decoding-mode because then I can't edit the Emacs
ChangeLog files anymore.

If any of the previous message sounds weird, that might be because I
fail to fully grok the problem.  I apologize.  I'm looking at it from
a European point of view, so I might miss important points.
A preposition is not a good thing to end a sentence with.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]