emacs-pretest-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: `.newsrc.eld' saves chinese group name in wrong coding


From: Stefan Monnier
Subject: Re: `.newsrc.eld' saves chinese group name in wrong coding
Date: Tue, 24 Oct 2006 14:14:05 -0400
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux)

>> It works correctly, provided the characters in that string can be
>> expressed in the unibyte buffer.

>     But which characters can be expressed is poorly specified.  E.g. Tell me
>     which chars can be expressed in a unibyte buffer in a BIG5 locale?

> Mentioning the locale is somewhat of a red herring, since what controls
> this conversion is (effectively) nonascii-insert-offset.

The nonascii-insert-offset and noonascii-translation-table is AFAIK
initialized differently depending on the locale (and/or language
environment) and users typically don't fidle with that table directly but
via their locale setting instead.

> Mentioning BIG5 is a second red herring.  You can't represent Chinese
> in 8-bit characters, but that is not Emacs' fault.

Code which implicitly converts text from multibyte to unibyte (and vice
versa), using nonascii-*, will presumably be used in all kinds of locales,
including BIG5 ones.  So knowing what happens in this case is
still relevant.

> Do you think that we need to document nonascii-insert-offset more
> prominently?  If so, where else should we talk about it?

No, I think we should kill it instead and declare in error any code which
tries to use it.  It made sense in Emacs-20 when the multibyte support was
weaker, but nowadays it just encourages sloppy code which breaks down in
different language environments.

>> If people generally agree it would be better to signal an error,
>> we could do that.  However, that would cause trouble trying to use
>> M-y to move past multibyte entries in the kill ring to reach the
>> unibyte entry you really want.

>     When the insertion is a user-level operation, the elisp code should make
>     sure to manually do the encoding/decoding, using e.g. the default file
>     coding-system.

> I don't understand -- could you be more specific?

C-y/M-y uses `insert' somewhere internally.  My suggestion is to make
`insert' signal an error when faced with the need to insert a multibyte
string in a unibyte buffer.  This doesn't mean that C-y/M-y should propagate
this error.


        Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]