emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#4051: closed (Character Soup)


From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#4051: closed (Character Soup)
Date: Thu, 11 Feb 2016 20:40:03 +0000

Your message dated Thu, 11 Feb 2016 20:38:58 +0000
with message-id <address@hidden>
and subject line Re: bug#4051: Character Soup
has caused the debbugs.gnu.org bug report #4051,
regarding Character Soup
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
4051: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4051
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: Character Soup Date: Thu, 06 Aug 2009 00:09:20 +0300 User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (x86_64-pc-linux-gnu)
The coding system for the buffer with the Latin-1 character á in
the Cyrillic KOI8 language environment is detected as Chinese gb2312.
How funny!

I noticed this while reporting the bug#4037 that was sent by message.el
with charset=gb2312.  Mail readers incorrectly display this message due
to ugly fonts associated with gb2312 (this is a separate problem).

I think it would be more natural to encode this as Latin-1 (in this
particular case) or generally UTF-8 - the universal coding specially
designed for mixing different scripts.

The easiest way to reproduce this problem:

  1. emacs -Q
  2. C-x RET l Cyrillic-KOI8
  3. C-x 8 ' a
  4. C-x C-s
  5. File to save in: /tmp/file

After that the prompt says:

  Select coding system (default chinese-iso-8bit): 

and the buffer `*Warning*' contains:

  These default coding systems were tried to encode text
  in the buffer `file':
    (cyrillic-koi8-unix (192 . 225))
  However, each of them encountered characters it couldn't encode:
    cyrillic-koi8-unix cannot encode these: á

  Click on a character (or switch to this window by `C-x o'
  and select the characters by RET) to jump to the place it appears,
  where `C-u C-x =' will give information about it.

  Select one of the safe coding systems listed below,
  or cancel the writing with C-g and edit the buffer
     to remove or modify the problematic characters,
  or specify any other coding system (and risk losing
     the problematic characters).

    gb2312 utf-8 euc-jis-2004 euc-jp windows-1258 viscii
    iso-2022-jp-2004 cp862 iso-8859-16 hp-roman8 next mac-roman cp437
    cp865 cp861 cp860 cp858 cp857 cp852 cp850 windows-1254 windows-1252
    windows-1250 iso-8859-15 iso-8859-14 iso-8859-10 iso-8859-9
    iso-8859-4 iso-8859-3 iso-8859-2 gb18030 gbk hz-gb-2312 utf-7
    iso-8859-1 utf-16 utf-16be-with-signature utf-16le-with-signature
    utf-16be utf-16le iso-2022-7bit utf-8-auto utf-8-with-signature
    eucjp-ms vietnamese-tcvn vietnamese-viqr vietnamese-vscii
    japanese-shift-jis-2004 japanese-iso-7bit-1978-irv ibm1047
    utf-7-imap utf-8-emacs

I already figured out how to fix this problem for message.el using
(setq mm-coding-system-priorities (cons 'utf-8 mm-coding-system-priorities))
But as shown by the test case above this is a general problem.

-- 
Juri Linkov
http://www.jurta.org/emacs/


--- End Message ---
--- Begin Message --- Subject: Re: bug#4051: Character Soup Date: Thu, 11 Feb 2016 20:38:58 +0000 User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.90 (darwin)
Juri Linkov <address@hidden> writes:

> The coding system for the buffer with the Latin-1 character á in
> the Cyrillic KOI8 language environment is detected as Chinese gb2312.
> How funny!

Hi, sorry nobody's got back to you about this before now. It seems that
the choice of gb2312 isn't due to it being detected as Chinese text, but
just that that is the first encoding Emacs finds that can encode the
buffer correctly.

There's some more discussion of this here:

https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22436

I'm going to close this bug report, but please reopen it if you're
unhappy.
-- 
Alan Third


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]