[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: An iso-8859-6 cannot be saved

From: Peter Dyballa
Subject: Re: An iso-8859-6 cannot be saved
Date: Fri, 22 Sep 2006 01:02:37 +0200

Am 21.09.2006 um 04:27 schrieb Kenichi Handa:

In article <address@hidden>, Peter Dyballa <address@hidden> writes:
My test files starts with: ;;; -*- coding: iso-8859-6; -*-

The mode-line starts with -6:

GNU Emacs 22.0.50 was started with -Q

When I try to save it I get in mini-buffer:

Selected encoding mule-utf-8-unix disagrees with iso-8859-6-unix
specified by file contents.  Really save (else edit coding cookies
and try again)? (yes or no)

Then it's saved in UTF-8 and the mode-line changes to -u:. In another
editor (Smultron) I can load the file in ISO 8859-6 encoding and see
that it's original encoding was changed to something like UTF-8 (two
octets when there was only one before).

iso-8859-6 is an Arabic charset.  Didn't the buffer contain
a character that can't be encoded by iso-8859-6?

There seem to be even more bugs ...

Those characters that are displayed as boxes are described incorrectly (oct 244, 254, 255, 273, 277-322). For example:

  character: ْ (333618, #o1213462, #x51732, U+0652)
charset: mule-unicode-0100-24ff (Unicode characters of the range U+0100..U+24FF.)
        code point: #x2E #x32
             syntax: w  which means: word
           category: b:Arabic
        buffer code: #x9C #xF4 #xAE #xB2
          file code: #xD2 (encoded by coding system iso-8859-6-unix)
            display: by this font (glyph code)
-B&H-LucidaTypewriter-Medium-R-Normal-Sans-10-100-75-75-M-60- ISO10646-1 (#x652)
        ;   oct   dec   hex    UCS2    UTF-8
        ْ = 322 = 210 = D2 = U+0632 =    D8 B2 : ARABIC LETTER ZAIN

Notice that GNU Emacs 22.0.50 says U+0652, which is incorrect. The correct slot is U+0632.

The range of oct 323-332, 340-362 is displayed as \<the oct value>. C- u C-x = shows for these, for example:

          character: ” (211, #o323, #xd3)
            charset: eight-bit-graphic (8-bit graphic char (0xA0..0xFF))
        code point: #xD3
             syntax:    which means: whitespace
        buffer code: #xD3
          file code: not encodable by coding system iso-8859-6-unix
            display: by this font (glyph code)
-B&H-LucidaTypewriter-Medium-R-Normal-Sans-10-100-75-75-M-60- ISO10646-1 (#xFFFD)

Am I assuming the wrong ISO 8859-6 encoding?



Without vi there is only GNU Emacs

reply via email to

[Prev in Thread] Current Thread [Next in Thread]