Re: GNU Emacs 22.0.50 fails to find ä in different ISO Latin encodings

From: Peter Dyballa
Subject: Re: GNU Emacs 22.0.50 fails to find ä in different ISO Latin encodings
Date: Fri, 22 Sep 2006 11:06:03 +0200

Am 22.09.2006 um 02:44 schrieb Miles Bader:

Peter Dyballa <address@hidden> writes:
Anyway, what also does not work is: C-s C-q <a non-ASCII, i.e. greater
177 octal code>. For those with really small keyboards this  is the
(almost?) only chance to find some of the x times 64 K  characters in
Unicode ...

Eh?  It works for me:

E.g., the Emacs 22 character code of "字" is octal 0156772.

If I enter C-s C-q 0156772 (followed by some other char to terminate the octal code), it correctly adds that character to the search string (and
finds in the buffer).

OK, I did not check in the "higher" Unicode regions, and I did not check in an UTF-8 encoded buffer, and I did not input so long numbers I cannot compute, I was still in my simple ISO 8859-X test files (your example works for me too in an UTF-8 encoded buffer). After launching GNU Emacs 22.0.50 with -Q the phenomenon seems to be that input like

        C-s C-q <[23][0-7][0-7]> RET

is interpreted as trying to "name/point to" an ISO 8859-1 encoded character. For example:

C-s C-q 245 in ISO 8859-16 does not find ``„´´ (U+201E) – mini- buffer tells me that ``¥´´ (\245 in ISO 8859-1) cannot be found.

C-s C-q 241 RET searches for ¡.
C-s C-q 242 RET searches for ¢.
C-s C-q 243 RET searches for £.
C-s C-q 244 RET searches for ¤ (CURRENCY SIGN, U+00A4).

Evaluating (unify-8859-on-decoding-mode t) does not change this specific behaviour.

Which is the formula to map octal 0156772 to a Unicode slot/position? Octal 0156772 is DDFA in hex, which is different from 5B57, 字's position in Unicode. Or: how can I find the octal value for a given Unicode slot (U+ABCD)? There is probably some function for this purpose ...



