[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fwd: Problem with non-bmp unicode

From: Kenichi Handa
Subject: Re: Fwd: Problem with non-bmp unicode
Date: Sun, 12 Nov 2006 11:32:58 +0900
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/22.0.50 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI)

In article <address@hidden>, Jérôme Marant <address@hidden> writes:

> Do you have any clue about this?

Sorry for the late reponse on this thread.

> Subject: Problem with non-bmp unicode
> Date: mercredi 08 novembre 2006 09:26
> An UTF-8 file (attached) with these three characters:
> U+0022 U+00010380 U+0022
> shows with "emacs -nw":
> "\360\220\216\200"
> which is not usable at all. The file displays correctly if I cat it.

> I tried a bunch of other characters outside the BMP, all of which
> fail in the same way. Characters in the BMP work nicely.

Emacs 22 still doesn't support Unicode characters over BMP.
If you really need to handle them, please use the CVS branch

> Apparently, emacs 22 shows a question mark instead of "\360\220\216\200"
> but trying to delete the question mark character with backspace turn it into
> "\360\220\216".

This is written in the comment of utf-8.el.

;; We compose the untranslatable sequences into a single character,
;; and move point to the next character.
;; This is infelicitous for editing, because there's currently no
;; mechanism for treating compositions as atomic, but is OK for
;; display.  They are composed to U+FFFD with help-echo which
;; indicates the unicodes they represent.  This function GCs too much.

I tried to fix this editting problem by using
modification-hooks text property, but couldn't accomplish a
good result.

Kenichi Handa

reply via email to

[Prev in Thread] Current Thread [Next in Thread]