Re: Unicode character read representation

From: Kenichi Handa
Subject: Re: Unicode character read representation
Date: Tue, 24 Feb 2009 20:14:14 +0900

In article <address@hidden>, Chong Yidong <address@hidden> writes:

> From objects.texi in the Lisp manual:
>   `\U00NNNNNN' represents the character whose Unicode code point is
>   `U+NNNNNN', if such a character is supported by Emacs.  If the
>   corresponding character is not supported, Emacs signals an error.

> Are there any Unicode code points not supported by Emacs,


> or is this sentence obsolete?

Not completely obsolete, but should be modified somehow.

At first, #x0..#x3FFFFF are all valid Emacs character codes.

Some of U+NNNNNN are valid Unicode code points for
"noncharacter" (e.g. U+FFFE, U+FFFF), some are invalid
Unicode code points (U+120000..U+3FFFFF), some are invalid
both as Unicode code points and Emacs character codes
(U+400000 and over).

Currently Emacs signals an error only for U+400000 and over,
and I'm not sure how strictly we should interprete
\U.. notation.

Kenichi Handa

