Re: Possible UTF-8 CJK Regressions in Terminal Emulators

From: Dave Love
Subject: Re: Possible UTF-8 CJK Regressions in Terminal Emulators
Date: Tue, 08 Jun 2004 19:02:07 +0100
Kenichi Handa <address@hidden> writes:

>> Absolutely!  Then we can say "utf-8 is (almost) completely
>> supported"...  I think this is a very important thing.
> I think "completely" is still too strong even with preceding
> "(almost)".

I know what you mean, but I think that's the sort of thing that
encourages the established user confusion over encoding issues.

UTF-8 per se is fully supported up to some limit on the code point.
(I hope that's as large as the Emacs 22 maximum codepoint, but I don't
remember.)  Whether or not valid unicodes can be decoded into a
character Emacs can actually encode/display/input properly is a
different matter, and the feature should affect all relevant CCL
coding systems, especially UTF-16.

> Perhaps "utf-8 support is fairly good" or
> "Unicode BMP support is fairly good".

The latter is much better.  (Exceptions include at least: various
complex scripts, much of the CJK space (little used?), reliable
display of CJK e.g. with XFree86 10646-encoded fonts, locale support
(including customization of the font encodings preferred), and BIDI.)

