Re: HELLO changes

emacs-pretest-bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HELLO changes

From:	Kenichi Handa
Subject:	Re: HELLO changes
Date:	Tue, 28 Oct 2003 21:30:17 +0900 (JST)
User-agent:	SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.3 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

In article <address@hidden>, Richard Stallman <address@hidden> writes:
>>  That is rather cryptic and I can't draw a conclusion from that.  Could
>>  you say how they ARE treated now?

>     I don't think this is getting anywhere (except, I think, illustrating
>     the confusion I wanted to avoid).

> If you want to convince me that that change is desirable,
> it behooves you to explain it clearly.

In Emacs, we can call a character as "unicode character" if
it can be encoded by an encoding scheme that the Unicode
defines (e.g. utf-8).  So, all characters belonging to
ascii, latin-iso8859-1, mule-unicode-xxx-yyy, and
eight-bit-control are "unicode characters".

In addition, in unify-8859-on-encoding-mode, many other
characters belonging to the other character sets
(e.g. latin-iso8859-2) BECOME "unicode character" because
they are mapped to latin-iso8859-1 or mule-unicode-xxx-yyy
on encoding in this mode.

In addition, in utf-translate-cjk-mode, many more CJK
characters BECOME "unicode character" because they can be
encoded by UTF-8 in this mode.

Thus, as Dave wrote, most charaters in HELLO files are
"unicode characters".

I think the underlying meaning of Dave's this mail:

> It isn't meaningful to say XXX is a `Unicode character' because I
> typed it with the TeX input method, but YYY isn't because I typed it
> with latin-9-prefix (assuming they don't get canonicalized on input).

is that we can't answer to the question "Is this a unicode
character or not?" only by seeing the Emacs character code.

So, the line "A short test for Unicode characters:" in the
current HELLO file is a very confusing one.  It implies that
the characters in the previous lines are not Unicode
characters.

Richard Stallman <address@hidden> writes:
> Doesn't the current Emacs represent them with different codes,
> treating them as different charsets?  That was the case until
> recently, I think.

Right.  Emacs represents two characters that are "same in
the Unicode code point" in multiple ways.  But, it is also
possible to display them by the same glyph of the same font
by setting up a fontset properly.  In that case, having the
lines under "A short test ..." is useless.

But, I too think that it's too early to remove those lines.
By default, Emacs uses different fonts for characters in
latin-iso8859-X and mule-unicode-xxxx-yyyy.  By having those
lines, we can see if the fonts currently assigned for
mule-unicode-xxxx-yyyy is working or not easily.

So, my suggestion is to change the line "A short test ..."
to "A short test for characters represented by the character
sets mule-unicode-0100-24ff".  Making
"mule-unicode-0100-24ff" clickable to run
describe-charater-set may also be useful.

---
Ken'ichi HANDA
address@hidden

[Prev in Thread]

Current Thread

[Next in Thread]

Re: HELLO changes, (continued)

Prev by Date: Re: CVS emacs -- and buffer char encoding
Next by Date: Re: using qp.el, rfc2047.el in mailutils, rmail
Previous by thread: Re: HELLO changes
Next by thread: Re: HELLO changes
Index(es):
- Date
- Thread