emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Several serious problems


From: Kenichi Handa
Subject: Re: Several serious problems
Date: Thu, 29 Aug 2002 22:25:25 +0900 (JST)
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.1.30 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

In article <address@hidden>,
  Dave Love <address@hidden> writes:
> As far as I know, what's installed in the trunk behaves correctly, but
> I'm not using that code

Why aren't you using that code?  Does it mean that you
changed some of them locally?

> and I don't know if I'd hear about real
> problems with it (as opposed to imagined problems).  It should all be
> things you have said are OK or I'm sure you will think are OK, but I
> may have overlooked something.  However, it could use work for CJK, in
> particular; there's a fixme in utf-8, and there could be additional
> interconversion tables for CJK charsets as well as a way of
> customizing the character preferences in utf-8-subst.el, and probably
> other things.

I noticed those `fixme's.   Yes, it is better to solve all
of them, but, for the moment, I want to concentrate on
fixing the problem of RC.

>>  I've thought that the current codes were
>>  the same one as what Dave had, but the above statement of
>>  Dave's tells that it's not.

> Well, now I check, utf-8.el in the RC branch seems to be as I left it,
> which is what rms (I think) told me to do.  As far as I can tell, its
> safe-charsets property is correct,

The safe-charsets property of utf-8 in RC is this:

ascii eight-bit-control eight-bit-graphic latin-iso8859-1
mule-unicode-0100-24ff mule-unicode-2500-33ff
mule-unicode-e000-ffff ethiopic tibetan thai-tis620
katakana-jisx0201 ipa chinese-sisheng lao
vietnamese-viscii-lower vietnamese-viscii-upper

It doesn't contain latin-iso8859-[23...].

> and I don't understand what the complaint is about.  When
> I couldn't check, I assumed someone had modified it
> incorrectly, but there's no sign of that in CVS.

The complaint is that the coding-system utf-8 can't encode
latin-2 characters in RC even if loadup.el has these lines.

(load "international/ucs-tables")
(ucs-unify-8859 'encode-only)

The reason is, as far as I see, the ccl program
`ccl-encode-mule-utf-8' doesn't have this line at the near
to head.

           (translate-character ucs-mule-to-mule-unicode r0 r1))

So, even if we setup the translation table
`ucs-mule-to-mule-unicode' at loadup time, it is not used in
utf-8.

>>  Could someone tell me why are they different in HEAD and RC,
>>  and why are they different from what Dave have written?

> Most changes aren't in RC since I was only allowed to add (a version
> of) ucs-tables, not changing the default behaviour, so people could
> turn on (partial) character translation themselves.  It doesn't affect
> utf-8 or any other ccl coding systems because they don't use the
> translation table (although the useful extra coding systems in
> code-pages.el aren't included either, so I think only koi,
> alternativnyj and mac-roman are affected).

Hmmm, I think I realized the situation of RC.  It can unify
charsets between iso-8859-X, but utf-8 can't encode
iso-8859-X (intentionally), correct?

Richard, is it what you asked Dave to install for RC?

I think RC should also allow utf-8 to encode 8859-X
correctly like in HEAD.  I see no harm in it.

> I think I unilaterally added some other things (a utf-8 language
> environment and utf-16.el?) since they addressed somewhat misleading
> entries in PROBLEMS and the arguments against the Unicode support are
> either demonstrably wrong or spurious IMNSHO.

I don't oppose to that.  I found one problem with utf-16.
It seems that utf-16-le/be can handle 8859-X correctly
because of this line in ccl-encode-mule-utf-16-le/be,
      (translate-character ucs-mule-to-mule-unicode r0 r1)
but the safe-charsets property lists only these:
      ascii
      eight-bit-control
      latin-iso8859-1
      mule-unicode-0100-24ff
      mule-unicode-2500-33ff
      mule-unicode-e000-ffff
thus, they can't be regarded as a safe coding system for
them.

> I'm afraid I've had enough of all this,

Yah, you have done the excellent hack!  When I implemented
translation table stuffs, I didn't expect that it can be
used this thoroughly.

> and I doubt it's worth more effort anyhow.  Especially
> after all the FUD about them, the Mule additions probably
> won't get used much unless they're the default, even by
> i18n people, unfortunately.

I thought containing ucs-tables and etc in RC is at least
for making unify-on-encoding the default INCLUDING utf-8.

---
Ken'ichi HANDA
address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]