[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: utf-8 input under X11
From: |
Kenichi Handa |
Subject: |
Re: utf-8 input under X11 |
Date: |
Mon, 29 Oct 2001 10:30:22 +0900 (JST) |
Sorry for the late reply on this thread. I couldn't reach
the source code during the weekend.
gerd.moellmann@t-online.de (Gerd Moellmann) writes:
> David Monniaux <David.Monniaux@ens.fr> writes:
>> Le Vendredi 26 Octobre 2001 17:25, Gerd Moellmann a écrit :
>> > Then we're indeed on the right track, but I guess I'll need Kenichi
>> > for debugging decode_coding. Could you please print the value
>> > of locale-coding-system?
>>
>> nil
> That's certainly no good to decode UTF-8 :-).
I found at least one apparent bug in xterm.c. If
locale-coding-system is nil, we executes this code (at line
10573):
if (/* If the event is not from XIM, */
event.xkey.keycode != 0
/* or the current locale doesn't request
decoding of the intup data, ... */
|| coding.type == coding_type_raw_text
|| coding.type == coding_type_no_conversion)
{
/* ... we can use the input data as is. */
nchars = nbytes;
}
thus the byte sequence returned by XmbLookupString is not
decoded. But, in the for loop at line 10601, we have this
code:
for (i = 0; i < nbytes; i += len)
{
c = STRING_CHAR_AND_LENGTH (copy_bufptr + i,
nbytes - i, len);
Here, STRING_CHAR_AND_LENGTH should not be used on undecoded
byte sequence. If used, the result of C and LEN is random.
So, at least, the above line should be something like:
if (nbytes == nchars)
c = copy_bufptr[i], len = 1;
else
c = STRING_CHAR_AND_LENGTH (copy_bufptr + i, nbytes - i, len);
Could you change the code as above, and try again without
setting locale-coding-system to utf-8?
>> When I start emacs with LC_ALL=fr_FR, locale-coding-system is
>> iso-latin-1.
> Hm.
>> I now did experiments setting locale-coding-system to 'utf-8. The
>> results are pretty much interesting. I don't have to
>> (set-keyboard-coding-system 'utf-8)
>> - <Multi_key o e> gives oe ligature (correct)
>> - <Multi_key O E> gives oe ligature (correct)
>> - <Multi_key E => gives the Euro sign (correct)
>> - AltGr-E abandons instead of giving the Euro sign
>> - the Russian characters give garbage
For the last two cases, could you tell me the exact byte
sequence returned in COPY_BUFPTR by XmbLookupString?
---
Ken'ichi HANDA
handa@etl.go.jp
Re: utf-8 input under X11, Kenichi Handa, 2001/10/27
Re: utf-8 input under X11, Kenichi Handa, 2001/10/28
Re: utf-8 input under X11,
Kenichi Handa <=
- Re: utf-8 input under X11, David Monniaux, 2001/10/29
- Re: utf-8 input under X11, David Monniaux, 2001/10/29
- Re: utf-8 input under X11, Eli Zaretskii, 2001/10/29
- Re: utf-8 input under X11, David Monniaux, 2001/10/29
- Re: utf-8 input under X11, Eli Zaretskii, 2001/10/30
Re: utf-8 input under X11, Gerd Moellmann, 2001/10/29