[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: utf-8 input under X11

From: Kenichi Handa
Subject: Re: utf-8 input under X11
Date: Mon, 29 Oct 2001 10:30:22 +0900 (JST)

Sorry for the late reply on this thread.  I couldn't reach
the source code during the weekend.

address@hidden (Gerd Moellmann) writes:
> David Monniaux <address@hidden> writes:
>>  Le Vendredi 26 Octobre 2001 17:25, Gerd Moellmann a écrit :
>>  > Then we're indeed on the right track, but I guess I'll need Kenichi
>>  > for debugging decode_coding.  Could you please print the value
>>  > of locale-coding-system?
>>  nil

> That's certainly no good to decode UTF-8 :-).

I found at least one apparent bug in xterm.c.  If
locale-coding-system is nil, we executes this code (at line

                          if (/* If the event is not from XIM, */
                              event.xkey.keycode != 0
                              /* or the current locale doesn't request
                                 decoding of the intup data, ... */
                              || coding.type == coding_type_raw_text
                              || coding.type == coding_type_no_conversion)
                              /* ... we can use the input data as is.  */
                              nchars = nbytes;

thus the byte sequence returned by XmbLookupString is not
decoded.  But, in the for loop at line 10601, we have this

                          for (i = 0; i < nbytes; i += len)
                              c = STRING_CHAR_AND_LENGTH (copy_bufptr + i,
                                                          nbytes - i, len);

Here, STRING_CHAR_AND_LENGTH should not be used on undecoded
byte sequence.  If used, the result of C and LEN is random.

So, at least, the above line should be something like:

if (nbytes == nchars)
  c = copy_bufptr[i], len = 1;
  c = STRING_CHAR_AND_LENGTH (copy_bufptr + i, nbytes - i, len);

Could you change the code as above, and try again without
setting locale-coding-system to utf-8?

>>  When I start emacs with LC_ALL=fr_FR, locale-coding-system is 
>>  iso-latin-1.

> Hm.

>>  I now did experiments setting locale-coding-system to 'utf-8. The 
>>  results are pretty much interesting. I don't have to
>>  (set-keyboard-coding-system 'utf-8)
>>  - <Multi_key o e> gives oe ligature (correct)
>>  - <Multi_key O E> gives oe ligature (correct)
>>  - <Multi_key E => gives the Euro sign (correct)
>>  - AltGr-E abandons instead of giving the Euro sign
>>  - the Russian characters give garbage

For the last two cases, could you tell me the exact byte
sequence returned in COPY_BUFPTR by XmbLookupString?

Ken'ichi HANDA

reply via email to

[Prev in Thread] Current Thread [Next in Thread]