Unicode and Chinese

From: Dan Grayson
Subject: Unicode and Chinese
Date: Thu, 22 Feb 2007 13:59:23 -0600 (CST)

Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:

    Start emacs
    Open a shell buffer with M-x shell.
    Use C-x RET p (set-buffer-process-coding-system) to set its coding systems 
both to utf-8.
    Type the command
             echo ""
         but don't press return yet.
    Use C-x RET C-\ (set-input-method) to set its input method to chinese-py
    Type "ni" inside the string.  That puts a common Chinese character there, 
the word for "you".
    Now press return and observe that the Chinese character is not echoed 
    Now execute
    with M-: (eval-expression).
    Now scroll up and enter that line again and this time it will work.
    Repeat the same experiment with
           echo -n "" | od -X
    to see that it's an encoding problem.

    Here is what the shell interaction will look like (except I've typed the
    Chinese character in there just now):

    After loading the tables:

        % echo -n "你" | od -X
        0000000 00bdbfef

    After loading the tables:

        % echo -n "你" | od -X
        0000000 00a0bde4

    My work-around, discovered after lots of trial and error: set the language 
environment to
    Chinese-GB first.

    That work-around is not very intuitive.  Shouldn't the tables get loaded 
the first time
    utf-8 encoding is done?


In GNU Emacs (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2007-02-20 on rhodium
X server distributor `The X.Org Foundation', version 11.0.60900000
configured using `configure  '--with-xpm' '--with-jpeg' '--with-tiff' 
'--with-gif' '--with-png' '--with-pop' 'CC=gcc' '--prefix=/capybara''

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: C
  locale-coding-system: nil
  default-enable-multibyte-characters: t

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  tool-bar-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  unify-8859-on-encoding-mode: t
  utf-translate-cjk-mode: t
  auto-compression-mode: t
  line-number-mode: t

Recent input:
C-y M-< C-x C-k C-g C-g M-x C-g M-x r e p o r SPC <tab> 

Recent messages:
(emacs -q)
For information about the GNU Project and its goals, type C-h C-p. [2 times]
Mark set [2 times]
Loading kmacro...done
Quit [2 times]
Loading emacsbug...
Loading regexp-opt...done
Loading emacsbug...done

