[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: More Cyrillic vs UTF-8

From: Simon Josefsson
Subject: Re: More Cyrillic vs UTF-8
Date: Sat, 26 Apr 2003 13:54:34 +0200
User-agent: Gnus/5.090019 (Oort Gnus v0.19) Emacs/21.3.50 (gnu/linux)

Kenichi Handa <address@hidden> writes:

> In article <address@hidden>, Simon Josefsson <address@hidden> writes:
>> (Same configuration as last mail)
>> Cut'n'paste the following string into a new file and save it:
>> Горбачев
>> UTF-8 isn't shown as an option, and indeed selecting UTF-8 destroys
>> the data.  Doesn't Emacs CVS support the entire Unicode repertoire?
>> (The string above, encoded as shift_jis, is, according to od -x:
>> 0000000 4384 8084 8284 7184 7084 8984 7584 7284)
> Those characters belongs to the charset japanese-jisx0208,
> and the current Emacs still can't encode them into UTF-8.
> How did you get such characters?

That may be interesting by itself.  Go to
http://www.nns.ru/persons/gorbach.html using galeon (or mozilla, I
think).  Cut'n'paste the first word and yank it in Emacs.  It looks as
single-width in galeon, but when yanked into emacs it becomes double
width. Yanking it into xterm or gnome-terminal doesn't change the
string, it looks like single-width.  Save the HTML file and open it in
emacs as a koi8 file (note that emacs doesn't auto detect it as koi8
so you to do that manually), then it is single-width too.

I guess it is the emacs X cut'n'paste code that somehow makes the
string into double width japanese characters.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]