Re: More Cyrillic vs UTF-8

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: More Cyrillic vs UTF-8

From:	Simon Josefsson
Subject:	Re: More Cyrillic vs UTF-8
Date:	Sat, 26 Apr 2003 13:54:34 +0200
User-agent:	Gnus/5.090019 (Oort Gnus v0.19) Emacs/21.3.50 (gnu/linux)

Kenichi Handa <address@hidden> writes:

> In article <address@hidden>, Simon Josefsson <address@hidden> writes:
>> (Same configuration as last mail)
>> Cut'n'paste the following string into a new file and save it:
>
>> Горбачев
>
>> UTF-8 isn't shown as an option, and indeed selecting UTF-8 destroys
>> the data.  Doesn't Emacs CVS support the entire Unicode repertoire?
>
>> (The string above, encoded as shift_jis, is, according to od -x:
>> 0000000 4384 8084 8284 7184 7084 8984 7584 7284)
>
> Those characters belongs to the charset japanese-jisx0208,
> and the current Emacs still can't encode them into UTF-8.
>
> How did you get such characters?

That may be interesting by itself.  Go to
http://www.nns.ru/persons/gorbach.html using galeon (or mozilla, I
think).  Cut'n'paste the first word and yank it in Emacs.  It looks as
single-width in galeon, but when yanked into emacs it becomes double
width. Yanking it into xterm or gnome-terminal doesn't change the
string, it looks like single-width.  Save the HTML file and open it in
emacs as a koi8 file (note that emacs doesn't auto detect it as koi8
so you to do that manually), then it is single-width too.

I guess it is the emacs X cut'n'paste code that somehow makes the
string into double width japanese characters.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: More Cyrillic vs UTF-8, (continued)
- Re: More Cyrillic vs UTF-8, Kenichi Handa, 2003/04/26
  - Re: More Cyrillic vs UTF-8, Simon Josefsson <=

Prev by Date: Re: MML charset tag regression
Next by Date: Re: Cyrillic vs UTF-8
Previous by thread: Re: More Cyrillic vs UTF-8
Next by thread: tramp failing in new build
Index(es):
- Date
- Thread