[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Unicode-2] `read' always returns multibyte symbol

From: Kenichi Handa
Subject: Re: [Unicode-2] `read' always returns multibyte symbol
Date: Thu, 15 Nov 2007 23:41:23 +0900
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/23.0.60 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)

In article <address@hidden>, Katsumi Yamaoka <address@hidden> writes:

> > If "modifies" means that 8-bit bytes are converted to
> > multibyte characters as what string-as-multibyte does, it's
> > an expected behaviour.

> What I observed was different.  The group name "ใƒ†ใ‚นใƒˆ" is
> encoded by utf-8 by the nntp server into:

> "\343\203\206\343\202\271\343\203\210"

> After it is transferred to Gnus, in the nntp process bufer it is
> modified into:

> "\343\203XY\343\203\210"

> Where X is (make-char 'greek-iso8859-7 99)
>   and Y is (make-char 'latin-iso8859-2 57).

That is exactly what string-as-multibyte does. \206\343 and
\202\271 are valid multibyte forms in the current Emacs,
thus are treated as multibyte characters.

> Since Gnus treats a group name as a unibyte string, finally it
> is made into:

> "\343\203\343\271\343\203\210"

It seems that gnus treats "\343\203XY\343\203\210" as
unibyte by converting it by string-make-unibyte.

Please try this:

 (string-as-multibyte "\343\203\206\343\202\271\343\203\210"))

You'll get the above result, ... yes, very weird.

On the other hand,

 (string-as-multibyte "\343\203\206\343\202\271\343\203\210"))
 =>  "\343\203\206\343\202\271\343\203\210"

> > I long ago proposed a facility that turns on the
> > multibyteness of a buffer while converting 8-bit bytes to
> > multibyte characters as what string-to-multibyte does, but
> > not accepted.

> But the modern Emacsen does do so, doesn't it?


Kenichi Handa

reply via email to

[Prev in Thread] Current Thread [Next in Thread]