[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Unicode-2] `read' always returns multibyte symbol
From: |
Kenichi Handa |
Subject: |
Re: [Unicode-2] `read' always returns multibyte symbol |
Date: |
Thu, 15 Nov 2007 23:41:23 +0900 |
User-agent: |
SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/23.0.60 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) |
In article <address@hidden>, Katsumi Yamaoka <address@hidden> writes:
> > If "modifies" means that 8-bit bytes are converted to
> > multibyte characters as what string-as-multibyte does, it's
> > an expected behaviour.
> What I observed was different. The group name "ใในใ" is
> encoded by utf-8 by the nntp server into:
> "\343\203\206\343\202\271\343\203\210"
> After it is transferred to Gnus, in the nntp process bufer it is
> modified into:
> "\343\203XY\343\203\210"
> Where X is (make-char 'greek-iso8859-7 99)
> and Y is (make-char 'latin-iso8859-2 57).
That is exactly what string-as-multibyte does. \206\343 and
\202\271 are valid multibyte forms in the current Emacs,
thus are treated as multibyte characters.
> Since Gnus treats a group name as a unibyte string, finally it
> is made into:
> "\343\203\343\271\343\203\210"
It seems that gnus treats "\343\203XY\343\203\210" as
unibyte by converting it by string-make-unibyte.
Please try this:
(string-make-unibyte
(string-as-multibyte "\343\203\206\343\202\271\343\203\210"))
You'll get the above result, ... yes, very weird.
On the other hand,
(string-as-unibyte
(string-as-multibyte "\343\203\206\343\202\271\343\203\210"))
=> "\343\203\206\343\202\271\343\203\210"
> > I long ago proposed a facility that turns on the
> > multibyteness of a buffer while converting 8-bit bytes to
> > multibyte characters as what string-to-multibyte does, but
> > not accepted.
> But the modern Emacsen does do so, doesn't it?
No.
---
Kenichi Handa
address@hidden
- Re: [Unicode-2] `read' always returns multibyte symbol, (continued)
- Re: [Unicode-2] `read' always returns multibyte symbol, Katsumi Yamaoka, 2007/11/13
- Re: [Unicode-2] `read' always returns multibyte symbol, Katsumi Yamaoka, 2007/11/14
- Re: [Unicode-2] `read' always returns multibyte symbol, Stefan Monnier, 2007/11/14
- Re: [Unicode-2] `read' always returns multibyte symbol, Katsumi Yamaoka, 2007/11/14
- Re: [Unicode-2] `read' always returns multibyte symbol, Stefan Monnier, 2007/11/14
- Re: [Unicode-2] `read' always returns multibyte symbol, Katsumi Yamaoka, 2007/11/14
- Re: [Unicode-2] `read' always returns multibyte symbol, Stefan Monnier, 2007/11/14
- Re: [Unicode-2] `read' always returns multibyte symbol, Katsumi Yamaoka, 2007/11/15
- Re: [Unicode-2] `read' always returns multibyte symbol, Kenichi Handa, 2007/11/15
- Re: [Unicode-2] `read' always returns multibyte symbol, Katsumi Yamaoka, 2007/11/15
- Re: [Unicode-2] `read' always returns multibyte symbol,
Kenichi Handa <=
- Re: [Unicode-2] `read' always returns multibyte symbol, Katsumi Yamaoka, 2007/11/15
- Re: [Unicode-2] `read' always returns multibyte symbol, Kenichi Handa, 2007/11/15
- Re: [Unicode-2] `read' always returns multibyte symbol, Katsumi Yamaoka, 2007/11/15
- Re: [Unicode-2] `read' always returns multibyte symbol, Stefan Monnier, 2007/11/15
- Re: [Unicode-2] `read' always returns multibyte symbol, Stefan Monnier, 2007/11/15
- Re: [Unicode-2] `read' always returns multibyte symbol, Kenichi Handa, 2007/11/15
- Re: [Unicode-2] `read' always returns multibyte symbol, Eli Zaretskii, 2007/11/16
Re: [Unicode-2] `read' always returns multibyte symbol, Stefan Monnier, 2007/11/13