emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: char-to-string


From: Kenichi Handa
Subject: Re: char-to-string
Date: Fri, 9 Feb 2001 20:29:30 +0900 (JST)
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.0.97 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

"Eli Zaretskii" <address@hidden> writes:
>>  > Won't this have potentialy bad effects on eight-bit-*
>>  > character support, e.g. when searching for them in a
>>  > buffer?
>>  
>>  Do you have in your mind some concrete example?

> No.  I just thought about the possibility that some primitive might
> convert unibyte characters such as \200 into multibyte eight-bit-*
> characters (to provide the behavior users expect).  If such cases do
> exist, they might conflict with this change.

I don't know what kind of confliction you are afraid of.

The primites that did such a conversion silently WAS
char-to-string and string.  And, with the change, they now
don't do that.  Such functions as format and concat don't do
that conversion from the first.

Ex: (multibyte-string-p (format "%c" ?\200)) => nil
    (multibyte-string-p (concat '(?\200))) => nil

Now the behaviours of char-to-string and string are
consistent with them.

Do you know any other functions that make a string from a
character code?  If so, we must check their behaviour too.

>>  I think all of these should produce the same lisp string:
>>    "\343\200"
>>    (concat "\200" "\343")
>>    (concat (char-to-string ?\200) (char-to-string ?\343))
>>    (concat '(?\200 ?\343))
>>    (string ?\200 ?\343)
>>    (mapconcat 'char-to-string '(?\200 ?\343) "")
>>  and should work the same way on inserting, searchihg, etc.

> Yes, but I remember that some of the primitives silently convert
> between unibyte and multibyte because users expect that.  Isn't that
> the case?

The conversion that users expect is, for instance, this kind
of ones (assuming Latin-1 lang. env.):
   (concat "\300" " is À (Latin1 A-grave)") => "À is À (Latin1 A-grave)"
In this case, the unibyte string "\300" is converted to "À"
to keep the semantics of character, not to keep the code of character.

---
Ken'ichi HANDA
address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]