emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: unibyte<->multibyte conversion [Re: Emacs-diffs Digest, Vol 2, Issue


From: Stefan Monnier
Subject: Re: unibyte<->multibyte conversion [Re: Emacs-diffs Digest, Vol 2, Issue 28]
Date: Wed, 22 Jan 2003 09:12:49 -0500

>     While we're at it, how about making string-as-multibyte obsolete ?
> 
> It is not obsolete--there are reasons to use it.

But it can be replaced by a call to decode-coding-string, so it is
not indispensable.

>     I think avoiding string-FOO-multibyte and using decode-coding-string
>     instead would make things a lot more clear.
> 
> I don't see any advantage in the change.

Here is the reason why we should discourage the use of unibyte<->multibyte
conversions and recommend coding/decoding instead:

There is a lot of
confusion among Emacs hackers about "what's this MULE stuff" and "why
Emacs does conversions instead of keeping things as they are", typically
for users of latin-1 locales (but more generally any 8-bit locale)
where they don't understand the difference between bytes and chars.

This is of course why we introduced unibyte buffers in the first place:
a lot of code was not properly updated to MULE and was not doing
conversions where they're necessary.

So where does the unibyte<->multibyte stuff comes in ?  I think it
simply promotes the illusion that it is possible to "switch between
the two equivalent representation" although there's clearly no unambiguous
equivalence.  So people end up with "oh, I have a unibyte thing here
and Emacs wants a multibyte thing instead, so I'll just make it
multibyte" using some kind of default encoding which "should work
most of the time".

If coders such as Eli and myself don't fully understand the semantics
of string-as-multibyte and string-make-multibyte (and the various ways
in which they are implicitly called), it's clear that those functions
should basically not be used by anyone.

Using decode-coding-string is just as easy and makes things much
more clear so we should encourage it.


        Stefan





reply via email to

[Prev in Thread] Current Thread [Next in Thread]