Re: [Emacs-diffs] master 58a3c54: Avoid using string-make-unibyte in sel

From: Stefan Monnier
Subject: Re: [Emacs-diffs] master 58a3c54: Avoid using string-make-unibyte in select.el
Date: Sat, 22 Jun 2019 12:44:05 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

>> > +            (or (null (multibyte-string-p str))
>> > +                (setq str (encode-coding-string 'raw-text-unix str))))
>> Isn't this the same as (setq str (string-to-unibyte str))?
> No, because the former doesn't signal an error.

Oh, right, that's yet another subtle distinction between all those alternatives.

BTW, do we actually need to convert to unibyte here?
(most place where we expect a unibyte string, we silently convert from
multibyte when needed, in a way that's basically equivalent to the

> (And I didn't want to use any of those string-to/as-uni/multibyte
> functions anyway.)

I hated those functions and still do for the string-as and string-make
variety, but I'm beginning to like the string-to variety when we need to
convert the representation of a sequence of *bytes* within
encoding/decoding them as chars.

So maybe the present case argues for adding a `no-error` argument to
string-to-unibyte.  I say this because to me (encode-coding-string
'raw-text-unix str) is an oxymoron since `raw-text-unix` is a synonym of
`binary` and `no-conversion`, which basically says "do any
encoding/decoding, instead preserve bytes as bytes".

IOW coding-systems like `raw-text` make sense in places like the
`coding:` tag or in buffer-file-coding-system, where we are forced to
put some kind of coding-system and where it is hence handy to be able to
use `raw-text-unix` to basically skip the en/decoding.
But I find them confusing when passed as a constant to

> The only thing we are supposed to do in the multibyte case is to make
> sure the raw bytes are converted to their single-byte representation,
> which is exactly what raw-text-unix does.

Right (and indeed string-make-unibyte worked in practice for the same
reason that encoding with pretty much any coding-system preserves the
bytes as well, save for a few exceptions like utf-8-emacs).


