[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: eight-bit char handling in emacs-unicode
From: |
Stefan Monnier |
Subject: |
Re: eight-bit char handling in emacs-unicode |
Date: |
25 Nov 2003 10:43:05 -0500 |
User-agent: |
Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50 |
> It seems that you keep of saying that "A does B, thus it's
> nonsense". But, I'm arguing that "A does C".
Well, the thing is: I still don't understand what is C.
>From what I understand, you say that C is "a conversion from multibyte
to a sequence of code-points", but since the output is a unibyte string,
that restrict it to cases where the code-points can be encoded in 8 bits,
thus it doesn't sound very generic and I don't see any application for it
(nor do I see any practical difference with using encode-coding-string
since the output AFAIK would be the same).
> It doesn't make sense because you treat the result as "a
> unibyte string encoded in Latin-1".
> It makes sense if you treat the result as "a unibyte string
> in which each byte represents a sequence of Unicode
> code-points", doesn't it?
But each byte can only represent the 0-255 subset of unicode code-points, in
which case this is equivalent (practically speaking) to latin-1, isn't it ?
>> It'd make sense if the environment said "latin-1 when you can,
>> utf-8 otherwise" or something like that, but then we would use
>> encode-coding-string anyway.
> It's itself nonsense to have such a coding system.
I was not thinking of a coding-system, but just some encoding job,
such as what is done when saving a buffer (where my .emacs does exactly
that: try latin-1 first and utf-8 if that fails).
> Do you agree with having string-make-unibyte if it signals an error on
> non-Latin-1 characters?
Of course: that's pretty much what I suggested: make-string-unibyte only
accepts multibyte chars that correspond to "bytes".
>> I just don't know of a concrete case where it makes sense to use
>> string-make-unibyte.
> I'll paraphrase my previous example as this:
> It is perfectly possible to live in such an environment
> where only the characters U+0000..U+00FF of Unicode is
> used but only the coding system utf-8 is used.
> But, I don't claim that the above is a realistic case.
> Another non-realistic but concrete case is:
> Use only the charset iso-8859-5 and the encoding CTEXT.
I don't see any use of string-make-unibyte in your two examples.
And "having string-make-unibyte if it signals an error on non-Latin-1
characters" means that the second example can't be used any more.
Stefan "still in the dark"
- Re: eight-bit char handling in emacs-unicode, (continued)
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/20
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/20
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/21
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/21
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/21
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/21
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/22
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/23
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/23
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/24
- Re: eight-bit char handling in emacs-unicode,
Stefan Monnier <=
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/25
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/26
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/26
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/27
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/30
- Re: eight-bit char handling in emacs-unicode, Richard Stallman, 2003/11/25
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/25
- Re: eight-bit char handling in emacs-unicode, Richard Stallman, 2003/11/26
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/26
- Re: eight-bit char handling in emacs-unicode, Richard Stallman, 2003/11/27