[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: eight-bit char handling in emacs-unicode
From: |
Stefan Monnier |
Subject: |
Re: eight-bit char handling in emacs-unicode |
Date: |
17 Nov 2003 16:17:56 -0500 |
User-agent: |
Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50 |
> The basic problem is that we don't distinguish a character
> (code) and a number. So, we introduce a character object
That's one way to look at the problem.
Another is to say that the problem is instead that we do not distinguish
between arrays of chars and arrays of bytes. We just use strings and
buffers and expect to be able to mix bytes and chars in them.
Such mixes are admittedly very rare for strings, but they're pretty common
for buffers.
So when we write 192 at a location, we don't know whether we should put
there the byte 192 or the eight-bit-char character that will be encoded
into a 192 byte.
In Emacs-21 we worked around the problem by arranging for "the
eight-bit-char that encodes to 192" to be represented by the integer 192, so
as to avoid having to choose. But with unicode, the 128-255 zone cannot be
dedicated to eight-bit-char since it's already used up for latin-1, so we
have to face the problem more directly.
The places where Emacs-21 still had to choose, we just used heursitics,
so `concat' will sometimes return a unibyte string, and sometimes
multibyte string.
So I think your options 1-3 are better than 4. BTW, your function
`eight-bit-char' should be named `byte-to-char' instead.
Which of 1 to 3 is the best is not clear, and maybe we can just live with
`make-string-unibyte' and `make-string-multibyte'. Note that 1-3 are
not mutually exclusive so we can use them all.
Stefan
- Re: BIG5-HKSCS?, (continued)
- Re: BIG5-HKSCS?, Simon Josefsson, 2003/11/13
- eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/13
- Re: eight-bit char handling in emacs-unicode, Oliver Scholz, 2003/11/14
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/14
- Re: eight-bit char handling in emacs-unicode, Oliver Scholz, 2003/11/15
- Re: eight-bit char handling in emacs-unicode, Simon Josefsson, 2003/11/15
- Re: eight-bit char handling in emacs-unicode, Simon Josefsson, 2003/11/14
- Re: eight-bit char handling in emacs-unicode, Alex Schroeder, 2003/11/16
- Re: eight-bit char handling in emacs-unicode,
Stefan Monnier <=
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/18
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/18
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/18
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/18
- Re: eight-bit char handling in emacs-unicode, Juri Linkov, 2003/11/19
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/19
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/20
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/20
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/21
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/21