[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unibyte characters, strings, and buffers
From: |
David Kastrup |
Subject: |
Re: Unibyte characters, strings, and buffers |
Date: |
Fri, 28 Mar 2014 12:34:56 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) |
Andreas Schwab <address@hidden> writes:
> David Kastrup <address@hidden> writes:
>
>> "Stephen J. Turnbull" <address@hidden> writes:
>>
>>> I agree that having a way to represent "undecodable bytes" in a string
>>> or buffer is extremely convenient. XEmacs's lack of this capability
>>> is surely a deficiency (Hi, David K!)
>>
>> Doing this in an utf-8 based internal coding is somewhat doable by
>> employing non-utf-8 sequences. Either using code points above the
>> Unicode code range (2^20 + something, requiring 4 bytes), or by using
>> non-minimal encodings (since the minimal ones are two bytes, requiring 3
>> bytes). Either way, the size increases significantly.
>
> Emacs uses U3fff80-U3fffff for raw 8-bit bytes, internally represented
> by 2 bytes.
Well, I forgot the non-minimal encodings for 0x00-0x7f, namely two-byte
sequences starting with 0xc0 or 0xc1 and ending with 0x80-0xbf.
Those would still fit the representation invariants. Are those the
two-byte encodings used for "raw 0x80 to 0xff"?
--
David Kastrup
- Re: Unibyte characters, strings and buffers, (continued)
- Re: Unibyte characters, strings and buffers, Stefan Monnier, 2014/03/28
- Re: Buffer-local variables affect general-purpose functions, Stefan Monnier, 2014/03/28
- Re: Buffer-local variables affect general-purpose functions, Stephen J. Turnbull, 2014/03/27
- Re: Unibyte characters, strings, and buffers, Eli Zaretskii, 2014/03/28
- Re: Unibyte characters, strings, and buffers, Stephen J. Turnbull, 2014/03/28
- Re: Unibyte characters, strings, and buffers, David Kastrup, 2014/03/28
- Re: Unibyte characters, strings, and buffers, Andreas Schwab, 2014/03/28
- Re: Unibyte characters, strings, and buffers,
David Kastrup <=
- Re: Unibyte characters, strings, and buffers, Stephen J. Turnbull, 2014/03/28
- Re: Unibyte characters, strings, and buffers, Eli Zaretskii, 2014/03/28
- Re: Unibyte characters, strings, and buffers, David Kastrup, 2014/03/28
- Re: Unibyte characters, strings, and buffers, Eli Zaretskii, 2014/03/28
- Re: Unibyte characters, strings, and buffers, David Kastrup, 2014/03/28
- Re: Unibyte characters, strings, and buffers, Eli Zaretskii, 2014/03/29
- Re: Unibyte characters, strings, and buffers, David Kastrup, 2014/03/29
- Re: Unibyte characters, strings, and buffers, Eli Zaretskii, 2014/03/29
- Re: Unibyte characters, strings, and buffers, David Kastrup, 2014/03/29
- Re: Unibyte characters, strings, and buffers, Eli Zaretskii, 2014/03/29