emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unibyte characters, strings, and buffers


From: Stephen J. Turnbull
Subject: Re: Unibyte characters, strings, and buffers
Date: Sun, 30 Mar 2014 00:37:27 +0900

Andreas Schwab writes:
 > "Stephen J. Turnbull" <address@hidden> writes:
 > 
 > > *sigh*  No, it's about unibyte being a premature pessimization.
 > 
 > Unibyte is a pure space optimisation.

It may be a space optimization, but it's hardly pure.  Else this
discussion wouldn't be happening.  And `string-as-unibyte' exposes the
internal representation of strings to Lisp.

 > Everything else should work as if all bytes in the range 128-255
 > are decoded in the eight-bit charset.

There seem to be conflicting opinions about that, and I would
certainly disagree as there are scads of European charsets that
happily fit into bytes.  I see no reason why character operations
(such as case conversion) shouldn't work transparently on bytes in GR
interpreted as the corresponding Latin-1 (or any ISO Latin) charset --
with a little extra metadata in (internal unibyte) buffers and strings
to indicate the charset implied.  (This charset is independent of the
various coding systems associated with buffers; it only says how to
interpret a byte as a character in operations on characters in
buffers.)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]