[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: utf8 and emacs text/string multibyte representation
From: |
Eli Zaretskii |
Subject: |
Re: utf8 and emacs text/string multibyte representation |
Date: |
Wed, 29 Oct 2014 16:51:35 +0200 |
> From: Camm Maguire <address@hidden>
> Date: Wed, 29 Oct 2014 10:04:58 -0400
>
> Greetings! I've recently been considering supporting unicode in gcl by
> representing strings internally in utf8. It appears that emacs does the
> same or similar.
If you haven't already, you can find some basic description of what
Emacs does in the node "Text Representations" of the ELisp manual.
> Apart from the obvious memory footprint benefits, I'd
> like to ask what other advantages/disadvantages have been discovered.
You have basically said it yourself: memory footprint vs
addressability. If you want to discuss this in more detail, I suggest
to ask more specific questions about specific aspects that bother you.
> A cached internal pointer storing the last referenced codepoint
> offset makes access essentially O(1).
We indeed maintain a cache for byte-to-character and character-to-byte
conversions.
> Yet setting string elements can trigger reallocations/memmove
> operations.
Emacs, as every editor, needs to handle this efficiently anyway,
because editing operations rarely leave the buffer size unchanged. So
Emacs uses a gap to minimize reallocations.
> While these can be aggregated over the setting of multiple elements,
> operations like nreverse look ridiculous if left in terms of calls
> to aref and aset.
nreverse applied to a string is a rarity, IME.
- Re: Referring to revisions in the git future., (continued)
- Re: Referring to revisions in the git future., Eli Zaretskii, 2014/10/31
- Re: Referring to revisions in the git future., Stefan Monnier, 2014/10/29
- Re: Referring to revisions in the git future., Jose E. Marchesi, 2014/10/29
- Re: Referring to revisions in the git future., Stefan Monnier, 2014/10/29
- Re: Referring to revisions in the git future., Eric S. Raymond, 2014/10/29
- Re: Referring to revisions in the git future., Rasmus, 2014/10/29
- Re: Referring to revisions in the git future., Eric S. Raymond, 2014/10/29
- Re: Referring to revisions in the git future., Rob Browning, 2014/10/29
- Re: Referring to revisions in the git future., Stefan Monnier, 2014/10/29
- utf8 and emacs text/string multibyte representation, Camm Maguire, 2014/10/29
- Re: utf8 and emacs text/string multibyte representation,
Eli Zaretskii <=
- Re: utf8 and emacs text/string multibyte representation, Camm Maguire, 2014/10/29
- Re: utf8 and emacs text/string multibyte representation, Eli Zaretskii, 2014/10/29
- Re: utf8 and emacs text/string multibyte representation, Camm Maguire, 2014/10/31
- Re: utf8 and emacs text/string multibyte representation, Eli Zaretskii, 2014/10/31
- Re: utf8 and emacs text/string multibyte representation, Camm Maguire, 2014/10/31
- Re: utf8 and emacs text/string multibyte representation, Eli Zaretskii, 2014/10/31
- Re: utf8 and emacs text/string multibyte representation, Camm Maguire, 2014/10/31
- Re: utf8 and emacs text/string multibyte representation, Stephen J. Turnbull, 2014/10/31
- Re: utf8 and emacs text/string multibyte representation, Stefan Monnier, 2014/10/29
- Re: utf8 and emacs text/string multibyte representation, Raymond Toy, 2014/10/29