[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gcl-devel] utf8 and emacs text/string multibyte representation
From: |
Camm Maguire |
Subject: |
Re: [Gcl-devel] utf8 and emacs text/string multibyte representation |
Date: |
Thu, 30 Oct 2014 10:16:15 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux) |
Greetings! Don't worry -- I'm not committed to this idea yet, just
exploring!
Do these other lisps allocate a fresh character on each aref? Do they
maintain some ~2^21 sized table in core? (And isn't emacs a "lisp"
:-)).
Take care,
Raymond Toy <address@hidden> writes:
>>>>>> "Camm" == Camm Maguire <address@hidden> writes:
>
> Camm> Greetings! I've recently been considering supporting unicode in
> gcl by
> Camm> representing strings internally in utf8. It appears that emacs
> does the
> Camm> same or similar. Apart from the obvious memory footprint benefits,
> I'd
> Camm> like to ask what other advantages/disadvantages have been
> discovered.
> Camm> Much of the utf8 literature emphasizes that most algorithms can
> proceed
> Camm> conventionally in byte-wise fashion, including lexicographical
> ordering
> Camm> comparisons, given that almost all jobs are sequential, at least
> Camm> initially. A cached internal pointer storing the last referenced
> Camm> codepoint offset makes access essentially O(1). Yet setting string
> Camm> elements can trigger reallocations/memmove operations. While these
> can
> Camm> be aggregated over the setting of multiple elements, operations like
> Camm> nreverse look ridiculous if left in terms of calls to aref and aset.
>
> Camm> Thoughts, advice and experiences most appreciated.
>
> Have you looked at what other Lisp implementations do? AFAIK, none use
> utf-8. CCL and clisp use utf-32, cmucl and allegro use utf-16, sbcl
> and ecl(?) have two string types: 8-bit base-string and 32-bit
> strings.
>
> As a one-man operation (unfortunately), I'd go with the easiest one to
> get right and follow either ccl or cmucl. The rest of the support for
> unicode can be added with libraries like cl-unicode and/or babel, if
> need be.
>
> --
> Ray
>
>
> _______________________________________________
> Gcl-devel mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/gcl-devel
>
>
>
>
--
Camm Maguire address@hidden
==========================================================================
"The earth is but one country, and mankind its citizens." -- Baha'u'llah
- Re: utf8 and emacs text/string multibyte representation, (continued)
- Re: utf8 and emacs text/string multibyte representation, Eli Zaretskii, 2014/10/29
- Re: utf8 and emacs text/string multibyte representation, Camm Maguire, 2014/10/29
- Re: utf8 and emacs text/string multibyte representation, Eli Zaretskii, 2014/10/29
- Re: utf8 and emacs text/string multibyte representation, Camm Maguire, 2014/10/31
- Re: utf8 and emacs text/string multibyte representation, Eli Zaretskii, 2014/10/31
- Re: utf8 and emacs text/string multibyte representation, Camm Maguire, 2014/10/31
- Re: utf8 and emacs text/string multibyte representation, Eli Zaretskii, 2014/10/31
- Re: utf8 and emacs text/string multibyte representation, Camm Maguire, 2014/10/31
- Re: utf8 and emacs text/string multibyte representation, Stefan Monnier, 2014/10/29
- Re: utf8 and emacs text/string multibyte representation, Raymond Toy, 2014/10/29
- Re: [Gcl-devel] utf8 and emacs text/string multibyte representation,
Camm Maguire <=
- Re: [Gcl-devel] utf8 and emacs text/string multibyte representation, Stefan Monnier, 2014/10/31
- Message not available
- Re: utf8 and emacs text/string multibyte representation, Andreas Schwab, 2014/10/31
- utf8 and emacs text/string multibyte representation, Stephen J. Turnbull, 2014/10/29
- Re: Referring to revisions in the git future., Eric S. Raymond, 2014/10/29
- Re: Referring to revisions in the git future., Stefan Monnier, 2014/10/29
- Re: Referring to revisions in the git future., Eric S. Raymond, 2014/10/29
- Re: Referring to revisions in the git future., Stephen J. Turnbull, 2014/10/29
- Re: Referring to revisions in the git future., Jan Djärv, 2014/10/29
- Re: Referring to revisions in the git future., Eric S. Raymond, 2014/10/29
- Re: Referring to revisions in the git future., Eric S. Raymond, 2014/10/29