[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Implementation:] copy on write strings.
From: |
Dirk Herrmann |
Subject: |
[Implementation:] copy on write strings. |
Date: |
Fri, 22 Sep 2000 22:05:59 +0200 (MEST) |
Hi!
I enclose a sample implementation of copy on write strings for guile. The
files char-field.[ch] implement a type `char-field' for character arrays,
which can be shared among strings and symbols, and maybe also between
other types. The files newstrings.[ch] implement a string type
`newstrings', which makes use of the `char-fields'. The idea is, that
different strings can share the same `char-field', and that copying is
only performed if a string is to be written, but shares its `char-field'
with other objects.
The type `newstring' is not fully integrated into guile: You cannot use a
newstring in places where guile up to now expects its old string type.
Thus, this implementation is for experimental purposes. I may form a
basis for a usable implementation, though.
Additional files hold a set of testcases, and a set of benchmarks. In
general, the implementation performs quite OK. It is obviously much
faster for operations, where sharing can be exploited. However, it is
possible to get a really bad performance in certain constellations: If
you have a long string, create a short substring from that string and then
modify the long string: In this case, the long string will be copied
because it shares the charfield with its substring. With guile's current
string implementation, the substring operation itself would already have
made a copy of the short substring.
Some general points about changing guile's string and symbol types:
There is currently no clean separation between strings, symbols and all
kinds of vectors. All of them use the same macros to access their cell
elements, for example SCM_CHARS. This is unfortunate, because it makes
changing the implementation for only some of those types difficult. I
have already added SCM_STRING_CHARS and SCM_SYMBOL_CHARS and started to
change some of the calls to SCM_CHARS accordingly, but it is not a trivial
task, because in some places it is hard to determine which actual type is
to be expected - sometimes it can even be more than one type.
The same would have to be done for other macros as well, for example
SCM_LENGTH should also have different macros for different
types. Further, the corresponding setters like SCM_SETCHARS have also be
considered.
Best regards
Dirk
newstrings.tgz
Description: GNU Unix tar archive
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Implementation:] copy on write strings.,
Dirk Herrmann <=