guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Implementation:] copy on write strings.


From: Dirk Herrmann
Subject: [Implementation:] copy on write strings.
Date: Fri, 22 Sep 2000 22:05:59 +0200 (MEST)

Hi!

I enclose a sample implementation of copy on write strings for guile.  The
files char-field.[ch] implement a type `char-field' for character arrays,
which can be shared among strings and symbols, and maybe also between
other types.  The files newstrings.[ch] implement a string type
`newstrings', which makes use of the `char-fields'.  The idea is, that
different strings can share the same `char-field', and that copying is
only performed if a string is to be written, but shares its `char-field'
with other objects.

The type `newstring' is not fully integrated into guile:  You cannot use a
newstring in places where guile up to now expects its old string type.  
Thus, this implementation is for experimental purposes.  I may form a
basis for a usable implementation, though.

Additional files hold a set of testcases, and a set of benchmarks.  In
general, the implementation performs quite OK.  It is obviously much
faster for operations, where sharing can be exploited.  However, it is
possible to get a really bad performance in certain constellations:  If
you have a long string, create a short substring from that string and then
modify the long string:  In this case, the long string will be copied
because it shares the charfield with its substring.  With guile's current
string implementation, the substring operation itself would already have
made a copy of the short substring.


Some general points about changing guile's string and symbol types:
There is currently no clean separation between strings, symbols and all
kinds of vectors.  All of them use the same macros to access their cell
elements, for example SCM_CHARS.  This is unfortunate, because it makes
changing the implementation for only some of those types difficult.  I
have already added SCM_STRING_CHARS and SCM_SYMBOL_CHARS and started to
change some of the calls to SCM_CHARS accordingly, but it is not a trivial
task, because in some places it is hard to determine which actual type is
to be expected - sometimes it can even be more than one type.

The same would have to be done for other macros as well, for example
SCM_LENGTH should also have different macros for different
types.  Further, the corresponding setters like SCM_SETCHARS have also be
considered.


Best regards
Dirk

Attachment: newstrings.tgz
Description: GNU Unix tar archive


reply via email to

[Prev in Thread] Current Thread [Next in Thread]