[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: string-map arg order

From: Alex Shinn
Subject: Re: string-map arg order
Date: 03 Sep 2001 16:56:12 -0400
User-agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.0.104

>>>>> "Dirk" == Dirk Herrmann <address@hidden> writes:

    Dirk> However, I fully agree with you about the problems of
    Dirk> multiple-width encodings: In case of multiple threads, every
    Dirk> access to one of a string's characters needs to recompute
    Dirk> the memory location of that character, because some other
    Dirk> thread might have changed the string and even replaced some
    Dirk> characters of different encoding widths.

That's funny, I've reversed my initial opinions of variable-width
encodings, and in fact just a few minutes ago got a utf-8 based
version of Guile to pass the test-suite :)

After taking Thi's advice and looking through the archives, I found a
discussion from last November.  This referenced a proposal Jim Blandy
made in August 1999 (no archives on this), still available in
doc/mbapi.texi.  The proposal was utf-8 based, and there ensued a lot
of arguing about the efficiency of variable-width encodings, but no
conclusions and eventually the thread died out.  But the proposal
seemed more than reasonable, and after thinking about it also seems to
be the only realistic option if we want to be compatible with existing
Guile projects/modules as well as continue to provide easy integration
with C libraries.

So with some cut and paste and minor embellishments, Jim's mbapi.texi
became mbapi.[ch], and a few primitives like string-{length,ref,set!}
were modified to be multi-byte aware.  Of course, there's lots more
that needs to be done (port character handling, shared-substrings,
case conversions, srfi-13,14, regexps, etc.), and probably the only
reason it passed the tests was because they're already 7-bit clean and
my utf-8 tests are not very extensive, but it's a start.

Should I polish what I have now and document/rationalize it, or should
I keep going until I have a mostly complete implementation (everything
but regexps should be straightforward)?

Alex Shinn <address@hidden>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]