[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: about strings, symbols and chars.

From: Dirk Herrmann
Subject: Re: about strings, symbols and chars.
Date: Thu, 21 Dec 2000 17:55:57 +0100 (MET)

On 21 Dec 2000, Jim Blandy wrote:

> I think the source of the disconnect is this: you're assuming that the
> fundamental operation on strings is indexing by character number, but
> it's not.  Strings are overwhelmingly more often scanned sequentially.
> For that kind of access, variable-width representations are not a
> serious problem.  However, in order to take advantage of the access
> pattern, you have to move some of the logic out of the accessor and
> into the surrounding loop.  My argument is that, given the other
> primitives described in mbapi.texi, this is little work, and is
> efficient.

Well, I don't want to be annoying here, but, honestly, I really don't get
it:  For which kinds of operations could you move conditionals out of the
inner loop?  (The conditionals we are talking about are those, which
determine the character width for every single character access.)  Take
for example a string< operation:  Two strings are compared by scanning
them sequentially.  The characters that are to be compared have both to be
extracted (in order to determine their relative ordering according to a
certain locale).  This is done in the inner loop.  How could the code that
checks the encoding length be extracted from the inner loop?

Another example, the substring operation.  Maybe I fail to see some
fundamental point here, but IMO you first need to scan the string to find
the byte position of the first character you want to include in the
substring.  Thus, you need to skip all previous characters, each one with
it's corresponding length.  How can the conditionals that determine the
width of each character be moved out of the inner loop?  The same holds
for finding the byte position of the last character to include.

I'd really like to understand the point.  Could you give an example for a
situation where you could move the conditionals out of the inner loops?

> In cases where random access is necessary, mbapi.texi provides
> primitives (like the cached indexing functions) that should do pretty
> well.

This is true as long as you make sure that your indexes remain valid,
which can be a problem with multiple threads.

Best regards,
Dirk Herrmann

reply via email to

[Prev in Thread] Current Thread [Next in Thread]