[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using libunistring for string comparisons et al

From: Mark H Weaver
Subject: Re: Using libunistring for string comparisons et al
Date: Tue, 15 Mar 2011 13:20:54 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux)

Mike Gran <address@hidden> writes:
> We do, in a matter of speaking, have a single string representation:
> UTF-32.  The 'narrow' encoding is UTF-32 with the initial 3 bytes of
> zero removed.

Despite the similarity of these two representations, they are
sufficiently different that they cannot be handled by the same machine
code.  That means you must either implement multiple inner loops, one
for each combination of string parameter representations, or else you
must dispatch on the string representation within the inner loop.  On
modern architectures, wrongly predicted conditional branches are very

> I actually at one point had a nearly complete version of Guile 1.8
> that used UTF-8 and another that used UTF-32.  There are some
> other reasons why UTF-8 is bad, which I could bore you with
> ad naseum.

Can you please tell me why UTF-8 is bad, or point me to something that
explains it?  Everything I have found suggests that UTF-8 is very good.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]