[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Hash computation and TFB
From: |
Richard Frith-Macdonald |
Subject: |
Re: Hash computation and TFB |
Date: |
Tue, 6 Aug 2013 14:39:41 +0100 |
On 6 Aug 2013, at 14:30, Stefan Bidi <address@hidden> wrote:
> I copied the hash algorithm straight out of -base, so they should match. I
> remember a few months ago Richard was playing around with hash functions and
> this might be causing some issues, now.
It wouldn't on a normal setup ... the experimental hash code is used only if
you explicitly build it.
> I just looked it up, the changes were made on rev 36344.
>
> There is another issue... -base allows UTF-8 strings, which will not be
> hashed to the same UTF-16 value.
They are hashed to the same value as other strings, in base hashing is computed
on unicode codepoint.
> In my opinion, allowing UTF-8 string literals is not a good idea and base
> should revert back to Latin1 as the default C string encoding.
gnustep-base still uses latin1 as the default C string encoding. The change
with string literals is one from ascii to utf-8
> I'm actually debating adding a UTF-16 string literals configure option for
> corebase. I believe using UTF-16 internally is the only sane solution to
> non-ASCII encodings.
>
> I've tried experimenting with other hash functions that are not
> one-at-a-time, but unfortunately have not found anything that will work on
> both ASCII and Unicode strings consistently. It would be really nice to be
> able to work with 32- or 64-bit integers directly instead of 8- or 16-bit
> characters. If could use UTF-16 across the board, this wouldn't be a problem.
base uses the 16bit codepoints to compute string hashes ... which is of course
fine for ascii and utf-16 since ascii is a true subset of unicode and each
ascii character therefore has exactly the same value as the corresponding
utf-15 character.
- Re: Hash computation and TFB, (continued)
- Re: Hash computation and TFB, Stefan Bidi, 2013/08/06
- Re: Hash computation and TFB, Luboš Doležel, 2013/08/06
- Re: Hash computation and TFB, David Chisnall, 2013/08/06
- Re: Hash computation and TFB, Luboš Doležel, 2013/08/06
- Re: Hash computation and TFB, David Chisnall, 2013/08/06
- Re: Hash computation and TFB, Luboš Doležel, 2013/08/06
- Re: Hash computation and TFB, David Chisnall, 2013/08/06
- Re: Hash computation and TFB, Luboš Doležel, 2013/08/06
- Re: Hash computation and TFB, Stefan Bidi, 2013/08/06
- Re: Hash computation and TFB, Luboš Doležel, 2013/08/06
Re: Hash computation and TFB,
Richard Frith-Macdonald <=