[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Case mapping of sharp s

From: Kenichi Handa
Subject: Re: Case mapping of sharp s
Date: Tue, 17 Nov 2009 16:36:12 +0900

In article <address@hidden>, Ulrich Mueller <address@hidden> writes:

> So do I understand this right: In order to perform a Boyer-Moore
> search, the characters have to be either both ASCII, or must be in the
> same group of 64 adjacent characters (because the last byte in UTF-8
> encodes 6 bits)?


> Is that the reason why also ÿ and Ÿ (U+00FF and U+0178, small/capital
> y with diaeresis) don't form a case pair?


> > So, if you are sure that searching of ß is very rare (I have
> > no idea), please install it.

> Usage of (lower case) ß is very common in a German language context,
> so I'd guess that searching for it is not so rare.

> On the other hand, capital ẞ is not used in regular German orthography
> (that's probably the reason why the character was added to Unicode
> only in 2008). So if the change would cause large tradeoffs in search
> speed, then I think it's not worthwhile.

> By what factor is the non-BM search slower, as compared to the BM
> search?

I don't know exactly.  It depends on the length of searching
string; longer the string is, the more BM search is faster
than simple serach.  At least, when this code was active,
  ;; (set-downcase-syntax  ?İ ?i tbl)
  ;; (set-upcase-syntax    ?I ?ı tbl)
there were complaints about the slowdown.

Kenichi Handa

reply via email to

[Prev in Thread] Current Thread [Next in Thread]