Re: Case mapping of sharp s

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Case mapping of sharp s

From:	David Kastrup
Subject:	Re: Case mapping of sharp s
Date:	Thu, 19 Nov 2009 23:43:19 +0100
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (gnu/linux)

Stefan Monnier <address@hidden> writes:

>> Actually I think there is something simply wrong with the simple
>> search, as it's much slower even for single chars (where bm doesn't
>> have any advantage) and additionally in some weird random fashion
>> it's again slower for backwards search, such as 14, 37, 66 ... 94
>> secs, where the bm takes 0.5 secs and simple forward constantly
>> ~3.7 secs, all for isearch'ing one character in a 100Mb file.
>
> I can guess why it's much slower going backward: the simple search
> operates on chars rather than bytes.  The internal encoding we use
> (currently based on utf-8) is designed to be easy to parse going forward
> but not so easy going backward (IIRC our encoding is actually even a bit
> more painful in this case than pure utf-8).

I don't think so.  The utf-8 _scheme_ can be used to encode 21bits in 4
characters.  We stay within that range, in the utf-8 4 character scheme,
but outside of the Unicode range 2^20+2^16.

> BM on the other hand works on bytes, so there's no such slowdown.

With utf-8, I think that apart from character ranges, search forward and
backward should work perfectly like on 8-bit characters.  Exception is
incomplete character matches, but since the utf-8 scheme can immediately
tell "is a 7-bit character" "is the first character of a multibyte
sequence of length n" "is last or intermediate character of multibyte
sequence" this is not a serious problem.

> But maybe we're doing something silly somewhere.

The Emacs 22 multibyte scheme likely had worse properties for reverse
searching.  So maybe something might be simplified nowadays.

-- 
David Kastrup

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Case mapping of sharp s, (continued)
- Re: Case mapping of sharp s, grischka, 2009/11/19
  - Re: Case mapping of sharp s, Stefan Monnier, 2009/11/19
    - Re: Case mapping of sharp s, David Kastrup <=
    - Re: Case mapping of sharp s, Stefan Monnier, 2009/11/19
    - Re: Case mapping of sharp s, David Kastrup, 2009/11/20
    - Re: Case mapping of sharp s, Stefan Monnier, 2009/11/20
    - Re: Case mapping of sharp s, Stephen J. Turnbull, 2009/11/19
    - Re: Case mapping of sharp s, Stefan Monnier, 2009/11/19
    - Re: Case mapping of sharp s, Stephen J. Turnbull, 2009/11/20
    - Re: Case mapping of sharp s, Richard Stallman, 2009/11/20
    - Re: Case mapping of sharp s, David Kastrup, 2009/11/21
    - Re: Case mapping of sharp s, Stephen J. Turnbull, 2009/11/21
    - Re: Case mapping of sharp s, Eli Zaretskii, 2009/11/21

Prev by Date: Re: Fwd: Re: Inadequate documentation of silly characters on screen.
Next by Date: Re: Fwd: Re: Inadequate documentation of silly characters on screen.
Previous by thread: Re: Case mapping of sharp s
Next by thread: Re: Case mapping of sharp s
Index(es):
- Date
- Thread