[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question about critical_factorization() in the Two-Way algorithm

From: Eric Blake
Subject: Re: Question about critical_factorization() in the Two-Way algorithm
Date: Wed, 15 Dec 2010 16:48:33 -0700
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.7

On 12/15/2010 03:23 PM, Pádraig Brady wrote:
>>> I spoke too soon.  We would also need to patch m4/memmem.m4 to actually
>>> perform the empty needle verification in the memmem-simple case.
>> I also noticed the empty needle verification didn't
>> check that the correct pointer was returned,
>> and only checked for !NULL.  So I rolled that
>> fix into the attached reorg.
> I've rebased the attached memmem reorg patch
> which splits correctness checks from performance checks.

> Subject: [PATCH] memmem: rearrange memmem and expand memmem-simple modules
> Move all functional checks to memmem-simple so that one has
> a fully functional memmem by using just this module.
> Restrict the memmem module to performance checks only.
> Document exactly how the memmem and memmem-simple modules
> relate to each other.

Thanks for reviving this.

> +Performance problems fixed by Gnulib module @code{memmem}:
> address@hidden
>  @item
>  This function has quadratic instead of linear worst-case complexity on some
>  platforms:
>  glibc 2.8, FreeBSD 6.2, NetBSD 5.0, AIX 5.1, Solaris 11 2010-11, Cygwin 
> 1.5.x.
> +Note for small needles the replacement may be slower.
>  @end itemize

I really wish I had more time to look into this.  Really, it seems like
the naive version for up to 4- or even 8-byte needles ought to knock the
socks off the factored version by cutting the common-case expense of
up-front factorization for the less-common case of fewer comparisons for
certain needles.  We'd need a good benchmark that tests time on:

non-periodic needle, short haystack match
non-periodic needle, long haystack before match
non-periodic needle, long haystack with no match
periodic needle, short haystack match
periodic needle, long haystack before match
periodic needle, long haystack with no match

for varying length needles, and which compares naive, glibc's sse4.2,
and the two-way algorithm timings for all of those situations (or gives
up, in the case of quadratic timing in naive or glibc's sse4.2 on long
worst-case needles)

Nothing like a good benchmark to prove that changing the algorithm
according to needle length really is a good thing, since Uli was
skeptical that his SSE4.2 implementation really is a regression:

Eric Blake   address@hidden    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]