[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gperf-bugs] Revised gperf 2.7.2 patch
From: |
Bruno Haible |
Subject: |
Re: [Gperf-bugs] Revised gperf 2.7.2 patch |
Date: |
Mon, 18 Nov 2002 13:58:01 +0100 (CET) |
On Tuesday 01 October 2002 19:53, Bruce Lilly wrote:
> In February 2002, I posted some patches to gperf 2.7.2 to address
> a number of issues:
>
> 1. efficient case-independent matching of keywords
> 2. aoutomatic determination of optimal key positions
>
> A revised patch is available at
> http://users.erols.com/blilly/mailparse/gperf.patch
>
> Item #1 above has not changed. Item #2 has been
> substantially improved; the method now used is much
> faster and handles some pathological cases that were
> not handled in the earlier patch.
Thanks for the patch. I'm adding functionality equivalent to #2 to
gperf-2.8. I couldn't use your patch because the previous gperf
sources were already very convoluted, nearly unmaintainable. Had to
clean up the gperf sources first.
Can you explain what the distinction between
2: key position might be useful
3: key position uniquely identifies keywords of that length or longer
brings in practice? I mean, in either case you can have several
positions of the same type, and you don't know a priori which one to
choose. So you need some form of backtracking anyway.
Item #1 has already been rejected earlier.
> Additional issues are also handled:
>
> 3. more efficient code when all keywords have the same length
> 4. more efficient code when the -l option is used
> 5. correct handling of automatic key position determination
> when -n option is used
>
> Details:
>
> Vanilla gperf 2.7.2 generates a test like:
>
> if (len <= MAX_WORD_LENGTH && len >= MIN_WORD_LENGTH)
>
> Clearly that is inefficient when all keywords have the
> same length (MAX_WORD_LENGTH == MIN_WORD_LENGTH). The
> patch generates the following in such cases:
>
> if (len == MIN_WORD_LENGTH)
>
> which requires a single comparison rather than two.
Modern compilers generate a single comparison for this expression
anyway.
> When the -l option is used, vanilla gperf checks the
> length via lengthtable and a string comparison, then
> makes an additional check for a terminating '\0';
> that additional check is unnecessary (because of the
> lengthtable check).
When -l is given, gperf 2.7.2 uses a lengthtable access and then a
string comparison with memcmp, excluding the terminating '\0'.
> The revised automatic optimal key position determining
> code ensures that at least one key position is used if
> the -n option is specified or if all keywords have the
> same length.
That's what one would expect from an automatic key position
determination anyway.
Bruno
- Re: [Gperf-bugs] Revised gperf 2.7.2 patch,
Bruno Haible <=