bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] fall back to glibc matcher if a multibyte match is found


From: Jim Meyering
Subject: Re: [RFC PATCH] fall back to glibc matcher if a multibyte match is found
Date: Sat, 01 May 2010 10:20:14 +0200

Paolo Bonzini wrote:

>>> This patch works around the performance problems that are still in
>>> current grep.  Red Hat will probably be using it in its own 2.6.x.
>>>
>>> For UTF-8 it should trigger only in the presence of MBCSET, e.g. [a-z]
>>> or [à] (nad the latter case could be avoided).
>>>
>>> For other character sets all brackets, and `.' as well, will trigger it.
>>
>> Sounds like a good change, but please add a comment.
>> Can you suggest a pathologically bad example
>> with which we can try to come up with a performance-measuring
>> addition to the test suite?
>
> If I read correctly the matcher code, it is still an NFA, so it's
> O(nodes * input-length).  So it's difficult to find a pathological
> case, even though the slowdown is over 200x.

200x is good enough for me.  That should be
easy to measure reliably enough for a regression test comparison.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]