bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] fall back to glibc matcher if a MBCSET is found


From: Jim Meyering
Subject: Re: [PATCH] fall back to glibc matcher if a MBCSET is found
Date: Sun, 12 Sep 2010 14:55:32 +0200

Paolo Bonzini wrote:
> On Sun, Sep 12, 2010 at 10:46, Jim Meyering <address@hidden> wrote:
>> That patch induces a performance *decrease* on at least one system.
>>
>> Built using --without-included-regex
>> Run on an idle i920 @ 2.67GHz, kernel 2.6.18-194.11.3.el5PAE, i686:
>>
>>  yes 1234567890123456789012345678901234567890123456789012567890 |sed 100000q 
>> > in
>>  for i in $(seq 10); do env time --f=%E env LC_ALL=fr_FR.UTF8 \
>>    ./grep '[a-z]' in;done
>
> With a cs_CZ.UTF-8 locale, however the unpatched grep would likely
> fail to match for example ^[a-z]$ against "ch", which is also a
> correctness issue.

Sounds like you're suggesting to expand the test case ;-)
That'd be nice, but it is not necessary.

> Regarding the failing test-case, the lack of equivalence class support
> is in glibc.  Should I commit the patch without the testcase?

??  Keep the test, of course.
It's alerting us to a legitimate problem.
We don't want to sweep that under the rug.

This is a strong argument against using --without-included-regex,
unless you know you're using a new-enough glibc.

Your NEWS addition recommends --without-included-regex unconditionally.

Can you adjust NEWS to mention that caveat (preferably
with a minimum-working glibc version number) and detect
the losing glibc in the test, then use die_ to announce
that an old and buggy version of glibc is the cause?
At worst, just add a comment in the test script.

Also, please ensure that "make syntax-check" passes.
You'd added new lines in C code that started with TAB.

Finally, since this change induces a performance penalty on at
least one system, please adjust or remove this comment in the commit log.
It's inaccurate, if not misleading:

  This patch works around some of the performance problems of multibyte grep



reply via email to

[Prev in Thread] Current Thread [Next in Thread]