--- Begin Message ---
Subject: |
[PATCH] grep: use fastmap in regex |
Date: |
Sun, 17 Jul 2016 01:57:30 +0900 |
sed and gawk use fastmap in regex, but grep does not. By using fastmap,
I expect that grep speeds up for patterns as regex is used.
before:
$ time -p env LC_ALL=ja_JP.eucjp src/grep '\([a-b]\)\1' k
real 7.83
user 7.62
sys 0.07
after:
$ time -p env LC_ALL=ja_JP.eucjp src/grep '\([a-b]\)\1' k
real 0.46
user 0.38
sys 0.07
However, if grep uses fastmap, fails in case-fold-titlecase test. It
means that grep's behavior differ from sed and gawk, as they use fastmap,
although it seems to be a bug in regex.
0001-grep-use-fastmap-in-regex.patch
Description: Text document
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#24009: [PATCH] grep: use fastmap in regex |
Date: |
Thu, 1 Sep 2016 22:32:12 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 |
Norihiro Tanaka wrote:
I think this patch should be suspended because of this issue.
I reported it to glibc developers.
https://sourceware.org/bugzilla/show_bug.cgi?id=20381
After thinking about it a bit, I came up with a variant of the patch that gives
the performance improvement unless -i is used, so I installed the attached
patches. The first patch is mostly just refactoring this somewhat-crufty code
and fixing an O(N**2) reallocation problem. The second is the real improvement.
The second patch just captures the low-hanging fruit. For example, even with -i
we could use a fastmap if all the pattern's letters (including letters matched
by ranges) happen to avoid the glibc bug. Something like that might be worth
pursuing.
Since the attached patch fixes the test case that prompted the bug report I'm
closing the bug. We can reopen it, or open a new one, if someone wants to fix
the remaining performance glitches.
Thanks again for all these fixes!
0001-grep-improve-dfasearch-storage-management.patch
Description: Text Data
0002-grep-use-regex-fastmap-unless-i.patch
Description: Text Data
--- End Message ---