[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#16777: [PATCH] Revert "grep: DFA now uses rational ranges in unibyte
From: |
Jim Meyering |
Subject: |
bug#16777: [PATCH] Revert "grep: DFA now uses rational ranges in unibyte locales" |
Date: |
Mon, 17 Feb 2014 17:41:38 -0800 |
On Mon, Feb 17, 2014 at 6:18 AM, Paolo Bonzini <address@hidden> wrote:
> The correct course of action for grep is to defer range interpretation
> to regex, because otherwise you can get mismatches between regexes with
> backreferences and those without.
>
> For example, [A-Z]. will use RRI but ([A-Z])\1 won't, with the confusing
> result that the first regex won't match a superset of the language
> described by the second regex.
>
> The source of the confusion is that, even though grep's dfa.c was changed
> to use range checking instead of strcoll, that code is only invoked if
> dfaexec is called with backref = NULL, and that never happens for grep!
>
> In the end, all that's needed for RRI is compiling --with-included-regex,
> and in that case the patch is almost a no-op. Almost, because there
> are corner cases that aren't handled correctly (e.g. [a-[.e.]], or
> regular expressions that include a NUL character), but this can be
> handled separately.
>
> * NEWS: Revert paragraph introduced by commit 1078b64302.
> * src/dfa.c (parse_bracket_exp): Revert back to regcomp/regexec.
>
> Signed-off-by: Paolo Bonzini <address@hidden>
Thanks. I have applied that (and pushed) with two log message changes:
I removed the Signed-off-by line (redundant when same as "Author:"),
and replaced 1078b64302 with v2.16-7-g1078b64.