|
From: | Paul Eggert |
Subject: | bug#16481: dfa.c and Rational Range Interpretation |
Date: | Mon, 10 Feb 2014 11:50:07 -0800 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 |
On 02/10/2014 01:18 AM, Paolo Bonzini wrote:
If you use --with-included-regex, the patch is a no-op.
Are we talking about the patch in git commit 1078b64302bbf5c0a46635772808ff7f75171dbc <http://git.savannah.gnu.org/cgit/grep.git/commit/?id=1078b64302bbf5c0a46635772808ff7f75171dbc>?
If so, then the above comment doesn't sound right. Without the patch, the DFA matcher mishandles expressionsin some cases, as described in Bug#16481. For example, "grep -Xawk '[\[-\]]'" will cause dfa.c to try to compile the regular expression [[-]], which won't workregardless of whether --with-included-regex is being used.
More generally, we already had the problem of subtle differences between dfa.c and full-regexp matching on platforms that do not observe RRI, because dfa.c already uses RRI in multibyte locales, regardless of whether the full matcher uses RRI. The change causes non-"C" unibyte locales to behave consistently with multibyte locales, which in some sense is an improvement (though obviously not ideal; it'd be better if it was RRI everywhere).
Non-"C" unibyte locales are dying out, so to some extent this is a minor issue. In practice most users these days won't notice or care about this change.
[Prev in Thread] | Current Thread | [Next in Thread] |