bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16481: dfa.c and Rational Range Interpretation


From: Paul Eggert
Subject: bug#16481: dfa.c and Rational Range Interpretation
Date: Mon, 10 Feb 2014 11:50:07 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0

On 02/10/2014 01:18 AM, Paolo Bonzini wrote:

If you use --with-included-regex, the patch is a no-op.

Are we talking about the patch in git commit 1078b64302bbf5c0a46635772808ff7f75171dbc <http://git.savannah.gnu.org/cgit/grep.git/commit/?id=1078b64302bbf5c0a46635772808ff7f75171dbc>?

If so, then the above comment doesn't sound right. Without the patch, the DFA matcher mishandles expressionsin some cases, as described in Bug#16481. For example, "grep -Xawk '[\[-\]]'" will cause dfa.c to try to compile the regular expression [[-]], which won't workregardless of whether --with-included-regex is being used.

More generally, we already had the problem of subtle differences between dfa.c and full-regexp matching on platforms that do not observe RRI, because dfa.c already uses RRI in multibyte locales, regardless of whether the full matcher uses RRI. The change causes non-"C" unibyte locales to behave consistently with multibyte locales, which in some sense is an improvement (though obviously not ideal; it'd be better if it was RRI everywhere).

Non-"C" unibyte locales are dying out, so to some extent this is a minor issue. In practice most users these days won't notice or care about this change.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]