bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16481: dfa.c and Rational Range Interpretation


From: Paolo Bonzini
Subject: bug#16481: dfa.c and Rational Range Interpretation
Date: Mon, 10 Feb 2014 23:13:42 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

Il 10/02/2014 20:50, Paul Eggert ha scritto:

If so, then the above comment doesn't sound right.  Without the patch,
the DFA matcher mishandles expressionsin some cases, as described in
Bug#16481.  For example, "grep -Xawk '[\[-\]]'" will cause dfa.c to try
to compile the regular expression [[-]], which won't workregardless of
whether --with-included-regex is being used.

Ok, so there is a real bug. But it is not immediately obvious what the problem is, and the bug has (AFAICS) no test case and no mention in the commit message. Without this, I am not sure that the fix should not be the one in this commit.

More generally, we already had the problem of subtle differences between
dfa.c and full-regexp matching on platforms that do not observe RRI,
because dfa.c already uses RRI in multibyte locales, regardless of
whether the full matcher uses RRI.

It only does so if the fallback to regex is not requested (dfaexec invoked with backref = NULL). This is never the case for grep. In fact, as far as I know it is never the case, and I've been tempted many times to completely remove the mostly dead code dealing with multibyte ranges if backref = NULL.

The change causes non-"C" unibyte
locales to behave consistently with multibyte locales, which in some
sense is an improvement (though obviously not ideal; it'd be better if
it was RRI everywhere).

It would be if glibc were fixed. For me, consistency with other GNU utilities---especially sed---trumps anything else, and this was the main point in fixing multibyte matching in GNU grep 2.6 and newer.

Non-"C" unibyte locales are dying out, so to some extent this is a minor
issue.  In practice most users these days won't notice or care about
this change.

That's true.

Paolo





reply via email to

[Prev in Thread] Current Thread [Next in Thread]