bug#16481: dfa.c and Rational Range Interpretation

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16481: dfa.c and Rational Range Interpretation

From:	Paolo Bonzini
Subject:	bug#16481: dfa.c and Rational Range Interpretation
Date:	Mon, 10 Feb 2014 23:13:42 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

Il 10/02/2014 20:50, Paul Eggert ha scritto:


If so, then the above comment doesn't sound right.  Without the patch,
the DFA matcher mishandles expressionsin some cases, as described in
Bug#16481.  For example, "grep -Xawk '[\[-\]]'" will cause dfa.c to try
to compile the regular expression [[-]], which won't workregardless of
whether --with-included-regex is being used.

Ok, so there is a real bug. But it is not immediately obvious what theproblem is, and the bug has (AFAICS) no test case and no mention in thecommit message. Without this, I am not sure that the fix should not bethe one in this commit.

More generally, we already had the problem of subtle differences between
dfa.c and full-regexp matching on platforms that do not observe RRI,
because dfa.c already uses RRI in multibyte locales, regardless of
whether the full matcher uses RRI.

It only does so if the fallback to regex is not requested (dfaexecinvoked with backref = NULL). This is never the case for grep. Infact, as far as I know it is never the case, and I've been tempted manytimes to completely remove the mostly dead code dealing with multibyteranges if backref = NULL.

The change causes non-"C" unibyte
locales to behave consistently with multibyte locales, which in some
sense is an improvement (though obviously not ideal; it'd be better if
it was RRI everywhere).

It would be if glibc were fixed. For me, consistency with other GNUutilities---especially sed---trumps anything else, and this was the mainpoint in fixing multibyte matching in GNU grep 2.6 and newer.

Non-"C" unibyte locales are dying out, so to some extent this is a minor
issue.  In practice most users these days won't notice or care about
this change.


That's true.

Paolo

[Prev in Thread]

Current Thread

[Next in Thread]

bug#16481: dfa.c and Rational Range Interpretation, Paolo Bonzini, 2014/02/09
- bug#16481: dfa.c and Rational Range Interpretation, Paul Eggert, 2014/02/09
  - bug#16481: dfa.c and Rational Range Interpretation, Jim Meyering, 2014/02/09
  - bug#16481: dfa.c and Rational Range Interpretation, Paolo Bonzini, 2014/02/10
    - bug#16481: dfa.c and Rational Range Interpretation, arnold, 2014/02/10
    - bug#16481: dfa.c and Rational Range Interpretation, Paolo Bonzini, 2014/02/10
    - bug#16481: dfa.c and Rational Range Interpretation, arnold, 2014/02/10
    - bug#16481: dfa.c and Rational Range Interpretation, Paul Eggert, 2014/02/10
    - bug#16481: dfa.c and Rational Range Interpretation, Paolo Bonzini <=
    - bug#16481: dfa.c and Rational Range Interpretation, Paul Eggert, 2014/02/11
    - bug#16481: dfa.c and Rational Range Interpretation, Paolo Bonzini, 2014/02/11
    - bug#16481: dfa.c and Rational Range Interpretation, Jim Meyering, 2014/02/16
    - bug#16481: dfa.c and Rational Range Interpretation, Paolo Bonzini, 2014/02/17

Prev by Date: bug#16707: [patch] fix erroneous line ending
Next by Date: bug#16707: [patch] fix erroneous line ending
Previous by thread: bug#16481: dfa.c and Rational Range Interpretation
Next by thread: bug#16481: dfa.c and Rational Range Interpretation
Index(es):
- Date
- Thread