bug#17376: [PATCH] grep: fix the different behaviour for a invalid seque

From: Paul Eggert
Subject: bug#17376: [PATCH] grep: fix the different behaviour for a invalid sequence between KWset and DFA
Date: Mon, 05 May 2014 20:26:37 -0700
While thinking about Bug#17376 I noticed some related bugs, which appear to have been in 'grep' since at least grep 2.0. For example:

$ encode() { echo "$1" | tr ABC '\357\274\241'; }
$ encode ABCABC >exp3
$ encode _____________________ABCABC___ >exp4
$ bca=$(encode BCA)
$ grep "$bca" exp3
$ grep -F "$bca" exp3
$ grep "\\(\\)\\1$bca" exp3

Here the regexp code disagrees with KWset and with the DFA, which is a bug: KWset and DFA should affect only performance, not behavior.

$ grep "$bca" exp4
$ grep -F "$bca" exp4
$ grep "\\(\\)\\1$bca" exp4

Here they agree, but only because there's a bug in is_mb_middle!
Fixing that will cause them to disagree again.

I installed the attached patch to fix the bugs I found, and to adjust the test cases accordingly.

