bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: grep -e '\(a\)\1' -e '\(b\)\1'


From: Paul Eggert
Subject: Re: grep -e '\(a\)\1' -e '\(b\)\1'
Date: Sun, 18 Feb 2001 17:22:42 -0800 (PST)

> From: "Alain Magloire" <address@hidden>
> Date: Sun, 18 Feb 2001 15:26:36 -0500 (EST)
> 
> > > > echo ba | egrep '(a)\1|(b)\1'

> It will match "aaba".
> 
> Now I do not know what POSIX.2 says about this.

POSIX doesn't say anything about ERE back-references; the committee
discussed the idea but rejected standardization.

However, it may amuse you to know that a similar problem occurs with
BRE back-references.  For example, should the following shell command
output nothing, or output a line containing "b"?

        echo 'b' | grep '\(\(a\)\)*\2'

GNU grep outputs nothing, but Solaris 8 xpg4 grep outputs "b".
This is because GNU grep says \2 does not match if the corresponding
subexpression never matched, but Solaris 8 xpg4 grep says \2 matches
the empty string in that case.

In the discussion of BRE back-references, the latest POSIX draft says:

   The back-reference expression '\n' shall match the same (possibly
   empty) string of characters as was matched by a subexpression
   enclosed between "\(" and "\)" preceding the '\n'....  When the
   referenced subexpression matched more than one string, the
   back-referenced expression shall refer to the last matched string.
   If the subexpression referenced by the back-reference matches more
   than one string because of an asterisk ('*') or an interval
   expression (see item (5)), the back-reference shall match the last
   (rightmost) of these strings.

In my opinion, this does not define the behavior of \(\(a\)\)*\2 when
the \(a\) never matched a string.  Hence POSIX does not define the
behavior of "grep" on the above example.  I.e. even though GNU grep
and Solaris xpg4 grep act differently here, they both conform to
POSIX.  (Personally, I prefer the GNU grep behavior.  :-)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]