|
From: | Matthew Woehlke |
Subject: | Re: regular expression behaviour. |
Date: | Wed, 03 Sep 2008 16:02:17 -0500 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.16) Gecko/20080723 Fedora/2.0.0.16-1.fc9 Thunderbird/2.0.0.16 Mnenhy/0.7.5.0 |
cristi wrote:
By executing the following statement at the command prompt (linux os) $ echo " 01" | grep '^ *...$' the string " 01" (excluding the double quotes) is printed to the standard output. Can somebody explain the behaviour? Of course, theoretically the regular expression matches the string, but according to the documentation it shouldn't. The documentation says that the * operator is greedy so the regular expression should match all the space at the beginnign of the string and then (because of the 3 dots) it should try to match 3 more characters.
I might be wrong, but I think "greedy" means that, for example, "a*b" will match all of "aaaaaaaaaaaaaaaaaaaab" rather than just the last two characters (e.g. if using --color), not that it will consume all possible characters in such a way to prevent a match when only consuming some would result in a match. IOW, it's greed that isn't especially relevant to grep where what matters is mainly matched vs. didn't match, but moreso to things where it matters what part of the string matched (e.g. sed, syntax highlighters, etc).
-- Matthew If a signature is not read by anyone, does it make a sound?
[Prev in Thread] | Current Thread | [Next in Thread] |