[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #16421] regexp bug in grep -P or libpcre
From: |
Tony Abou-Assaleh |
Subject: |
[bug #16421] regexp bug in grep -P or libpcre |
Date: |
Mon, 08 Oct 2007 04:10:43 +0000 |
User-agent: |
Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7 |
Update of bug #16421 (project grep):
Status: None => Confirmed
_______________________________________________________
Follow-up Comment #1:
I see similar behaviour in grep-2.5.3 and libpcre 6.7.
Also:
$ printf "anbn" | src/grep -P '[^a]'
a
b
$ printf "anbn" | src/grep -P '^[^a]'
b
$ printf "anbn" | src/grep -P '[^a]$'
b
$ printf "anbn" | src/grep -P '[^a][^b]'
b
$ printf "anbn" | src/grep -P '[n]'
a
b
$ printf "anbn" | src/grep -P '[^a][n]'
b
It appears that the end-of-line character is passed as part of the string and
matched by [^a]. [^a] in PCRE will match an end-of-line character. From the
pcrepattern man page:
"The newline character is never treated in any special way in character
classes, whatever the setting of the PCRE_DOTALL or PCRE_MULTILINE options is.
A class such as [^a] will always match a newline."
The question is: should grep pass end-of-line character of each line to PCRE?
To be consistent, I think the answer is no.
Care must be taken to handle -z and binary files properly.
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?16421>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
- [bug #16421] regexp bug in grep -P or libpcre,
Tony Abou-Assaleh <=