[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Serious bug in regexps (eg., grep 2.5.1)
From: |
Bob Proulx |
Subject: |
Re: Serious bug in regexps (eg., grep 2.5.1) |
Date: |
Thu, 6 Mar 2003 21:42:51 -0700 |
User-agent: |
Mutt/1.3.28i |
> To: <address@hidden>
Wow. That is a really old address. I suggest you update your address
book and use address@hidden or newer address as documented in
'grep --help' in the future.
Wayne Hayes wrote:
> Running Red Hat 8.0, /bin/grep --version says "grep (GNU grep) 2.5.1".
> This bug also affects gawk 3.1.1, so I'm guessing it might be in a
> common regexp library that they may use, but I don't know enough to
> delve further.
Thanks for your report. But your report matches many others which do
not indicate a bug in grep or any other utility.
> The Bug:
>
> grep '^[a-z]' returns lines beginning with both upper and lower case letters.
> Plus, it runs about 100 times slower than version 2.4.2. Possibly these
> "features" are related.
>
> gawk '/^[a-z]/{print}' has the same problem.
Because of the vendor you mentioned I am very certain you have LANG
set to a dictionary sorting order. In those locales [a-z] matches
both upper and lower case and punctuation is ignored.
Test that this is really the case set LC_ALL=POSIX and verify that
your commands operate in a standard manor. If so then your
environment is definitely the cause of your trouble.
export LC_ALL=POSIX
The man page for grep documents this information.
LC_ALL, LC_CTYPE, LANG
These variables specify the LC_CTYPE locale, which
determines the type of characters, e.g., which
characters are whitespace. The locale is determined by
the first of these variables that is set. The POSIX
locale is used if none of these environment variables
are set, or if the locale catalog is not installed, or
if grep was not compiled with national language support
(NLS).
Check the FAQ listed below. Look for "not sorting in normal order".
http://www.gnu.org/software/fileutils/doc/faq/
> I couldn't find any references to this bug on RedHat's bugzilla web site.
Look for references to LANG being set to dictionary sort order.
Search the gnu.org textutils mailing list relating to sort and you
will see that this one topic commands a large fraction of the mailing
list questions.
Bob