bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gawk] RE bug??


From: Chuck Swiger
Subject: Re: [gawk] RE bug??
Date: Fri, 08 Jul 2005 08:57:23 -0400
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511

Bob Proulx wrote:
Stephen Davies wrote:
Here is the output that I see here:

bash-3.00$ echo Silver | awk '/[[:upper:]][[:upper:][:digit:]]+/'
bash-3.00$ echo Silver | awk '/[A-Z][A-Z0-9]+/'
bash-3.00$ echo Silver | LC_COLLATE=en_US awk '/[A-Z][A-Z0-9]+/'
bash-3.00$ echo Silver | LC_COLLATE=POSIX awk '/[A-Z][A-Z0-9]+/'
bash-3.00$ awk --version
GNU Awk 3.1.4

Hmm...

I can confirm the results Stephen says, using both GNU awk and the "one true awk" described in ISBN 0-201-07981-X which appears to be Lucent source code.

36-pi% echo Silver | gawk '/[[:upper:]][[:upper:][:digit:]]+/'
37-pi% echo Silver | gawk '/[[:upper:]][[:upper:][:digit:]]+/'
37-pi% echo Silver | LC_COLLATE=en_US gawk '/[A-Z][A-Z0-9]+/'
38-pi% echo Silver | LC_COLLATE=POSIX gawk '/[A-Z][A-Z0-9]+/'
39-pi% gawk --version
GNU Awk 3.1.4
Copyright (C) 1989, 1991-2003 Free Software Foundation.
[ ... ]
40-pi% uname -a
FreeBSD pi.codefab.com 5.4-STABLE FreeBSD 5.4-STABLE #0: Wed Jun 1 20:15:28 EDT 2005

[ ... ]
It is somewhat humorous that one the most common bug reports against
GNU coreutils is that some distro has turned on en_US for them without
their knowledge and enabled this behavior of non-standard collating
sequence, and here is the exact reverse case where it is desired for
the test case and it is not enabled.  :-)

Are you sure that you aren't running into a bug in GNU's libc or locale support on those Linux platforms? The FreeBSD system above has full locale support present:

41-pi% ls -d1 /usr/share/locale/en_US*
/usr/share/locale/en_US.ISO8859-1//
/usr/share/locale/en_US.ISO8859-15//
/usr/share/locale/en_US.US-ASCII//
/usr/share/locale/en_US.UTF-8//

...and I could repeat the tests on a Solaris 8 or MacOS X 10.3/10.4 system, if that would help. (But I wouldn't expect "ilver" to match [[:upper:]]+...?)

--
-Chuck





reply via email to

[Prev in Thread] Current Thread [Next in Thread]