bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: grep locale off by one?


From: P
Subject: Re: grep locale off by one?
Date: Fri, 22 Aug 2003 12:36:21 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030701

Andreas Schwab wrote:
address@hidden writes:

|> address@hidden wrote:
|> > What in the name of holy collating orders
|> > is going on below?
|> > $ echo "Z" | grep "[a-z]"
|> > $ echo "Y" | grep "[a-z]"
|> > Y
|> > $ echo "a" | grep "[A-Z]"

"[A-Z]" does not include "a" (which is collated before "A" in your locale).

Ah OK. Thanks for the info. It was another inconsistency
which confused me. It seems that grep (with or without
the included regex lib), treats locales inconsistently.
I tested sort and pcregrep, and they are consistent.

The following should illustrate the problem:

$ echo "A" | LC_ALL=en_IE grep "[a-z]"
$ echo "A" | LC_ALL=en_IE.UTF-8 grep "[a-z]"
A

$ echo "A" | LC_ALL=en_IE ./pcregrep  "[a-z]"
$ echo "A" | LC_ALL=en_IE.UTF-8 ./pcregrep  "[a-z]"
$ echo "A" | LC_ALL=en_IE.UTF-8 ./pcregrep -u "[a-z]"

$ echo -ne "b\nA\nB\na\n" | LC_ALL=en_IE sort
a
A
b
B
$ echo -ne "b\nA\nB\na\n" | LC_ALL=en_IE.UTF-8 sort
a
A
b
B

Note the latest official grep (2.5.1) and
redhat's latest (2.5.1-16.1) behave alike.

Pádraig.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]