bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Case insensitivity seems to ignore lower bound of interval


From: Aharon Robbins
Subject: Re: Case insensitivity seems to ignore lower bound of interval
Date: Wed, 27 Apr 2011 21:48:41 +0300
User-agent: Heirloom mailx 12.4 7/29/08

Greetings. Re the below.

First, thank you for the bug report.

Second, it's not a bug, but rather the consequence of how locales behave.
This is documented somewhat in the released gawk manual and documented better
in the upcoming one.

I do agree that the behavior is suprising, disconcerting, undesirable,
and so on.  For this reason, the upcoming version of gawk translates
ranges of the form [d-h] into '[defgh]' before compiling the regular
expression.

You can check out the development version from the git repository
on savannah.gnu.org, if you like, to try it.

Thanks,

Arnold

> From: Eric Bischoff <address@hidden>
> To: address@hidden
> Subject: Case insensitivity seems to ignore lower bound of interval
> Date: Tue, 26 Apr 2011 17:27:49 +0200
> Cc: address@hidden, Nicolas Parpandet <address@hidden>
>
> Hi all,
>
>
> $ echo "ijklmnopqrstuvwxyz" | awk '{ gsub(/[R-Z}/, "X"); print }
> ijklmnopqrXXXXXXXX
>
> please notice that "r" is not matched, i.e. case insensitivity is applied 
> only 
> to [S-Z] interval.
>
> $ awk --version
> GNU Awk 3.1.7
> (...)
>
> $ echo $LANG
> fr_FR.UTF-8
>
> The problem does not appear when locale is C.
>
> The problem does not appear when interval is specified as [r-z] (lower case)..
>
> This contradicts http://www.gnu.org/software/gawk/manual/gawk.html#Locales
> which documents 
>      $ echo something1234abc | gawk '{ sub("[A-Z]*$", ""); print }'
> as returning
>      something1234
> while it returns
>      something1234a
>
> Bug reproduced both on Ubuntu Natty beta 2 and on Fedora 15.
>
>
> I hope that helps,
>
> -- 
> ?ric Bischoff - Bureau Cornavin
> Technical writing and translations
> http://www.bureau-cornavin.com
> (+33) 3 68 46 00 85
> sip:address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]