bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Case insensitivity seems to ignore lower bound of interval


From: Eric Blake
Subject: Re: Case insensitivity seems to ignore lower bound of interval
Date: Wed, 27 Apr 2011 14:55:49 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110307 Fedora/3.1.9-0.39.b3pre.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.9

On 04/27/2011 02:40 PM, John Cowan wrote:
> Aharon Robbins scripsit:
> 
>> I do agree that the behavior is suprising, disconcerting, undesirable,
>> and so on.  For this reason, the upcoming version of gawk translates
>> ranges of the form [d-h] into '[defgh]' before compiling the regular
>> expression.
> 
> Alas, that means that in a locale where e-acute sorts after e, the regex
> [d-h] will not match it.  You can't have everything at once, but it
> would be good to have a switch to turn this behavior on and off.

POSIX already states that the regex [d-h] is unspecified in all but the
C locale, because there is no one-size-fits-all intepretation of what it
_should_ represent.  If you want e-acute in the set, it is always better
to ask for it explicitly.  Meanwhile, I welcome this change, as it is
easier to document that the expansion always mirrors the C locale rather
than the expansion depends on the collation order of the current locale.

-- 
Eric Blake   address@hidden    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]