Re: match finds wrong space.

bug-gnu-utils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: match finds wrong space.

From:	Hermann Peifer
Subject:	Re: match finds wrong space.
Date:	Thu, 08 Jul 2010 10:17:03 +0200
User-agent:	Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.1.10) Gecko/20100512 Thunderbird/3.0.5

On 07/07/2010 20:36, Davide Brini wrote:

On Wed, 07 Jul 2010 21:34:25 +0300 Aharon Robbins<address@hidden>  wrote:

regards - Chris Willis in the UK# insjk.awk

BEGIN {
        s = "Mary Ann jane"
        n = match( s, /\040[a-z]/ )
        print n, s
        }


Hi. Current gawk is correct, and 3.0.3 is wrong. You'll note that
following the \040 for a space you have [a-z]. This matches *lower case
letters*; the "A" following the first first is an upper case letter.


But it's matched in his example.

So, there's no bug.


He is saying that

match( s, /\040[a-z]/ )

on the line

"Mary Ann jane"

gives 5 (meaning [a-z] matches the "A"), whereas it should give 9.

I explained the reason for that in my post.


Davide,

Soemone else already explained that this is expected behaviour. Unlessyour are in C locale, the character range [a-z] can expanded to justabout anything. Simplified examples are:


aBbCc...XxYyZz  or  aAbBcC...xXyYz

Your locale is probably similar to the latter example, this is why itmatches an uppercase A. In non-C locales, use character classes like[:lower:] and [:upper:] instead of character ranges like [a-z] and [A-Z].


Hermann

[Prev in Thread]

Current Thread

[Next in Thread]

match finds wrong space., Chris Willis, 2010/07/06
- Re: match finds wrong space., Davide Brini, 2010/07/06
- Re: match finds wrong space., Aharon Robbins, 2010/07/07
  - Re: match finds wrong space., Davide Brini, 2010/07/07
  - Message not available
    - Re: match finds wrong space., Hermann Peifer <=
    - Re: match finds wrong space., Davide Brini, 2010/07/08

Prev by Date: Re: match finds wrong space.
Next by Date: Re: match finds wrong space.
Previous by thread: Re: match finds wrong space.
Next by thread: Re: match finds wrong space.
Index(es):
- Date
- Thread