bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

\w+ in gawk 3.1.3 - fix


From: Aharon Robbins
Subject: \w+ in gawk 3.1.3 - fix
Date: Wed, 14 Jan 2004 11:37:07 +0200

Greeetings.  Re this:

> From: Ron Burk <address@hidden>
> To: <address@hidden>
> Date: Tue, 13 Jan 2004 20:13:01 -0800
>
> Gawk 3.1.3, compiled with gcc 2.96 on RedHat Linux 7.2
>
> Based on the documentation,
> found it confusing that "\w+" does not match identifiers
> containing digits. Specifically, this pattern:
>
> /^\w+$/
>
> will match a line containing only "func", but not
> one containing "func2". Since the documentation
> implies a similarity between "\w" and "[[:alnum:]_]",
> I also find it confusing that this pattern:
>
> /^[[:alnum:]_]+$/
>
> *does* match a line containing either "func" or
> "func2".
>
> If the documentation is correct in claiming that
> "\w" is *supposed* to also match digits, not
> just alphabetics and "_", then this seems like
> a bug. I went to the source code and in the
> "\w" code that pokes in a "_" to the bitset,
> I added similar pokes for '0' through '9'.
> That seemed to produce the documented behavior,
> though I'm sure it is unlikely to be the best
> way to fix it.
>
> Thanks.

Here is the promised fix for 3.1.3.

Thanks!

Arnold
-----------------------------------------------
--- ../gawk-3.1.3/regcomp.c     2003-03-11 11:42:51.000000000 +0200
+++ regcomp.c   2004-01-14 11:35:11.000000000 +0200
@@ -3343,7 +3343,7 @@
 #ifdef RE_ENABLE_I18N
                         mbcset, &alloc,
 #endif /* RE_ENABLE_I18N */
-                        (const unsigned char *) "alpha", 0);
+                        (const unsigned char *) "alnum", 0);
 
   if (BE (ret != REG_NOERROR, 0))
     {




reply via email to

[Prev in Thread] Current Thread [Next in Thread]