[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] awk V4 patsplit: many unexpected 0 length elements in out
From: |
Aharon Robbins |
Subject: |
Re: [bug-gawk] awk V4 patsplit: many unexpected 0 length elements in output array |
Date: |
Mon, 17 Feb 2014 21:45:47 +0200 |
User-agent: |
Heirloom mailx 12.5 6/20/10 |
Hi.
> On Mon, 17 Feb 2014 08:06:54 -0500, David Kra <address@hidden> wrote:
>
> > Alert: I am not a regexpert. It could be that my regex *does* match the
> > null string, though I can't see how. It is a long assembly of several
> > shorter regex's, each wrapped in () and then all ||'d together, as in:
> > "(p1)||(p2)||(p3)||(p4)||(p5)"
> >
> > "([ABDE][0-9][0-9A-Z]{4}[A-Z])||([ABDE][0-9A-Z]{1}[0-9][0-9A-Z]{3}[A-Z])||([ABDE][0-9A-Z]{2}[0-9][0-9A-Z]{2}[A-Z])||([ABDE][0-9A-Z]{3}[0-9][0-9A-Z]{1}[A-Z])||([ABDE][0-9A-Z]{4}[0-9][A-Z])"
> >
> >
> > The goal is to match 7 character strings that start with one of a few
> > letters, ends in a letter, has alphanumerics in the middle, but must have
> > at least one digit in the middle.
Davide Brini points out:
> I haven't looked much into it, but the regexp alternation operator is
> definitely "|" (a single pipe), not "||".
This is indeed the issue. Changing the || to | causes the program to
print nothing and things work as expected.
By the way --re-interval is now the default in gawk 4.0; it's not
needed unless you also use --traditional.
Arnold