bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] FPAT bug?


From: Andrew J. Schorr
Subject: Re: [bug-gawk] FPAT bug?
Date: Sat, 1 Apr 2017 09:57:15 -0400
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, Mar 31, 2017 at 11:52:29PM -0500, Ed Morton wrote:
> Is this a bug?

Yes. It is a regression in 4.1.4.

> $ cat tst.awk
> BEGIN { FPAT="[^,]*" }
> {
>      print NF, $0
>      for (i=1;i<=NF;i++)
>          print "\t" i, "[" $i "]"
>      print ""
> }
> 
> $ cat -v file.csv
> ,,3
> ,,3
> 
> $ awk -f tst.awk file.csv
> 3 ,,3
>         1 []
>         2 []
>         3 [3]
> 
> 2 ,,3
>         1 []
>         2 [3]
> 
> Note that awk recognizes 3 fields in the first line but only 2 in
> the second. If it's not a bug - what's causing that behavior?

This worked OK in 4.1.3, but is broken in 4.1.4.  It is related to this
ChangeLog entry:

2015-09-18         Arnold D. Robbins     <address@hidden>

        * field.c (fpat_parse_field): Always use rp->non_empty instead
        of only if in_middle. The latter can be true even if we've
        already parsed part of the record. Thanks to Ed Morton
        for the bug report.

diff --git a/field.c b/field.c
index 6a7c6b1..ed31098 100644
--- a/field.c
+++ b/field.c
@@ -1598,9 +1598,8 @@ fpat_parse_field(long up_to,      /* parse only up to 
this field number */
 
        if (in_middle) {
                regex_flags |= RE_NO_BOL;
-               non_empty = rp->non_empty;
-       } else
-               non_empty = false;
+       }
+       non_empty = rp->non_empty;
 
        eosflag = false;
        need_to_set_sep = true;

Reversing this patch fixes the bug, but reintroduces the bug that
was fixed by this patch. :-) Here's the test case for that bug:

==> test/fpat5.awk <==
BEGIN {
        FPAT = "([^,]*)|(\"[^\"]+\")"
        OFS = ";"
}

p != 0 { print NF }

{ $1 = $1 ; print }

==> test/fpat5.in <==
"A","B","C"

==> test/fpat5.ok <==
"A";"B";"C"


*** fpat5.ok    2017-01-26 13:52:53.285369000 -0500
--- _fpat5      2017-04-01 09:55:20.122459000 -0400
***************
*** 1 ****
! "A";"B";"C"
--- 1 ----
! "A";;"B";"C"

Arnold?

Regards,
Andy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]