bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] FPAT bug?


From: Ed Morton
Subject: Re: [bug-gawk] FPAT bug?
Date: Sat, 1 Apr 2017 10:28:47 -0500
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0

OK, thanks for the update!

On 4/1/2017 8:57 AM, Andrew J. Schorr wrote:
On Fri, Mar 31, 2017 at 11:52:29PM -0500, Ed Morton wrote:
Is this a bug?
Yes. It is a regression in 4.1.4.

$ cat tst.awk
BEGIN { FPAT="[^,]*" }
{
      print NF, $0
      for (i=1;i<=NF;i++)
          print "\t" i, "[" $i "]"
      print ""
}

$ cat -v file.csv
,,3
,,3

$ awk -f tst.awk file.csv
3 ,,3
         1 []
         2 []
         3 [3]

2 ,,3
         1 []
         2 [3]

Note that awk recognizes 3 fields in the first line but only 2 in
the second. If it's not a bug - what's causing that behavior?
This worked OK in 4.1.3, but is broken in 4.1.4.  It is related to this
ChangeLog entry:

2015-09-18         Arnold D. Robbins     <address@hidden>

         * field.c (fpat_parse_field): Always use rp->non_empty instead
         of only if in_middle. The latter can be true even if we've
         already parsed part of the record. Thanks to Ed Morton
         for the bug report.

diff --git a/field.c b/field.c
index 6a7c6b1..ed31098 100644
--- a/field.c
+++ b/field.c
@@ -1598,9 +1598,8 @@ fpat_parse_field(long up_to,      /* parse only up to 
this field number */
if (in_middle) {
                regex_flags |= RE_NO_BOL;
-               non_empty = rp->non_empty;
-       } else
-               non_empty = false;
+       }
+       non_empty = rp->non_empty;
eosflag = false;
        need_to_set_sep = true;

Reversing this patch fixes the bug, but reintroduces the bug that
was fixed by this patch. :-) Here's the test case for that bug:

==> test/fpat5.awk <==
BEGIN {
         FPAT = "([^,]*)|(\"[^\"]+\")"
         OFS = ";"
}

p != 0 { print NF }

{ $1 = $1 ; print }

==> test/fpat5.in <==
"A","B","C"

==> test/fpat5.ok <==
"A";"B";"C"


*** fpat5.ok    2017-01-26 13:52:53.285369000 -0500
--- _fpat5      2017-04-01 09:55:20.122459000 -0400
***************
*** 1 ****
! "A";"B";"C"
--- 1 ----
! "A";;"B";"C"

Arnold?

Regards,
Andy





reply via email to

[Prev in Thread] Current Thread [Next in Thread]