bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Computed regex and getline bug / issue


From: Aharon Robbins
Subject: Re: [bug-gawk] Computed regex and getline bug / issue
Date: Sat, 10 May 2014 22:50:52 +0300
User-agent: Heirloom mailx 12.5 6/20/10

Hi Andy.

> Date: Fri, 9 May 2014 09:36:40 -0400
> From: "Andrew J. Schorr" <address@hidden>
> To: Aharon Robbins <address@hidden>
> Cc: address@hidden, address@hidden, address@hidden
> Subject: Re: [bug-gawk] Computed regex and getline bug / issue
>
> Hi Arnold,
>
> On Fri, May 09, 2014 at 11:54:36AM +0300, Aharon Robbins wrote:
> > Thanks everyone for their help cutting the test case down, and to Andy for
> > setting up some test cases to drop into the test suite.  The fix is below.
> > I will push it into the repo shortly.
>
> I'm afraid my test cases were not comprehensive.
>
> > [ .... ]
>
> The logic here depends on having access to the st_size.  When reading data
> from stdin, this doesn't seem to work.  We need another test case that
> reads from stdin instead of from a file.  After rebuilding with your patch,
> I see this:
>
> bash-4.2$ ./gawk -f test/rsgetline.awk < test/rsgetline.in
> [1] [,]
> -1-
> [2] [,]
> bash-4.2$ cat test/rsgetline.in
> 1,2,bash-4.2$ 
> bash-4.2$ printf '1,2,' | ./gawk -f test/rsgetline.awk
> [1] [,]
> -0-
> [1] [,]
> bash-4.2$ 
>
> So the fix works when reading from a regular file, but apparently not
> from stdin.
>
> Regards,
> Andy

Indeed. Here is the fix for that too.

Got the other test case, I'll push it.

Thanks,

Arnold
--------------------------------------------------
diff --git a/io.c b/io.c
index b1c9fa1..3d7b00a 100644
--- a/io.c
+++ b/io.c
@@ -3473,8 +3473,15 @@ get_a_record(char **out,        /* pointer to pointer to 
data */
 
                ret = (*matchrec)(iop, & recm, & state);
                iop->flag &= ~IOP_AT_START;
+               /* found the record, we're done, break the loop */
                if (ret == REC_OK)
                        break;
+
+               /*
+                * Likely found the record; if there's no more data
+                * to be had (like from a tiny regular file), break the
+                * loop. Otherwise, see if we can read more.
+                */
                if (ret == TERMNEAREND && buffer_has_all_data(iop))
                        break;
 
@@ -3527,10 +3534,14 @@ get_a_record(char **out,        /* pointer to pointer 
to data */
                        break;
                } else if (iop->count == 0) {
                        /*
-                        * hit EOF before matching RS, so end
-                        * the record and set RT to ""
+                        * Hit EOF before being certain that we've matched
+                        * the end of the record. If ret is TERMNEAREND,
+                        * we need to pull out what we've got in the buffer.
+                        * Eventually we'll come back here and see the EOF,
+                        * end the record and set RT to "".
                         */
-                       iop->flag |= IOP_AT_EOF;
+                       if (ret != TERMNEAREND)
+                               iop->flag |= IOP_AT_EOF;
                        break;
                } else
                        iop->dataend += iop->count;



reply via email to

[Prev in Thread] Current Thread [Next in Thread]