bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using getline into a variable from a coprocess sometimes overwrites


From: Nick Hobson
Subject: Re: Using getline into a variable from a coprocess sometimes overwrites RT
Date: Sun, 4 Oct 2009 14:59:49 -0700 (PDT)

Ah, I was assuming that getline into a variable would split lines on 
newline, but it makes sense that gawk should use the actual value 
of RS!  (Of course, if necessary you can set RS = "\n" before the 
getline and reset it afterwards.)

Maybe section 3.8 (e.g., 3.8.10) of the user guide could be changed 
to mention that when RS is a regexp, getline might change RT?  At 
the moment 3.8.8 says: "In this version of getline, none of the 
built-in variables are changed and the record is not split into 
fields. The only variable changed is var."

Thanks,
Nick


Greetings. Re this:

> Date: Fri, 2 Oct 2009 17:21:22 -0700 (PDT)
> From: Nick Hobson <address@hidden>
> Subject: Using getline into a variable from a coprocess sometimes overwrites 
> RT
> To: address@hidden
>
> Hi,
>
> I think I've found a gawk bug.  In some cases, using getline into a
> variable from a coprocess clobbers RT.  It doesn't touch $0 or NF.
> I have demonstrated the problem in gawk 3.1.7 under Arch Linux.
>
> The test program is:
>
> BEGIN {RS = "u"}
>
> RT == "u" {
>     printf callpgm("xargs file -b ", "imgs/auk.png")
> }
>
> function callpgm(ext, src,  res) {
>     print src |& ext
>     close(ext, "to")
>     print "1", $0, NF, RT
>     ext |& getline res
>     print "2", $0, NF, RT
>     close(ext)
>     return res
> }
>
> The input data is:
>
> bug
>
> When run as echo 'bug' | bug.awk, the expected output is:
>
> 1 b 1 u
> 2 b 1 u
> PNG image, 500 x 600, 8-bit colormap, non-interlaced
>
> The actual output is:
>
> 1 b 1 u
> 2 b 1 
> PNG image, 500 x 600, 8-bit colormap, non-interlaced
>
> The getline sets RT to the empty string.  (The actual value RT is
> overwritten with varies.  If you change the above BEGIN block to RS =
> "[aeiou]", RT gets overwritten with "i".)

This is not a bug.  RT is set to match the text of whatever input
characters matched RS.  In the case of the getline, there were no
"u" characters in the output from the coprocess, so RT is set to the
null string.  When you set RS = "[aeiou]" then "i" is the first of
those characters found in the output from `file', so that's what RT
is set to.

Since your getline used a variable, the fields and NF are not changed.

Consider this simpler example:

        $ echo foo | gawk 'BEGIN { RS = "x" }                
        > { printf "RT = <%s>\n", RT ; print NF, $0 }'
        RT = <>
        1 foo

        $

Here, no text matched "x", so RT is empty, and the print $0 includes
the original newline that was in the input, as well as the value of
ORS that print automatically adds.

This stuff can be subtle. :-)  In this example, at least, gawk is
working correctly.

Thanks for taking the time to report a possible bug, I appreciate it,
even when it's not really a bug. :-)

Arnold


reply via email to

[Prev in Thread] Current Thread [Next in Thread]