bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

difference in RS handling for equivalent regexps with unending input str


From: Ed Morton
Subject: difference in RS handling for equivalent regexps with unending input stream
Date: Wed, 3 Jul 2024 04:10:35 -0500
User-agent: Mozilla Thunderbird

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: cygwin
Compiler: gcc
Compilation CFLAGS: -ggdb -O2 -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fstack-protector-strong --param=ssp-buffer-size=4 -fdebug-prefix-map=/cygdrive/d/a/scallywag/gawk/gawk-5.3.0-1.x86_64/build=/usr/src/debug/gawk-5.3.0-1 -fdebug-prefix-map=/cygdrive/d/a/scallywag/gawk/gawk-5.3.0-1.x86_64/src/gawk-5.3.0=/usr/src/debug/gawk-5.3.0-1 -DNDEBUG uname output: CYGWIN_NT-10.0-22631 TournaMart_2023 3.5.3-1.x86_64 2024-04-03 17:25 UTC x86_64 Cygwin
Machine Type: x86_64-pc-cygwin

Gawk Version: 5.3.0

Attestation 1:
        I have read https://www.gnu.org/software/gawk/manual/html_node/Bugs.html.
        Yes

Attestation 2:
        I have not modified the sources before building gawk.
        True

Description:

   Someone asked a question on SO about handling unending input from
   netcat with a regexp delimiter that's just 2 possible chars, see
   https://stackoverflow.com/q/78700014/1745001, where gawk seems to be
   a record behind in it's processing. I'm using bash on cygwin, they
   used zsh on MacOS.

Repeat-By:

   I can reproduce the problem with this (hitting control-C to stop
   each command when it stops to wait for more input):

   $ printf 'A;B;C;\n' > file

   $ cat file - | awk -v RS='(;|=)' '{print NR, $0}'
   1 A

   $ cat file - | awk -v RS=';|=' '{print NR, $0}'
   1 A
   2 B

   $ cat file - | awk -v RS='[;=]' '{print NR, $0}'
   1 A
   2 B
   3 C

   Obviously that's 3 supposedly equivalent regexps producing 3
   different results.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]