bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Length and contents of RT may be wrong/garbled when RS=="


From: Aharon Robbins
Subject: Re: [bug-gawk] Length and contents of RT may be wrong/garbled when RS==""
Date: Sun, 02 Oct 2011 20:26:17 +0200
User-agent: Heirloom mailx 12.4 7/29/08

Greetings.

Thanks for this report. I have just applied a slightly modified version
of the patch, and added the tests.  These have been pushed to the
gawk-4.0-stable branch in the git repository.

Thanks!

Arnold

> Date: Fri, 26 Aug 2011 11:41:24 +0200
> From: Jeroen Schot <address@hidden>
> To: address@hidden
> Subject: [bug-gawk] Length and contents of RT may be wrong/garbled when
>       RS==""
>
> Hello,
>
> Below is a bug report from a Debian user. I have checked his findings
> and the same behaviour exists in gawk 3.1.8 and 4.0.0. I have not
> verified his patch (also attached). The original bug report can be
> found at http://bugs.debian.org/619738
>
> Regards,
> -- 
> Jeroen Schot
>
> ----- Forwarded message from Rogier -----
> From: Rogier
> To: Debian Bug Tracking System <address@hidden>
> Subject: gawk: Length and contents of RT may be wrong/garbled when RS==""
> Date: Sat, 26 Mar 2011 17:50:44 +0100
>
> Package: gawk
> Version: 1:3.1.7.dfsg-5
> Severity: normal
> Tags: patch
>
> The contents of RT may be garbled and the length may be wrong when RS=="".
>
> There are two cases:
> - Case 1: The last record is 'terminated' with '\n' instead of '\n\n'
>   In this case, the length of RT is reported as 0 instead of 1
>   Example (1st and 3rd are OK):
>     $ awk 'BEGIN {printf "0"; exit}' | awk 'BEGIN {RS=""}; {print length(RT)}'
>     0
>     $ awk 'BEGIN {printf "0\n"; exit}' | awk 'BEGIN {RS=""}; {print 
> length(RT)}'
>     0
>     $ awk 'BEGIN {printf "0\n\n"; exit}' | awk 'BEGIN {RS=""}; {print 
> length(RT)}'
>     2
> - Case 2: RT is longer than the shortest RT seen so far
>   In this case, the additional characters in RT are garbage.
>   In a non-C locale, the length is also reported incorrectly.
>     $ awk 'BEGIN {printf "0\n\n\n1\n\n\n\n\n"; exit}' | LC_ALL=C awk 'BEGIN 
> {RS=""}; {print length(RT),gensub("\n","\\\\n","g",RT)}' | cat -v
>     3 \n\n\n
>     5 address@hidden@
>     $ awk 'BEGIN {printf "0\n\n\n1\n\n\n\n\n"; exit}' | LC_ALL=en_US.UTF-8 
> awk 'BEGIN {RS=""}; {print length(RT),gensub("\n","\\\\n","g",RT)}' | cat -v
>     3 \n\n\n
>     3 address@hidden@
>   In both cases, the output should be:
>     3 \n\n\n
>     5 \n\n\n\n\n
>
> I have attached a patch that fixes these problems, and I have added some test 
> cases
> as well. The patched source passes all tests and compiles into a .deb without 
> errors.
> After applying the patch, execute permission must be set on the test scripts:
>     $ chmod +x test/rtlen*.sh
>
> I hereby put the patch, to which I have all rights, in the public domain, so 
> that
> there can (hopefully) be no legal objection to incorporating it.
>
> Regards.
>
> Rogier.
> ----- End forwarded message -----



reply via email to

[Prev in Thread] Current Thread [Next in Thread]