[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] Length and contents of RT may be wrong/garbled when RS=="
From: |
Aharon Robbins |
Subject: |
Re: [bug-gawk] Length and contents of RT may be wrong/garbled when RS=="" |
Date: |
Sun, 02 Oct 2011 20:26:17 +0200 |
User-agent: |
Heirloom mailx 12.4 7/29/08 |
Greetings.
Thanks for this report. I have just applied a slightly modified version
of the patch, and added the tests. These have been pushed to the
gawk-4.0-stable branch in the git repository.
Thanks!
Arnold
> Date: Fri, 26 Aug 2011 11:41:24 +0200
> From: Jeroen Schot <address@hidden>
> To: address@hidden
> Subject: [bug-gawk] Length and contents of RT may be wrong/garbled when
> RS==""
>
> Hello,
>
> Below is a bug report from a Debian user. I have checked his findings
> and the same behaviour exists in gawk 3.1.8 and 4.0.0. I have not
> verified his patch (also attached). The original bug report can be
> found at http://bugs.debian.org/619738
>
> Regards,
> --
> Jeroen Schot
>
> ----- Forwarded message from Rogier -----
> From: Rogier
> To: Debian Bug Tracking System <address@hidden>
> Subject: gawk: Length and contents of RT may be wrong/garbled when RS==""
> Date: Sat, 26 Mar 2011 17:50:44 +0100
>
> Package: gawk
> Version: 1:3.1.7.dfsg-5
> Severity: normal
> Tags: patch
>
> The contents of RT may be garbled and the length may be wrong when RS=="".
>
> There are two cases:
> - Case 1: The last record is 'terminated' with '\n' instead of '\n\n'
> In this case, the length of RT is reported as 0 instead of 1
> Example (1st and 3rd are OK):
> $ awk 'BEGIN {printf "0"; exit}' | awk 'BEGIN {RS=""}; {print length(RT)}'
> 0
> $ awk 'BEGIN {printf "0\n"; exit}' | awk 'BEGIN {RS=""}; {print
> length(RT)}'
> 0
> $ awk 'BEGIN {printf "0\n\n"; exit}' | awk 'BEGIN {RS=""}; {print
> length(RT)}'
> 2
> - Case 2: RT is longer than the shortest RT seen so far
> In this case, the additional characters in RT are garbage.
> In a non-C locale, the length is also reported incorrectly.
> $ awk 'BEGIN {printf "0\n\n\n1\n\n\n\n\n"; exit}' | LC_ALL=C awk 'BEGIN
> {RS=""}; {print length(RT),gensub("\n","\\\\n","g",RT)}' | cat -v
> 3 \n\n\n
> 5 address@hidden@
> $ awk 'BEGIN {printf "0\n\n\n1\n\n\n\n\n"; exit}' | LC_ALL=en_US.UTF-8
> awk 'BEGIN {RS=""}; {print length(RT),gensub("\n","\\\\n","g",RT)}' | cat -v
> 3 \n\n\n
> 3 address@hidden@
> In both cases, the output should be:
> 3 \n\n\n
> 5 \n\n\n\n\n
>
> I have attached a patch that fixes these problems, and I have added some test
> cases
> as well. The patched source passes all tests and compiles into a .deb without
> errors.
> After applying the patch, execute permission must be set on the test scripts:
> $ chmod +x test/rtlen*.sh
>
> I hereby put the patch, to which I have all rights, in the public domain, so
> that
> there can (hopefully) be no legal objection to incorporating it.
>
> Regards.
>
> Rogier.
> ----- End forwarded message -----
- Re: [bug-gawk] Length and contents of RT may be wrong/garbled when RS=="",
Aharon Robbins <=