[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-gawk] Function argument corruption in 4.2.0
From: |
Eric Pruitt |
Subject: |
[bug-gawk] Function argument corruption in 4.2.0 |
Date: |
Sun, 12 Nov 2017 11:55:11 -0800 |
User-agent: |
NeoMutt/20170113 (1.7.2) |
I've run into a problem with GAWK 4.2.0 where a function argument gets
corrupted. Here's an example showing the output in 4.2.0 compared to
4.1.4:
mdlint$ gawk --version | head -n1 && gawk -We mdlint test.in -v -r
label_exists_for_destination
GNU Awk 4.2.0 (GNU MPFR 3.1.5, GNU MP 6.1.2)
52: the URI
"آ|address@hidden@address@hidden@address@hidden|address@hidden@address@hidden@address@hidden"
points to the same place as the link reference labeled
"label_exists_for_destination"
(1)
mdlint$ /usr/bin/gawk --version | head -n1 && /usr/bin/gawk -We mdlint
test.in -v -r label_exists_for_destination
GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)
52: the URI "//#label_exists_for_destination" points to the same place as
the link reference labeled "label_exists_for_destination"
(1)
mdlint$
In addition to 4.1.4, the script also works correctly on mawk 1.3.3,
original-awk (https://packages.debian.org/stretch/original-awk) and
BusyBox AWK v1.22.1. I've observed the corruption building against glibc
and musl libc. Unfortunately I haven't been able to create a simplified
test case or figure out which commit introduced the issue using "git
bisect run", and the mdlint script is ~800 SLOC without the function
documentation comments and ~1500 SLOC with them. I am happy to provide a
copy of the mdlint script and the test case data if someone is willing
to dig into the code. It depends on the cmark binary
(https://github.com/commonmark/cmark), but that could be mocked out
easily enough with something like "cat $OUTPUT_OF_CMARK". For now, here
are two snippets of code surrounding this issue:
1170 md_link_definitions[label] = n
1171 $0 = substr(line, RSTART + RLENGTH + 1)
1172
1173 if ($1 in uris) {
1174 link_destination_duplicate(n, label, $1, uris[$1])
1175 } else if (length($1)) {
1176 uris[$1] = label
1177 }
1178
1179 if ($1 in md_link_uris) {
--> 1180 label_exists_for_destination(md_link_uris[$1], $1,
label)
1181 }
287 # A link reference definition for a URI exists.
288 #
289 # Arguments:
290 # - linenos: Numbers of the lines with the problem.
291 # - destination: Destination of the link.
292 # - label: Name of the label that refers to the destination.
293 #
294 function label_exists_for_destination(linenos, destination,
label, n, seen)
295 {
296 # This kludge resolves a data corruption issue in GNU Awk
4.2.0; TODO: root
297 # cause the problem and report it upstream.
298 destination = destination ""
299
300 $0 = linenos
301
302 for (n = 1; n <= NF; n++) {
303 if ($n in seen) {
304 continue
305 }
306
307 seen[$n] = 1
308 report("label_exists_for_destination", 0 + $n,
--> 309 sprintf("the URI \"%s\" points to the same place as
the link" \
310 " reference labeled \"%s\"",
311 destination, label \
312 ) \
313 )
314 }
315 }
I have omitted the report function. It ultimately just shoves the output
from sprintf into a queue. If the "report" function is replaced with a
"print" statement, the displayed data is still corrupted. This is the
code from my most recent, failed attempt to reproduce the issue in
isolated setting:
function A(x, y, n)
{
$0 = x
for (n = 1; n <= NF; n++) {
print sprintf("y = \"%s\"; n = \"%s\"", y, n)
}
}
function B()
{
$0 = "??**??%% //#label_exists_for_destination BBBBBBBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCCCCCCCCC"
if ($2 in array) {
A(array[$2], $2)
}
}
BEGIN {
split("", array)
array["//#label_exists_for_destination"] = "XXXXXXXXXXXXXXXXXXX
YYYYYYYYYYYYYYYY ZZZZZZZZZZZZZZZZZZZZZZ"
B()
exit
}
Since appending a null string to the "destination" variable mitigates the
corruption, my vague guess is that values or pointers to the split values
generated by assigning "$0" are being modified when they shouldn't be. Any
ideas? Is there any other information I could / should provide?
Eric
- [bug-gawk] Function argument corruption in 4.2.0,
Eric Pruitt <=
- Re: [bug-gawk] Function argument corruption in 4.2.0, Andrew J. Schorr, 2017/11/12
- Re: [bug-gawk] Function argument corruption in 4.2.0, Eric Pruitt, 2017/11/12
- Re: [bug-gawk] Function argument corruption in 4.2.0, Eric Pruitt, 2017/11/12
- Re: [bug-gawk] Function argument corruption in 4.2.0, Andrew J. Schorr, 2017/11/12
- Re: [bug-gawk] Function argument corruption in 4.2.0, Eric Pruitt, 2017/11/13
- Re: [bug-gawk] Function argument corruption in 4.2.0, Andrew J. Schorr, 2017/11/13
- Re: [bug-gawk] Function argument corruption in 4.2.0, Andrew J. Schorr, 2017/11/13
- Re: [bug-gawk] Function argument corruption in 4.2.0, Andrew J. Schorr, 2017/11/13
- Re: [bug-gawk] Function argument corruption in 4.2.0, arnold, 2017/11/13
- Re: [bug-gawk] Function argument corruption in 4.2.0, Andrew J. Schorr, 2017/11/13