[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Issues caused by array elements passed to a function: crash dumps, reads
From: |
mekanofox |
Subject: |
Issues caused by array elements passed to a function: crash dumps, reads-after-free, internal errors, others |
Date: |
Sun, 05 Jan 2025 14:00:49 +0000 |
User-agent: |
Purely Mail via Roundcube/1.6.8 |
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -g3 -gdwarf-5 -DNDEBUG
uname output: Linux linbox 6.12.6-100.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC
Thu Dec 19 23:18:14 UTC 2024 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu
Gawk Version: 5.3.60
Attestation 1:
I have read
https://www.gnu.org/software/gawk/manual/html_node/Bugs.html.
Yes
Attestation 2:
I have not modified the sources before building gawk.
True
Description:
These are cases that all start by passing an array element to a
function; the array element is brand new (an untyped/Node_elem_new). The
test scripts used to reproduce the issues can be split in roughly two
categories:
1. the function makes the passed argument an array; later on, the
scripts attempt to access the newly created subarray in a scalar
context.
Yes, attempting to use an array as a scalar is a fatal script error
and gawk will "shortly exit anyway"; however, on its way out it should
print an informative message about the location, the array involved
(plus the parameter name if inside a function). Sometimes though is
doing a segfault/core dump, which is a tad excessive. (example below)
2. the function deletes the original/global array, then accesses in
various ways the argument (which is the former element of the now
deleted array).
In this group of tests the now "orphan" argument is left in an
improper state and sometimes using it later can cause an internal error
due to an unexpected node type; in some cases a NODE structure is
prematurely freed (DEREFed); later on, the code will read from the freed
memory; the effects range from internal errors to incorrect results.
For example, from the first category, a core dump can be triggered with
this very short script (it is actually the first test case,
ar2fn_elnew_sc, a greatly simplified version of a real life script):
function f(y) {
y[3] = 14
y # BOOM
}
BEGIN { f(a[10]) }
The crash is caused by a strlen stepping on a NULL vname pointer; the
"vname" should be the name/string form of the index in the
array/expression wrongly used as a scalar; it should be used in the
error message saying something like "attempt to use array `y (from
a["10"])' in a scalar context". That in turn requires correct
"parent_array" and/or "orig_array" pointers. (see also the 2nd test
case).
Repeat-By:
See the test/ar2fn_*.awk test cases and their associated .ok files in
the attached patch.
To add ONLY the test cases and see the bugs manifest before the actual
fix is in... Do something like this (maybe in a separate branch to keep
things tidy):
git apply --whitespace=nowarn --include='test/*'
path/to/the/patch/file/ar2fn.diff
... then do a 'make check'. (well, maybe build gawk beforehand if it's
not already built).
Or just slice and dice the diff to keep only the tests and their
makefiles.
The incorrect outcomes for each test are as follows:
ar2fn_elnew_sc: causes a core dump
ar2fn_elnew_sc2: malformed error message: "fatal: attempt to use array
`(null)' in a scalar context"
ar2fn_fmod: use-after-free, accessing a Node_var and its "var_value"
*after* it has been freed, so all bets are off: depending on how gawk
was built (debug/release) and/or Valgrind options, it can crash, other
times it gives a wrong type in result.
ar2fn_unxptyp_aref: "fatal: internal error: file interpret.h, line 265:
unexpected parameter type Node_array_ref"
ar2fn_unxptyp_val: NODE left in improper state; "fatal: internal error:
file interpret.h, line 265: unexpected parameter type Node_val"
Note: no fuzzers were stressed during the making of these scripts.
Also, no synthesizers. ;)
Fix:
See attached ar2fn.diff for a fix. Some details, with a bit of
background first:
A newly born Node_elem_new could become a scalar or a (sub)array. But
until it knows exactly what it will be when it "grows up", it should be
or pretend to be (potentially) both: it should carry around -- or have a
way to access -- all the information needed for both outcomes.
However, the way it is treated in its childhood does not allow it to
maintain the memory of its parent (this starts to sound like a soap
opera or a bildungsroman...); in some cases, by the time it ends up in a
situation to turn into a subarray it will miss key data that a subarray
would need: a pointer to its parent ("parent_array") and the "vname", a
string representation of its index in that parent array.
The fix saves this information in certain fields of the NODE structure
and preserves them while the new element is passed around. If the NODE
will turn into a scalar, that information will be freed/discarded. But
if it will become a subarray, the saved information will be moved into
the proper fields so that it can be used in case is needed (mainly by
the "array_vname" function).
I tried to leave existing code as much as possible the way it was and
only add/augment around it.
Also, in two places there are conditions that do not seem to be
triggered, or at least I was not able to come up with a test to do it;
thus arguably they could be turned into asserts, just in case.
But out of an abundance of caution I've left them there; at worst they
will be very small pieces of dead code.
One such place IIRC is the prev_array->valref test in
"adjust_param_node" (array.c).
The 2nd is in gawkapi.c, the additional test towards the end of
"api_sym_update" to reset a Node_elem_new; I feel a new element node
cannot reach that point, but I could be wrong; I can dig deeper in that
area when I'll have more tuits -- the round kind.
The fix passes the test suite, without leaks or other memory issues
reported from Valgrind ("make valgrind" shows 3 "broken pipe"-style
errors, but that's unrelated, the same thing is happening with an
official master branch build).
ar2fn.diff
Description: Text Data
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Issues caused by array elements passed to a function: crash dumps, reads-after-free, internal errors, others,
mekanofox <=