[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Issues caused by array elements passed to a function: crash dumps, r
From: |
Aharon Robbins |
Subject: |
Re: Issues caused by array elements passed to a function: crash dumps, reads-after-free, internal errors, others |
Date: |
Wed, 05 Feb 2025 14:02:21 +0200 |
User-agent: |
Heirloom mailx 12.5 6/20/10 |
Hi Christian,
Thank you VERY much for this contribution. Now that your paperwork
has come through, I have integrated this into the code.
I have made some additional stylistic changes, but I'm doing that
in a separate Git commit. I will push everything to the git repo
shortly.
Your changes were nicely done. I hope you'll feel like continuing
to contribute!
Thanks,
Arnold
> Date: Sun, 05 Jan 2025 14:00:49 +0000
> To: bug-gawk@gnu.org
> Subject: Issues caused by array elements passed to a function: crash dumps,
> reads-after-free, internal errors, others
>
> Configuration Information [Automatically generated, do not change]:
> Machine: x86_64
> OS: linux-gnu
> Compiler: gcc
> Compilation CFLAGS: -g3 -gdwarf-5 -DNDEBUG
> uname output: Linux linbox 6.12.6-100.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC
> Thu Dec 19 23:18:14 UTC 2024 x86_64 GNU/Linux
> Machine Type: x86_64-pc-linux-gnu
>
> Gawk Version: 5.3.60
>
> Attestation 1:
> I have read
> https://www.gnu.org/software/gawk/manual/html_node/Bugs.html.
> Yes
>
> Attestation 2:
> I have not modified the sources before building gawk.
> True
>
> Description:
> These are cases that all start by passing an array element to a
> function; the array element is brand new (an untyped/Node_elem_new). The
> test scripts used to reproduce the issues can be split in roughly two
> categories:
>
> 1. the function makes the passed argument an array; later on, the
> scripts attempt to access the newly created subarray in a scalar
> context.
>
> Yes, attempting to use an array as a scalar is a fatal script
> error
> and gawk will "shortly exit anyway"; however, on its way out it should
> print an informative message about the location, the array involved
> (plus the parameter name if inside a function). Sometimes though is
> doing a segfault/core dump, which is a tad excessive. (example below)
>
> 2. the function deletes the original/global array, then accesses in
> various ways the argument (which is the former element of the now
> deleted array).
>
> In this group of tests the now "orphan" argument is left in an
> improper state and sometimes using it later can cause an internal error
> due to an unexpected node type; in some cases a NODE structure is
> prematurely freed (DEREFed); later on, the code will read from the freed
> memory; the effects range from internal errors to incorrect results.
>
> For example, from the first category, a core dump can be triggered with
> this very short script (it is actually the first test case,
> ar2fn_elnew_sc, a greatly simplified version of a real life script):
>
> function f(y) {
> y[3] = 14
> y # BOOM
> }
> BEGIN { f(a[10]) }
>
> The crash is caused by a strlen stepping on a NULL vname pointer; the
> "vname" should be the name/string form of the index in the
> array/expression wrongly used as a scalar; it should be used in the
> error message saying something like "attempt to use array `y (from
> a["10"])' in a scalar context". That in turn requires correct
> "parent_array" and/or "orig_array" pointers. (see also the 2nd test
> case).
>
> Repeat-By:
> See the test/ar2fn_*.awk test cases and their associated .ok files in
> the attached patch.
>
> To add ONLY the test cases and see the bugs manifest before the actual
> fix is in... Do something like this (maybe in a separate branch to keep
> things tidy):
> git apply --whitespace=nowarn --include='test/*'
> path/to/the/patch/file/ar2fn.diff
> ... then do a 'make check'. (well, maybe build gawk beforehand
> if it's
> not already built).
> Or just slice and dice the diff to keep only the tests and their
> makefiles.
>
> The incorrect outcomes for each test are as follows:
>
> ar2fn_elnew_sc: causes a core dump
>
> ar2fn_elnew_sc2: malformed error message: "fatal: attempt to use
> array
> `(null)' in a scalar context"
>
> ar2fn_fmod: use-after-free, accessing a Node_var and its
> "var_value"
> *after* it has been freed, so all bets are off: depending on how gawk
> was built (debug/release) and/or Valgrind options, it can crash, other
> times it gives a wrong type in result.
>
> ar2fn_unxptyp_aref: "fatal: internal error: file interpret.h, line
> 265:
> unexpected parameter type Node_array_ref"
>
> ar2fn_unxptyp_val: NODE left in improper state; "fatal: internal
> error:
> file interpret.h, line 265: unexpected parameter type Node_val"
>
> Note: no fuzzers were stressed during the making of these scripts.
> Also, no synthesizers. ;)
>
> Fix:
> See attached ar2fn.diff for a fix. Some details, with a bit of
> background first:
>
> A newly born Node_elem_new could become a scalar or a (sub)array. But
> until it knows exactly what it will be when it "grows up", it should be
> or pretend to be (potentially) both: it should carry around -- or have a
> way to access -- all the information needed for both outcomes.
>
> However, the way it is treated in its childhood does not allow it to
> maintain the memory of its parent (this starts to sound like a soap
> opera or a bildungsroman...); in some cases, by the time it ends up in a
> situation to turn into a subarray it will miss key data that a subarray
> would need: a pointer to its parent ("parent_array") and the "vname", a
> string representation of its index in that parent array.
>
> The fix saves this information in certain fields of the NODE structure
> and preserves them while the new element is passed around. If the NODE
> will turn into a scalar, that information will be freed/discarded. But
> if it will become a subarray, the saved information will be moved into
> the proper fields so that it can be used in case is needed (mainly by
> the "array_vname" function).
>
> I tried to leave existing code as much as possible the way it was and
> only add/augment around it.
> Also, in two places there are conditions that do not seem to be
> triggered, or at least I was not able to come up with a test to do it;
> thus arguably they could be turned into asserts, just in case.
> But out of an abundance of caution I've left them there; at worst they
> will be very small pieces of dead code.
> One such place IIRC is the prev_array->valref test in
> "adjust_param_node" (array.c).
> The 2nd is in gawkapi.c, the additional test towards the end of
> "api_sym_update" to reset a Node_elem_new; I feel a new element node
> cannot reach that point, but I could be wrong; I can dig deeper in that
> area when I'll have more tuits -- the round kind.
>
> The fix passes the test suite, without leaks or other memory issues
> reported from Valgrind ("make valgrind" shows 3 "broken pipe"-style
> errors, but that's unrelated, the same thing is happening with an
> official master branch build).
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: Issues caused by array elements passed to a function: crash dumps, reads-after-free, internal errors, others,
Aharon Robbins <=