[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/4] fuzz: refine crash detection mechanism
From: |
Alexander Bulekov |
Subject: |
Re: [PATCH 1/4] fuzz: refine crash detection mechanism |
Date: |
Tue, 22 Dec 2020 11:47:59 -0500 |
Oops let me try to resend this..
Qiuhao Li <Qiuhao.Li@outlook.com> writes:
> The original crash detection method is to fork a process to test our new
> trace input. If the child process exits in time and the second-to-last line
> is the same as the first crash, we think it is a crash triggered by the same
> bug. However, in some situations, it doesn't work since it is a
> hardcoded-offset string comparison.
>
> For example, suppose an assertion failure makes the crash. In that case, the
> second-to-last line will be 'timeout: the monitored command dumped core',
>
Ah - I have not encountered this message. Are you running an
--enable-sanitizers build? I believe ASAN disables coredumps, by
default. I have to turn them on with:
ASAN_OPTIONS=abort_on_error=1:disable_coredump=0:unmap_shadow_on_exit=1
Maybe this is a matter of setting the correct env variables/disabling
coredumps?
I like the idea of switching out CRASH_TOKEN for a regex, however I am
not sure about using the hardcoded crash_patterns to perform matching:
1.) You risk missing some crash pattern. E.g. I don't think
abort()/hw_error() are covered right now.
2.) At some point ASAN/compiler-rt might change the way it outputs
crashes.
I think the current lines[-2] approach is ugly, but it is small, works
in most cases (when coredumps are disabled), and has a simple
CRASH_TOKEN fallback mechanism. We should fix the coredump problem.
Is there any way to do this without hardcoding patterns (or at least
fall-back to something if you don't find a pattern)?
-Alex
> which doesn't contain any information about the assertion failure like where
> it happened or the assertion statement. This may lead to a minimized input
> triggers assertion failure but may indicate another bug. As for some
> sanitizers' crashes, the direct string comparison may stop us from getting a
> smaller input, since they may have a different leaf stack frame.
>
> Perhaps we can detect crashes using both precise output string comparison
> and rough pattern string match and info the user when the trace input
> triggers different but a seminar output.
>
> Tested:
> Assertion failure, https://bugs.launchpad.net/qemu/+bug/1908062
> AddressSanitizer, https://bugs.launchpad.net/qemu/+bug/1907497
> Trace input that doesn't crash
> Trace input that crashes Qtest
>
> Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
> ---
> scripts/oss-fuzz/minimize_qtest_trace.py | 59 ++++++++++++++++++------
> 1 file changed, 46 insertions(+), 13 deletions(-)
>
> diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py
> b/scripts/oss-fuzz/minimize_qtest_trace.py
> index 5e405a0d5f..d3b09e6567 100755
> --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> @@ -10,11 +10,16 @@ import os
> import subprocess
> import time
> import struct
> +import re
>
> QEMU_ARGS = None
> QEMU_PATH = None
> TIMEOUT = 5
> -CRASH_TOKEN = None
> +
> +crash_patterns = ("Assertion.+failed",
> + "SUMMARY.+Sanitizer")
> +crash_pattern = None
> +crash_string = None
>
> write_suffix_lookup = {"b": (1, "B"),
> "w": (2, "H"),
> @@ -24,13 +29,12 @@ write_suffix_lookup = {"b": (1, "B"),
> def usage():
> sys.exit("""\
> Usage: QEMU_PATH="/path/to/qemu" QEMU_ARGS="args" {} input_trace output_trace
> -By default, will try to use the second-to-last line in the output to identify
> -whether the crash occred. Optionally, manually set a string that idenitifes
> the
> -crash by setting CRASH_TOKEN=
> +By default, we will try to search predefined crash patterns through the
> +tracing output to see whether the crash occred. Optionally, manually set a
> +string that idenitifes the crash by setting CRASH_PATTERN=
> """.format((sys.argv[0])))
>
> def check_if_trace_crashes(trace, path):
> - global CRASH_TOKEN
> with open(path, "w") as tracefile:
> tracefile.write("".join(trace))
>
> @@ -42,17 +46,47 @@ def check_if_trace_crashes(trace, path):
> shell=True,
> stdin=subprocess.PIPE,
> stdout=subprocess.PIPE)
> + if rc.returncode == 137: # Timed Out
> + return False
> +
> stdo = rc.communicate()[0]
> output = stdo.decode('unicode_escape')
> > - if rc.returncode == 137: # Timed Out
> - return False
> - if len(output.splitlines()) < 2:
> + output_lines = output.splitlines()
> + # Usually we care about the summary info in the last few lines, reverse.
> + output_lines.reverse()
> +
> + global crash_pattern, crash_patterns, crash_string
> + if crash_pattern is None: # Initialization
> + for line in output_lines:
> + for c in crash_patterns:
> + if re.search(c, line) is not None:
> + crash_pattern = c
> + crash_string = line
> + print("Identifying crash pattern by this string: ",\
> + crash_string)
> + print("Using regex pattern: ", crash_pattern)
> + return True
> + print("Failed to initialize crash pattern: no match.")
> return False
>
> - if CRASH_TOKEN is None:
> - CRASH_TOKEN = output.splitlines()[-2]
> + # First, we search exactly the previous crash string.
> + for line in output_lines:
> + if crash_string == line:
> + return True
> +
> + # Then we decide whether a similar (same pattern) crash happened.
> + # Slower now :(
> + for line in output_lines:
> + if re.search(crash_pattern, line) is not None:
> + print("\nINFO: The crash string changed during our minimization
> process.")
> + print("Before: ", crash_string)
> + print("After: ", line)
> + print("The original regex pattern can still match, updated the
> crash string.")
> + crash_string = line
> + return True
>
> - return CRASH_TOKEN in output
> + # The input did not trigger (the same type) bug.
> + return False
>
>
> def minimize_trace(inpath, outpath):
> @@ -66,7 +100,6 @@ def minimize_trace(inpath, outpath):
> print("Crashed in {} seconds".format(end-start))
> TIMEOUT = (end-start)*5
> print("Setting the timeout for {} seconds".format(TIMEOUT))
> - print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
>
> i = 0
> newtrace = trace[:]
> @@ -152,6 +185,6 @@ if __name__ == '__main__':
> usage()
> # if "accel" not in QEMU_ARGS:
> # QEMU_ARGS += " -accel qtest"
> - CRASH_TOKEN = os.getenv("CRASH_TOKEN")
> + crash_pattern = os.getenv("CRASH_PATTERN")
> QEMU_ARGS += " -qtest stdio -monitor none -serial none "
> minimize_trace(sys.argv[1], sys.argv[2])
> --
> 2.25.1