[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in
From: |
Jacob Bachmeyer |
Subject: |
bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case |
Date: |
Fri, 26 Jun 2020 17:37:23 -0500 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.22) Gecko/20090807 MultiZilla/1.8.3.4e SeaMonkey/1.1.17 Mnenhy/0.7.6.0 |
Pedro Alves wrote:
On 6/25/20 11:39 PM, Jacob Bachmeyer wrote:
Pedro Alves wrote:
On 6/25/20 3:43 AM, Jacob Bachmeyer wrote:
Pedro Alves wrote:
[...]
I presume buildbot can be easily configured to pass a '--keep_going' option,
perhaps as 'RUNTESTFLAGS=--keep_going' if needed?
What's more likely is we'll tweak GDB's Makefile so that make check
enables that option automatically. If GCC does the same, which I would
encourage them to, since they also run the testsuite against systems
that can take hours/days to complete, then I'll get to wonder who the
default is for. :-)
The default is for developers working on testsuites -- and for testing release
candidates and releases because if your testcases crash in a release, you have
problems. :-)
Are you suggesting withdrawing the --keep-going/--no-keep-going options
Yes, but if you would like to add it, it's OK with me, though then
you can say that I'm more discussing the default. But really I'm more
worried about making sure that the people maintaining the GDB
testsuite (me included), and the DejaGnu people understand each
other's needs better.
My position here stems from the need for reliable testing, and part of
that is that crashes in the testsuite scripts need to be made very
obvious, where DejaGnu had previously swept them under the proverbial
rug. Sure, there would be a message in the log, but that is easily
missed in a testsuite as large as GDB's and nowhere was a need to grep
the log for those messages documented.
and always generating an UNRESOLVED result,
resuming with the next test script, and possibly storing a list of aborted test scripts somewhere
Yes (with --keep-going if you insist on having the option), but note that
I suggested storing the list of ERRORs and WARNINGs.
Arguably, a crashed test script is a distinct category from the ERRORs
and WARNINGs that test scripts can report when something does not go as
expected. The former is a problem with the testsuite *itself*, while
the latter indicate problems detected *by* the testsuite.
Still, needing a new option to get old behavior is changing the environment
interface, so I will make --keep_going the default, at least for the rest of
1.6. That default is likely to change for 1.7. For automated systems like
buildbot, passing '--keep_going' should be perfectly reasonable, which is why
the option exists. (While the documented options will be changed to use
hyphens to align with the rest of the GNU system in 1.7, the current option
names will continue to be accepted, since the aliases are obvious.)
Currently with DejaGnu only the case of calling an unknown function stops
the test run AFAIK. Any other tcl error like calling a proc with the wrong
number of arguments, or treating a non-array variable as an array, etc. is
caught by the catch in the runtest proc in runtest.exp, and the testrun
continues. That to me clearly indicates that the original intention was
to catch errors and continue.
That also gives the tool's $(tool)_finish proc a chance to tear down
correctly.
It is just that the "unknown" case wasn't thought of. So I argue that not
making unknown proc calls abort the run is a bug fix, and making DejaGnu
abort the run for other errors by default is a behavior change that no real
testsuite built around DejaGnu is asking for.
As it was explained to me, it is the other way around -- the catch in runtest
was overlooked when ::unknown was added to catch bugs by aborting if an
undefined procedure is called.
Is there a ${tool}_finish or do you mean ${tool}_exit? Either way, you make a
good point about running cleanup procedures when aborting a test run.
Look at proc runtest in runtest.exp, it calls ${tool}_init,
sources the testcase under catch (using ERROR:, so here you could
run a tally of such caught errors), and then calls ${tool}_finish.
Well, sure enough, there are ${tool}_init and ${tool}_finish procedures
called "around" each script if they exist... and no documentation...
another item added to my local TODO list for improving the manual.
The entire reason for stopping the testsuite immediately when a Tcl error
occurs is that it is undeniable that something is *very* *wrong* when the
testsuite stops early. Software testing is hard enough when you know your
testsuite is good, but it is far worse when you have bugs in the testsuite
masking bugs in the actual program because a test script is crashing and part
of the program that is supposed to be tested is not being exercised.
I think it's safe to say that we all understand the general principle,
but IMO that applies more to continuing the same testcase (if somehow that
were possible, like "unknown" suppressing the error), rather than continuing
with a different testcase.
Continuing the testrun when one testcase errors out does not mask any
bug in the program that is supposed to be tested. Why would it? The
testcase aborts, it doesn't continue. The remaining testcases start
afresh.
Consider a hypothetical case where GDB has exec.exp (which tests starting a debugged process) and attach.exp
(which tests attaching to a process). Unknown to the GDB developers, the "attach" command does not
work and causes GDB to dump core (oops!) but attach.exp fails with a Tcl error before reaching the point
where "attach" is issued to the inferior GDB. If DejaGnu stops immediately, it is obvious that the
testsuite is broken, the bug in the testsuite is fixed, and the bug in GDB is found. If DejaGnu sweeps the
Tcl error under the proverbial rug (as previous versions did), the bug in the testsuite effectively hides the
bug in GDB , because the developers think the "attach" command is being exercised, but those tests
are being (almost) silently skipped.
First, that scenario doesn't normally cause a Tcl error. If GDB trips
an internal error, the GDB testsuite catches it with a FAIL. If GDB just dies
(whether it dumps core or not) the GDB testsuite catches that
with UNRESOLVED.
In the hypothetical scenario, attach.exp crashes with a Tcl error
*before* issuing the "attach" command to the GDB-under-test. Previous
releases of DejaGnu mention a Tcl error in the log and carry on. With
this patch, DejaGnu catches that attach.exp crashed and itself inserts
an UNRESOLVED result.
I also take this as more support for using the UNRESOLVED status here:
the test script crashing is little different from GDB exiting unexpectedly.
Also, even if those weren't already considered, the developer won't think the
testcase is just passing correctly, because being a good GDB developer,
he knows he needs to diff the new gdb.sum file against a previous run's, and
sees that some PASSes are missing, and some ERROR: tcl error showed up
in their stead. Nobody should rely on just looking at the summary.
Everyone _must_ diff between a previous run and the current run. Now some
people may use test result diffing scripts that forget to look
at ERROR: lines in .sum. Those would be buggy, but then again, if
those ERRORs were accounted for in the DejaGnu summary, they would be
harder to miss.
If this is the case, then what is the problem with aborting the test run
upon error? "Good GDB developers" will always diff the .sum files,
notice that the expected new PASS results are not there, and fix their
code before checking it in, so they will never have a problem with
DejaGnu aborting upon encountering a Tcl error. :-)
And again, people run the GDB testsuite in parallel mode, so your
assumption that the testsuite would stop immediately anyway doesn't
hold. One of the parallel runtest invocations bombs
out, while the others keep running, so by the time you get
back the prompt, other testcases already executed, and you don't
see the ERROR in the current terminal anymore.
This is a fundamental problem with slicing up the testsuite external to
DejaGnu and running multiple instances of runtest -- output that is
supposed to be "last at the end of the run" can be hidden away by other
ongoing subset runs.
You'll need to
look at the testsuite .sum .log files anyway. Only, you will
find out that one testcase bombed out, 50000 tests ran, but 1000 others
didn't run because of that one testcase. And now you're missing
those 1000 results, which maybe included those which you were
actually interested in, because you were doing a builbot try run
across a number of systems.
This also applies on a smaller scale for which we have no workaround:
gdb.foo/bar.exp bombs out and the remaining /N/ tests in that file did
not run. The only real solution is for the test scripts to avoid
causing Tcl errors in the first place; anything else is papering over
the problem with duct tape.
Now, when running the testsuite against slow embedded boards, then
you won't normally be using the parallel mode, the board just can't
support multiple parallel connections. In that case, you set the
testsuite running, and go on your merry way for a couple hours or 20.
When you come back, maybe the next day, you notice that the testsuite
actually stopped running 30 mins in, because someone changed
the testsuite upstream just before you rebased in way that caused a
tcl error for you in some testcase. Oops. You just lost a day.
This is more of an annoyance, and yes, I have come back the next day to
find that a long-running command failed shortly after being started, so
I know this irritation.
Now
you wish you had remembered to type --keep-going, but it's not
the default, and nobody ever writes Tcl bugs, so you didn't think
of it before.
Note that the snapshot of GDB I have been using for testing this patch
does have such a Tcl bug in its testsuite. I did not make any
particular effort to choose this snapshot, it was just the current state
of GDB master when I started. I have not even looked at *which* commit
it is. Thinking "nobody ever writes Tcl bugs" is what causes these
problems in the first place.
I would expect that, had DejaGnu always aborted immediately upon
catching Tcl errors from testsuites, that issue would have been found
and fixed quickly.
That's why I think --keep-going should be the default. When
you're developing a new testcase, you'll run that testcase in
isolation a number of times. It's very hard to miss any
silly error then -- there's no need to optimize for this use
case by default, IMO.
Now a bug that obvious will be caught very quickly by users (likely the GDB
developers themselves) but obscure regressions are for more insidious,
especially in large testsuites such as those for GCC or GDB. If a regression
test is being skipped with an easily-missed error, that regression can slip in
unnoticed.
Would simply turning aborted test scripts into UNRESOLVED results be enough to
get them fixed quickly?
Yes. (Though I still think a separate count for ERROR (and WARNING)
would be better.)
I think we need the UNRESOLVED results for at least a credible
best-effort at claiming POSIX compliance, but additionally tracking how
many UNRESOLVED results are due to script crashes would not be difficult.
-- Jacob
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, (continued)
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Jacob Bachmeyer, 2020/06/24
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Pedro Alves, 2020/06/25
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Jacob Bachmeyer, 2020/06/25
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Pedro Alves, 2020/06/26
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Pedro Alves, 2020/06/26
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Jacob Bachmeyer, 2020/06/26
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Pedro Alves, 2020/06/27
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Pedro Alves, 2020/06/27
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Jacob Bachmeyer, 2020/06/27
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Rob Savoye, 2020/06/26
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case,
Jacob Bachmeyer <=
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Pedro Alves, 2020/06/27
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Pedro Alves, 2020/06/27
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Jacob Bachmeyer, 2020/06/27
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Pedro Alves, 2020/06/27
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Jacob Bachmeyer, 2020/06/30
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Pedro Alves, 2020/06/30
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Jacob Bachmeyer, 2020/06/30
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Rob Savoye, 2020/06/25
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Jacob Bachmeyer, 2020/06/25
- bug#41824: Dejagnu's unknown proc aborts testsuite run when triggered in test-case, Rob Savoye, 2020/06/27