Re: [PATCH] {GSoC} parallel-tests: simplify testsuite summary

automake-patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] {GSoC} parallel-tests: simplify testsuite summary

From:	Stefano Lattarini
Subject:	Re: [PATCH] {GSoC} parallel-tests: simplify testsuite summary
Date:	Thu, 7 Jul 2011 19:02:58 +0200
User-agent:	KMail/1.13.3 (Linux/2.6.30-2-686; KDE/4.4.4; i686; ; )

On Wednesday 06 July 2011, Ralf Wildenhues wrote:
> * Stefano Lattarini wrote on Wed, Jul 06, 2011 at 02:49:15PM CEST:
> > On Wednesday 06 July 2011, Ralf Wildenhues wrote:
> > > * Stefano Lattarini wrote on Sun, Jul 03, 2011 at 03:31:22PM CEST:
> > > > Prefer a more deterministic, "tabular" format for the testsuite
> > > > summary, always listing the numbers of passed, failed, xfailed,
> > > > xpassed, skipped and errored tests, even when these numbers are
> > > > zero.  This simplify the logic of testsuite summary creation,
> > > > makes it more easily machine-parseable, and will probably allow
> > > > for easier addition of new kinds of test results in the future.
> > > > 
> > > > Applied to the temporary branch 'GSoC/experimental/test-results-work'.
> > > 
> > > This looks a bit verbose to me, also IIUC it introduces yet another
> > > format that is different from our previous summary and different from
> > > the summaries of any other testsuite environments.  Any chance we can
> > > lessen the NIH?
> > >
> > Well, I'm surely open to revisit the issue later if we find an existing
> > format that offers all the information my new format does, is as easily
> > parseable, and does not suffer of the NIH syndrome :-).
> 
> Wait.  Do you know what NIH is?
>
Yes.  And anyway, I can use Google ;-)

> Not Invented Here means:
> you create something anew, although there exists already other code that
> does the same or a similar thing, or a similar protocol or format.
> 
> In University, not copying is considered good style (especially with
> scientific results).  In programming, copying is good, and reusing is
> even better.  But whether the code you copy (or which you reuse) itself
> suffered from NIH syndrome is often not so relevant.  ;->
> 
> > But the older
> > format has to go IMHO, for two orthogonal and good reasons:
> >  - the logic necessary to create a summary that obeys that format is
> >    complex, unclear, and not at all extensible; and
> 
> Good point.
> 
> >  - it's pretty hard to verify the correctness of the resulting summary
> >    by means of grepping checks.
> 
> OK.
> 
> But tools like autobuild have another requirement: they would like to be
> able to detect, as reliably as possible, some statistics based on
> whatever output is thrown at them.  IOW, they usually also would like to
> detect the style of testsuite driver that this comes from.
>
But this is impossible to do reliably, as the new automake harness allows
the use of multiple test drivers in the same Makefile.am, and even allow
the user to define his own test drivers.  Or am I misunderstanding what
you're saying?

> This is
> easiest done if it can be done on a line by line basis.  Your proposed
> format doesn't make that easy (but we could add a clue in one of the
> first/early summary lines to make that easier, if we also make it easy
> to detect the end of the summary).
>
Ah, now I see what you mean.  Maybe even add the "version" and "type"
of the testsuite summary, as in:

 ==================================================================
 Testsuite results for GNU AutoFoo 7.3
 harness: parallel-tests; version: 1; automake: 1.12;
 ==================================================================
 # TOTAL: 113
 # PASS: 97
 ...

or to make line-by-line parsing even easier:

 @TestSuite - harness: parallel-tests; version: 1; automake: 1.12;
 ==================================================================
 Testsuite results for GNU AutoFoo 7.3
 ==================================================================
 # TOTAL: 113
 # PASS: 97
 ...

(or something similar, I'm not trying to se details yet).

> Also, if you want something extensible, then don't treat extensibility
> like an afterthought: you could just omit any line with 0 tests, no?
>
Or better again add an option to do so (I personally prefer a count to be
reported even if it is 0, but I understand that some other people might
have a different preference).

> Before implementing more,
>
Don't worry about this, there's nothing more to implement for the moment
w.r.t. testsuite summary.  Also, note that the planned TAP driver could
have happily used the older testsuite summary format *without any change*
-- the only (but IMHO very serious) downside being that the new test
cases checking the new driver would have had a much, much harder time at
verifying whether the testsuite summary was correct or not.

> please show a (non-code) definition of the
> output that would be sufficient for someone to write an (say, autobuild)
> parser that grabs testsuite results
>
You mean a testsuite summary, right?  If you do, here is a rough but good
enough definition:

  A line composed by 76 `=' characters, followed by a line beginning with
  the string "Testsuite summary", followed again by a line composed by 76
  `=' characters -- this marks the beginning of the testsuite results'
  summary.  The first line composed by 76 `=' characters that comes after
  those -- this marks the end of the testsuite results summary.  Each of
  these results should be reported inside the testsuite summary, on a line
  if its own, matching the following regular expression:
   ^#\s*KIND-OF-RESULT:\s*COUNT(\s.*)?$
  Any line in the testsuite summary that doesn't obey this format doesn't
  denote a test result count, and should thus be ignored.

> from a stream of otherwise garbage
> that generally works, without looking at your code.  Then, let's see
> that, and proceed from there.  Thanks.
>
Is the above good enough?  Or should we also integrate the idea of having
the "version" and "type" of the testsuite summary declared in the stdout?

> > > I've rejected a reformatting of the results before (see the list
> > > archives for details).  I'm not totally opposed, if there is a real
> > > advantage /for the user/,
> > >
> > Personally I can say that, as a user, I find the new format definitely
> > clearer and more pleasant (at least when 'color-tests' is in use); and
> > this independently from the considerations above (that were admittedtly
> > done with the developer/tester hat on).
> 
> But clarity is only one criterion.  Let's define all criteria is has to
> meet.
>
Suggestions?  I really only had "clarity" in mind :-)

> > > This is end-user API, a lot of stuff that's
> > > not even using Automake (but only running a testsuite of a package that
> > > uses Automake) can be relying on this.  IOW, things like GSRC, the
> > > autobuild parser, or whatnot else (that we don't have any control over)
> > > that greps our output, and now will have to maintain yet another syntax
> > > forever (since of course the old syntax won't die in the next several
> > > years).
> > >
> > Maybe this time we should clearly document the testsuite summary format?
> 
> Exactly.
>
Could you please open a bug report about that on bug-automake?  This way,
even if we don't manage to write the documentation about this by the end
of GSoC, we will still remember to do so for 1.12.

> > Or even offer a small script to extract it from the "make check" output,
> > so that those project won't bash us too hard? (Note that I have written
> > a rough version of such a script in my patch, for use in our own test
> > cases; but it can't handle multiple summaries in the same output though).
> 
> I should read the whole post before starting to reply, obviously.
> ;-)
>
> > > > +       result_count () \
> > > > +       { \
> > > > +           if test x"$$1" = x"--color"; then \
> > > > +             colorize=yes; \
> > > > +           elif test x"$$1" = x"--no-color"; then \
> > > 
> > > What if output doesn't go to a terminal?  Does --no-color get set then,
> > > or colorize?
> > >
> > In that case, `$colorize' is set to "yes", but the variables that should
> > hold the color escape sequences ($red, $grn, etc.) are set to the empty
> > string (by the code in $(am__tty_clors)), unless the user explicitly
> > required otherwise by setting `AM_COLOR_TESTS' to "always".  This is the
> > same "colorizing logic" that is supported for the testsuite progress
> > output.
> > 
> > And now that I've been forced to spell that out, I see that the `--color'
> > option name is poorly chosen; I'd say we change it to `--maybe-color',
> > which is more faithful to the intended seantics.  OK?  I've done that
> > in the attached squash-in.
> 
> What if the user wants color really hard?
>
You mean want them *even in the test-suite.log*?

> Oh, is this user-influenceable at all btw?
>
For the testsuite summary that goes on stdout, yes, the user has complete
control over its colorizing (as he has for the colorizing of the testsuite
progress output); for the summary that goes in `test-suite.log', no, the
user can't colorize it, period.  This seems to me the most sensible policy.

> > > > +         result_count $$1 "tests:" $$all   "$$brg"; \
> > > > +         result_count $$1 "pass: " $$pass  "$$grn"; \
> > > > +         result_count $$1 "skip: " $$skip  "$$blu"; \
> > > > +         result_count $$1 "xfail:" $$xfail "$$lgn"; \
> > > > +         result_count $$1 "fail: " $$fail  "$$red"; \
> > > > +         result_count $$1 "xpass:" $$xpass "$$red"; \
> > > > +         result_count $$1 "error:" $$error "$$mgn"; \
> > > 
> > > You don't want to color the actual numbers nor the 'fail:'?
> > >
> > Huh?  The whole lines get colored ... am I misunderstanding your question?
> 
> OK then.  I misunderstood.
> 
> > > Why the shift in capitalization?
> > >
> > No good reason at all.
> > 
> > > I know this is totally bike shedding, but see above for rationale.
> > >
> > Good point.  I'll capitalize the result names ("pass" -> "PASS", etc.),
> > and adjust the test cases accordingly.
> > 
> > Post Scriptum: I find the summary even nicer with the capitalization
> > you suggested :-)
> 
> Yeah, except now the summary looks _a lot_ like output from a single
> test.  Quite confusing for a simple parser (again, look at the autobuild
> scripts for an example).
>
But IMVHO, the leading `#' characters avoids confusion for the parser, and
the enclosing `=========...' lines avoid confusion for the humans.  Or not?

> > > Need to test this on Tru64 and OpenBSD shells.  We've been burned with
> > > false negatives before.
> > >
> > I can do the test on OpenBSD 4.6 (not right now, but soonish).  I've no
> > access to True64 systems though.
> 
> Yeah, was more of a note to self.
> 
> > > > --- /dev/null
> > > > +++ b/tests/extract-testsuite-summary
> > > > @@ -0,0 +1,15 @@
> > > > +#! /usr/bin/env perl
> > > 
> > > Is it necessary to add another perl script?
> > >
> > Not strictly, but it's easier and reasonably fast, and since we can
> > assume the availability of perl when running the automake testsuite,
> > what is the problem?
> > 
> > > If yes, why use a different portability mechanism than elsewhere in
> > > the code base?
> > >
> > I don't understand the question, sorry.
> 
> The automake script itself uses a different method to find 'perl', not
> 'env'.  The user can pass PERL=... to configure.  Should that be honored
> here?
>
Ah, ok.  Yes, the script is always run with `$PERL -w /path/to/script' from
within our tests.

> > > > +Please report to address@hidden
> > > 
> > > Please don't list bad but valid email addresses.  Either use
> > > bug-autoconf or something in the example.com domain.  These addresses
> > > otherwise get picked up from the mail archives by spammers.
> > >
> > OK, sorry.  I'll use "address@hidden" then (and BTW, you
> > might want to fix a similar blunder in nodep2.test).
> 
> But why not bug-automake at the usual address?  Isn't that where you'd
> want any feedback to see, if anybody has the weird idea of writing to
> addresses found in tests?
>
Well, good point.  I'll make the change.

Regards,
  Stefano

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH] {GSoC} parallel-tests: simplify testsuite summary, Stefano Lattarini, 2011/07/03
- Re: [PATCH] {GSoC} parallel-tests: simplify testsuite summary, Ralf Wildenhues, 2011/07/06
  - Re: [PATCH] {GSoC} parallel-tests: simplify testsuite summary, Stefano Lattarini, 2011/07/06
    - Re: [PATCH] {GSoC} parallel-tests: simplify testsuite summary, Ralf Wildenhues, 2011/07/06
    - Re: [PATCH] {GSoC} parallel-tests: simplify testsuite summary, Stefano Lattarini <=
    - Re: [PATCH] {GSoC} parallel-tests: simplify testsuite summary, Ralf Wildenhues, 2011/07/07
    - Re: [PATCH] {GSoC} parallel-tests: simplify testsuite summary, Stefano Lattarini, 2011/07/07

Prev by Date: Re: [PATCH] {GSoC} tests defs: new auxiliary function 'count_test_results'
Next by Date: [GSoC] preliminary tests on TAP support
Previous by thread: Re: [PATCH] {GSoC} parallel-tests: simplify testsuite summary
Next by thread: Re: [PATCH] {GSoC} parallel-tests: simplify testsuite summary
Index(es):
- Date
- Thread