[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] more on the merge-fest

From: Tom Lord
Subject: Re: [Gnu-arch-users] more on the merge-fest
Date: Tue, 25 Nov 2003 12:51:41 -0800 (PST)

    > From: address@hidden

    > >         1) fix the code
    > >         2) verify that it broke the test suite
    > >         3) study and fix the test suite

    > I may misunderstand the point here since I dont know the
    > context, but this seems a bit strange to me.  All the docu for
    > java test suites tell it the other way around; if there is a bug
    > or a change in behavior you define the way it should work by
    > programming the test suite to do what you expect the API to do.
    > If the testsuite breaks (it will) then you fix the sources to
    > make the test suite pass.

This is just a variation on that theme.

Feature "X" was designed with spec "S1".

Tests and code were provided that tested for and implemented S1.

A variety of coding standard changes were made not involving a change
to the spec -- in those cases, just make sure the tests keep working.
That's an example of the theory of testing you're talking about in

However, the spec had to be changed to "S2".  But the handy
observation: S2 is such that a basic unit test of S1 should fail if
the code suddenly switches to S2.

So: change the code to S2.  Run the test.  It fails as expected: good.
That helps validate the test.  (This is a crude example of something
other theories of testing talk about:  deliberately inserting bugs to
evaluate the quality of testing.   In this case, though, it was just a
convenient option:  the new spec S2 looked like a bug from the
perspective of the old spec S1.)

Next step: Look at the test source in detail.  Confirm it is failing
as expected.  Change it into an S2 test.  Now it works.  That helps
validate the modified code.

    > I can point to some interresting documents that discuss this way
    > of working, but if I misuderstood then its not needed.  Let me
    > know if you are interrested in documents describing best-methods
    > for unit testing..

In general, you have to watch out for testing religions.  They're good
to study and are good rules of thumb but after the studying is done,
there's still art and pragmatic trade-offs left.

In general, the end result is the same: you want to say how things are
supposed to work as many different ways as you have time for, giving
priority to the expressions of how things are supposed to work that
are most likely to highlight problems, weighted by seriousness of
problems -- then you make sure everything you've said agrees with
everything else.  How you get there is an art, at best approximated by
a process formula.[*]

Testing gets a lot easier -- and formulas a lot more valuable -- when
the consequences of screwing up are very high.  After all, testing is
pretty much by definition redundant typing compared to the task "type
in the correct program":  when the value of high confidence in the
product is high, you can afford all that redundancy.

Hmm... that reminds me:  I should probably do another round of
mkpatch/dopatch stress-testing, just in case.


[*] Back in the days of `larch' there was an earlier effort to
    make an arch test suite.   It was exactly by this "artfulness"
    metric that that effort failed:   people started off writing
    the tests that were _easiest_ to write rather than the tests
    most needed.   For example (I don't remember if this was literally
    true or not but it may as well be):  vigorous testing of 
    `my-id' but none of `inventory' or `mkpatch'.

    The thinking, as I understood it, was: "well, there's just a
    finite number of tests needed.   We'll start at one end and
    continue to other, nailing every point along the way.  Slow and
    steady wins the race".

    A fine theory, of course, except unrealistic:  the effort died 
    in part when folk(s) working on it got called away to other
    tasks, leaving behind a not-particularly useful set of tests.

    In the ramp-up to tla, you have Robert Anderson to thank for 
    writing stress-tests for mkpatch/dopatch (which caught _many_
    bugs) and entering into a tight feedback-loop with me as I hacked
    ("1) [R] here's the next bug.  2) [T] Here's the fix. 3) goto 1").
    Absent that, tla would have been a disaster.

    And since the release of TLA, Colin got to the heart of the matter
    by emphasizing "end-to-end checks" (tests of high-level
    functionality) in the test suite distributed with tla.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]