[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Axiom-developer] Re: [sage-devel] Randomised testing against Mathematic
[Axiom-developer] Re: [sage-devel] Randomised testing against Mathematica
Wed, 03 Mar 2010 09:40:36 -0500
Thunderbird 188.8.131.52 (Windows/20090302)
There are two test suites with validated results at
The CATS (Computer Algebra Test Suite) effort targets
the development of known-good answers that get run
against several systems. These "end result" suites test
large portions of the system. As they are tested against
published results they can be used by all systems.
The integration suite found several bugs in the published
results which are noted in the suite. It also found a bug
introduced by an improper patch to Axiom.
It would be generally useful if Sage developed known-good
test suites in other areas, say infinite sequences and series.
Perhaps such a suite would make a good GSOC effort with
several moderators from different systems.
I have done some more work toward a trigonometric test
suite. So far I have found that Mathematica and Maxima
tend to agree on branch cuts and Axiom and Maple tend
to agree on branch cuts. The choice is arbitrary but
it affects answers. I am having an internal debate about
whether to choose MMA/Maxima compatible answers just to
"regularize" the expected results users will see.
Standardized test suites give our users confidence that
we are generating known-good results for some (small)
range of expected inputs.
An academic-based effort (which Axiom is not) could
approach NIST for funding an effort to develop such
suites. NIST has a website (http://dlmf.nist.gov/)
Digital Library of Mathematical Functions. I proposed
developing Computer Algebra test suites for their
website but NIST does not fund independent open source
projects. Sage, however, could probably get continuous
funding to develop such suites which would benefit all
of the existing CAS efforts.
NSF might also be convinced since such test suites raise
the level of expected quality of answers without directly
competing against commercial efforts. I'd like to see a
CAS testing research lab that published standardized
answers to a lot of things we all end up debating, such
as branch cuts, sqrt-of-squares, foo^0, etc.
Dr. David Kirkby wrote:
Joshua Herman wrote:
Is there a mathematica test suite we could adapt or a standardized set
of tests we could use? Maybe we could take the 100 most often used
functions and make a test suite?
I'm not aware of one. A Google found very little of any real use.
I'm sure Wolfram Research have such test suites internally, but they
are not public. There is discussion of how they have an internal
version of Mathematica which runs very slowly, but tests things in
Of course, comparing 100 things is useful, but comparing millions of
them in the way I propose would more likely show up problems.
I think we are all aware that it is best to test on the hardware you
are using to be as confident as possible that the results are right.
Of course, Wolfram Research could supply a test suite to check
Mathematica on an end user's computer, but they do not do that. They
could even encrypt it, so users did not know what was wrong, but could
at least alert Wolfram Research.
I'm aware of one bug in Mathematica that only affected old/slower
SPARC machines if Solaris was updated to Solaris 10. I suspect it
would have affected newer machines too, had they been heavily loaded.
(If I was sufficiently motivated, I would probably prove that, but I'm
not, so my hypothesis is unproven).
It did not produce incorrect results, but pegged the CPU at 100%
forever if you computed something as simple as 1+1.) It was amazing
how that was solved between myself, Casper Dik a kernel engineer at
Sun and various other people on the Internet. It was Casper who
finally nailed the problem, after I posted the output of lsof, he
could see what Mathematica was doing.
I've got a collection of a few Mathematica bugs, mainly affecting only
Solaris, although one affected at least one Linux distribution too.
One thing I know Mathematica does do, which Sage could do, is to
automatically generate bug report if it finds a problem. At the most
primitive level, that code might be
if (x < 0)
else if (x == 0)
else if (x > 0)
If the error is generated, a URL is given, which you click and can
send a bug report to them. It lists the name of the file and line
number which generated the error. That's something that could be done
in Sage and might catch some bugs.
---- LOOK ITS A SIGNATURE CLICK IF YOU DARE---
On Wed, Mar 3, 2010 at 12:04 AM, David Kirkby
Has anyone ever considered randomised testing of Sage against
As long as the result is either
a) True or False
b) An integer
then comparison should be very easy. As a dead simple example,
1) Generate a large random number n.
2) Use is_prime(n) in Sage to determine if n is prime or composite.
3) Use PrimeQ[n] in Mathematica to see if n is prime or composite.
4) If Sage and Mathematica disagree, write it to a log file.
Something a bit more complex.
1) Generating random equation f(x) - something that one could
2) Generate generate random upper and lower limits, 'a' and 'b'
3) Perform a numerical integration of f(x) between between 'a' and
'b' in Sage
4) Perform a numerical integration of f(x) between between 'a' and 'b'
5) Compare the outputs of the Sage and Mathematica
A floating point number, would be more difficult to compare, as one
would need to consider what is a reasonable level of difference.
Comparing symbolic results directly would be a much more difficult
task, and probably impossible without a huge effort, since you can
often write an equation in several different ways which are equal, but
a computer program could not easily be programmed to determine if they
One could potentially let a computer crunch away all the time, looking
for differences. Then when they are found, a human would had to
investigate why the difference occurs.
One could then add a trac item for "Mathematica bugs" There was once a
push for a public list of Mathematica bugs. I got involved a bit with
that, but it died a death and I became more interested in Sage.
Some of you may know of Vladimir Bondarenko, who is a strange
character who regularly used to publish Mathematica and Maple bugs he
had found. In some discussions I've had with him, he was of the
opinion that Wolfram Research took bug reports more seriously than
Maplesoft. I've never worked out what technique he uses, but I believe
is doing some randomised testing, though it is more sophisticated that
what I'm suggesting above.
There must be a big range of problem types where this is practical -
and a much larger range where it is not.
You could at the same also compare the time taken to execute the
operation to find areas where Sage is much faster or slower than
To post to this group, send an email to address@hidden
To unsubscribe from this group, send an email to
For more options, visit this group at
|[Prev in Thread]
||[Next in Thread]|
- [Axiom-developer] Re: [sage-devel] Randomised testing against Mathematica,
Tim Daly <=