Re: [patch #5583] NPAR TESTS
John Darrington
Re: [patch #5583] NPAR TESTS
Sat, 9 Dec 2006 11:14:40 +0900
Mutt/1.5.4i
On Thu, Dec 07, 2006 at 05:28:37PM +0000, Jason H Stover wrote:
Follow-up Comment #5, patch #5583 (project pspp):
I checked a couple of the tests out. It looks like SPSS is doing
something odd. I attached a file with comments. I'm not sure what
to do about this. The SPSS output is misleading, though not necessarily
"wrong", if we accept its inconsistent changes in what hypothesis it tests
in
different situations. But I don't want to mislead any users.
Your comments is consistent with the SPSS documentation which says:
"The direction of the one-tailed test depends on the observed
proportion in the first category. If the observered proportion is
more than the test proportion, the significance of observing that
many or more in the first category is reported. If the observed
proportion is less than or equal to the test proportion, the
significance of observing that many or fewer in the first category is
reported. In other words, the test is always done in the observed
direction."
The calculations in my patch don't have any problem with the two
particular examples you mentioned. Later ones however have problems:
DATA LIST LIST NOTABLE /x * w *.
BEGIN DATA.
1 11
2 9
END DATA.
WEIGHT BY w.
NPAR TESTS
/BINOMIAL(0.6) = x
Gives:
5.1 NPAR TESTS. Binomial Test
+-+------#--------+--+--------------+----------+---------------------+
| | #Category| N|Observed Prop.|Test Prop.|Exact Sig. (1-tailed)|
+-+------#--------+--+--------------+----------+---------------------+
|x|Group1# 1.00|11| .550| .600| .404|
| |Group2# 2.00| 9| .450| | |
| |Total # |20| 1.00| | |
+-+------#--------+--+--------------+----------+---------------------+
And the cumulative Binomial Distribution for p = 0.6, n = 20, x = 11 ,
is indeed 0.404.
But the formula given in Algorithms says (as I understand it) to use
the binomial cdf for p = 0.4, n = 20, x = 9. That answer is 0.755
It seems that the book is saying to reverse the test if the observed
proportion >= 0.5, whereas SPSS and its documentation is reversing the
test if the observed proportion >= p
Also, there is what I think is a seperate issue: The book says the
answer is infact not the binomial cdf, but (2 * cdf - B(k;n,p) )/2 ---
this seems to be a "correction for continuity" which Siegel &
Castellen (Chapter 4) says is necessary for the asymptotic
approximation, but they don't mention it for the exact case.
So I'm confused. But I'm fast coming to the conclusion that the book
is completely wrong.
J'
