[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patch #5583] NPAR TESTS

From: Jason Stover
Subject: Re: [patch #5583] NPAR TESTS
Date: Mon, 18 Dec 2006 11:52:27 -0500
User-agent: Mutt/1.5.10i

On Sun, Dec 17, 2006 at 01:47:29PM +0900, John Darrington wrote:
> On Sat, Dec 16, 2006 at 10:27:01PM +0000, Jason H Stover wrote:
>      I would use the exact result until the sample size grows enough to cause
>      overflows. 
> OK.  Any ideas how large such a sample would be ?

I would have to look at this closely to decide, but after some initial
thought, I would say: Large enough that we should just use
gsl_cdf_binomial_[PQ] always. gsl_cdf_binomial_[PQ] use a
gamma-function reparameterization to the binomial distribution, which
means they don't compute the summation of the binomial mass function
to compute the binomial CDF. The problem with the exact computations
is usually overflow due to small values of the mass function. Since
gsl_cdf_binomial_[PQ] cicumvent this by reparamaterizing with the
gamma function, my guess is that there is no need for an asymptotic

This reparameterization wasn't so useful prior to Walter Gautschi's
development of a precise method to compute the gamma function in the
1970's. Before that, computation of the Gamma function wasn't as
precise, and researchers in numerical analysis didn't talk much to
researchers in statistics. Statisticians therefore used the
large-sample approximation. I believe that legacy is the only reason a
lot of statistical programs still use large-sample approximations to
the binomial CDF.

I can run some tests to look for a threshold at which pspp should use 
an approximation, but I'm certain it's going to be large enough to be
almost, and possibly never, necessary with IEEE 754 arithmetic. Because of
that, I'd recommend checking in the NPAR procedure with the gsl functions.


> J'
> -- 
> PGP Public key ID: 1024D/2DE827B3 
> fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
> See or any PGP keyserver for public key.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]