[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-gsl] Incomplete gamma functions
From: |
Jason Stover |
Subject: |
Re: [Bug-gsl] Incomplete gamma functions |
Date: |
Mon, 15 Nov 2004 15:03:21 +0000 |
User-agent: |
Mutt/1.4.2.1i |
I reproduced the error with gcc 3.3.4, gsl 1.5, linux
kernel 2.4.26.
If you are computing probabilities, you should use gsl_cdf_chisq_Q (x,
df) instead. It's written for doing such statistical tests. It uses
the incomplete gamma function, so it may still show the bug. But when
you have a lot of degrees of freedom, as in this case, you can (and probably
should) use a normal approximation, since this value:
(chi-square statistic) - (degrees of freedom)
Z = ----------------------------------------------
sqrt (2*degrees of freedom)
becomes normally distributed with mean 0 and standard dev. 1 as
df-->infinity. In this case, you can gsl_cdf_gaussian_Q(x-df,sqrt(2*df)).
Here is your program rewritten to use the chi-square
cdf (I think I have the right number of degrees of freedom;
definitely check that before cutting and pasting):
#include <stdio.h>
#include <gsl/gsl_cdf.h>
int main (void)
{
double df=2*1e6 - 1.0;
double x=2*1e6-2.0;
printf ("%g\n", gsl_cdf_chisq_Q (x,df));
return 0;
}
This gave 0.499668 and didn't show any error (I don't know
why not).
-Jason
On Sat, Nov 13, 2004 at 11:46:58PM -0800, Michel Lespinasse wrote:
> Hi all,
>
> I've recently been looking for good implementations of the incomplete
> gamma functions. Note that I dont actually know much about statistics
> or numerical mathematics, but I just need an incomplete gamma function
> to implement the chi-square function in a spam filter :)
>
> Anyway. I found out that the following program fails at runtime in
> gamma_inc.c:97
>
> #include <stdio.h>
> #include <gsl/gsl_sf_gamma.h>
>
> int main (void)
> {
> printf ("%g\n", gsl_sf_gamma_inc_Q (1e6 - 1, 1e6 - 2));
> return 0;
> }
>
> I also wanted to point out a few things about this gamma_inc.c file,
> and ask a few questions too:
>
> * In gsl_sf_gamma_inc_Q_e, you first test for a <= x (line 487) and then
> for a < 0.8*x (line 505). The second test will always fail - otherwise,
> the first one would have succeeded. I think this is the cause of the bug
> I noted above - you try to use the P_series in a region where it converges
> too slowly. Maybe line 505 really meant to test for 0.8*a < x ???
> (I do not know how fast the Q_CF algorithm would converge in that region).
>
> * gsl_sf_gamma_inc_P_e and gsl_sf_gamma_inc_Q_e are curiously asymetrical
> in the way they select the proper numerical algorithm depending on
> which region we are in. gsl_sf_gamma_inc_Q_e selects the Q_large_x
> algorithm for any x > 1e6 (even as x-a might be small), while
> gsl_sf_gamma_inc_P_e only selects Q_large_x for a < 0.2 * x.
> Anyone understands why we have this difference ?
>
> * In the region a>1e6, (x-a)<sqrt(a), you're using the Q_asymp_unif
> algorithm from Temme. Do you know how precise your implementation is in
> that region ? (the function comment says you're missing the c1 coefficient).
> Also, gnu R uses a different implementation for that region, using a
> normal distribution approximation. The reference I have is
> "Algorithm AS 239, Incomplete Gamma Function, Applied Statistics 37, 1988".
> I do not have access to that publication though, so I dont know if
> it's any good. Anyone knows how it compares to the Temme algorithm ?
>
> Since I'm there, I might also copy in my own pgamma function, which
> makes use of the normal approximation I found in gnu R. As I've said,
> I don't know how precise this really is, though it seems to have about
> 9 digits of precision in practice. If anyone knows more or has comments,
> I'd like to hear.
>
> double pgamma (double a, double x)
> {
> if (a > 1e5) {
> // Use normal approximation.
> return 0.5 * erfcl (-3 * sqrtl (0.5 * a) *
> (cbrtl (x / a) - 1 + (1 / 9.0L) / a));
> } else if (x < a + 1) {
> // Use series approximation.
> long double an = a;
> long double factor = 1 / an;
> long double sum = factor;
>
> do
> sum += factor *= x / ++an;
> while (factor > LDBL_EPSILON * sum);
> return expl (a * logl (x) - x - lgammal (a)) * sum;
> } else {
> // Use continued fraction approximation.
> long double frac = LDBL_MAX;
> long double xa1 = x - a - 1;
> long double n = 450; // Keep in FP register for efficiency.
>
> while (--n > 0)
> frac = n * (a - n) / frac + xa1 + n + n;
> return 1 - expl (a * logl (x) - x - lgammal (a)) / frac;
> }
> }
>
> Hope this helps. I'm not on the lists, so I would appreciate to be
> copied in any replies.
>
> Thanks,
>
>
> _______________________________________________
> Bug-gsl mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/bug-gsl
--
address@hidden
SDF Public Access UNIX System - http://sdf.lonestar.org