bug-gsl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gsl] Incomplete gamma functions


From: Jason Stover
Subject: Re: [Bug-gsl] Incomplete gamma functions
Date: Mon, 15 Nov 2004 15:03:21 +0000
User-agent: Mutt/1.4.2.1i

I reproduced the error with gcc 3.3.4, gsl 1.5, linux
kernel 2.4.26.

If you are computing probabilities, you should use gsl_cdf_chisq_Q (x,
df) instead. It's written for doing such statistical tests. It uses
the incomplete gamma function, so it may still show the bug. But when
you have a lot of degrees of freedom, as in this case, you can (and probably
should) use a normal approximation, since this value:

             (chi-square statistic) - (degrees of freedom)
        Z = ----------------------------------------------
                    sqrt (2*degrees of freedom)

becomes normally distributed with mean 0 and standard dev. 1 as
df-->infinity. In this case, you can gsl_cdf_gaussian_Q(x-df,sqrt(2*df)).

Here is your program rewritten to use the chi-square
cdf (I think I have the right number of degrees of freedom;
definitely check that before cutting and pasting):

#include <stdio.h>
#include <gsl/gsl_cdf.h>
int main (void)
{
  double df=2*1e6 - 1.0;
  double x=2*1e6-2.0;
  printf ("%g\n", gsl_cdf_chisq_Q (x,df));

  return 0;
}

This gave 0.499668 and didn't show any error (I don't know
why not).

-Jason

On Sat, Nov 13, 2004 at 11:46:58PM -0800, Michel Lespinasse wrote:
> Hi all,
> 
> I've recently been looking for good implementations of the incomplete
> gamma functions. Note that I dont actually know much about statistics
> or numerical mathematics, but I just need an incomplete gamma function
> to implement the chi-square function in a spam filter :)
> 
> Anyway. I found out that the following program fails at runtime in
> gamma_inc.c:97
> 
> #include <stdio.h>
> #include <gsl/gsl_sf_gamma.h>
> 
> int main (void)
> {
>     printf ("%g\n", gsl_sf_gamma_inc_Q (1e6 - 1, 1e6 - 2));
>     return 0;
> }
> 
> I also wanted to point out a few things about this gamma_inc.c file,
> and ask a few questions too:
> 
> * In gsl_sf_gamma_inc_Q_e, you first test for a <= x (line 487) and then
>   for a < 0.8*x (line 505). The second test will always fail - otherwise,
>   the first one would have succeeded. I think this is the cause of the bug
>   I noted above - you try to use the P_series in a region where it converges
>   too slowly. Maybe line 505 really meant to test for 0.8*a < x ???
>   (I do not know how fast the Q_CF algorithm would converge in that region).
> 
> * gsl_sf_gamma_inc_P_e and gsl_sf_gamma_inc_Q_e are curiously asymetrical
>   in the way they select the proper numerical algorithm depending on
>   which region we are in. gsl_sf_gamma_inc_Q_e selects the Q_large_x
>   algorithm for any x > 1e6 (even as x-a might be small), while
>   gsl_sf_gamma_inc_P_e only selects Q_large_x for a < 0.2 * x.
>   Anyone understands why we have this difference ?
> 
> * In the region a>1e6, (x-a)<sqrt(a), you're using the Q_asymp_unif
>   algorithm from Temme. Do you know how precise your implementation is in
>   that region ? (the function comment says you're missing the c1 coefficient).
>   Also, gnu R uses a different implementation for that region, using a
>   normal distribution approximation. The reference I have is
>   "Algorithm AS 239, Incomplete Gamma Function, Applied Statistics 37, 1988".
>   I do not have access to that publication though, so I dont know if
>   it's any good. Anyone knows how it compares to the Temme algorithm ?
> 
> Since I'm there, I might also copy in my own pgamma function, which
> makes use of the normal approximation I found in gnu R. As I've said,
> I don't know how precise this really is, though it seems to have about
> 9 digits of precision in practice. If anyone knows more or has comments,
> I'd like to hear.
> 
> double pgamma (double a, double x)
> {
>     if (a > 1e5) {
>         // Use normal approximation.
>         return 0.5 * erfcl (-3 * sqrtl (0.5 * a) *
>                             (cbrtl (x / a) - 1 + (1 / 9.0L) / a));
>     } else if (x < a + 1) {
>         // Use series approximation.
>         long double an     = a;
>         long double factor = 1 / an;
>         long double sum    = factor;
> 
>         do
>             sum += factor *= x / ++an;
>         while (factor > LDBL_EPSILON * sum);
>         return expl (a * logl (x) - x - lgammal (a)) * sum;
>     } else {
>         // Use continued fraction approximation.
>         long double frac = LDBL_MAX;
>         long double xa1  = x - a - 1;
>         long double n    = 450;  // Keep in FP register for efficiency.
> 
>         while (--n > 0)
>             frac = n * (a - n) / frac + xa1 + n + n;
>         return 1 - expl (a * logl (x) - x - lgammal (a)) / frac;
>     }
> }
> 
> Hope this helps. I'm not on the lists, so I would appreciate to be
> copied in any replies.
> 
> Thanks,
> 
> 
> _______________________________________________
> Bug-gsl mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/bug-gsl

-- 
address@hidden
SDF Public Access UNIX System - http://sdf.lonestar.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]