bug-gsl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-gsl] nan's in statistics and histograms


From: e0afbb4ee8bdd2be cd14180a5e4ba2cf
Subject: [Bug-gsl] nan's in statistics and histograms
Date: Sat, 20 Dec 2014 09:13:36 -0500
User-agent: Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Thunderbird/31.3.0

To whom it concerns,

I would like if the GSL statistical routines, e.g., rank, sort, mean,
et c., operating on data sets would have configurable way of handling
NaN's.
As you may have seen in the past, when acquiring data, NaN can be used
for missing entries in data.

Proposals:

1. make gsl statistical routines insensitive to presence of NaN's, that
is, skip them
...
  if (isnan(d[i]))
    continue;
  // or do something with d[i] if it is not NaN
...

2. in re sorting of an array adopt three user-selectable strategies:
  I. put them at the beginning of the array
  II. put them at the end of the array
  III. leave them in place (sort data around them)

3. in re ranking of the vector entries, adopt sorting strategy II, then
do ranking as usual on offset array.

4. Ignore NaN's when computing 1-D histograms, but add an entry for
count of NaN's.

5. Ignore NaN's when computing 2-D histrograms, but add row-and-column
in the bin matrix for data points that had one of the coordinates or
both NaNs.

In re 4 and 5, using the global nan strategy put nan's at the beginning,
or at the end of the data matrix.


Regards,
w/boobs




reply via email to

[Prev in Thread] Current Thread [Next in Thread]