[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Histogram
From: |
maxgacode |
Subject: |
Histogram |
Date: |
Thu, 12 Mar 2020 15:59:40 +0100 |
User-agent: |
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 |
Hi,
The current implementation of the histogram in GSL is using, for the
first and last bin, the rules
bin[0] corresponds to xmin <= x < xmin + d
bin[n-1] corresponds to xmin + (n -1)d <= x < xmax
and there is a comment about the last bin
***
Thus any samples which fall on the upper end of the histogram are
excluded. If you want to include this value for the last bin you will
need to add an extra bin to your histogram.
***
I'm facing exactly this problem. Some data are not binned in the last
bin if the data is exactly equal to xmax. I can add an extra bin but
this is very inconvenient especially if I'm calling
gsl_histogram_set_ranges_uniform(), that is using the minimum and
maximum data values, after a call to gsl_histogram_alloc().
I'm wondering why (and if it is possible to implement it) GSL is not
using a different binning strategy like the NCAR
https://www.ncl.ucar.edu/Document/Graphics/Interfaces/gsn_histogram.shtml
The linked page has the following statement
"Note that the last interval is treated specially. This is intentional,
to make sure that all data values that fall inclusively between the
lowest and highest intervals are binned. "
So the last bin is filled with the rule
bin[n-1] corresponds to xmin + (n -1)d <= x <= xmax
Using this rule in the functions
gsl_histogram_increment
and
gsl_histogram_accumulate
seems (to me) much more convenient, obvious and simple. May be it can
be an option to be set using a function call after the histogram allocation.
Any comment welcome.
Max