[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Histogram
From: |
maxgacode |
Subject: |
Re: Histogram |
Date: |
Sat, 28 Mar 2020 01:31:48 +0100 |
User-agent: |
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 |
Il 24/03/2020 01:55, Peter Johansson ha scritto:
Hi Max,
I think the current behavior is intuitive. I think it would be
unexpected if the count of a bin depend on whether it's a bin in the
middle and the last bin.
I have a set of data points and I'm embedding the histogram in a C++
class. So I create the histogram starting from a std:vector<double> in
the class constructor or in a setting function.
In the class the minimum and maximum vales of the data are used to set
the range
gsl_histogram_set_ranges_uniform(gsl_histogram * h, double xmin, double
xmax)
then the histogram is filled with
gsl_histogram_increment(gsl_histogram * h, double x)
Once the class is initialized (with data) there is a member function
creating a graphical plot from the histogram. It is also plotting some
statistical properties of the data.
So if I have 100 data-points the shown number of data points may be 100
or 99 or 98 or...who knows!!! It seems that the plot has lost data points!
It depends from a simple fact. The latest bin may or may not contains
some of the elements. If x is equal to xmax
xmin + (n - 1) d <= x < xmax
fails to increment the bin.
It may happen if x and xmax are exactly representable.
It's a known problem. At the moment I'm using the following workaround
If xmax and xmin are the minimum and maximum values of data points of
the vector of data I'm setting the range with
gsl_histogram_set_ranges_uniform(h, double xmin, double nextMax)
where
nextMax = std::nextafter(xmax,std::numeric_limits<double>::max());
So I'm usig the "next" floating point greater than xmax as the maximum
value for the range. It works...(hopefully!) and no data poinst are missed
Thank your your reply
Massimo
--
PGP key: wwwkeys.pgp.net: 0EBF4A07