bug-gnubg
[Top][All Lists]

## Re: [Bug-gnubg] Snowie 4 vs. GNU 0.13

 From: Rod Roark Subject: Re: [Bug-gnubg] Snowie 4 vs. GNU 0.13 Date: Mon, 9 Jun 2003 07:53:10 -0700 User-agent: KMail/1.4.3

```As I mentioned, I don't think standard deviation from a
single run of any number of matches is an interesting
number.

Let's try to answer the question "what's the standard
deviation of 1000 instances of 100 coin tosses?".  This
should help us understand the significance of the result
from 100 matches.  I.e. if the result of the matches is
well outside an expected *random* result, then we can
reasonably think that it is meaningful.

Here's a little Perl program to test this:

------------------------------------------------------------
\$ cat cointoss.plx
#!/usr/bin/perl
use strict;
my \$count       = 0;
my \$sum         = 0;
my \$sum_squares = 0;
for (my \$i = 0; \$i < 1000; ++\$i) {
my \$value = 0;

for (my \$j = 0; \$j < 100; ++\$j) {
\$value += 1 if (rand(1) < 0.5);
}

++\$count;
\$sum += \$value;
\$sum_squares += \$value * \$value;
}
my \$mean = \$sum / \$count;
printf "Number of samples  : %d\n", \$count;
printf "Mean               : %f\n", \$mean;
printf "Standard Deviation : %f\n", sqrt((\$sum_squares / \$count) - (\$mean *
\$mean));
\$ ./cointoss.plx
Number of samples  : 1000
Mean               : 49.810000
Standard Deviation : 4.989579
\$
------------------------------------------------------------

For a normal distribution, 68% of samples will fall within
the average += the standard deviation.  Here the standard
deviation from 1000 instances of 100 *random* match results

So winning anywhere from 45-55 of the matches is quite
ordinary, and in my opinion the actual result of 56 wins is
nice but not at all conclusive.

-- Rod

On Monday 09 June 2003 07:28 am, Joern Thyssen wrote:
> On Mon, Jun 09, 2003 at 07:14:08AM -0700, Rod Roark wrote
>
> > As others mentioned, there's a square root involved in the
> > calculation of standard deviation.  I believe there are a
> > couple of different ideas in use as to the exact formula.
> >
> > However I don't think you want this calculation anyway.  It
> > is an attempt to answer the question "how much do the
> > results differ from the average?".  Well if each result
> > must be either 0 or 1 and the average is somewhere around
> > 0.5, then they will all differ by 0.5.  This is not an
> > interesting number.
> >
> > The question to answer is, "how significant is the result
> > from 100 matches?".  This depends in part on "how much does
> > a match depend on skill, and how much on luck?".  The answer
> > to that last question is elusive.
>
> I know how to calculate the luck by analysing the matches and extracting
> the total luck.
>
> The 95% confidence interval is calculated as 1.96 * std.dev. So with 100
> matches and a final result of 56 wins to gnubg and std.dev of 0.5, I
> get:
>
> 56% +/- 98%
>
> What I don't understand is that if I had played 100000000000 matches and
> gnubg won 54000000000 of them, I would still have a std.dev of 0.5, thus
> I still get:
>
> 56% +/- 98%
>
> So what did I get wrong?
>
> Jørn

```