Re: [Bug-gnubg] Confidence intervals from rollouts

bug-gnubg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Confidence intervals from rollouts

From:	Douglas Zare
Subject:	Re: [Bug-gnubg] Confidence intervals from rollouts
Date:	Tue, 10 Sep 2002 03:10:53 -0400
User-agent:	Internet Messaging Program (IMP) 3.1

Quoting David Montgomery <address@hidden>:

> Douglas Zare wrote
> > There was an interesting question on the Gammonline bulletin board about
> the
> > standard deviation in cubeless rollouts and the standard deviations of
> live
> > cube results. I've included an excerpt below, in which I attempt to
> estimate a
> > confidence interval for the difference between doubling and not doubling.
> I'm
> > not sure what the best way to do this is, but I suggest that an attempt
> would
> > be worth implementing in gnu.
> 
> Isn't the right way to do this a paired t-test?
> This is what I thought, anyway, after talking with
> Jeremy Bagai about it.  Each game forms a pair with
> double and no-double result.  Since the two are so
> highly correlated, you should get a much tighter
> confidence interval than the joint standard deviation.

True (unless you use a nonlinear function to convert cubeless to cubeful 
numbers), although one problem is that the differences are not very close to a 
normal distribution, so you either have to wait for the Central Limit Theorem 
to kick in, or just hope for small tails, or use another test. I would be 
inclined to set a minimum number of trials and then the middle choice, but 
that's because I don't know the alternative tests.

> The same thing could be applied to checker plays, but
> it's not as clean, especially for 3+ plays.  Then you
> are probably supposed to do some multi-way analysis of
> variance, of which I am unfortunately ignorant.

Yeah, the Central Limit Theorem still works, but some coincidences that make it 
easy to do things in one dimension fail. I think most of the complexity goes 
away if you just allow yourself to use perhaps twice as much data as a truly 
efficient test would take. Since we are spending future processor cycles rather 
than analyzing last year's medical trials, I'm not so concerned about it.

Douglas Zare

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-gnubg] Re: Confidence intervals from rollouts, (continued)
- Re: [Bug-gnubg] Confidence intervals from rollouts, Joern Thyssen, 2002/09/05
  - Re: [Bug-gnubg] Confidence intervals from rollouts, Douglas Zare, 2002/09/05
  - Re: [Bug-gnubg] Confidence intervals from rollouts, David Montgomery, 2002/09/05
    - Re: [Bug-gnubg] Confidence intervals from rollouts, Douglas Zare <=

Prev by Date: RE: [Bug-gnubg] miscellaneous bugs and suggestions
Next by Date: RE: [Bug-gnubg] miscellaneous bugs and suggestions
Previous by thread: Re: [Bug-gnubg] Confidence intervals from rollouts
Next by thread: [Bug-gnubg] Sound in GNUBg (low priortiy)
Index(es):
- Date
- Thread