[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-gnubg] Confidence intervals from rollouts
From: |
Douglas Zare |
Subject: |
Re: [Bug-gnubg] Confidence intervals from rollouts |
Date: |
Tue, 10 Sep 2002 03:10:53 -0400 |
User-agent: |
Internet Messaging Program (IMP) 3.1 |
Quoting David Montgomery <address@hidden>:
> Douglas Zare wrote
> > There was an interesting question on the Gammonline bulletin board about
> the
> > standard deviation in cubeless rollouts and the standard deviations of
> live
> > cube results. I've included an excerpt below, in which I attempt to
> estimate a
> > confidence interval for the difference between doubling and not doubling.
> I'm
> > not sure what the best way to do this is, but I suggest that an attempt
> would
> > be worth implementing in gnu.
>
> Isn't the right way to do this a paired t-test?
> This is what I thought, anyway, after talking with
> Jeremy Bagai about it. Each game forms a pair with
> double and no-double result. Since the two are so
> highly correlated, you should get a much tighter
> confidence interval than the joint standard deviation.
True (unless you use a nonlinear function to convert cubeless to cubeful
numbers), although one problem is that the differences are not very close to a
normal distribution, so you either have to wait for the Central Limit Theorem
to kick in, or just hope for small tails, or use another test. I would be
inclined to set a minimum number of trials and then the middle choice, but
that's because I don't know the alternative tests.
> The same thing could be applied to checker plays, but
> it's not as clean, especially for 3+ plays. Then you
> are probably supposed to do some multi-way analysis of
> variance, of which I am unfortunately ignorant.
Yeah, the Central Limit Theorem still works, but some coincidences that make it
easy to do things in one dimension fail. I think most of the complexity goes
away if you just allow yourself to use perhaps twice as much data as a truly
efficient test would take. Since we are spending future processor cycles rather
than analyzing last year's medical trials, I'm not so concerned about it.
Douglas Zare
- [Bug-gnubg] Re: Confidence intervals from rollouts, (continued)
- [Bug-gnubg] Re: Confidence intervals from rollouts, pepster, 2002/09/06
- Re: [Bug-gnubg] Confidence intervals from rollouts, Nis Jorgensen, 2002/09/05
- Re: [Bug-gnubg] Confidence intervals from rollouts, David Montgomery, 2002/09/05
- Re: [Bug-gnubg] Confidence intervals from rollouts, Douglas Zare, 2002/09/05
- RE: [Bug-gnubg] Confidence intervals from rollouts, David Montgomery, 2002/09/05
- Re: [Bug-gnubg] Confidence intervals from rollouts, Joern Thyssen, 2002/09/05
- Re: [Bug-gnubg] Confidence intervals from rollouts, Joern Thyssen, 2002/09/08
- Re: [Bug-gnubg] Confidence intervals from rollouts, Joern Thyssen, 2002/09/05