bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Confidence intervals from rollouts


From: David Montgomery
Subject: Re: [Bug-gnubg] Confidence intervals from rollouts
Date: Wed, 4 Sep 2002 08:22:44 -0700

> > Long ago I did some calculations based on JF rollouts,
> > which use stratified sampling.  I got a negative result
> > -- the data said the differences were larger than you
> > would expect from truly independent random samples,
> > rather than each rollout using stratified sampling.
>
> Could you elaborate on which experiments you did, and which calculations
> you did on the results?

It's been a long time... I think what I did was this....

I have a collection of thousands of JF rollouts in binary
form.  Many of these are positions that were rolled out
1296 games three times (for a total of 3888 games) with
identical parameters (level 5 untruncated) but independent
seeds.

Consider a single position, for which we have 3 rollout
samples A, B, and C.  The idea of rotating the first ply
or two is that the variance of the *difference* between
two plays should be reduced, since one aspect of the
luck has been eliminated.  So, for example, we would
expect that the standard error of abs(A - B) would be
less than sqrt(2)*[true standard error of rollout of
that size of that position].

So I think what I did was to consider that from C
I could get an unbiased estimate of this true standard
error, and then I compared this (* sqrt(2)) with abs(A-B).
I repeated with C-A vs B and C-B vs A, and repeated the
whole thing for hundreds or thousands of positions.

And the result I got was that the difference had a
greater standard error than sqrt(2)*independent sample.

I'm not certain this is exactly what I did, but it
was something like this.

> Note that stratified sampling (at least in the form
> of perfect distribution on first plies) _should_ increase
> the standard error as calculated by gnu.

But not the standard error of the difference in equity
between two plays, which is what we really care about
(for checker plays).

> > Ah, you just stipulate to cycling through the opening
> > ply or two.  JF's and my rollout code actually does more
> > than this, ensuring both perfectly distributed sampling
> > of the first two ply, and duplicate dice for subsequent ply
> > for every game.  That's what I mean by stratified sampling.
>
> Just to make sure: "My rollout code" above means the one
> in gnubg, right?

No, I have my own backgammmon code.  My first bg program
was written in the early 80's on an Apple IIe, and I
bemoaned the fact that my program kept making the 3-point
with an opening 53 no matter how I tweaked it to get it to
play 13/10 13/8! :-)

> Could you elaboreate on what "ensuring duplicate dice
> for subsequent ply" means?

Each game is associated with an unlimited stream
or sequence of rolls, like this:

Game 1: 21 32 66 43 54 51 51 ...
Game 2: 43 22 55 31 42 63 65 ...
.
.
.

Now, when you do a rollout of two positions (say two checker
plays from the same position), for both rollouts you use
the Game 1 dice for the first game, Game 2 dice for the second,
etc.  Thus both games have the exact same rolls for as deep
as the games go, for all games in the rollout.

(I believe Snowie gets this wrong, although I'm not certain.
Above we are having a seed determine the dice in an unlimited
number of streams, each of unlimited length.  I think Snowie
has the seed determine only one stream of dice.  This means
that if you use the same seed, the dice will be duplicate
for the first game, but from then on they will not, because
one game will last longer than the other. I don't know what
gnubg does.)

Actually, now that I think about it, stratified sampling is
actually quite different, but orthogonal.  It means that for
each ply (1, 2, 3, 4, ... ) every roll appears exactly as
often as it should (assuming that you use the roll from
that ply some multiple of 36 or 1296 times).  JF and my code
do both duplicate dice based on a seed and stratified sampling
for all ply.

David







reply via email to

[Prev in Thread] Current Thread [Next in Thread]