Re: [Bug-gnubg] A new variance reduction technique

bug-gnubg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] A new variance reduction technique

From:	Christian Anthon
Subject:	Re: [Bug-gnubg] A new variance reduction technique
Date:	Mon, 30 Mar 2009 11:05:40 +0200

Doable I guess, but nothing we are going to do right now, and the idea
as outlined is complicated to implement with the current multi
threaded code. What one could do is

a) change the current quasi random dice code so that it produces
numbers between 1 and 36 (21?) (currently up till nine rolls I
believe, but squaring the possibilities from 6 to 36 might make only
up to 4-5 rolls possible)
b) use the choose-n'th-best-roll dice generator that we have already
with the numbers obtained from a)
c) calculate the diffs and their uncertainties

2009/3/29 bob koca <address@hidden>:
>
>
>   This was also posted to the forums at bgonline.org,
>
>
>
> Bob Koca
>
>
>
> ---
>
>
>
> I have a method that I think will make backgammon rollouts more efficient in
> that the std. errors will be smaller for the same number of games rolled
> out.  I don’t have the skills though to actually program it. Let me first
> give the basic idea of the theory with an example. To read more about this
> the  key words would be 2 sample t-tests and matched pairs.  Nearly all
> college level introductory stats textbooks would have a chapter on this.
>
>
>
>   Suppose you want to measure the difference  Mh – Fh where Mh is the
> average height of  married males and Fh is the average height of married
> females.  One way to do this is to take a random sample of married males and
> find the sample average and deviation and also find a random sample of
> females and find the sample average and deviation.  If the sample sizes are
> large enough and/or the population itself is close to normal then the sample
> averages each have an approximately normal distribution (with the true
> population average as its mean and deviation determined by its sample
> deviation). The difference is also then normal. This means our best guess
> for the true population difference is just the difference in the sample
> means.  The two samples were taken independently and the variance of the sum
> or difference of two random variables is the sum of the two individual
> variances. If the s dev are equal this gives a s dev of the difference equal
> to SQRT(2) times the STD dev of a single population.  This is essentially
> how rollouts are now done to compare play A to play B.
>
>
>
>   A second way to estimate the difference in heights it is to use matched
> pairs. Instead of choosing males randomly and also females randomly choose
> married couples randomly. For each couple find the difference (male height –
> female height).  This gives us a single list of numbers. Its average will be
> our best guess for the true population difference and the deviation of this
> list can be used to find the std error for our guess.  The std. error will
> be lower for this case as it takes advantage of the positive correlation
> between husband and wife heights.  As a near extreme case imagine that a
> husband is nearly always very close to being 3 inches taller than his wife.
> Use of the matched pairs technique shows this fact quickly. The greater the
> positive correlation the greater benefit is obtained. I believe that the
> matched pairs technique can be used in rollouts and will give a smaller
> standard error for the same number of trials.
>
>
>
>     Suppose one wants to do a rollout to compare play A with play B. What
> would be the meaning of having a positive correlation here? It would mean
> that if game n of say 1296 trials has a certain amount of luck for play A it
> would have about that same amount of luck for game n of the rollout of  play
> B as well. Using the same dice stream is a better than nothing but somewhat
> feeble way to do this as the games could quickly diverge.
>
>
>
>       A better way is to ensure that when play A gets its best roll that
> play B also does and vice versa. The evaluation feature can make this
> possible. Choose a roll for one of the play’s rollouts at random. But then
> instead of using that same roll or another randomly chosen roll for the
> other play’s rollout make it so that the ranks are the same. So suppose that
> the rollout of play A just got its 5th best (out of 36) roll. Determine the
> 5th best roll for the rollout of play B and use that roll. If there is a tie
> break it randomly. At the end of each trial look at result of A – result of
> B.  This gives a single list of numbers. The average value gives your
> estimate for equity of A – equity of B and the s.dev of the list gives the
> std. error. Conventional variance reduction techniques such as Dahl’s
> algorithm, and using each each die roll once could still be used.
>
>
>
>      Pseudocode (or perhaps I should call it pseudo-pseudo code) on how to
> do this is at the end. I am not a programmer so I hope my thoughts are
> interpretable.
>
>
>
>  How to test the value of this once it is has been coded up? Well one could
> just look at the std error of the estimate of the difference between play A
> and B. But I think something like the following would be more
> convincing:  Take several instances of play vs play decisions. RO
> (conventionally) with a huge number of trials so that the true difference in
> the win chances are known very accurately. Now using the same bot RO with
> just 36 trials the play A vs play B choice. Look at the discrepancy between
> the RO result and the true difference we have already found. Do this say 100
> times. The std. deviation of those errors gives a measure of how accurate
> the 36 game ROs  are. Now repeat using the same bot but with the above
> technique.  Compare the std deviation of the error from these rollouts
> against those obtained from the conventional rollouts.
>
>
>
>   PSEUDO CODE to estimate the difference in win% between play A and play B.
>
> DEFNS:
>
> A Strand is a rollout starting with a certain play, here either play A or
> play B.
>
>  A Trial is a certain game being rolled out. For example a rollout might
> have 1296                  trials for each of two strands.
>
>
>
> PROCEDURE: ForEachRoll:
>
>
>
> 1)      Generate a random number  (COMMENT could be done in conjunction with
> the rotation idea for rollouts that are powers of 36)
>
> If the current trial is unresolved for both strands:  Then
>
> 2)      Using 2 ply analysis find the rank out of the 36 possible dice rolls
> for that roll for the rollout in which play A was made initially.  (COMMENT:
> Note that this is not computationally expensive since is needed anyways for
> Dahl V.R.)
>
> 3)      Using two ply analysis find the roll which has the same rank for the
> strand in which  play B was made initially.  If there is a tie then break it
> randomly.
>
> 4)      Using 2 ply analysis find and make the best plays for each.
>
> 5)      Use Dahl VR and adjust the running count for each strand.
>
>
>
> If only one strand is unresolved then,
>
> 2)      Choose a die roll randomly.
>
> 3)      Play the move for that strand using 2-ply
>
> 4)      Use Dahl VR and adjust the running count for each strand.
>
>
>
> PROCEDURE ForEachTrial
>
>
>
> Run ForEachRoll
>
> If Both strands are not resolved Then Run ForEachRoll
>
> If both strands are resolved then record the following in 1XN arrays where N
> = number of trials to run.
>
>
>
> 1)      A = game result for  strand A + Dahl VR adjustment
>
> 2)      B = game result for strand B + Dahl VR adjustment
>
> 3)      Diff = A - B
>
>
>
>
>
>
>
> PROGRAM:  Do full RO
>
> Inputs:  N = number of trials.
>
>
>
> Set Counter = 1
>
> Until Counter = N+1 do:
>
>   Run Procedure ForEachTrial
>
>   Increase N by 1
>
> Use the array A to give estimate of win%  and std error for how often play A
> wins.
>
> Use the array B to give estimate of win% and std. error for how often play B
> wins.
>
> Use the array Diff to give estimate of difference in win% and its std error
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ________________________________
> Express your personality in color! Preview and select themes for Hotmail®.
> See how.
> _______________________________________________
> Bug-gnubg mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/bug-gnubg
>
>

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-gnubg] A new variance reduction technique, bob koca, 2009/03/29
- Re: [Bug-gnubg] A new variance reduction technique, Christian Anthon <=
- Re: [Bug-gnubg] A new variance reduction technique, Massimiliano Maini, 2009/03/30
  - Re: [Bug-gnubg] A new variance reduction technique, Michael Petch, 2009/03/30
    - Re: [Bug-gnubg] A new variance reduction technique, Massimiliano Maini, 2009/03/30
    - Re: [Bug-gnubg] A new variance reduction technique, Christian Anthon, 2009/03/31
- Re: [Bug-gnubg] A new variance reduction technique, Massimiliano Maini, 2009/03/31

Prev by Date: [Bug-gnubg] A new variance reduction technique
Next by Date: [Bug-gnubg] Re: Inconsistent highlighting during game analysis?
Previous by thread: [Bug-gnubg] A new variance reduction technique
Next by thread: Re: [Bug-gnubg] A new variance reduction technique
Index(es):
- Date
- Thread