Re: [Bug-gnubg] A new variance reduction technique

bug-gnubg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] A new variance reduction technique

From:	Massimiliano Maini
Subject:	Re: [Bug-gnubg] A new variance reduction technique
Date:	Mon, 30 Mar 2009 16:14:08 +0200

<This reply too was also posted to bgonline.org>

Hi Bob,

just to be sure I do understand exactly what you are proposing: your trick tries to
give comparable luck to the two plays' games. But if Dahl's Var.Red. is perfect (i.e.
it factors out luck perfectly), this would be unnecessary, right ?

A few comments (assuming my understanding is right):

1. Instead of going for the same rank, you could (at negliectible cost) select
the roll for play B that gives the closest luck to the one obtained for play A.
Should be more efficent than just taking the roll with the same rank.

2. I'm not sure gnubg uses 2ply for luck evaluation. If it uses 0 ply (as I remember,
but I could be wrong, mabe it is 0ply just for match analysis luck evals), introducing
your trick would cost something in terms of speed (actually, quite a lot I would say,
knowing that each additional ply cost more or less a factor 21). If somebody on gnubg
mailing list can confirm that ....

MaX.

address@hidden wrote on 29/03/2009 21:50:15: >
> This was also posted to the forums at bgonline.org,
>
> Bob Koca
>
> ---
>
> I have a method that I think will make backgammon rollouts more > efficient in that the std. errors will be smaller for the same > number of games rolled out. I don’t have the skills though to > actually program it. Let me first give the basic idea of the theory > with an example. To read more about this the key words would be 2 > sample t-tests and matched pairs. Nearly all college level > introductory stats textbooks would have a chapter on this.
>
> Suppose you want to measure the difference Mh – Fh where Mh is > the average height of married males and Fh is the average height of > married females. One way to do this is to take a random sample of > married males and find the sample average and deviation and also > find a random sample of females and find the sample average and > deviation. If the sample sizes are large enough and/or the > population itself is close to normal then the sample averages each > have an approximately normal distribution (with the true population > average as its mean and deviation determined by its sample > deviation). The difference is also then normal. This means our best > guess for the true population difference is just the difference in > the sample means. The two samples were taken independently and the > variance of the sum or difference of two random variables is the sum > of the two individual variances. If the s dev are equal this gives a > s dev of the difference equal to SQRT(2) times the STD dev of a > single population. This is essentially how rollouts are now done to > compare play A to play B.
>
> A second way to estimate the difference in heights it is to use > matched pairs. Instead of choosing males randomly and also females > randomly choose married couples randomly. For each couple find the > difference (male height – female height). This gives us a single > list of numbers. Its average will be our best guess for the true > population difference and the deviation of this list can be used to > find the std error for our guess. The std. error will be lower for > this case as it takes advantage of the positive correlation between > husband and wife heights. As a near extreme case imagine that a > husband is nearly always very close to being 3 inches taller than > his wife. Use of the matched pairs technique shows this fact > quickly. The greater the positive correlation the greater benefit is > obtained. I believe that the matched pairs technique can be used in > rollouts and will give a smaller standard error for the same number > of trials.
>
> Suppose one wants to do a rollout to compare play A with play B. > What would be the meaning of having a positive correlation here? It > would mean that if game n of say 1296 trials has a certain amount of > luck for play A it would have about that same amount of luck for > game n of the rollout of play B as well. Using the same dice stream > is a better than nothing but somewhat feeble way to do this as the > games could quickly diverge.
>
> A better way is to ensure that when play A gets its best roll > that play B also does and vice versa. The evaluation feature can > make this possible. Choose a roll for one of the play’s rollouts at > random. But then instead of using that same roll or another randomly > chosen roll for the other play’s rollout make it so that the ranks > are the same. So suppose that the rollout of play A just got its 5th > best (out of 36) roll. Determine the 5th best roll for the rollout > of play B and use that roll. If there is a tie break it randomly. At > the end of each trial look at result of A – result of B. This gives > a single list of numbers. The average value gives your estimate for > equity of A – equity of B and the s.dev of the list gives the std. > error. Conventional variance reduction techniques such as Dahl’s > algorithm, and using each each die roll once could still be used.
>
> Pseudocode (or perhaps I should call it pseudo-pseudo code) on > how to do this is at the end. I am not a programmer so I hope my > thoughts are interpretable.
>
> How to test the value of this once it is has been coded up? Well > one could just look at the std error of the estimate of the > difference between play A and B. But I think something like the > following would be more convincing: Take several instances of play > vs play decisions. RO (conventionally) with a huge number of trials > so that the true difference in the win chances are known very > accurately. Now using the same bot RO with just 36 trials the play A > vs play B choice. Look at the discrepancy between the RO result and > the true difference we have already found. Do this say 100 times. > The std. deviation of those errors gives a measure of how accurate > the 36 game ROs are. Now repeat using the same bot but with the > above technique. Compare the std deviation of the error from these > rollouts against those obtained from the conventional rollouts.
>
> PSEUDO CODE to estimate the difference in win% between play A and play B.
> DEFNS:
> A Strand is a rollout starting with a certain play, here either play > A or play B.
> A Trial is a certain game being rolled out. For example a rollout > might have 1296 trials for each of two strands.
>
> PROCEDURE: ForEachRoll:
>
> 1) Generate a random number (COMMENT could be done in > conjunction with the rotation idea for rollouts that are powers of 36)
> If the current trial is unresolved for both strands: Then
> 2) Using 2 ply analysis find the rank out of the 36 possible > dice rolls for that roll for the rollout in which play A was made > initially. (COMMENT: Note that this is not computationally > expensive since is needed anyways for Dahl V.R.)
> 3) Using two ply analysis find the roll which has the same rank > for the strand in which play B was made initially. If there is a > tie then break it randomly.
> 4) Using 2 ply analysis find and make the best plays for each.
> 5) Use Dahl VR and adjust the running count for each strand.
>
> If only one strand is unresolved then,
> 2) Choose a die roll randomly.
> 3) Play the move for that strand using 2-ply
> 4) Use Dahl VR and adjust the running count for each strand.
>
> PROCEDURE ForEachTrial
>
> Run ForEachRoll
> If Both strands are not resolved Then Run ForEachRoll
> If both strands are resolved then record the following in 1XN arrays > where N = number of trials to run.
>
> 1) A = game result for strand A + Dahl VR adjustment
> 2) B = game result for strand B + Dahl VR adjustment
> 3) Diff = A - B
>
>
>
> PROGRAM: Do full RO
> Inputs: N = number of trials.
>
> Set Counter = 1
> Until Counter = N+1 do:
> Run Procedure ForEachTrial
> Increase N by 1
> Use the array A to give estimate of win% and std error for how > often play A wins.
> Use the array B to give estimate of win% and std. error for how > often play B wins.
> Use the array Diff to give estimate of difference in win% and its std error

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-gnubg] A new variance reduction technique, bob koca, 2009/03/29
- Re: [Bug-gnubg] A new variance reduction technique, Christian Anthon, 2009/03/30
- Re: [Bug-gnubg] A new variance reduction technique, Massimiliano Maini <=
  - Re: [Bug-gnubg] A new variance reduction technique, Michael Petch, 2009/03/30
    - Re: [Bug-gnubg] A new variance reduction technique, Massimiliano Maini, 2009/03/30
    - Re: [Bug-gnubg] A new variance reduction technique, Christian Anthon, 2009/03/31
- Re: [Bug-gnubg] A new variance reduction technique, Massimiliano Maini, 2009/03/31

Prev by Date: Re: [Bug-gnubg] How strong is the lookahead?
Next by Date: Re: [Bug-gnubg] A new variance reduction technique
Previous by thread: Re: [Bug-gnubg] A new variance reduction technique
Next by thread: Re: [Bug-gnubg] A new variance reduction technique
Index(es):
- Date
- Thread