[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] TEST RESULT: Variance Reduction

From: Jim Segrave
Subject: Re: [Bug-gnubg] TEST RESULT: Variance Reduction
Date: Mon, 7 Jul 2003 12:49:33 +0200
User-agent: Mutt/

On Mon 07 Jul 2003 (10:08 +0000), Joern Thyssen wrote:
> On Mon, Jul 07, 2003 at 11:29:15AM +0200, Jim Segrave wrote
> > While I was coding, I was wondering where and how you would get as
> > many as 16 cubeinfo's in a call, but I just decided to live with
> > it. Reducing this would save some memory, as extending rollouts
> > requires keeping the following internal state of the rollout code:
> > 
> > float aarMu[max-cube][rollout-outputs]
> > float aarSigma[max-cube][rollout-outputs]
> > float aarVariance[max-cube][rollout-outputs]
> > float aarResult[max-cube][rollout-outputs]
> > cubinfo aciLocal[max-cube]
> >       where a cubeinfo is about 8 ints and 4 floats
> > 2 other ints ( no of games rolled out, nSkip for the quasi-random dice)
> > 
> > max-cube is 16, rollout-outputs is 7
> I would say that 
> aarVariance is trivially related to aarSigma
> aarMu is trivially related to aarResult
> furthermore, aarOutput/aarStdDev is also stored, hence
> aarMu is trivially related to aarOutput 
> aarSigma is trivially related to aarStdDev

To save me doing something horrible to the calculations, can you
sketch out how I can rebuild these four arrays from aarOutput and
aarStdDev (presumably also using the count of games rolled out)? 

> > I'm not convinced that aciLocal[] has to be preserved, I think it can
> > be recreated whenever you resume, but I wanted to be safe.
> aciLocal is "constant" inside RolloutGeneral so you're absolutely right:
> no need to save it!

OK, I can change it to re-fill aciLocal on entry from the passed cubeinfo.
> In principle you should be able to elimate aarMu, aarSigma, aarVariance,
> aarResult, and aciLocal, leaving only cTrials and nSkip.
That would be slick.
> > I then want to look at a couple of possible extra uses:
> > 
> > When rolling out several moves from the analysis window, instead of
> > doing the first move to completetion before starting the next one,
> > roll out one or more of the first move, then do the second, third,
> > etc. This allows people to know if a rollout to compare two moves is
> > going to be pointless (one move is so clearly better/worse) without
> > having to wait until one move is completely rolled out and the other
> > has been going for a while.
> This is a straightforward generalisation of RolloutGeneral. Currently
> you input one anBoard but multiple cubeinfo. Just change this to input
> multiple anBoards as well. Note that BasicCubefulRollout already handles
> multiple anBoards! So it's next to trivial to change this!
> Cube decision:
> Input: two copies of the same board, two cubeinfo's for nd and dt
> Chequer play decision:
> Input: n different boards, n copies of the current cubeinfo

That's nice - I hadn't registered that. You're quite right that the
bottomost layer is already done, I just looked again at
BasicCubefulRollout. It looks like RolloutGeneral is missing the same
capability only in that anBoard is currently declared as a single
board and there's only one rolloutcontext supplied. Changing this to
expand the number of boards (and then back propagating the changes to
all of the calls to RolloutGeneral) should not be too hard at all. The
only other thing is that (technically at least) we should also pass in
a rolloutcontext per move to be decided.  Whether or not it is
sensible to compare rollouts of different moves using different
settings is sensible may be a legitimate question, but it seems to me
we shouldn't prevent it.

Jim Segrave           address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]