bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] TEST RESULT: Variance Reduction


From: Joern Thyssen
Subject: Re: [Bug-gnubg] TEST RESULT: Variance Reduction
Date: Mon, 7 Jul 2003 10:08:37 +0000
User-agent: Mutt/1.4.1i

On Mon, Jul 07, 2003 at 11:29:15AM +0200, Jim Segrave wrote
[snip]
> I have got extending cube rollouts working with the current
> structure. It looks pretty good so far, but it's not heavily tested
> (verifying that a stopped and resumed rollout gives the same result as
> simply doing the same rollout on the old code is very time
> consuming). I've been running out simple rollouts and comparing the
> text mode report of results for exact matches in all the stats. 
> 
> I think people will like it. I save the internal state of
> RolloutGeneral as part of the evalstat. If you select rollout on code
> which has already been rolled out, it temporarily sets the rollout
> context from the saved state, but sets the number of games to be
> rolled out and any special stop condition (currently only the stop on
> std deviation) from the current settings). The rollout then simply
> continues. I've tried it with one move where different alternatives
> were rolled out with drastically different rollout settings, each one
> resumed with the appropriate settings. It should also preserve the
> full state of the quasi-random dice setup as well.
> 
> A pleasant side-effect is that the export of rollout results now shows
> the actual number of games rolled out.
> 
> While I was coding, I was wondering where and how you would get as
> many as 16 cubeinfo's in a call, but I just decided to live with
> it. Reducing this would save some memory, as extending rollouts
> requires keeping the following internal state of the rollout code:
> 
> float aarMu[max-cube][rollout-outputs]
> float aarSigma[max-cube][rollout-outputs]
> float aarVariance[max-cube][rollout-outputs]
> float aarResult[max-cube][rollout-outputs]
> cubinfo aciLocal[max-cube]
>       where a cubeinfo is about 8 ints and 4 floats
> 2 other ints ( no of games rolled out, nSkip for the quasi-random dice)
> 
> max-cube is 16, rollout-outputs is 7

I would say that 

aarVariance is trivially related to aarSigma
aarMu is trivially related to aarResult

furthermore, aarOutput/aarStdDev is also stored, hence

aarMu is trivially related to aarOutput 
aarSigma is trivially related to aarStdDev

> I'm not convinced that aciLocal[] has to be preserved, I think it can
> be recreated whenever you resume, but I wanted to be safe.

aciLocal is "constant" inside RolloutGeneral so you're absolutely right:
no need to save it!

In principle you should be able to elimate aarMu, aarSigma, aarVariance,
aarResult, and aciLocal, leaving only cTrials and nSkip.


> So it's adding just under 2Kbytes/eval setup, whether or not it's a
> rollout. It would be possible to avoid the overhead for non-rollout
> evalsetups, but I didn't want to try this straight away as it requires
> a lot of work to ensure that the memory is malloced whenever it's
> needed and freed whenever an analysis is re-done or a moverecord is
> deleted. 

With proper elimination this should be reduced to 8 bytes/evalsetup as
you get rid of all the max-cube size elements.


> I then want to look at a couple of possible extra uses:
> 
> When rolling out several moves from the analysis window, instead of
> doing the first move to completetion before starting the next one,
> roll out one or more of the first move, then do the second, third,
> etc. This allows people to know if a rollout to compare two moves is
> going to be pointless (one move is so clearly better/worse) without
> having to wait until one move is completely rolled out and the other
> has been going for a while.

This is a straightforward generalisation of RolloutGeneral. Currently
you input one anBoard but multiple cubeinfo. Just change this to input
multiple anBoards as well. Note that BasicCubefulRollout already handles
multiple anBoards! So it's next to trivial to change this!

Cube decision:

Input: two copies of the same board, two cubeinfo's for nd and dt

Chequer play decision:

Input: n different boards, n copies of the current cubeinfo

Jørn




reply via email to

[Prev in Thread] Current Thread [Next in Thread]