I've just committed a mega-patch to include this feature. Here's some
notes to go with it. I'll look forward to people testing it,
This code does the following:
For any rollout, the saved results include a rollout context which is
extended with two additional fields - the number of trials actually
rolled out and the value of 'nSkip' which affects the quasi-random
dice generation.
When rolling out a move/cube decision, gnubg now looks to see if the
last analysis was a rollout. If so, it initiates the rollout using the
saved context except that:
The number of trials is taken from your current rollout settings.
The settings for options to stop rollouts early (currently only the
stop when STD is small enough), are taken from your current rollout
settings, not from the saved rollout context.
.sgf files also save the complete rollout context. The values of
win/win gammon/.../cubeless equity/cubeful equity/, their std's and
the value of rScore and rScore2 are now saved using %.10g format,
which was a guess on my part on what would be sufficient accuracy to
resume rollouts.
When a rollout is running, if you press stop, the number of completed
rollouts is recorded.
So, to extend a rollout, you don't need to remember what the previous
settings were, you simply select the number of trials you want the
rollout to go to and it simply works.
When rolling out multiple alternative moves, all the moves are done in
step rather than the old way of completing the rollout of the first
alternative before starting the second (it's a bit more complicated
what happens when extending one rollout and starting or extending a
different move's rollout, but the effect is the same - moves/decisions
with a lower number of completed trials are done until they catch up
with those which are being extended, then the process proceeds in
parallel.
.sgf files from earlier versions will be readable by this code, but
rollouts won't be extendible. .sgf files which contain rollouts
written by this code will not be readable by (and in some cases may
cause a crash) older versions. .sgf files written by this code which
don't contain rollouts will be fully interchangeable with older
versions of gnubg
Caveats:
Testing rollouts is very slow, so I can't claim this code is
bug-free. I have done a large number (in the hundreds) or rollouts,
primarily of a small number of positions and compared the results
to those of the version of gnubg just before these changes. This
does not mean that there can't be bugs - RolloutGeneral has been
heavily modified, as has CommandRollout, GeneralCubeDecsionR and
a few others. Anyone finding bugs, please report them.
Accuracy:
Resuming a rollout costs a small amount in accuracy. Since only the
value and standard deviation are saved, some of the internal variables
used in the rollout loop are reconstructed from them. In particular,
the accumulated variance and the accumulated sum are reconstructed and
slight differences will appear. My experiments, necessarily limited in
length and number, suggest that the effects are at most visible in the
4th decimal place.
.sgf files containing rollouts will be slightly larger, because they
contain a complete rollout context and the values are stored with
greater precision.
Repeatability:
Every extended rollout will use the same seed as the initial part of
that rollout. If you are using quasi-random dice. then the games
rolled out (as long as there are less than 128 rolls in any one game)
will get the exact same sequence of dice as they would have got had
the rollout not been interrupted. This does assume you are using a
repeatable RNG, if not, the results are unpredictable, but it's hard
to argue that they are intrinsically different from what you would have
got if you didn't interrupt the rollout.
Misc things:
If you have a match where different moves have been rolled out with
different rollout settings (whether it's truncation, move filters, rng
seed, or whatever), then each one will be extended using the corresponding
rollout settings.
If you have two different moves which have been rolled out to a
different number of trials (say one move was stopped at 360 trials,
another at 720), and you set your rollout for 1296 games. The
rollout would begin by doing the 360 trial game until it reaches 720,
then both games will progress.
Output of rollout results will now show the number of trials
actually done, not the number originally requested.
The only way to roll a move out with different settings is to get rid
of the previous rollout results. The simplest way is to press the
0-ply button, then select the move again and press rollout.
The code in RolloutGeneral() now can take an arbitrary list of
positons, cubeinfo and rollout contexts and roll them out in one
pass. While there's currently no way in gnubg to select positions from
more than one point in a game or match, if there were a way, then all
the selected positions can be rolled out with a single call to
RolloutGeneral.
Limitations:
Cube rollouts from the Annotation Window, results from Command
Rollout, and the Rollout commands in the Analysis dropdown are not
put into the move list and hence are lost. It would be good if this
data were put into the moverecords.
Implementation notes:
The calls to RolloutGeneral (and in turn the calls to callers thereof)
now allow passing in multiple board, eval setups, cubeinfos, result
arrays, etc. There is no longer an assumption that these multiple
items actually form a single array, so all the callers now pass an
array of pointer to the various items. This makes call setup more
tedious, but gives the flexibility to pass pointers to the relevant
pieces of a large number of separate moves to a single call. See the
lengthy comments just before RolloutGeneral in rollout.c
In a couple of places, I avoided the use of gnubg's dynamic arrays in
favour of calls to alloca(). This is simply because debugging with
dynamic arrays is sometimes very difficult - the debugger symbol table
address for an item is not necessarily the location where the item
actually resides, whereas this problem doesn't occur with alloca.
Whoever did the quasi-random dice code deserves special honours - it
fits in so well with this system and already optimises some of the
most common things - like checking if the rng seed used to set it up
is the same. This is really nice code.
Because we may now be using the rollout context saved in .sgf
files, I have done two things:
Instead of the old indicator 'R ' to mark a rollout, the newer ones
use 'X ' for eXtendable rollouts. I have also added an integer
version number (set to 1), defined in eval.h. sgf.c will ensure
that if the version numbers don't match, the rollout context will
be marked as not being extendabl.
Future plans:
I'd like to discard the current 'stop when std is less than...' and
replace it with a condition to be used only when rolling out more
than one move or a cube decision. The idea would be to set a
minimum number of trials, then, once those are completed, at the
end of each iteration, caclulate the joint standard deviation of
the cubeful or cubeless equity for the best move paired with each
other one being rolled out. When the equity difference exceeds
some user chosen multiple of the joint standard deviation, stop
rolling out the inferior move. Continue until only one move is
left. The idea would be to automate rolling out close decisions
until either your patience is exhausted and you have exceeded the
number of trials selected or when there is a reasonable confidence
that one move is actually better than any of the others being
considered.
It would be nice to fix CommandRollout and the cube decision code
to save their results.