[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Bug-gnubg] Some Questions about rollouts.
From: |
David Montgomery |
Subject: |
RE: [Bug-gnubg] Some Questions about rollouts. |
Date: |
Mon, 4 Nov 2002 08:36:41 -0800 |
> Do you have code for other reductions, e.g., 66%, 50% or 25%? Also, it
> would be very nice to make this work for 3-ply evaluations as well.
>
> Jørn
I've attached my file for lookahead roll lists. It has code
for 50% and 25% sampling, as well. I think it would be relatively
straightforward to add these for 2-ply eval.
In my program it is intended to work easily with deeper evaluations.
You can say that you want 50% at the 2-ply level, and then 25% at
the 3-ply level, for example. But I think this interacts a lot with
how you do lookahead evaluations, so I'm not prepared to say this
is easy. One thing that I found to be important, as is mentioned
in the comments, was to use a stack of rotation counters ... that
is, you don't want your iteration over the 3-ply groups to affect
your iteration over the 2-ply groups.
You can see how I use larolls in the function below. Just scan
for larl_
static void
laeval_eval( eBoard_t *eb, int ply ) {
/*
* Ply is at least 1... this is lookahead evaluation.
*
* bep = basic (no lookahead) evaluation procedure
* claep = candidate lookahead evaluation procedure
* flaep = final lookahead evaluation procedure
*
* If ply is 1, no claep, no flaep, no pruning... just evaluate all the
* moves with bep and store the average.
*
* If ply is 2, no claep. Flaep is 1-ply lookahead. Evaluate the moves
* with bep and apply post-basic pruning, then evaluate the survivors with
* flaep and store the average.
*
* If ply > 2, then...
* 1) evaluate with bep, apply post-basic pruning
* 2) claep: evaluate with ply-2 lookahead. This recurses so that what
happens
* is that plays will be evaluated with 1, ..., ply-3 lookahead first,
pruning
* using the candidate pruning policy for the appropriate depth.
Finally
* after ply-2 lookahead is done the finalPruningPolicy[depth] is
applied.
* 3) flaep: evaluate with ply-1 lookahead. Average over this is ply
lookahead.
*
*/
static int depth = -1;
const eProcedure_t *bep;
eProcedure_t claep;
eProcedure_t flaep;
eMoveList_t *eml[ MAX_LA_ROLL_LISTS ];
int wt[ MAX_LA_ROLL_LISTS ];
int rollIndexer[21];
int numLists;
boolean needEvals;
int count;
int i;
/* --- get basic eval procedures --- */
if ( depth == 0 || fixedEvalProcedure ) bep = startBoardEvalProcedure;
else bep =
getNoLookaheadEvalProcedure( eb->b );
assume( bep->evalLength == startBoardEvalProcedure->evalLength );
/*
* Handle game over as a special case, so that we can
* be sure not to advance the game beyond the move that
* ends it. Otherwise you could end up with it looking
* like a win for me, for you, for me, etc., depending
* on depth of lookahead.
*/
if ( gameOver(eb->b) ) {
assume( eb->b[ POS_BORNE_OFF ] == 15 );
bep->gameOverEval(eb);
return;
}
/* --- prepare for lookahead --- */
++depth;
eml_setLAlistMemory( eCellMemory[depth], eBoardMemory[depth] );
if ( twoSided ) flopBoard( eb->b );
if ( ply == rootPly ) larl_restart();
needEvals = ( ply+depth == rootPly );
if ( depth == 2 ) printf( "Depth 2\n" );
/* --- set eProcedures outside loop --- */
if ( ply > 2 && postBasicPruningPolicy[depth].maxCandidates != 1 ) {
claep.numSteps = ply-2;
claep.ev = candidateEvaluators[ply-1];
claep.pp = (pruningPolicy_t*) candidatePruningPolicy[depth]; //
const cast
claep.required = nothingRequired;
}
if ( ply > 1 ) {
flaep.numSteps = 1;
flaep.ev = &finalEvaluator[ply-1];
}
/* --- initialize accumulators --- */
count = 0;
zeroEboardEval( eb, needEvals );
/* --- get the move lists --- */
larl_getMoveLists( eb->b, eml, wt, &numLists, rollIndexer );
/* --- store overall info for la breakdown --- */
if ( ply == rootPly && storeBreakdown ) {
assume( eb->laBreakdown == NULL );
eb->laBreakdown = newMemory( sizeof(laBreakdown_t) );
eb->laBreakdown->evalLength = bep->evalLength;
eb->laBreakdown->numVariations = numLists;
memcpy( eb->laBreakdown->rollIndexer, rollIndexer, sizeof(int)*21 );
}
/* --- do each list --- */
for ( i=0; i < numLists; i++ ) {
/* --- basic eval --- */
if ( eml[i]->length > 1 || ply == 1 )
evaluatePlays( eml[i], bep, ply == 1 ); //lint !e730 boolean
arg to fn
/* --- lookahead candidate pruning --- */
if ( ply > 2 ) {
eml_prune( eml[i], &postBasicPruningPolicy[depth] );
if ( eml[i]->length > 1 ) {
larl_pushDepth( 0 );
assume( claep.numSteps == ply-2 ); //lint !e644 lint
thinks claep might not be initialized
evaluatePlays( eml[i], &claep, FALSE );
larl_popDepth();
}
}
/* --- final lookahead scoring --- */
if ( ply > 1 ) {
larl_pushDepth( larl_depthStackTop() + 1 );
larl_rotateRollList();
eml_prune( eml[i], &finalPruningPolicy[depth] );
assume( flaep.numSteps == 1 ); //lint !e644 lint
thinks flaep might not be initialized
evaluatePlays( eml[i], &flaep, TRUE );
larl_popDepth();
}
/* --- accumulate evaluations --- */
accumulateEboardEval( eb, eml[i]->best->eb, wt[i], needEvals );
count += wt[i];
}
if ( ply == rootPly && storeBreakdown ) {
for ( i=0; i < numLists; i++ )
memcpy( eb->laBreakdown->eval[i], eml[i]->best->eb->eval,
sizeof(float)*(unsigned)bep->evalLength );
}
/* --- normalize evaluation --- */
divideEboardEval( eb, count, needEvals );
/* --- clean up --- */
eb->info[0] = -ply;
eml_purgeLAlistMemory( eml, numLists );
if ( twoSided ) {
flopBoard( eb->b );
startBoardEvalProcedure->flopEval( eb, needEvals );
}
--depth;
}
David
larolls.h
Description: Text document
larolls.c
Description: Text document