bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Some more code-related questions


From: Jim Segrave
Subject: Re: [Bug-gnubg] Some more code-related questions
Date: Sun, 26 Oct 2003 02:35:18 +0100
User-agent: Mutt/1.4.1i

On Sat 25 Oct 2003 (17:06 -0600), Thomas Hauk wrote:
> On Sat, 25 Oct 2003, Jim Segrave wrote:
> > > How does gnubg's search work? Is it basically a beam search, and if so, 
> > > how are candidates chosen? How many are kept?
> 
> [snip explanation]
> 
> Thanks, that was helpful. I may check back again for clarification after I 
> write that part up in my thesis.
> 
> > > How do the cubeinfo and evalcontext structs work?
> > 
> > I'm not sure I understand the question - cubeinfo's contain the
> > current cube owner, cube value, match score, rule sets etc. This is
> > the data needed to do cubeful equity calculations from the raw
> > win/wing/winbg/loseg/losebg data. It is also needed even in non-cube
> > situations such as the Crawford game, as it affects calculating match
> > winning chance etc.
> > 
> > evalcontexts contain the current choices in terms of plies, if cubeful
> > evaluations are wanted, if reduced sets of rolls are to be used, etc.
> 
> I am wondering what the fields do and how each struct is used.
> 
> We have:
> 
> typedef struct _cubeinfo {
>     int nCube, fCubeOwner, fMove, nMatchTo, anScore[ 2 ], fCrawford, 
> fJacoby, fBeavers;
>     float arGammonPrice[ 4 ];
>     bgvariation bgv;
> } cubeinfo;
> 
> And the comments:
>      * fCubeOwner: the owner of the cube,
> 
> What values does this take? 0 and 1, 0 representing the player whose 
> pieces are represented by the first slice of the board array? If so, 
> should this be changed between plies if SWAP_SIDES is called, unless the 
> cube is centred? (I mentioned before I want to play 1-point matches, but I 
> still should know how this works if I decide to use the cube later).

0 = player 0
1 = player 1
-1 = centered

It does not change with the player on roll

>      * fMove: the player for which we are
>      *        calculating equity for,
> 
> Like fCubeOwner, what values does this take, and should the value be 
> changed between pies?

0 = player 0, 1 = player 1
yes, it flips between plies. When calculating the zero ply equity,
fMove is the player on roll, when doing the 1 ply responses, fMove is
the opponent (and the equities will be for that player, so they need
to be inverted when going back to the 0 ply player), etc.

> 
> And:
> 
> typedef struct _evalcontext {
>     /* FIXME expand this... e.g. different settings for different position
>        classes */
>     unsigned int fCubeful : 1; /* cubeful evaluation */
>     unsigned int nPlies : 3;
>     unsigned int nReduced : 3; /* this will need to be expanded if we add
>                                 support for nReduced != 3 */
>     unsigned int fDeterministic : 1;
>     float        rNoise; /* standard deviation */
> } evalcontext;
> 
> 
> (Interesting use of bit fields... are thousands of evalcontexts being 
> generated or something?)

They were/are being stored in the evaluation cache, so yes, hundreds
of thousands are being calculated. 

A caution:

If you were thinking of commenting on the efficiency or lack thereof
of bitfields, please visit the mailing list archives and search for
bit fields. It's been discussed at length.

> I assume fCubeful is a boolean bit, and I should set it to 0 if I'm not 
> using the cube?

Yes

> And I suppose nPlies should be set to 2, if I want a (gnubg)2-ply search.

Yes

> What's nReduced and fDeterministic for?

nReduced tells gnubg to use a reduced set of possible rolls when doing
evaluations, trading accuracy for speed. You'd need to examine the
code to see how this works, I don't personally know the theoretical
background for this one. I do know that extensive tests have shown
that this is probably not a good idea for chequer play and that there
is only a small difference in playing quality when using 50%
reductions for cube decisions. For what you are currently doing, I'd
say this one gets set to 0 (no reduction, when evaluating any
position's future, use all 21 rolls).

> I assume rNoise is for adding noise to the NN evaluation, so I should 
> probably set it to 0.0?

Yes. It's purpose is to introduce errors in gnubg's play so that
it plays more like a beginning/intermediate/whatever player. There are
unanswered questions as to how good a model of human play this gives,
although Kees van Doel did some research to model rating systems using
this as part of his modelling. I, along with others, had serious
doubts about this method, but using his model to predict ratings on
Fibs and Gamesgrid seems to show that his model works very well. This
does not imply that it's an accurate model of human play, there could
be other reasons for its working as well as it appears to have done.

 
> As a side question...
> 
> What the heck do all the prefixes to variable names signify? I've seen a, 
> an, aan, f, n, ... is this some kind of secret code?

Hungarian notation. A source of hours of heated argument for the long
winter nights. I'm against it, others are great believers in it, but
I've tried to adapt to the style being used.

n - integer, often a counter
r - real (float). I guess this dates from FORTRAN days where you had
    reals, fixed point and integers
sz - zero terminated string

an, ar - array of integer/real
aan, aar - array of array of integer/real
auch - array of unsigned char

> > > I want to set up a match between my search procedure at 3-ply (aka 
> > > gnubg2ply) and gnubg at "best" settings for 3-ply (aka gnubg2ply). To 
> > > make 
> > > things simple, I just want to play a series of 1-point matches.
> > > 1. How do I "turn off" the doubling cube?
> > 
> > Settings->Options->Use Doubling cube
> 
> I meant in the code. Right now the plan is not to use the gnubg interface 
> at all, just to write my own tournament program (a glorified driver) to go 
> back and forth between calling either the gnubg search function or my 
> search function, and checking for a terminal board between each one.

If you follow out what CommandSetCubeUse in set.c does, you will find
it sets/resets the global flag fCubeUse to 0

> That way I can also do as much verbose logging to file that I want, and go 
> back and muck around with it later.
> 
> > > 2. What function should I call to initiate a "regular" gnubg search, 
> > > given 
> > > a board and a roll?
> > 
> > FindnSaveBestMoves()
> 
> ok:
> 
> extern int 
> FindnSaveBestMoves( movelist *pml,
>                     int nDice0, int nDice1, int anBoard[ 2 ][ 25 ],
>                     unsigned char *auchMove,
>                     cubeinfo *pci, evalcontext *pec,
>                     movefilter aamf[ MAX_FILTER_PLIES ][ MAX_FILTER_PLIES ] );
> 
> So should I just access the first element the returned pointer to a 
> movelist struct points to, as the the move gnubg has chosen for the given 
> board?

More or less. First you need to check that pml->cMoves is non-zero
(the count of valid moves), otherwise it's a can't play. If it is
non-zero, then pml->amMoves[0].anMove is the best move.

Look at ComputerTurn in play.c to see how gnubg selects its move

> What is this auchMove thing?

When analysing a match, you need to force FindnSaveBestMoves to
evaluate the actual play made and not discard it if it's not one of
the better moves, so you pass this encoding of the move to the
function.

> How does the movefilter work?

You really need to look at how FindnSaveBestMoves uses this to reduce
the list of moves it processes at deeper plies. I can't give a simple
explanaton, the code is definitive. But the basics are to take a list
of moves sorted in descending equity order, and use the move filter
for whichever ply generated the list to truncate it by discarding
entries which don't fit the filter requirements. 

In eval.c line 3521:
pml->cMoves = min(mFilter->Accept, pml->cMoves );

This cuts the number of moves on the list to be no greater than the
number of moves specified to be kept by the filter. Then there;s a
short loop to add some more moves back if their evaluation is within
the move filter's "extra moves within x equity) rating. Note that just
before all this code is a check for a move filter having an accept of
-1, which means do not filter at this ply. In that case, all the moves
being considered survive to the next round.

> I guess I can sum up things this way: I am trying to figure out what lines 
> of code I'll need so I can generate the best (gnubg)2-ply search possible 
> for any given board & dice rolls.

I'd say that if you want to play gnubg in a series of matches, then
either use two copies and the ability to use an external
opponent. That allows you the most freedom in making changes, as you
needn't preserve any portion of the old behaviour. 

If, for some reason, that's not an option, then I'd look at
play.c. The routine ComputerTurn is called whenever the player on roll
is gnubg. Why not simply put a branch at the top of that routine so
that if the player on roll is player1, it calls your routine, if it's
player0, it runs the old code. You can then get gnubg to play a series
of games or matches at whatever setting you've chosen with minimal
work on your part.
 
> > > How does the NN cache scheme work? How can I turn it on/off? How can it 
> > > be 
> > > adjusted?
> > 
> > A key is generated based on the evaluation done (0/1/2/... ply), the board
> > position, and match state. A hash function on this key is used to
> > locate a storage point for the evaluation output. I believe (but
> > haven't checked), that no collision handling is done - if there is a
> > cache entry there which doesn't match, it is over-written.
> > 
> > Why would you ever wish to turn off the cache - the result is a
> > dramatic performance hit with no gains that I can imagine.
> 
> This is true. But I have my own hash table (aka transposition table) 
> structure already, in the code. Most state-of-the-art search algorithms 
> will have one (maybe they all do?). Normally a state is looked up in the 
> TT before it is evaluated. In this way, I can sidestep the existing cache 
> entirely, and store a little bit more useful information at the same time.
> 
> I also may wish to turn it off to eliminate a possible hidden variable 
> with my results. The greater control I have over what my code is doing, 
> the better. Since about the only remaining gnubg code I use are the two 
> calls to EvaluatePosition() and Utility(), I need to know how those are 
> doing their work (do a decent degree).
> 
> > If you
> > expect that your search algorithm will generate evaluations which are
> > only applicable at certain points, change the key definitions to
> > ensure that cache lookups will only succeed in the correct context.
> > 
> > The only adjustment is the cache size, Settings->Options->Other->Cache Size
> 
> How can I adjust that in the code? What would be the line of code to do 
> that?

I can't answer that in detail, because I've never had any reason to
dip into the cache code in any depth. I think you'll find that the
setup for insertion and cache lookup is in EvaluatePositionCache, the
actual cache code is in lib/hash.c. This is where you can deterimine
what's used to create a cache entry (generation of the hash key) and
how an entry in the cache is checked to see that it contains the data
you are seeking.

>From the description you are giving, it sounds more and more like the
best way to achieve what you want is to use a benchmark, unmodified
gnubg playing against a copy with your code, so that you don't need to
preserve any of the current behaviour - trying to make a version which
does its move selection and evaluation cacheing for one player and
your new routine's move selection and cacheing of results is likely to
add extra, unproductive work and to lead to a poorer metric of the
degree of improvement. Gnubg has provision for playing against an
external copy, so play a stock gnubg against one using your move
selector.


-- 
Jim Segrave           address@hidden





reply via email to

[Prev in Thread] Current Thread [Next in Thread]