gnugo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnugo-devel] top level search: evaluation function


From: Douglas Ridgway
Subject: [gnugo-devel] top level search: evaluation function
Date: Thu, 2 Dec 2004 20:56:31 -0700 (MST)

I did these runs awhile ago, using the top level search program, running
against native GNU Go. I was trying start to collect data to help optimize
width, depth, level, etc. Level, Color and Wins/Losses are from the search
program's point of view. The one column that needs comment is IgLstSente.  
When I first wrote the program, the evaluation function was just the first
number returned by GTP estimate_score. This ignores which side is to play.
I traced one stupid move made by the search program but not by GNU Go to
chasing a particular line which ended in atariing a big group, getting
credit for half the group value, and failing to realize that the opponent
was just going to draw out of atari. This seems like an obvious bug, with
the obvious fix being to use the upper bound / lower bound to account for
side to move. Runs with the "bug" have IgLstSente = T, without have
IgLstSente = F. Here is the data:

Level   OppLvl  Color   IgLstSente      Width   Depth   Wins    Losses
10      15      W       T               5       5       9       1
10      15      B       T               5       5       8       2
10      15      W       T               2       11      9       1
10      15      B       T               2       11      8       1
10      15      W       F               5       5       5       4
10      15      B       F               5       5       5       5
10      15      W       F               5       6       7       3
10      15      W       F               5       5       4       4
10      15      B       F               5       5       2       7

As can be seen, in practice, this "bug" in the evaluation function
substantially improves results compared to counting correctly. I find this
surprising, and don't have a convincing explanation. Perhaps chasing
higher temperature situations, and optimistically assuming that they'll
work out somehow, is a good heuristic, especially in what's almost
self-play. But I don't really know. What is clear is that tuning the
evaluation function will require some care.

While I'm optimistic that a sufficiently tuned top level search could add
strength within the same time constraints, I'm no longer taking data.  
First I want to understand better how it would fit in with how GNU Go is
evaluating moves and spending the rest of its time, have handled repeated
games, have suitable opponents to tune against, etc.

doug.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]