gnugo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [gnugo-devel] Move valuation question


From: Portela Fernand
Subject: RE: [gnugo-devel] Move valuation question
Date: Sat, 18 Jan 2003 02:24:53 +0100

Dan wrote:

> > I'm not sure what to do, but what about following ?
> > 
> >       this_value = 2*dragon[aa].effective_size - gg_abs(score);
> >       if (this_value < 0.0)
> >         this_value = 0.0;
> 
> This doesn't look right to me, however. 

Indeed, after a bit of testing, I now understand why.

> (...). Yet the right thing may be to gamble (...)

If we are to gamble, we should then apply some poker-style rules, that is
measure our bet, e.g. the uncertain dragon (that's fairly easy) and ... 

> What we really need is an additional certainty
> estimate---we need to know not only whether the owl
> result is certain.

... yes, the probability of loosing it (this one looks tough to measure,
and the simple 1/2 is for sure too simplistic). This plus ...

> We also need to know whether the
> score estimate is certain, for example, how far we
> are from the end of the game.

This got me thinking and I came up with following patch, which I'd like to
submit for discussion/review (although it can be safely added into CVS as
is). I tried to implement what I hope to be (or become soon) a good
approximation of :
- the score (balance in terms of solid territory)
- the "power" (balance in terms of influence)
    the implementation currently uses the territory valuation,
    but this is possibly not the best choice for that purpose.
- and the game advancement (fuseki, chuban, yose)
    returned as a value between 0.0 (start) and 1.0 (game is over)

A couple lines are added in do_genmove(), which will only do some debug
output, so it won't (and shouldn't) have any effect on the engine for the
moment. The point of the patch is to test whether these algorythms are
reliable and/or useful.

One simple feature we could implement quickly would be to teach GG the art
of... resigning. I'm thinking of a (very) conservative policy like:
- 50 points behind (for a 19x19)
- no more critical or weak opponent dragons (all either dead or alive)
- the game is ending (game_status > 0.7 with current code)

All comments welcome.

/nando

- new influence_evaluate_position() function


Index: engine/genmove.c
===================================================================
RCS file: /cvsroot/gnugo/gnugo/engine/genmove.c,v
retrieving revision 1.62
diff -u -r1.62 genmove.c
--- engine/genmove.c    2 Jan 2003 00:23:28 -0000       1.62
+++ engine/genmove.c    18 Jan 2003 01:12:31 -0000
@@ -361,6 +361,20 @@
                lower_bound > 0 ? "W " : "B ", gg_abs(lower_bound),
                upper_bound > 0 ? "W " : "B ", gg_abs(upper_bound));
       fflush(stderr);
+
+      {
+       float i_score, power, game_status;
+
+       i_score = influence_evaluate_position(color, &power, &game_status);
+       gprintf("\nScore estimate (by influence): %s %f\n",
+               i_score > 0 ? "W " : "B ", gg_abs(i_score));
+       gprintf("Power balance: %s %f\n",
+               power > 0 ? "W " : "B ", gg_abs(power));
+       gprintf("Game phase: %s (%f / 100)\n",
+               game_status < 0.35 ? "FUSEKI" :
+                 (game_status < 0.7 ? "CHUBAN" : "YOSE"),
+               game_status * 100);
+      }
     }
     time_report(1, "estimate score", NO_MOVE, 1.0);
 
Index: engine/influence.c
===================================================================
RCS file: /cvsroot/gnugo/gnugo/engine/influence.c,v
retrieving revision 1.74
diff -u -r1.74 influence.c
--- engine/influence.c  2 Jan 2003 00:23:28 -0000       1.74
+++ engine/influence.c  18 Jan 2003 01:12:33 -0000
@@ -1688,6 +1688,72 @@
   return score;
 }
 
+
+/* Uses initial_influence to estimate :
+ *
+ * - the score (balance in terms of solid territory)
+ * - the "power" (balance in terms of influence)
+ *     implementation currently uses the territory valuation
+ *     FIXME : this is maybe not the best choice for that purpose
+ * - the game advancement (fuseki, chuban, yose)
+ *     returned as a value between 0.0 (start) and 1.0 (game over)
+ *
+ * The algorythm uses a 'chinese rules'-like method to estimate the
+ * score, so prisoners and dead stones are just ignored.
+ */
+float
+influence_evaluate_position(int color, float *power, float *game_status)
+{
+  struct influence_data *iq = INITIAL_INFLUENCE(color);
+  struct influence_data *oq = OPPOSITE_INFLUENCE(color);
+  float score = 0.0;
+  float power_balance = 0.0;
+  int count = 0;
+  int ii;
+
+  for (ii = BOARDMIN; ii < BOARDMAX; ii++)
+    if (ON_BOARD(ii)) {
+      if (iq->safe[ii]) {
+       /* chinese-style scoring, a safe stone is 1 point */
+        score += (board[ii] == WHITE ? 1 : -1);
+       count += WEIGHT_TERRITORY;
+      }
+      else if (whose_territory(iq, ii) != EMPTY) {
+       if (whose_territory(oq, ii) != EMPTY)
+         /* chinese-style scoring, add 1 point max, not 2 */
+         score += (iq->territory_value[ii] > 0 ? 1 : -1);
+       else
+         /* maybe not so solid territory, award half-point only */
+         score += (iq->territory_value[ii] > 0 ? 0.5 : -0.5);
+       count += WEIGHT_TERRITORY;
+      }
+      else if (whose_moyo(iq, ii) != EMPTY) {
+       power_balance += iq->territory_value[ii]
+                        + oq->territory_value[ii];
+       count += WEIGHT_MOYO;
+      }
+      else if (whose_area(iq, ii) != EMPTY) {
+       power_balance += iq->territory_value[ii]
+                        + oq->territory_value[ii];
+       count += WEIGHT_AREA;
+      }
+      else 
+       power_balance += iq->territory_value[ii]
+                        + oq->territory_value[ii];
+    }
+
+  score += komi;
+
+  if (power)
+    *power = power_balance;
+  if (game_status)
+    *game_status = (float) count
+                  / (WEIGHT_TERRITORY * board_size * board_size);
+
+  return score;
+}
+
+
 /* Print the influence map when we have computed influence for the
  * move at (i, j).
  */
Index: engine/influence.h
===================================================================
RCS file: /cvsroot/gnugo/gnugo/engine/influence.h,v
retrieving revision 1.15
diff -u -r1.15 influence.h
--- engine/influence.h  2 Jan 2003 00:23:28 -0000       1.15
+++ engine/influence.h  18 Jan 2003 01:12:34 -0000
@@ -120,6 +120,12 @@
  */
 typedef int (*owner_function_ptr)(const struct influence_data *q, int pos);
 
+/* Used for tuning game advancement algorythm
+ */
+#define WEIGHT_TERRITORY 10
+#define WEIGHT_MOYO       3
+#define WEIGHT_AREA       1
+
 /*
  * Local Variables:
  * tab-width: 8
Index: engine/liberty.h
===================================================================
RCS file: /cvsroot/gnugo/gnugo/engine/liberty.h,v
retrieving revision 1.152
diff -u -r1.152 liberty.h
--- engine/liberty.h    12 Jan 2003 20:51:45 -0000      1.152
+++ engine/liberty.h    18 Jan 2003 01:12:36 -0000
@@ -683,6 +683,8 @@
                   float black_influence[BOARDMAX],
                   int regions[BOARDMAX]);
 float influence_score(const struct influence_data *q);
+float influence_evaluate_position(int color, float *power,
+                                 float *game_status);
 void resegment_initial_influence(void);
 void influence_mark_non_territory(int pos, int color);
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]