gnugo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnugo-devel] move valuation


From: Arend Bayer
Subject: [gnugo-devel] move valuation
Date: Tue, 13 Nov 2001 03:01:35 +0100 (CET)

Thx, now I understand a little better the philosophy behind
influence_delta_territory.

> To begin with, the single purpose of the move valuation is to try to
> rank the moves so that the best move gets the highest score (well,
> there's the problem of determinacy too).
Sure, that's true for any move valuation.

> This valuation basically follows the principles in e.g. "The Endgame"
> by Ogawa and Davies. A move like * below is worth one point in gote
> 
> OOOX
> O.*X
> OOOX
> 
> and given the value 1. A move in sente or reverse sente is counted
> double, e.g. the move at * here
Yes, this is deiri counting.

> The territorial valuation done by the influence function uses the idea
> that if O makes a move at (pos), we first assign territory under the
> assumption that X moves first in all local positions in the original
> position and then under the assumption that X moves first in all local
> positions after O having made the move at (pos). These two territory
> assignments are compared and the difference gives the territorial
> value of the move.
I think that this principle is techically incorrect, although it is
certainly ok as an heuristic.

So the following is just a theoretical discussion, but since I raised
that matter I feel I should explain why I think so: Let's look at the
following position:
|..XXXXXXXXX...
|XXXOXOOOOOXXXX
|XOXOXO.O.OXOOX
|.OaObO.O.OcOOX
+--------------
First note that all moves here are gote.
Your method values a white move at b with 7 pts (after white moves there,
black will be assumed to play at a). A white move at c is worth 8 pts.

I would value the move at at b with 9.5 pts: If black moves at b, he has
12 pts, if white moves at b, he has an expected territory of 2.5 pts
(either side has 50% chance to get the move at a).
So the correct move is b (and these values are really precise if you
imagine this position being duplicated a 100 times on the board).

As another example, the move
OXXXX
O*..X
is valued as 1pt by GnuGo (instead of 1.5).

The more precise principle is:
1) Assign territory after the move by O in the local position XYZ
under the assumption that each side has a 50% chance of playing in each
other local situation, unless there is a one-sided sente play.
2) Assign territory if instead X moves in local position XYZ first, and
again both sides get their 50% share of gote-vs-gote plays, and take
the difference.
Double this if the X move is sente.
(This is is deiri counting; miai counting instead compares 1) with the
situation before the move, with each side getting its share of gote-vs-gote
plays INCLUDING local position XYZ -- then there is no need to double.)

Also, one can work around this with follow_up values, which as I
understand GnuGo does rather rarely.

Of course there are other things to worry about first; still I feel
that e.g. GnuGo's reluctance to jump into an opp's moyo (unless forced
to do so by patterns like LE14/LE15 with fixed values -- which quite
often produce rather inefficient moves as well) partly comes from the
same problem: It does not give the chance to jump further into opps
moyo enough value.

> + high reverse_followup_value) gets still more. The double sente value
> computation is probably not very good.
There is a (in theory) simple rule about how to calculate its value: It
can be played as soon as it is large enough as a gote (and reverse sente)
play (i.~e. assuming a 50% chance that X answers, and 50% that O will get
a second move). (Btw, the same is true for every sente play.)

-Arend




reply via email to

[Prev in Thread] Current Thread [Next in Thread]