|
From: | Nis |
Subject: | RE: [Bug-gnubg] Why is odd ply equity always lower? |
Date: | Wed, 18 Jun 2003 22:24:38 +0200 |
Quoting myself:
If 0-ply is unbiased but imprecise (as in having average error 0) then the value of the best move will be overrated. Example Move True Equity 0-ply equity A 0.4 0.35 B 0.4 0.45 C 0.5 0.45 D 0.5 0.55 (*BEST MOVE*) Note that the average error is 0, but the best move is off by 0.05. The result of this should be that 1-ply, which is the average of 21 BestMoves for the opponent, is underrated by some amount. This will be added to the (negated) 0-ply, so if 0-ply is overrated, 1-ply is even more underrated. QED In our next issue: Implications + How to repair this ...
Some items for consideration: How does this affect higher plies?One would think that the same mechanism would lead to 2-ply overestimating the best 1-ply play, and thus cancel out some of the error of the 1-ply calculation. This is not necessarily so, however, since the best play is selected on 0-ply, and only evaluated at 1-ply. Only if the errors of 1-ply and 0-ply correlate, will some of the effect be cancelled out at 2-ply.
Even if the correlation is there, the effect will be dampened by the imperfect correlation as well as the assumed smaller relative error on 1-ply compared to 0-ply.
If we ASSUME that the effect only appears at 1-ply, we find that the error propagates through the plies, being negative for odd plies and positive for even plies.
Thus an "obvious" fix would be to implement my previously suggested half-plies - averages between n- and (n+1)-ply
I have written the code to do so, and it compiles and works (as in doesn't play obviously bad)
I can send a patch to someone with an account. For those interested, what I do isPlease note that my code removes the ability of gnubg to do reduced 1-ply evaluations. I can reimplement it if people see a need.
--- Only relevant for cube decisions?The wrong estimations of equity is mainly a problem of absolute equities. As Joseph has pointed out, chequerplay relies almost solely on the relative equities of positions.
Thus a fix should only "need" to be applied for cube decisions (and cubeful chequerplay? Someone help me here)
So we might want to start using reduced evaluation settings for cube decisions
--- TrainingAnother question is: Does this have any impact on the training of the nets? Anything that could be done better, or implications for how we want to train.
--- MeasuringI am really interested in how this affects playing strength. Joseph has already measured the strength difference between 0, 0.5 and 1-ply (0.5-ply calculated as the average of the other two). However, I think the really interesting question will be how 1.5-ply relate to 1- and 2-ply - especially for cube decisions.
-- Nis Jorgensen Greenpeace Amsterdam
[Prev in Thread] | Current Thread | [Next in Thread] |