On 12/15/06, Joseph Heled <address@hidden> wrote:--
I understand your frustration with gnubg not handling the cube as well
as you think it should at those "simple" or "straightforward"
Well, 0-ply cube is still pretty good and I normally use 2-ply cube anyway, which I'd recommend to anyone. It just seems that it wouldn't be that hard to improve 0-ply's cube handling by fine-tuning the volatility estimate?
Yet I do not think the situation is simple. race is
simpler than contact, still someone may make great improvements if she
is willing to do the proper research.
Yes and I think Christian made a good start, suggesting strongly that 0-ply doubles too early.
Personally I do not believe any of the numbers below except the
cubeless winning percentage.
Backgammon is not solved except for a very small subset of positions. So, numbers aren't going to be exact.
I don't see the problem with that. The idea is to improve, not to be perfect, isn't it? And decent settings rollouts with enough trials figure to be (much) better than evaluations nearly always, so it seems like a good idea to use rollout results to improve gnubg's evaluations.
Isn't that a straightforward method being used all the time to improve bots?
For the simple race position I posted, I have no doubt whatsoever that the rollouts are more accurate than the evaluations, especially the numbers after double/take should be very close to the truth, since only few trials will have to make non-obvious cube decisions after a double/take in the actual position: first a big turn-around, then a cube action that is close enough so that gnubg could get it wrong.
More evidence for this is the fact that whereas 0-ply evaluation says double/take and 2-ply says no dbouel/take, a rollout with 0-ply evaluation says no double/take and a rollout with 2-ply cube evaluation is nearly identical, also saying no double/take.
It's no proof, but it's pretty strong evidence already. Higher settings rollouts could be done but I doubt anything different would come out.
2ply, 4ply and rollout cubfull numbers
are all based on a large number of 0ply cubefull decisions, and if
this is suspect, why would they be any good. 2ply play will not be any
better than 0ply if your 0ply is awful.
I am very surprised you write this. Maybe I'm misunderstanding you here.
I think the above is not true at all and simulations have proven that. 0-ply is often somewhat inaccurate, but 2-ply will average over a lot of 0-ply evaluations. The end result will nearly always be better than a single 0-ply evaluation. A similar argument goes for rollouts.
Just because the 2-ply numbers aren't perfect, doesn't mean they are not an improvement over 0-ply.
If I was to start somewhere, I
would start with doubling on the very last stage of bearoff - where
you first get the true actions by brute force. This requires a large
database since you need the result for each score.
Short bearoffs have very much different volatility estimates than races or contact positions, so I don't see how this would help 0-ply's cubes in general. Also, using this approach seems practically impossible with current processing power, no?
I think a bit of experimentation could help make GNUBG's cube action stronger. Christian's sample suggest quite a strong bias in 0-ply's cube handling towards early doubling.
That brings up two basic questions: does 0-ply overestimate equities on average? If so, then this might not be so easy to solve. However, I think it's more likely that 0-ply cube action is just using a too high volatility estimate and that is not so hard to improve.
Since backgammon is and will not be solved any time soon, I don't see any better way than using rollouts to help improve GNUBG.
| In addition I am
not sure I agree with the doubling code in gnubg. I always used my own
code which is part of the fibs2html or gnubg-nn, which I think is
better (but I may be wrong). If someone want to take this code and
integrate it into gnubg, where one can choose which method to use
would be a great start as well.
That sounds interesting. What is the difference between your algorithm/formula and GNUBG's present algorithm/formula?