[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-gnubg] Doubts about the new pruning net
From: |
Robert-Jan Veldhuizen |
Subject: |
[Bug-gnubg] Doubts about the new pruning net |
Date: |
Wed, 03 Nov 2004 18:05:22 +0100 |
User-agent: |
Mozilla Thunderbird 0.8 (Windows/20040913) |
Hi gnubg team,
I've been reading some of the threads here about the new pruning net. My
experiences aren't so great though, and some others report problems as
well. In fact, I've given up for now and switched back to the "old"
gnubg: too many errors and problems with the new pruning build. I see
some others (on GOL) have gotten to the same conclusion.
Let me first forward my own message on GOL about this:
-*- 29 Oct 2004, Gammonline BBS
-*- http://www.gammonline.com/members/board/config.cgi?read=86333
-*- (members only)
I see on the mailinglist that it is suggested the pruning net is so good
that GNUBG doesn't need reduced evaluations anymore.
I VERY STRONGLY disagree with that. It would be a real shame for reduced
evaluations to disappear, at least at this stage.
First, there is some bug with the pruning net. Sometimes, 2-ply
evaluation, with a 0-ply filter and skip 1-ply pruning, actually stops
evaluating, producing a 1-ply result. This can have pretty bad
consequences for GNUBG's playing strength. It happens only rarely, but
it makes the pruning net at least a bit unreliable for now.
Second, for rollouts, 2-ply with pruning seems to be performing badly in
some situations. Worse than 0-ply, in fact. That is pretty terrible.
I don't know if it's a bug or if this is simply the effect of the
pruning net, but either way, it is bad enough that I don't want to use
2-ply prune for rollouts.
Also, while pruning gives a nice speed increase for playing, evaluating
and analyzing, the results for rollouts don't show all that much of a
speed increase.
An unfortunate thing is that what many people agree on are a few of the
best rollout settings in general, use 2-ply 25% or 33% cube. I havent't
timed it, but it doesn't seem like 2-ply 100% with pruning is any faster
and it seems unlikely that it will be significantly more accurate.
Another thing is that 2-ply 50% speed no pruning checker play seems to
be almost as good as 2-ply 100% no pruning, while I'm not sure 2-ply
100% PRUNING is even as good as 2-ply 50% no pruning. This makes the
speed increase for rollouts much less.
So, especially for rollouts, I think it would be quite bad if:
* the use of pruning is forced
* reduced evaluations are lost
* no research is done on positions where 2-ply pruning seems to do
really bad
* the "1-ply" bug isn't fixed
If any of you is interested in one or more of these points, I'm happy to
try and find the relevant positions/results and/or discuss these points
further.
Thanks very much, GNUBG-team and keep up the nice work!
-*- END FORWARD
-*-
As of today, several examples of bad behaviour by the pruning net have
been posted on GOL. It seems like even 0-ply rollouts with just a 2-ply
cube can be seriously affected (in a bad way).
Also, it seems like quite often, using 2-ply prune for evaluations in
rolluts produces higher standard errors, meaning that one needs to do
more trials, thereby losing some of the speed advantage. Sometimes,
because of this, it takes even longer to get a significant result with
the pruning net.
One of the worst performances I've seen is in a simple bearoff position,
where a 2-ply prune rollout gave standard errors 10 times as high as
with the old gnubg, which would come down to 100 times more trials needed.
I don't know if this is really the pruning net, or perhaps an
implementation that doesn't work, but for now, it's clear to me that the
pruning net version of gnubg is not to be recommended.
I'll stop here for now but I'm willing to provide more and more detailed
examples if necessary. For GOL subscribers, you might want to take a
look at the following examples:
http://www.gammonline.com/members/board/config.cgi?read=86267
http://www.gammonline.com/members/board/config.cgi?read=86055
http://www.gammonline.com/members/board/config.cgi?read=86433
http://www.gammonline.com/members/board/config.cgi?read=86554
After finding out that the new pruning net version seems to have some
serious problems, I also reanalysed a few of my recent matches. I reran
some rollouts using exactly the same settings but now with the older
GNUBG. I got different absolute equities by significant amounts very
often; sometimes differences between plays were also significantly
different. Occasionally, the order of plays changed.
This was usually done with 0-ply and 2-ply 100% cube. 2-ply 100% play
rollouts seem even worse with the new pruning net, from what I've seen
so far.
Hope this will raise some discussion.
Regards,
--
Robert-Jan Veldhuizen
- [Bug-gnubg] Doubts about the new pruning net,
Robert-Jan Veldhuizen <=