Re: [Bug-gnubg] Doubts about the new pruning net

From: Joseph Heled
Subject: Re: [Bug-gnubg] Doubts about the new pruning net
Date: Thu, 04 Nov 2004 06:22:43 +1300
Please post the bearoff position to start with. Since there is no pruning at bearoff (no net there), I would like to see that first.

And please post explicit positions.

I run the benchmark overnight and I get for 13294 moves
Pruning 124 differences to full 2ply, error 0.00726047 vs. 0.00725554
Reduced33% 914 differences error 0.00755261


Jan Veldhuizen wrote:
Hi gnubg team,

I've been reading some of the threads here about the new pruning net. My experiences aren't so great though, and some others report problems as well. In fact, I've given up for now and switched back to the "old" gnubg: too many errors and problems with the new pruning build. I see some others (on GOL) have gotten to the same conclusion.

Let me first forward my own message on GOL about this:

-*- 29 Oct 2004, Gammonline BBS
-*- http://www.gammonline.com/members/board/config.cgi?read=86333
-*- (members only)

I see on the mailinglist that it is suggested the pruning net is so good that GNUBG doesn't need reduced evaluations anymore.

I VERY STRONGLY disagree with that. It would be a real shame for reduced evaluations to disappear, at least at this stage.

First, there is some bug with the pruning net. Sometimes, 2-ply evaluation, with a 0-ply filter and skip 1-ply pruning, actually stops evaluating, producing a 1-ply result. This can have pretty bad consequences for GNUBG's playing strength. It happens only rarely, but it makes the pruning net at least a bit unreliable for now.

Second, for rollouts, 2-ply with pruning seems to be performing badly in some situations. Worse than 0-ply, in fact. That is pretty terrible.

I don't know if it's a bug or if this is simply the effect of the pruning net, but either way, it is bad enough that I don't want to use 2-ply prune for rollouts.

Also, while pruning gives a nice speed increase for playing, evaluating and analyzing, the results for rollouts don't show all that much of a speed increase.

An unfortunate thing is that what many people agree on are a few of the best rollout settings in general, use 2-ply 25% or 33% cube. I havent't timed it, but it doesn't seem like 2-ply 100% with pruning is any faster and it seems unlikely that it will be significantly more accurate.

Another thing is that 2-ply 50% speed no pruning checker play seems to be almost as good as 2-ply 100% no pruning, while I'm not sure 2-ply 100% PRUNING is even as good as 2-ply 50% no pruning. This makes the speed increase for rollouts much less.

So, especially for rollouts, I think it would be quite bad if:

    * the use of pruning is forced
    * reduced evaluations are lost
* no research is done on positions where 2-ply pruning seems to do really bad
    * the "1-ply" bug isn't fixed

If any of you is interested in one or more of these points, I'm happy to try and find the relevant positions/results and/or discuss these points further.

Thanks very much, GNUBG-team and keep up the nice work!


As of today, several examples of bad behaviour by the pruning net have been posted on GOL. It seems like even 0-ply rollouts with just a 2-ply cube can be seriously affected (in a bad way).

Also, it seems like quite often, using 2-ply prune for evaluations in rolluts produces higher standard errors, meaning that one needs to do more trials, thereby losing some of the speed advantage. Sometimes, because of this, it takes even longer to get a significant result with the pruning net.

One of the worst performances I've seen is in a simple bearoff position, where a 2-ply prune rollout gave standard errors 10 times as high as with the old gnubg, which would come down to 100 times more trials needed.

I don't know if this is really the pruning net, or perhaps an implementation that doesn't work, but for now, it's clear to me that the pruning net version of gnubg is not to be recommended.

I'll stop here for now but I'm willing to provide more and more detailed examples if necessary. For GOL subscribers, you might want to take a look at the following examples:



After finding out that the new pruning net version seems to have some serious problems, I also reanalysed a few of my recent matches. I reran some rollouts using exactly the same settings but now with the older GNUBG. I got different absolute equities by significant amounts very often; sometimes differences between plays were also significantly different. Occasionally, the order of plays changed.

This was usually done with 0-ply and 2-ply 100% cube. 2-ply 100% play rollouts seem even worse with the new pruning net, from what I've seen so far.

Hope this will raise some discussion.


