bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Doubts about the new pruning net


From: Robert-Jan Veldhuizen
Subject: Re: [Bug-gnubg] Doubts about the new pruning net
Date: Thu, 04 Nov 2004 15:53:16 +0100
User-agent: Mozilla Thunderbird 0.8 (Windows/20040913)

Hi Joseph,

Here's the simple bearoff example, with evaluations and rollouts, first done with the pruning build, then without it.

I think all experts agree that 6/off 2/off is the best play here, even if it's just by a very tiny amount. Rollouts with the "old" gnubg confirm this. Even the evaluations get it right, both with and without pruning.

However, the rollouts with the pruning net build are terrible here.

***********************************************************************

The score (after 0 games) is: White 0, Blue 0 (match to 11 points)
Match Information:

Date: October 22, 2004
Move number 38: X to play 62

    GNU Backgammon  Position ID: d3cFAgCXfQMAAA
                    Match ID   : QQlrAQAAAAAA
    +24-23-22-21-20-19------18-17-16-15-14-13-+  O: White (Cube: 2)
    | O  O  O  O  O  O |   |                O |  0 points
    | O  O  O  O       |   |                  |
    | O  O  O  O       |   |                  |
    |                  |   |                  |
    |                  |   |                  |
    |                  |BAR|                  |v 11 point match
    |             X    |   |                  |
    |             X    |   |                  |
    | X           X    |   |                  |
  X | X        X  X  X |   |                  |  Rolled 62
  X | X  X     X  X  X |   |                  |  0 points
    +-1--2--3--4--5--6-------7--8--9-10-11-12-+  X: Blue
Pip counts: O 53, X 50

* Blue moves 6/off 2/off
Annotation:
    1. Cubeful 2-ply    6/off 2/off                  Eq.:  +0.5842
        81.98%   0.00%   0.00% -  18.02%   0.00%   0.00%
        2-ply cubeful prune [world class]
2. Cubeful 2-ply 6/off 5/3 Eq.: +0.5801 ( -0.0041)
        81.80%   0.00%   0.00% -  18.20%   0.00%   0.00%
        2-ply cubeful prune [world class]

    1. Cubeful 3-ply    6/off 2/off                  Eq.:  +0.5846
        81.98%   0.00%   0.00% -  18.02%   0.00%   0.00%
        3-ply cubeful [grandmaster]
2. Cubeful 3-ply 6/off 5/3 Eq.: +0.5789 ( -0.0057)
        81.73%   0.00%   0.00% -  18.27%   0.00%   0.00%
        3-ply cubeful [grandmaster]

    1. Cubeful 4-ply    6/off 2/off                  Eq.:  +0.5834
        81.99%   0.00%   0.00% -  18.01%   0.00%   0.00%
        4-ply cubeful
2. Cubeful 4-ply 6/off 5/3 Eq.: +0.5775 ( -0.0058)
        81.74%   0.00%   0.00% -  18.26%   0.00%   0.00%
        4-ply cubeful
===================================================
    1. Rollout          6/off 2/off                  Eq.:  +0.5893
81.99% 0.00% 0.00% - 18.01% 0.00% 0.00% CL +0.6399 CF +0.5893 [ 0.02% 0.00% 0.00% - 0.02% 0.00% 0.00% CL 0.0003 CF 0.0009]
        Full cubeful rollout with var.redn.
2592 games, Mersenne Twister dice gen. with seed 950646656 and quasi-random dice
        Play: 0-ply cubeful prune [beginner]
        Cube: 0-ply cubeful prune [beginner]
2. Rollout 6/off 5/3 Eq.: +0.5837 ( -0.0056) 81.74% 0.00% 0.00% - 18.26% 0.00% 0.00% CL +0.6348 CF +0.5837 [ 0.01% 0.00% 0.00% - 0.01% 0.00% 0.00% CL 0.0001 CF 0.0008]
        Full cubeful rollout with var.redn.
2592 games, Mersenne Twister dice gen. with seed 950646656 and quasi-random dice
        Play: 0-ply cubeful prune [beginner]
        Cube: 0-ply cubeful prune [beginner]

    1. Rollout          6/off 5/3                    Eq.:  +0.5977
79.39% 0.00% 0.00% - 20.61% 0.00% 0.00% CL +0.5878 CF +0.5977 [ 0.68% 0.00% 0.00% - 0.68% 0.00% 0.00% CL 0.0136 CF 0.0196]
        Full cubeful rollout with var.redn.
2592 games, Mersenne Twister dice gen. with seed 950646656 and quasi-random dice
        Play: 0-ply cubeful prune [beginner]
        Cube: 2-ply cubeful prune [world class]
2. Rollout 6/off 2/off Eq.: +0.5956 ( -0.0022) 79.36% 0.00% 0.00% - 20.64% 0.00% 0.00% CL +0.5872 CF +0.5956 [ 0.68% 0.00% 0.00% - 0.68% 0.00% 0.00% CL 0.0137 CF 0.0195]
        Full cubeful rollout with var.redn.
2592 games, Mersenne Twister dice gen. with seed 950646656 and quasi-random dice
        Play: 0-ply cubeful prune [beginner]
        Cube: 2-ply cubeful prune [world class]
----------------------------------------------------
    1. Rollout          6/off 5/3                    Eq.:  +0.6213
81.33% 0.00% 0.00% - 18.67% 0.00% 0.00% CL +0.6267 CF +0.6213 [ 0.59% 0.00% 0.00% - 0.59% 0.00% 0.00% CL 0.0117 CF 0.0175]
        Full cubeful rollout with var.redn.
2592 games, Mersenne Twister dice gen. with seed 950646656 and quasi-random dice
        Play:  2-ply cubeful prune [world class]
keep the first 0 0-ply moves and up to 5 more moves within equity 0.04
        Skip pruning for 1-ply moves.
        Cube: 2-ply cubeful prune [world class]
        Different evaluations after 4 plies:
        Play: 0-ply cubeful prune [beginner]
        Cube: 2-ply cubeful prune [world class]
2. Rollout 6/off 2/off Eq.: +0.6170 ( -0.0043) 81.19% 0.00% 0.00% - 18.81% 0.00% 0.00% CL +0.6237 CF +0.6170 [ 0.58% 0.00% 0.00% - 0.58% 0.00% 0.00% CL 0.0117 CF 0.0174]
        Full cubeful rollout with var.redn.
2592 games, Mersenne Twister dice gen. with seed 950646656 and quasi-random dice
        Play:  2-ply cubeful prune [world class]
keep the first 0 0-ply moves and up to 5 more moves within equity 0.04
        Skip pruning for 1-ply moves.
        Cube: 2-ply cubeful prune [world class]
        Different evaluations after 4 plies:
        Play: 0-ply cubeful prune [beginner]
        Cube: 2-ply cubeful prune [world class]
----------------------------------------------------
    1. Rollout          6/off 5/3                    Eq.:  +0.5426
78.20% 0.00% 0.00% - 21.80% 0.00% 0.00% CL +0.5639 CF +0.5426 [ 0.69% 0.00% 0.00% - 0.69% 0.00% 0.00% CL 0.0138 CF 0.0197]
        Full cubeful rollout with var.redn.
2592 games, Mersenne Twister dice gen. with seed 950646656 and quasi-random dice
        Play:  2-ply cubeful prune [world class]
keep the first 0 0-ply moves and up to 5 more moves within equity 0.04
        Skip pruning for 1-ply moves.
        Cube: 2-ply cubeful prune [world class]
2. Rollout 6/off 2/off Eq.: +0.5413 ( -0.0013) 78.09% 0.00% 0.00% - 21.91% 0.00% 0.00% CL +0.5617 CF +0.5413 [ 0.69% 0.00% 0.00% - 0.69% 0.00% 0.00% CL 0.0138 CF 0.0196]
        Full cubeful rollout with var.redn.
2592 games, Mersenne Twister dice gen. with seed 950646656 and quasi-random dice
        Play:  2-ply cubeful prune [world class]
keep the first 0 0-ply moves and up to 5 more moves within equity 0.04
        Skip pruning for 1-ply moves.
        Cube: 2-ply cubeful prune [world class]

========== SWITCH BACK: NO PRUNE !!!!! ===============================
same random seed
0-ply: NO CHANGE

    1. Rollout          6/off 2/off                  Eq.:  +0.5892
82.00% 0.00% 0.00% - 18.00% 0.00% 0.00% CL +0.6399 CF +0.5892 [ 0.02% 0.00% 0.00% - 0.02% 0.00% 0.00% CL 0.0003 CF 0.0009]
        Full cubeful rollout with var.redn.
2592 games, Mersenne Twister dice gen. with seed 950646656 and quasi-random dice
        Play: 0-ply cubeful [expert]
        Cube: 2-ply cubeful 100% speed [world class]
2. Rollout 6/off 5/3 Eq.: +0.5837 ( -0.0055) 81.74% 0.00% 0.00% - 18.26% 0.00% 0.00% CL +0.6348 CF +0.5837 [ 0.01% 0.00% 0.00% - 0.01% 0.00% 0.00% CL 0.0001 CF 0.0008]
        Full cubeful rollout with var.redn.
2592 games, Mersenne Twister dice gen. with seed 950646656 and quasi-random dice
        Play: 0-ply cubeful [expert]
        Cube: 2-ply cubeful 100% speed [world class]
----------------------------------------------------------
    1. Rollout          6/off 2/off                  Eq.:  +0.5865
81.83% 0.00% 0.00% - 18.17% 0.00% 0.00% CL +0.6366 CF +0.5865 [ 0.01% 0.00% 0.00% - 0.01% 0.00% 0.00% CL 0.0002 CF 0.0005]
        Full cubeful rollout with var.redn.
2592 games, Mersenne Twister dice gen. with seed 950646656 and quasi-random dice
        Play:  2-ply cubeful 100% speed [world class]
keep the first 0 0-ply moves and up to 5 more moves within equity 0.04
        Skip pruning for 1-ply moves.
        Cube: 2-ply cubeful 100% speed [world class]
2. Rollout 6/off 5/3 Eq.: +0.5842 ( -0.0022) 81.73% 0.00% 0.00% - 18.27% 0.00% 0.00% CL +0.6346 CF +0.5842 [ 0.00% 0.00% 0.00% - 0.00% 0.00% 0.00% CL 0.0001 CF 0.0005]
        Full cubeful rollout with var.redn.
2592 games, Mersenne Twister dice gen. with seed 950646656 and quasi-random dice
        Play:  2-ply cubeful 100% speed [world class]
keep the first 0 0-ply moves and up to 5 more moves within equity 0.04
        Skip pruning for 1-ply moves.
        Cube: 2-ply cubeful 100% speed [world class]



Output generated Thu Nov 04 15:27:45 2004
by GNU Backgammon 0.14-devel (Text Export version 1.68)

************************************************************

Whatever it is, but something must be (very) wrong here. Standard errors and absolute equities are clearly off with the prune rollouts.




>I run the benchmark overnight and I get for 13294 moves
>Pruning 124 differences to full 2ply, error 0.00726047 vs. 0.00725554
>Reduced33% 914 differences error 0.00755261

I don't know what these figures mean. What benchmark are you running? One position rolled out many trials, or many different positions evaluated? The real problems I'm getting are with rollouts.

Are the differences you report for checker play, cube or both? Is it money play, or also matchplay?

FWIW, my own conclusions from letting gnubg play against itself many games, are that:

2-ply 25% and 33% aren't to be recommended for checker play
2-ply 50% is very good though, almost the same as 2-ply 100%

For CUBE, reduced works much better. 2-ply 25% is pretty good already, and 2-ply 33% is almost as good as 2-ply 100%.

Based on that, one of the fastest and best settings with GNUBG which I'd recommend to anyone (especially for rollouts and play), is: 2-ply 50% checker play, 2-ply 33% cube. I think Michael Depreli also did some tests on this and came to a pretty similar conclusion.

So, I'm comparing the new pruning net 2-ply (100%) evaluations mostly to those settings. Especially for the cube, I like reduced evaluations very much and it doesn't seem like 2-ply prune 100% cube is a real improvement in accuracy or speed.

Greetings,
--
Robert-Jan Veldhuizen





reply via email to

[Prev in Thread] Current Thread [Next in Thread]