bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-gnubg] Deprelibenchmark 2010


From: Frank Berger
Subject: [Bug-gnubg] Deprelibenchmark 2010
Date: Sat, 23 Jul 2011 13:59:11 +0200

Hi,

as mentioned some time ago, I got the files of the 2010 Benchmark from Michael 
Depreli and analyzed the data to construct an XML file that contains all 
positions that were differently played with their rollout data. I carefully 
checked the data for consistency (I got a pretty good idea how much work it has 
been to collect this data manually. Kudos to Michael) added rollouts and 
together with Michael we looked at the problems and fixed them. Feel free to 
add some additional quality control. 

The idea of having such a file is to have a benchmark that allows quick (BGB 
needs just half an hour) and verifiable results. How difficult it has been to 
estimate bot strength in the past? Remember the Big-Bot-Shootout? 6000 25-point 
matches, month of computing time but for statistical relevant results (other 
than that JF is worse) it is still to few.
With the Depreli 2010 benchmark this should have been improved vastly now.

Naturally through the selection of positions, the rollout method etc. there 
might be deviations from the "truth", but it is better by far than anything we 
had before.

My expectation is to establish a file format for benchmarks, so we get more of 
this data in the future. I invite anyone to add suggestions, rollouts (or 
rollout data with a different bot.. Xavier?) etc., but to have one "master" 
copy of this file I ask you to send modifications to me. 

I would be glad if further benchmarks would be created, or where SW is 
developed where the creation of such an benchmark is simplified etc. That's 
should be just a start I hope.


You find the files here: 
http://www.bgblitz.de/Depreli2010/dep_2010_id.xml.zip

The sgf files of the matches are here: http://www.bgblitz.de/Depreli2010/sgf.zip
The excel file is here: 
http://www.bgblitz.de/Depreli2010/BOT%20SHOOTOUT500R8.xls 
(I asked Michael for permission)

Just to have an idea how the file looks I appended a short piece at the end.

ciao
Frank


<benchmark>
  <displayName>Depreli Benchmark 2010</displayName>
  <comment>Benchmark an AI against approx. 5000 difficult positions</comment>
  <plies>3</plies>
  <benchPositions>
    <benchPosition>
      <cubeDecision>false</cubeDecision>
      <id>010202G</id>
      <positionID>sM/gARTB28EBIg:cAkOAAAACAAE</positionID>
      <xgID>XGID=-Aaa--DBC---dC---b-ebA--A-:0:0:1:34:1:0:0:0:10</xgID>
      <responses>
        <response>
          <move>24-21,6-2</move>
          <equityDiff>0.00</equityDiff>
        </response> 
        <response>
          <move>24-21,13-9</move>
          <equityDiff>-0.008</equityDiff>
        </response> 
        <response>
          <move>24-21,8-4</move>
          <equityDiff>-0.017</equityDiff>
        </response> 
        <response>
          <move>8-1</move>
          <equityDiff>-0.042</equityDiff>
        </response> 
     </responses>
    </benchPosition>
    <benchPosition>
      <cubeDecision>true</cubeDecision>
      <id>010208S</id>
      <positionID>z88BAAzFdg8YAA:cAkAAAAACAAE</positionID>
      <xgID>XGID=-AAb-BBCD------B----c-f-d-:0:0:1:00:1:0:0:0:10</xgID>
      <responses>      
      <response>
           <cubeAction>No Double</cubeAction>
           <equityDiff>0.0000</equityDiff>
        </response>
        <response>
           <cubeAction>Double</cubeAction>
           <equityDiff>-0.0100</equityDiff>
        </response>
        <response>
           <cubeAction>Take</cubeAction>
           <equityDiff>-0.2510</equityDiff>
        </response>
        <response>
           <cubeAction>Pass</cubeAction>
           <equityDiff>0.0000</equityDiff>
        </response>
     </responses>
    </benchPosition>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]