bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Bug-gnubg] gnubg.sql - stats from my database


From: djskope
Subject: RE: [Bug-gnubg] gnubg.sql - stats from my database
Date: Wed, 24 Nov 2010 16:42:00 -0000

Hi Jim,

Many thanks for your comprehensive and helpful reply.

I will try to run your scripts tomorrow when I'm a bit fresher.
I don't personally know any Python programmers so I'm not sure if it will go
much further than that.

Also I tried to check the link at 

www.xs4all.nl/~jes-2

but it is giving a 404 error at the moment?

Finally, you did ask a couple of clarification questions which I will answer
here:

>>> + number of occasions successfully hit a single blot when within 
>>> 1 dice roll range (when you actually want to hit it i.e. not leaving 
>>> something silly open for your opponent to get of the bar and return a 
>>> hit. I know this bit sounds difficult to calculate) / vs gnubg's stats

>>You'd need to define what you actually mean a lot better. I don't know
>>what you eman by gnubg's stats
By 'vs gnubg's stats' I meant posing the same query again, but not for the
human - for the computer. (For comparison)

>> Two dice rolls? or including both dice?
I meant including both dice.

>> It's more profitable to study your mistakes - see the graphs on my
>> website to see the effect on winning percentages of playing with a
>> slightly lower EMG error rate than your opponent.


Best Regards,

djskope

-----Original Message-----
From: Jim Segrave [mailto:address@hidden 
Sent: 24 November 2010 15:04
To: address@hidden
Cc: address@hidden
Subject: Re: [Bug-gnubg] gnubg.sql - stats from my database

On Tue 16 Nov 2010 (17:56 +0000), address@hidden wrote:
> Hello Jim,
> 
> Thanks for your advice and kind offer to help. How do your scripts 
> run? Do you just feed it a folder of *.sgf files?

Yes - you invoke the script with the name of a directory containing sgf
files, They are all in python. I have only run them under Linux, they should
work under Windows, but I make no guarantess

script 'doubles':
The output is a bit opaque, I actually fed this into other scripts to
generate graphs. Basically, for each .sgf file in the directory, you get an
output which begins with a line

## name of sgf file

For each game you get one line with 6 columns:
 col 1 - doubles for the winner minus doubles for the loser  col 2 - as
above, but only usable doubles  col 3 - total doubles for the winner  col 4
- total usable doubles for te winner  col 5 - total doubles for the loser
col 6 - total usable doubles for te loser

# totalled over the whole match. This line begins with a '#'

Here's  sample output from a match file:

##  /usr/home/jes/bg/brunello-2.sgf
   0    0    0    0    0    0
   4    5    7    6    3    1
   1    1    2    2    1    1
   1    1    3    3    2    2
   0    1    1    1    1    0
  -3   -2    1    1    4    3
   3    2    3    2    0    0
  -1    0    3    3    4    3
   2    2    3    3    1    1
  -3   -3    1    1    4    4
  -3   -3    0    0    3    3
   0    0    2    2    2    2
   0    0    1    1    1    1
  -2    0    2    2    4    2
   2    2    5    5    3    3
#    8    9   67   63   59   54

It doesn't say who won, simply that the winner had 8 more doubles and
9 more usable doubles than the loser
================

script 'hits':

As above, your run it with the name of a directory containing .sgf files. It
goes through each .sgf file, totalling hits, pips lost to hits, It prints a
similar set of output (this is for the same match as
previously)

##  /usr/home/jes/bg/brunello-2.sgf
   2    0    0    0      4      2
   7    5    5    1     38     19
   3   -2    0    0     60    -20
   2    1    1    0      4      9
   2    1    1    1     12      9
   5   -1    0    7     59    -13
   3   -1    1   -1     30      7
  13    4    0    5    110     54
   3    2    1    0      5     18
   5    1    1   -1     55     -6
   3    2    0    0     15     18
   3   -3    6   -6     39      6
   2    1    0    0      5     21
   4    2    0    8     11     38
   3    0    0    0     18     25
#   49  -10   15   16    586    -55

Col 1 = number of times the winner hit a piece from the loser col 2 - number
of times winner hit loser - number of times loser hit
        winner
col 3 - number of times winner danced on the bar (couldn't get in from
        bar)
col 4 - number of times winner danced - number of times loser danced col 5 -
number of pips winner gained from hitting loser (amount loser
        was set back by being hit
col 5 - number of pips winner gained from hitting - number of pips
        loser gained from hitting

These don't match the statistics on the web page you pointed to, but the
internal code for parsing .sgf files would allow gathering all of that
information

For example:
http://www.capp-sysware.com/analysis/octnov2010-dc-dicestudy.txt
  the code already recognises the first roll of a game and extracts
  the value rolle.

Statistics for all Regular Rolls (Excludes First Roll)
  See above - the rolls are available and the fisrt of a game is
  recognised

Statistics for All Regular Rolls from the Bar
  The code in the script 'hits' has a map of the position so it knows
  if the player on roll is on the bar or not

Statistics for all Die values (First rolls included)
  Follows from the first two datasets

Statistics for bringing one checker in off the bar
  the code for recognising dancing and the fact that the program knows
  if there are pieces on the bar make this easy to extract

Statistics for bringing one checker in off the bar
  use the script 'hits' looking for move strings in the form yXyX
  where X is an arbitrary letter a..x

Statistics for bringing one checker in off the bar
  can be extracted from the code in 'hits'

Statistics for bringing one checker in off the bar
  can be extracted from the code in 'hits'

> The
> statistics I would be interested in are pretty much the same as the 
> headers from the webpage -
> 
> http://www.capp-sysware.com/analysis/octnov2010-dc-dicestudy.txt
> (I've
> pasted an abbreviated version of this page at the bottom of this 
> post.)
> 
> 
> Plus a couple of others - it would be good if there was also a column 
> for the 'running total all time' for each category.
> 
> + pip-count loss
> from hits / vs gnubg's stats

Not sure what this means

> + Total pip count per game / per match - 
> for me / GNU 

easily calculated from the code in 'hits'

> + Total number of doubles rolled whilst on the bar for me 
> / GNU 

combine the code in 'doubles' with the code in 'hits'

> + number of occasions successfully hit a single blot when within 
> 1 dice roll range (when you actually want to hit it i.e. not leaving 
> something silly open for your opponent to get of the bar and return a 
> hit. I know this bit sounds difficult to calculate) / vs gnubg's stats

You'd need to define what you actually mean a lot better. I don't know
what you eman by gnubg's stats

> + number of occasions successfully hit a single blot when within 2 dice 
> rolls range (when you actually want to hit it) / vs gnubg's stats

Two dice rolls? or including both dice? 
 
> Finally, I know that you are not the person to implement it, but do you 
> think this was an interesting idea, 

I've attached the scripts, find a Python programmer to take it
further. Note that these scripts were written to analyse the
relationship between various forms of 'lucky' rolls vs. winning rates
(I then plotted the results vs. gnubg's error rate in EMG in a vain
attempt to put some reality into arguments that backgammon was 60%
luck 40% skill across the board. There's some plots of the results
from 1,034 matches consisting of 17, 867 I played on
www.dailygammon.com against a mixture of players both far better than
me and a few far weaker. 

see www.xs4all.nl/~jes-2

Most of these were 21 pt matches. I did a
least squares fit of the various plots of error rate difference vs
winning percentagem. Thus all the statistics are gathered in terms of
the winner, without regard as to who that was in a given game or
match, whereas you seem to want to match one or more opponents
vs. gnubg. That would involve some simple changes in the scripts - the
first record of the first game in each .sgf file will identify the
player names and can be used to sort the statistics that way.

As to the question 'is this interesting?', not if the point is to
analyse the dice gnubg rolls to see if it's somehow luckier than
anyone playing against it, it isn't.

Tp ne fair, unless you have good reason to believe an on-line money
site is cheating with the dice (and you will need a _lot_ of data to
make a case that the dice are biased, then no, it's not
itneresting. Generating random dice rolls for a bg server is trivial
to do well enough for almost all practical purposes. 
 
> > Another
> > clarifying feature 
> would be, after I lose to gnubg (again), to be able 
> > to play the game 
> again.

You could certainly extract the dice rolls to a filefrom a saved match and
then
run gnubg again with the saved dice, assigning the saved rolls to
either player (so you could swap them). You'd need to decide how you
handle games which go longer (and thus have no rolls) or shorter (so
the rolls would end up in the next game). 

> > But we would swap the CPU's dice rolls with my own.
> > This 
> would clearly show if gnubg would still win when you have his now 
> > 
> predetermined 'lucky rolls'.

Unless you are a _very_ strong player, gnubg will win in the long run

> or is it a waste of time for me to record 
> the dice rolls and try and reply the games?

It's more profitable to study your mistakes - see the graphs on my
website to see the effect on winning percentages of playing with a
slightly lower EMG error rate than your opponent.

 
> P.S. Your other 
> suggestion sounds good too but I'm sure your round tuit list is just as 
> big as mine!
> P.P.S. I played a lot yesterday - lost 10 games to 5 
> against gnubg. Today I'm winning 3-2 :)
> 
> Cheers,
> 
> djskope 

-- 
Jim Segrave           address@hidden






reply via email to

[Prev in Thread] Current Thread [Next in Thread]