[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-gnubg] Re: Race database and bearoff DB
From: |
Jim Segrave |
Subject: |
Re: [Bug-gnubg] Re: Race database and bearoff DB |
Date: |
Wed, 2 Oct 2002 16:37:50 +0200 |
User-agent: |
Mutt/1.4i |
On Tue 01 Oct 2002 (16:19 +0000), Joern Thyssen wrote:
> On Tue, Oct 01, 2002 at 05:20:15PM +0200, Jim Segrave wrote
>
> Basically it's 32 unsigned short ints for each position. The current
> one-sided bear-off database is for 15 chequers on the 1-6 points, a
> total of N(21,6)=54264.
>
> Each of the 32 integers represent the probability that we bear off in i
> moves (multiplied by 255).
>
> #Pos #Short ints #bytes
> 15 chrs on p1-6 : N(21,6) 54264 1736448 3.4 MB
> 15 chrs on p1-7 : N(22,7) 170544 5457408 5.4 MB
> 15 chrs on p1-8 : N(23,8) 490314 15690048 31.3 MB
> 15 chrs on p1-9 : N(24,9) 1307504 41840128 83.7 MB
> 15 chrs on p1-10: N(25,10) 3268760 104600320 209.2 MB
> 15 chrs on p1-11: N(26,11) 7726160 247237120 494.5 MB
> 15 chrs on p1-12: N(27,12) 17383860 556283520 1112.6 MB
>
> We currently use 15/1-6 which easily fits into memory and can be
> referenced as:
>
> aaBearoff1 ( position, i )
I just tried a little crude perl script to see how many values are
unneeded in the database.
My plan was:
the probability of clearing in n rolls is 0, if the number of chequers
is more than 4*n. In these situations, you could omit the beginning of
the values for a position, since they are known to be 0.
the probability of having cleared after n rolls in a position is 1 if
(n * 3 >= pip count) and (n * 2) >= number of chequers since you'd
still clear everything even with repeated 21 rolls.
Both the above are cheap to evaluate, so if the database can easily
handle variable length entries, there's a lot of savings to be made:
For a 6 point rolloff database, this means that:
Positions done: 54264
Entries with zero probability: 141014
Entries with probability one: 904326
Entries with fractional probability: 691108
That gives a database with only 39.8% as many entries as currently.
For the gross case of a 12 point DB, things are less promising:
Positions done: 17383860
Entries with zero probability: 49319633
Entries with probability one: 28973060
Entries with fractional probability: 477990827
For a reduction to 85% of the original.
There's a question in my mind if having 32 entries/position is
actually sensible. For a 6 point DB, the longest bearoff is 30
rolls. But a realistic appraisal, even for 15 checkers on the 6 point,
is going to have 30 rolls as a huge outlier - it requires 29
consecutive 2/1 rolls, which I put as being (1/18)**29, or around
4*10**-30.
There is probably some much more realistic threshold after which the
probability of clearing all checkers from the 1 through 6 points is
effectivly 1, particularly given that the database can't give a
probability less than 10**-5 in a short.
Now, if I'd paid more attention in statistics classes years ago, I'd
be able to answer this myself, but I didn't and I can't.
Shouldn't we be able to say:
Assume the worst case bear-off is 15 chequers on point m (currently we
have m = 6). This requires 15*m pips, so to clear all the chequers in
r rolls, we need to average 15*m/r pips/roll. Dice rolling gives a
normal distribution and the average of a set of r dice rolls is also
normally distributed around 8.17. We should be able to come up with a
number of rolls r such that the probability that (total pips for r
rolls is .ge. 15*m) > 1-1/65536 (the limit of our resolution for a
unsigned short). Then for an m point rollout DB, there's no point
having more than that number of rolls per position. I'm willing to bet
that for a 6 point rollout DB, it's somewhere around 15 to 20 rolls,
not the 32 we have now.
A side note for the mathematically inclined (totally irrelevant to
gnubg):
I was interested in the N(i, j) in the above list. In fact I foolishly
misread it as being about the number of combinations of i things taken
j at a time and was going to send an email pointing out how wrong this
was, but then I realised that it had nothing to do with C(i,j) and
that the figures are right. I'd already written a little perl script
to work out the number of positions for various numbers of points,
which not surprisingly, matches the above. But the series has an
interesting property. If you list the elements of the series and the
multiplier to get the next element you get the following pleasing
pattern (p[i.j] is the number of distinct positions of i checkers over
j points (including some checkers possibly being borne off):
multiplier to get next term
p[15, 1] = 1 16/1
p[15, 2] = 16 17/2
p[15, 3] = 136 18/3
p[15, 4] = 816 19/4
p[15, 5] = 3876 20/5
p[15, 6] = 15504 21/6
p[15, 7] = 54264 22/7
p[15, 8] = 170544 23/8
p[15, 9] = 490314 24/9
p[15, 10] = 1307504 25/10
p[15, 11] = 3268760 26/11
p[15, 12] = 7726160 27/12
p[15, 13] = 17383860 28/13
p[15, 14] = 37442160
I'm sure there's a reason for this, but I can't see it.
--
Jim Segrave address@hidden