bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-gnubg] large one-sided bearoff db


From: Joern Thyssen
Subject: [Bug-gnubg] large one-sided bearoff db
Date: Sat, 5 Oct 2002 07:53:40 +0000
User-agent: Mutt/1.4i

Hi all,

I've checked-in an initial implementation of large one-sided bearoff
databases. The file sizes are twice as large as I've reported earlier,
since I store 32 short ints for bearing off all chequers as well as 32
short ints for bearing off just one chequers. The last distribution is
used to calculated gammon probabilities.

Both win percentages and gammon percentages look extremly good. 

The implementation is very simple and straightforward. Improvements
include:

* use some caching scheme
* compress the file [*]

The Makefile has targets:

bearoff-database    (generates the usual gnubg.bd (unchanged))
bearoff-database-7
bearoff-database-8
bearoff-database-9
bearoff-database-10
bearoff-database-11
...
bearoff-database-18

Tonight I generated the 10 point database, which is 400 MB and takes
approximately 6 hours to generate on my 1Gz laptop. I don't think we
should distribute the database yet -- I prefer to wait to the size of
file has been reduced. 


I'm a bit worried on how to read the file if it's compressed. The
current algorithm for reading the file is:

EvalOS ( position number ) {

   offset = k + position number * 64 * 2   /* k is the size of some
   lseek ( offset )                           initial data */
   read ( 128 bytes )

}

The proposed solution for reducing the file must keep it simple to read
the file. For example, I don't think gzip'ing the file falls into the
category.

I did some statistics on the 10 point database:

              #elements        #non zero      #zero
win          104 600 320      43 081 501    61 518 819
gammon       104 600 320      33 757 474    70 842 846

Altough we keep 32 elements a maximum of 22 is non-zero.

With the most complicated compression scheme the file would shrink
from 418 401 280 bytes to 153 677 950 bytes. A simple approach would be
to store only 22 elements plus an index, which would give an file size
of 300 725 920 bytes. Another possibility is to group the
positions:

position numbers 0 - 2M:
   19 elements for win, 12 elements for gammon
position numbers 2M - 3.2M:
   22 elements for win, 22 elements for gammon

File size is now: 241 MB. With the optimal grouping the file size would
be: 218 726 024 bytes. With these scheme it's very trivial to find the
position in the file for a given position number.

Any other ideas out there?

I'll also try to pursue my idea about approximating the roll
distributions with normal distributions. 

Jørn

-- 
Joern Thyssen, PhD
Vendsysselgade 3, 3., DK-9000 Aalborg, Denmark
+45 9813 2791 (private) / +45 2818 0183 (mobile) / +45 9633 7036 (work)
Note: new mobile number!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]