bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Bug-gnubg] Training Bots At Match Scores


From: Ian Shaw
Subject: RE: [Bug-gnubg] Training Bots At Match Scores
Date: Fri, 24 Jan 2003 10:40:51 -0000

> From: Øystein O Johansen
> Sent: 23 January 2003 14:32

> 
> I would love to see how a special trained DMP network 
> performs against a
> traditional net. Will it be much hacking into the GNU code to 
> train a DMP
> net? NUM_OUTPUTS set to 1. Will it be much more?
> 
>

>From a discussion I had with Joseph in Oct 2002:
<quote>
Ian Shaw wrote:
 
> Are there more complex issues? I imagined it would be OK to start with the 
> current net inputs and architecture, train it from a database or with TD, 
> then run it against 0.12 to see if played better in a long series of games.

Joseph Heled wrote:

I can think of no other starting point than using the current net playing at
DMP. One can use rollouts or (As I have been doing lately) 2-plys. TD is not an
option for me. It was a nice tool to "bootstrap" GNU, but even after several
million games it flattened at a rating of 1650.
The "Issues" are getting the right data set (this can make a huge impact on the
length and quality of training) and perhaps adding/removing inputs.

<end quote>

As you suggest, creating a net with a single output would probably be better. 

Joseph can comment on the training method better than I. He states that TD 
training flattened out. I wonder though, if you started with the current 0.13 
weights and trained the new network using TD from there, could you get any 
improvement. I would think the net might be able to learn some of the 
differences between DMP and $ play.

If we were to train from databases, I assume Joseph has some suitable ones.  I 
suppose some of the new  Contact rollouts would form part of the set, as would 
the Crashed. The race must be the same for DMP as for $ because you are very 
unlikely to get gammons in a race, so we would need to direct the training to 
positions that weren't in the race net nor bearoff database.

As I said to Joseph, I'd be willing to give it some CPU time if you give me an 
.exe. 

--Ian




reply via email to

[Prev in Thread] Current Thread [Next in Thread]