Re: [gnubg] Help with a new MET

bug-gnubg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnubg] Help with a new MET

From:	Timothy Y. Chow
Subject:	Re: [gnubg] Help with a new MET
Date:	Mon, 11 Nov 2019 21:17:33 -0500 (EST)
User-agent:	Alpine 2.21 (LRH 202 2017-01-01)

Ian,

Thanks for putting all this effort into a new MET!

I don't know too much about the innards of GNU Backgammon, but I do knowsomething about math and statistics.

In terms of how many matches you would have to play between GNU-old-METand GNU-new-MET, that depends on how much stronger GNU-new-MET is.Suppose that GNU-new-MET has a 51%/49% edge over GNU-old-MET. That meansthat if you played 1000 matches, then you would expect a score of 510 to490. The problem is that if GNU-old-MET were playing against itself, thestandard deviation would be about 15.8. So a 510 to 490 result would befar from statistically significant. You'd need about 10000 trials tobarely reach statistical significance: The expected score would be 5100 to4900 and the standard deviation would be 50, so 5100 would be two standarddeviations away. In general the formula for the standard deviation issqrt(n)/2 where n is the number of matches.

There's another point to be cognizant of, which is that there is adistinction between statistically significant evidence of the bare-bonesclaim that "the new MET is better," and a good estimate of *how* muchstronger GNU-new-MET is than GNU-old-MET. Let's say you played 10000matches and the score was 5100 to 4900. You could then claim that the newMET is better, and say that this claim is significant at the two standarddeviation level. But you *couldn't* claim that you are 95% confident thatthe new MET gives you a 51%/49% edge over the old MET. To get a goodestimate of the edge requires more trials. How many trials you need woulddepend on how sharp an estimate you want.

I don't have as much insight into what might be going wrong with thecubeful calculations. It does sound to me that there might be a problemwith floating-point precision, but someone with knowledge of the code willhave to comment on that.

Tim

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [gnubg] Help with a new MET, Timothy Y. Chow <=
- Re: [gnubg] Help with a new MET, Joseph Heled, 2019/11/11
  - Re: [gnubg] Help with a new MET, Timothy Y. Chow, 2019/11/12
    - Re: [gnubg] Help with a new MET, Joseph Heled, 2019/11/12
    - Re: [gnubg] Help with a new MET, Timothy Y. Chow, 2019/11/12
    - Re: [gnubg] Help with a new MET, Joseph Heled, 2019/11/12
    - Re: [gnubg] Help with a new MET, Timothy Y. Chow, 2019/11/12

Prev by Date: Re: Help with a new MET
Next by Date: Re: [gnubg] Help with a new MET
Previous by thread: Help with a new MET
Next by thread: Re: [gnubg] Help with a new MET
Index(es):
- Date
- Thread