[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Snowie error rates versus gnubg error rates

From: Douglas Zare
Subject: Re: [Bug-gnubg] Snowie error rates versus gnubg error rates
Date: Mon, 7 Apr 2003 20:50:22 -0400
User-agent: Internet Messaging Program (IMP) 3.2.1

Quoting Joern Thyssen <address@hidden>:

> On Sun, Apr 06, 2003 at 11:33:18PM -0400, Douglas Zare wrote
> > Quoting Joern Thyssen <address@hidden>:
> > 
> > > On Sun, Apr 06, 2003 at 05:36:34PM -0300, Albert Silver wrote
> > > > I mention this because it just
> > > > seems to me, from empirical experience alone, that 0.0083 as the
> bottom
> > > > limit of Expert and 0.012 as the bottom limit of Advanced seems a
> little
> > > > strict. I know Snowie has in practice been stricter in its grading,
> but
> > > > I didn't get the impression it was THAT strict. 
> > > 
> > > Apperently it is :-) With the new threshold gnubg should be, on average,
> > > equally strict as Snowie.
> > 
> > I think Snowie's thresholds are not strict enough. In my opinion,
> > there are too many players with average error rates under 4.4 mppm for
> > them all to be considered world-class. I favor decreasing the limit to
> > perhaps 3.5 mppm by Snowie's measure. 
> > 
> > Otherwise, some people rated as "world-class" by bots will be rated
> > "pigeon" by substantially stronger human players. I just analyzed a
> > money session in which a player with an error rate under 3.0 (by
> > Snowie rollouts) was estimated to be the favorite by about 0.12 ppg
> > against a player with an error rate of 4.4. 
> 0.12ppg seems like a lot when there is only 1.4 millipoints in
> difference in the error rates. Were the games very long? On a 1-cube you
> would need an average of 80 decisions per player to reach 0.12 ppg, so I
> assume one of the players made a lot of errors on high cubes?

First, most of the initial doubles were taken (as is normal). There were 9 
passes and 29 takes. It looks like the stronger player was intentionally 
doubling the weaker player in slightly early: the stronger player cashed twice 
and was doubled out 7 times. (Then again, some of the weaker player's takes 
should have been passes.) Nevertheless, the average cube level in the sense of 
the absolute error rate difference divided by the EMG error rate difference was 
less than 2.

Second, there were about 40 moves per game by Snowie's count, so 3.0 
millipoints per move absolute.

Third, I said the stronger player's error rate was under 3.0, not equal to 3.0. 
It was actually lower. 

You are free to set gnu's levels so that weak players are called "world-class," 
but the elite backgammon players (at least on a good day) are much stronger 
than even what Snowie calls borderline WC. See, for example, the incomplete 
match between Malcolm Davis and Bob Zavoral currently on the front page of 
GammonVillage, with error rates of 3.007 and 3.337, despite the sloppy play at 
the end. Here are the Snowie error rates of other recent matches there:

Malcolm Davis: 4.657 vs Steve Sax: 4.258 
[hideous match between weak/nervous players skipped]
Falafel: 3.170 vs Jean-Philippe Rohr: 3.553
Falafel: 1.558 vs Jean-Philippe Rohr: 4.527
Falafel: 4.021 vs a poor performance
[skip a few]
Steve Sax: 3.315 vs Neil Kazaross: 3.648
David Wells: 2.583 vs Tobias Hellwag: 4.965
David Wells: 2.890 vs Tobias Hellwag: 2.850
Gavin Crawley: 5.614 vs John Clark: 3.302
[skip another]
Gyl Savoie: 2.620 vs David Rubin: 5.912
Serge Rived: 3.892 vs Mads Andersen: 4.368
Johannes Levermann: 2.951 vs Kent Goulding: 2.963
Johannes Levermann: 3.173 vs Kent Goulding: 3.478
Neil Kazaross: 3.172 vs David Wells: 4.561

Ok, I'll stop there. Keep in mind that these were transcribed real matches, 
mostly from important rounds of major tournaments, so the players are typically 
playing worse than they do against the computer or online. Nevertheless, 
there are still a lot of performances under 3.5 as Snowie measures it. I 
think that would be a better cutoff for world-class. When players improve, the 
bar should be raised.

One might argue that there are 600 grandmasters in chess, and I've heard that 
there are that many 9th-dan go players (although that sounds high), so why not 
let there be 600 or more world-class backgammon players? Among the 9th-dan go 
players, though, in theory they can't spot each other two stones. There are 
large differences in skill between players who average WC according to Snowie.

Douglas Zare

reply via email to

[Prev in Thread] Current Thread [Next in Thread]