Re: [Bug-gnubg] Re: Strange FIBS ratings

bug-gnubg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Re: Strange FIBS ratings

From:	Christopher D. Yep
Subject:	Re: [Bug-gnubg] Re: Strange FIBS ratings
Date:	Wed, 10 Sep 2003 07:03:45 -0400

At 03:34 PM 9/9/2003 -0400, Douglas Zare wrote:

Quoting "Christopher D. Yep" <address@hidden>:

> I think this phenomenon has been known for many years now.  Kees'
> experiments and Douglas Zare's research on Gammonvillage are just the
> latest examples supporting this conclusion.

Some have known it, others have not. I have been arguing that

As a very rough guess, I'd say that 3% to 30% of all backgammon playersknow this fact today (that checker errors give up more equity than cubeerrors). Those who own gnubg (or Snowie) and regularly use the PlayerRecords (or Account Manager) should know this fact, assuming they careenough about their stats to review them regularly. Of the 70%-97% whodon't know the fact, some players grossly overestimate the importance ofthe cube. One casual player told me that "it's easy to move the checkersaround, but that cube errors account for 98%-99% of the total equity lost"!

Humans have been wondering which is more costly (checker errors, cubeerrors) for a long time, even before the concept of EMG was invented.


There are two different questions,

(1) Which gives up more equity in ppg/mwc (points per game for a moneygame, match winning chances for a match), checker errors or cube errors?

(2a) Do players have higher checker error rates or higher cube error rates,with error rates measured using Snowie methodology?


(2b) Same as 2a, but using gnubg methodology?

#2b is significantly different than #2a. #2a uses the same denominator forboth checker and cube error rates, so the ratio of (checker error rate) to(cube error rate) is the same as the ratio of (total EMG given up bychecker errors) to (total EMG given up by cube errors). If I remembercorrectly, gnubg checker error rate is the total EMG given up (checker)divided by total number of unforced checker plays, while gnubg cube errorrate is the total EMG given up (cube) divided by the number of (actual or"close" [based on some threshold] cube decisions).

I don't know the entire history of this thread (partly because it isspanned across multiple threads; I haven't read all the e-mails). #1interests me much more so I haven't commented yet on #2b, but I'm guessingthe thread was initially inspired by #2b.


The casual player doesn't have Snowie or GNU and is more concerned with #1.

1) Humans give up more equity through checker play.
2) Using EMG overstates the amount given up through cube play.
Many people have not been convinced (mainly weaker players), and I hopethat my
column will convince them.

Question #1 has interested me since I started playing in the early1990s. When I bought Snowie in 1999 I checked my own errors. I wassurprised that my checker errors gave up much more equity than my cubeerrors, but I used the intuitive arguments I gave earlier (mainly thatthere are many more difficult checker decisions than difficult cubedecisions each game) to convince myself of the fact. I also downloaded 9analyzed matches (all in 1999, analysed by Snowie 3) from Oasya.com (nowbgsnowie.com). I see that these matches have been taken down (exceptBallard vs. Meyburg at the Nordic Open 1999), but they've put up 13 newones in their place (http://bgsnowie.com/backgammon/matches.dhtml). If youhave time, you may wish to review these. I'd guess that these matches aremore reliable than those on Johanni's list, since presumably the decisionto record/display each match was made before the actual match was played (Icould be wrong though). Johanni's list includes only self-selectedmatches, which may present a bias. If there is a bias, I don't know inwhich direction it is, however I'll guess that Johanni's list is morelikely to exclude matches with large cube errors; after a match a playermay check a particular cube decision (but not many or any difficult checkerproblems), then if he was grossly wrong on the cube decision be tooembarrassed to send in the match to Johanni. Additionally I think thatJohanni's methodology is to rollout cube blunders but not checker blunders(someone correct me if I'm wrong). The last point is definitely a bias. Acountering bias is that Snowie (at least Snowie 3) does not include checkererrors in non-contact races but does include cube errors in non-contact races.

The second point is closer to what KvdD's experiments show. My data says that
human cube errors happen when less mwc is at stake, on average, than checker
play errors. His says that when gnu is told to play stupidly, its cube errors
happen when less mwc is at stake.

I thought that Kees' study centered around trying to estimate FIBS ratingbased on two variables (1) gnubg checker error rate, (2) gnubg cube errorrate. This is more than just simply concluding that cube errors happenwhen less mwc (or ppg) is at stake. His overall conclusion is quitevaluable in my opinion, but only if the results are trustworthy. The mostimportant issue that needs further study is whether using (gnubg withnoise) is sufficient to model humans. The advantage of using (gnubg withnoise) is that we can quickly develop a huge sample size. I appreciateboth bot and human work (with your investigation being the latter). Keesis now studying human data which is the next logical step. Hopefully workcan continue in this area.

BTW, here are two intuitive arguments that cube errors happen when less ppgis at stake in a money game (similar results apply in matches with respectto mwc):

Suppose that a player's average (total cube errors in ppg) is X% of his(total checker errors in ppg).

1. If every game ended in double/pass then we can partition each game intoperiods based on the cube value:


1. Centered cube
2. 2-cube
3. 4-cube
4. 8-cube
Etc.

Each period ends when the cube is accepted (or passed in the case of thefinal cube). We have assumed that the player's average (total cube errorsin ppg) is X% of his (total checker errors in ppg). It's reasonable toassume that this ratio applies across each period above (note that thefinal period ends with a double/pass). Thus a player's normalized cubeerror rate (Snowie methodology, not gnubg's methodology) will also be X% ofhis normalized checker error rate.

In actuality though, not every game ends in double/pass. For games thatdon't end in double/pass, the final period will involve difficult checkerdecisions, but not very many difficult cube decisions (if the game ended indouble/pass though, then a representative number of difficult cubedecisions could be expected in the final period). This is because theplayer holding the cube at the end of the game is usually an underdogthroughout the final period, thus his cube decisions are easy (not alwaysthough; sometimes he has to decide whether he is too good or not, if hedecides he is too good the cube will not be turned on that roll).

The overall effect of the above paragraph is that cube errors are morelikely to be made on smaller cubes than checker errors.

2. The above does not consider that on large cubes (cubes >= 2), the playeron roll has to (roughly) consider doubling only when he is both thefavorite and when he owns the cube, while on very small cubes (centeredcube) he has to consider doubling on *every* move when he is thefavorite. This further amplifies the effect that cube errors are morelikely to be made on smaller cubes.

Overall conclusion: the reported Snowie normalized error rate(equivalently: EMG error rate in the case of matches) exaggerates theeffect of cube errors on total error in ppg. This agrees with yourconclusion. There are some minor modelling flaws in #1, but theseintuitive arguments were enough to convince myself when I first thoughtabout it a few years ago.

Thanks for the work. While it sounds like you don't want to be mentionedin the same note as Kees, I think both your contributions are valuable andI hope this is taken as a compliment.


Chris

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Bug-gnubg] Re: Strange FIBS ratings, (continued)

Prev by Date: RE: [Bug-gnubg] Suggestion from GV
Next by Date: Réf. : Re: [Bug-gnubg] Marked move characters su ggestion
Previous by thread: Re: [Bug-gnubg] Re: Strange FIBS ratings
Next by thread: Re: [Bug-gnubg] Re: Strange FIBS ratings
Index(es):
- Date
- Thread