[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-gnubg] Re: ratings formula, checker play vs. cube
From: |
Robert-Jan Veldhuizen |
Subject: |
[Bug-gnubg] Re: ratings formula, checker play vs. cube |
Date: |
Sat, 13 Sep 2003 01:33:01 +0200 |
At 12:01 9/12/2003 -0400, you wrote:
You are assuming cubeErr/totalCubeDecisions is a better measure of cube
skill than cubeErr/totalActualCubeDecisions. I belive the opposite to
hold, so what you are proposing would introduce moe noise.
I'm quite confident the "old" way of measuring cube error rates, dividing
by totalCubeDecisions, gave a more realistic result, especially in
comparison to the checker play error rate, which is also based on all
unforced moves.
GNUBG used to work that way in the past and I strongly believe it gave me
better analyses that way.
The problem is that the current way of determining a close cube decision is
pretty much flawed for match play.
One obvious, and not that rare example is when both players are 2-away.
With winning chances for the player on roll being anywhere between 30% and
70%, it is almost always correct to double, but some times it's optional
(no market losers f.i.), and often it's a very close decision. If a player
doubles immediately on his first opportunity, he's making no error, but it
counts for one actual (and close) decision per the current scheme. If both
players somehow delay doubling (which can have its advantages), every turn
will count as a close cube decision.
So, one could have two similar matches where 2-away both is reached, one
where a player (correctly) doubles at the first opportunity, and one where
f.i. the cube (again correctly) gets turned after 16 moves. The latter will
give both players a much lower cube error rate with the current scheme,
because 8 decisions are added for each player.
If it somehow happens that someone plays on for the gammon at 2-away both,
that is in theory a bad plan (double/take and you just need a single win).
However GNUBG will now "reward" you for this strategy by adding (almost)
all moves as actual or close cube decisions, and this can instantly add
like 25 decisions for BOTH players. In a 5 or 7pt match, that is often more
than all other decisions in the match added up.
This certainly does not reflect the skill involved, and is mostly an
anomaly IMO.
Another example is when one player gets closed out and is very likely to
get gammoned when he loses; (re)doubling then is often just a tiny error,
and this might repeat for some turns as the other side brings the checkers
around and bears off. Again GNUBG will add relatively many to the number of
actual or close cube decisions, while it does not really reflect any more
skill involved. As a result this kind of situation will drop one's cube
error rate by the current scheme, which does not seem to reflect cube skill
very well.
Since it turns out the number of actual or close cube decisions is often
quite a low number, compared to f.i. all moves, the effects of adding a few
to them can be quite large.
>It is also inconsistent, because checker play errors DO get divided by
>all unforced moves.
So what? If anything I feel that the checker play total error should
maybe also be divided not just by non-forced moves, but by non-obvious
moves. THis is just harder to quantify.
Indeed, but my point is that as long as this isn't implemented, it would be
better not to use it for cube errors either. The relation between cube
error rate and checker play error rate is IMO clearly more realistic when
they are both calculated by using similar terms.
I think that this might also partly explain Albert Silver's results
although I haven't checked that.
I analysed a lot of matches when GNUBG still used the total number of cube
decisions, and I think it much better reflected my cube skills this way
than with the current method.
Note that for money play, it's a different matter, because the current
method works quite well there I think.
Cheers,
--
Robert-Jan Veldhuizen