bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Some idle musings re. ratings


From: kvandoel
Subject: Re: [Bug-gnubg] Some idle musings re. ratings
Date: Fri, 12 Sep 2003 00:26:39 +0200 (CEST)

> > Regarding modeling human error by noise:
> >
> > One thing I would expect is the noise errors to be uniformly distributed
> > over the moves of a match,  whereas the human errors would tend to clump
> > together when a  type of position arises that  is incorrectly handled by
> > the human. For myself, when I  get to a difficult position (in the human
> > sense) my errors clump in that region, because it is more difficult.
> >
> > I don't see  however how that (clumping versus  uniformity) would affect
> > the rating.

> I usually find when playing a 7 point match against gnubg that of the
> say 6 games in the match, 4 of them will give me a consistent
> low (for me) error rate and one or two games will have almost all of
> the errors, not uncommonly 2 or 3 major errors within a move or two of
> each other.

I think that is common (errors clump in the hard games).

> > I understand  the noise is injected in  the outputs of the  NN. I always
> > have had the feeling  that it would be a better model  of human error to
> > inject the noise into the WEIGHTS of the NN. Now that I think about it I
> > think this might also introduce clumping effects, like when the position
> > moves  into a  region whose  processing has  been damaged  a lot  by the
> > partial lobotomy.

> Hmm, I sometimes think that describes my off days at bg.

> > I guess  I could just  externally disturb the  weights file to  create a
> > number of braindamaged bots and  experiment with those.  Any pointers to
> > where I  can find the file  format for the  wieghts files? Or is  this a
> > stupid idea anyways?

> I have my doubts about this one. It would certainly be even harder to
> give any justification for altering the weights as being a model of
> human (mis)play, even more so that injecting random noise. And I
> suspect it would be much harder to produce a controlled result.

Why, just disturb the weigths and let it play and measure performance.
Reason I prefer that over injecting noise in outputs is that the weights
encode the evaluation function, and disturbing the evaluation function
seems to be a better model of bad play than disturbing the motoric
control of checkers (what the outputs encode).

> I'd also speculate you'll have difficulty finding any general model
> for real human errors - there are the ones caused by simply failing to
> see a move, whether by not looking, miscounting points or
> whatever. There are ones caused by not understanding a type of
> game. There are ones caused by too careless play, steaming, being too
> cautious and waiting too long, failing to see potential responses, you
> name it, people will find a way to screw it up. And every person will
> have their own mix of errors they make, which even for one person may
> vary with time, alcohol, their assesment of their opponent, etc.

Yes, but I still hold on to the working hypothesis that noise in the
ouputs is good enough model to accurately predict rating. Maybe all these
reasons you matter don't matter. Maybe. But maybe you're right.

Kees






reply via email to

[Prev in Thread] Current Thread [Next in Thread]