[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-gnubg] Bug in sigmoid?
Re: [Bug-gnubg] Bug in sigmoid?
Fri, 18 Apr 2003 08:04:29 +1200
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2) Gecko/20021202
Can you please post the code for the new sigmoid? I did not understand
the new formulation.
Any change in sigmoid would require a re-training, even if the
difference is small.
The |x| >= 10 range is supposed to be a safety net. This should be a
rare case and should not matter much. Did you find otherwise?
Olivier Baur wrote:
Le jeudi, 17 avr 2003, à 18:28 Europe/Paris, Nis a écrit :
--On Thursday, April 17, 2003 15:33 +0200 Olivier Baur
I think I've found a bug in sigmoid (neuralnet.c), but I'm not sure
its impact on the evaluation function...
I don't understand the role of this function fully, but I would guess
that the actual function doesn't matter much, as long as it is
relatively smooth and increasing.
True, but maybe a little sigmoid error at the ouput of each one of the
128 hidden neurons, combined into the 5 output neurons and sigmoid'ed
If I remember well, (sigmoid(x)-S(x))/S(x) was less than +/-.0047% in
the [-10.0 .. +10.0] range (not counting the discrepancy in the [+9.9 ..
+10.0] interval and the two plateaus for x < -10.0 and x > +10.0)
Let's call S the real sigmoid function: S(x) = 1 / ( 1 + e^x)
It seems that sigmoid(x) will return a good approximation of S(x) for
-10.0 < x < 10.0 (less than +/-.01% error), but then it returns S(9.9)
for x >= 10.0 (instead of S(10.0)) and S(-9.9) for x <= -10.0
S(-10.0)). sigmoid is not even monotonic!
The big question for me is: Does it matter if we change the function -
for instance to get it closer to S(x).
More specifically: Is the current sigmoid function "imprinted" on the
synapses of the current nets?
Yes, it seems so.
When I started vectorising the sigmoid function with an algorithm that
gives results closer to S(x), I couldn't pass the gnubgtest.
Then, instead of building my lookup table with S(x), I built it using
the results of the current sigmoid(x), and provided the lookup table had
enough entries (about 1000 in the [-10..+10] range, with .01 steps), I
could successfully pass the test!
By the way, I found a simple way of optimising the current sigmoid
function: instead of having a lookup table holding pre-computed
exp(X) and then returning sigmoid(x) = sigmoid(X+dx) =
1/(1+exp(X)(1+dx)), why not have a lookup table holding precomputed
values of S(X) and return sigmoid(x) = sigmoid(X+dx) =
S(X)+dx.(S(X+1)-S(X))? The time consumming operations here are the
lookups and the reciprocal (1/x) operations. With the second method, you
trade one reciprocal and one lookup for two lookups; and since in the
latter case the second lookup will probably already be in the processor
cache (since S(X+1) follows S(X) in memory), you end up doing mostly one
lookup and no more reciprocal. On my machine, it gave me a +60% speed
increase in sigmoid.
Beautiful. Did you compare the precision of the results?
Yes, I get better results with a lookup table of the same size (100
elements, from 0.0 to 10.0 with .1 steps), but then gnubg won't pass
Of course, you can reach whatever precision you like by using a more
detailed table (ie, with more entries), but then you might lose the
"cached lookup advantage" if the lookup table can't be kept in the
processor cache during repeated calls to the sigmoid function.
Bug-gnubg mailing list