[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Simple multi-threading...

From: Jonathan Kinsey
Subject: Re: [Bug-gnubg] Simple multi-threading...
Date: Tue, 09 Jan 2007 14:15:06 +0000
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv: Gecko/20061207 Thunderbird/ Mnenhy/

Øystein Johansen wrote:
> Jonathan Kinsey wrote:
>> Can someone explain how the NN works in terms of the NNEVAL_SAVE and
>> NNEVAL_FROMBASE operations?
> It's a smart trick invented by Joseph. Notice that there in the original
> implementation (before the sse vectorization) was a "if (input_node)"
> and if not it just moved a pointer. This condition test saved a lot of
> time when the input node was 0 (and it often is). An evaluation with
> lots of zeros in the input uses considerably shorter processing time
> than a input with non-zeros.
> So Joseph, ingenious as he is, invented the trick of storing the input
> array and the resulting hidden layer array, when the first candidate
> move is considered in a movelist. When evaluating the next move
> candidate, we simply subtract the new input array from the stored input
> array, and in that way we're hoping to "generate" a lot of zeros to the
> evaluator. (As described above, zeros are faster!) Now, to compensate at
> the hidden layer, the result from the modified input array evaluation is
> subtracted from the stored hidden layer array, to get the right result.
> The evaluation step from the hidden layer to the output layer is unchanged.
> When I first saw and understood this clever trick, I was mighty
> impressed. The speedup is not superb, but significant. About 10% I
> guess. Joseph may have more accurate results. It's not as big speed
> improvement as the pruning nets, but it is (or was) worth having.
> The SSE vectorization doesn't do this trick. 

Are you sure - the code looks the same in both places to me.

> Of course it then has to
> check if all four inputs are zeros to just move the result pointer to
> the next position, without doing any arithmetics. Maybe it's worth doing
>  something similar for the vectorized code? If not, we may remove the
> cleaver trick from the SSE code, so it's not wasting cycles subtracting
> the input array for no use.
> NNEVAL_SAVE is then a flag to the evaluator to do a normal evaluation
> and save the input array and the hidden layer array.
> NNEVAL_FROMBASE is using a flag to evaluate from a saved set of inputs
> and saved set of hidden layer values.

I think I need to have separate copies or savedIBase and savedBase for
each thread (as well as for each NN).  It's quite tied up with the
nContext array so I might try to join them together in some way...


Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]