[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] 50% decrease in evaluation speed

From: Øystein Johansen
Subject: Re: [Bug-gnubg] 50% decrease in evaluation speed
Date: Mon, 06 Nov 2006 14:27:50 +0100

Welcome to the list!


there is two things you have to do to make use of the SSE registers.

- Use the SSE vectorized code. ( define USE_SSE_VECTORIZE )
- turn the -msse in the compiler switches.

I'm not sure how MaxMaini's builds are built (since I build my own),
but I guess he builds with SSE. Jon even made a system such that
the program detects if SSE is available runtime and the sets a
pointer to the right evaluation function. In that way we can build
SSE binaries, and still be able to execute it on computers without
SSE registers available.

MaxMaini: Can you tell us whcich switches you're using?

However, there's other ways to increase the speed even more.
There are some other swithces to GCC that can be utilised,
but the final executable will then not be general. If you
use switches like -mtune=your_architecture_here you can get
about 5% futher speedup. (This is why I make my own builds.)

I've also achived some speedup with -mfpmath=sse,387 on my
system. (I have an old pentium 4 2.26, I believe) I recommend
that you try ut some of these settings and compare them to
each other. Your system may act differently than mine.

Try also the baseInputs function in eval.c which
I've commented out with #if 0. It uses the SSE registers to
calculate inputs by looking up a table. A test program shows
that this is much, much faster, but I did a measurement on my
system and I found that it wasn't faster when built into the
program. I believe this can be because the CPU level cache is
flushed for each CalculateInputs. Can we in some way make sure
that these tables inpvec and inpvecb stay in CPU cache?
This may be different on your system, so try it out.

There was also a suggestion of moving the whole neural net
evaluation over to the GPU to let the strong processor in 
a modern computer do the hard work. The neural net evaluation
is basically just a matrix multiplication and GPUs are just
made for that. No work has been done thought.

More a general discussion:
The method of measuring the performace of GNU Backgammon is a
bit strange. It selects many random positions and these are
all evaluated and timed. This is used for the standard method
of selecting Analyse->Evaluation Speed.

This measurement has some drawbacks. First of all it's hard
to read as the number fluctates a lot, making it really hard
make any good figure.

Also, this tests only two things actually. The contact
neural net and the CalculateInput function. (Also notice how
the speed increases after some milliseconds. I belive this
comes from a slowness in the GTK interface, but I may be wrong.)

So, there is no measurement of the race net, the bearoff
databases, the pruning nets blah blah...

So, should we maybe make another benchmark test?
A test that tests all the neural nets, all the databases, and
all input calculations and so on. I guess we'll have to make
set of positions for such a benchmark, and these positions
should then not be selected at random, as todays speed

If you try some different compiler switches or other things,
please give us your results. We'll be really thankful.

Hope to hear from you again,

PS: Did I answer too much?

------ Original message ------
Fra: "J.Schmidt (MS-Division GmbH)" <address@hidden>
Til: address@hidden
Dato: Mon, 06 Nov 2006 11:40:05 +0100
Emne: Re: [Bug-gnubg] 50% decrease in evaluation speed

Hello all,

my name ist Joachim, I'm new in this group.
My first step to get involved ist to recompile the source.
Compiling for windows (xp) results in 50% less evaluation speed
compared to the binary download (27448 to 4029295).
I followed the instructions from Superfly Jon (08/02/06).
In both versions SSE is available and used, so the only change I applied was
to turn on SSE ( COMPOPT=-g -O2 -mms-bitfields -msse).
Is it possible that the gnu-compiler creates different machine code 
under windows
or is something wrong with my compiler settings?
Or maybe the checked out code from cvs is very different in terms of 
speed compared
to the code used in the binary download?

Thanks for any hint,

Bug-gnubg mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]