|
From: | Ingo Macherius |
Subject: | RE: [Bug-gnubg] Benchmarks on server class machines and resulting change requests |
Date: | Fri, 7 Aug 2009 04:23:56 +0200 |
Jon,
find attached
the cleaned up benchmark data for both the 2xXeons 5130 and 2xNocona
machines.
I've also
done new research which now includes the impact of cache size, single threaded
vs. multithreaded binary, and number of threads. The main result graph is
attached, the data is in the same spreadsheed as the two other benchmarks
(format OpenOffice 3.1) in 3 worksheet tabs.
The basis of
the experiment were the same 5 different seven point FIBS matches used for
the previous benchmarks. There were two binaries compiled, one with
multihreading (GNUBGMT) and one without (GNUBGST). Both were compiled with
gcc 4.3.2.1 on Debian 5.0.2, heavily optimized for core2 CPUs. SSE and SSE2
are used, code basis is gnubg.org CVS as per 2. August 2009. The hardware is a
Supermicro 2xXeon 5130 machine with 6GB DDR2-5300 memory. The machine was
completely idle during testing.
The 5 matches
were analyzed 4 times each, resulting in a total 20 match evalutaions at 2ply/no
pruning/cubeful. All caches were cleaned before each analysis. Cache size was
varied from 2^1 to 2^27 bytes, resulting in 27 runs for each
Graph.
* Graphs
"Threads=1,2,3,4,5" are done with MT binary and the respektive settings for
cache and threads, 20 matches
* Graph "No
Threading" was done with GNUBGST, 20 matches
* Graph "4xNo
threading á 1/4 work" was done by running 4 instances of GNUBGST with 5 matches
to analyze each in parralel
* Graph
"4xThreads=1 á 1/4 work" was done by running 4 instances of GNUBGMT set to use
one thread, with 5 matches to analyze each in parralel
Additional
remarks:
- The
"spontaneois speedup" spikes seen especially for Threads=2 are oddd, i did
several runs and they didn't disappear but showed in different frequency and
cache size positions. I consider them bugs in the Unix time
command.
- Data for
Threads=6,7,8 was also collected but is not plotted, because as expected
performance decreased with growing number of threads. Graph for Threads=5 shows
that sufficiently, no need to clutter the diagram with more.
- The
"4xThreads..." and "4x No Threading" runs aborted with out of memory for
cachesize=2^26 and 2^27 (no suprise), thus no data for them.
I very much
liked to hear some comments by you Jonathan (the author of the threading
code). Happy with what you see? Well, I think you did a good job
:) Ingo
|
gnubg_threads_and_cache_vs_batch_eval_speed.png
Description: PNG image
[Prev in Thread] | Current Thread | [Next in Thread] |