bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Re: Getting gnubg to use all available cores


From: Jonathan Kinsey
Subject: Re: [Bug-gnubg] Re: Getting gnubg to use all available cores
Date: Sun, 16 Aug 2009 12:09:51 +0000

Hi All,

I'm just back from a holiday. The limit is just an arbitrary number (so can
just be increased if someone has more than 16 cores - there are some new 64 core
boxes coming out but they're not really consumer level). I think the limit is
just to avoid some small fixed memory overheads in the code.

Jon

Christian Anthon wrote:
> Hi Michael,
>
> thx for investigating this. My answer would have been along the lines
> of "beyond our control". Jon coded most of the threading stuff and I
> believe that the MAX_NUMTHREADS is indeed somewhat arbitrary. However
> I believe there is a bit of memory consumption and possible also extra
> cpu time involved in setting it higher. Hopefully Jon will pitch in.
>
> Christian.
>
> On Fri, Aug 7, 2009 at 6:55 AM, Michael Petch wrote:
>> Howdy Louis,
>>
>> I think that MAX_NUMTHREADS was an artificial limit set by the hardware of
>> the day. Christian can likely tell you why it is 16 specifically but I am
>> assuming that it was a someone arbitrary(and reasonable) value based on
>> cores available on most systems.
>>
>> Onto your OS/X issue. I did a bit of research and my original view on
>> waiting for Snow Leopard may actually be all that is required.
>>
>> Nehalem processors diverge from the previous generation of Intel processors
>> because they no longer based on SMP (Symmetric MultiProcessor) designs. In
>> an SMP system, generally all processors have access to main memory (RAM) via
>> a single data bus. The problem of course is that the more cores you have,
>> the more contention for memory read/writes that have to occur on that one
>> bus.
>>
>> Intel decided that SMP designs likely will not scale properly in the future
>> when dealing with large core counts (32, 62, 128 cores etc) so they moved
>> their Nehalem design to NUMA type systems instead of SMP. NUMA is non
>> uniform memory access. In this type of design cores may not necessarily be
>> able to share memory with other processors without some help. I'm nto going
>> to get into the gorey details but the bus system Intel is pushing is the QPI
>> (QuickPath interconnect) bus. This literally replaces the good old FSB
>> (Front side Bus)
>>
>> NUMA architectures do allow for the concept of "Remote" and "Local" data.
>> Shared data may not be directly available by a processor but it can be
>> retrieved (remotely) but it will be slower. Operating System Kernels need
>> NUMA support in order for shared data access on different buses to work
>> properly.
>>
>> So your asking, why tell me all this? Well the answer is simple. Apple in
>> their infinite wisdom started using new QPI/Numa hardware without actually
>> fully implementing NUMA in its current kernel! This hasn't been well
>> documented by Apple but it was discovered when companies started running
>> Xserve on the new QPI/Nehalem systems.
>>
>> Without proper NUMA support, processors can't arbitrarily share memory with
>> all other processors. Which seems to be the case here with GnuBG. Gnubg
>> launches in a single process and then asks the OS/X to create threads (with
>> shared memory requirements). It appears by default that each processor is
>> considered as a separate entity without sharing (On OS/X Leopard). The
>> exception is that eacg core appears as 2 virtual cores. Virtual cores are on
>> the same processor, thus the same bus so one can share memory across them.
>>
>> It seems when Gnubg launches, all the threads are created on one processor
>> (the processor is originally chosen by OS/X) and accessible by 2 virtual
>> cores (Using Hyperthreading). It seems Apple did this so they could put out
>> new equipment before the next OS (Snow Leopard) was released.
>>
>> So what does Snow Leopard have that Leapard doesn't? NUMA support.
>>
>> My guess is that if you got your hands on Snow Leopard you may find that
>> what you are seeing changes. Apparently this very problem exists for people
>> using CS4 (Adobes Creative Studio 4).
>>
>> Linux supports NUMA, you might be adventuresome and try to install Linux on
>> your Apple Hardware and see what happens.
>>
>> Your chess program may work because of the way it splits up tasks (It may
>> even use a combination of Posix Threads and separate process spaces). I
>> haven't seen the source code so its very hard to say.
>>
>> Michael Petch
>>
>> On 06/08/09 10:29 AM, "Louis Zulli" wrote:
>>
>>> Hi,
>>>
>>> I put
>>>
>>> #define MAX_NUMTHREADS 64
>>>
>>> in multithread.h and rebuilt.
>>>
>>> In Settings-->Options-->Other, I put Eval Threads to 64.
>>>
>>> I then let gnubg analyze a game using 4-ply analysis.
>>>
>>> According to my unix top command, gnubg had 69 threads and was using
>>> 188%CPU. So apparently all the threads were running (into each other!)
>>> in one physical core.
>>>
>>> In any case, increasing the max number of threads above 16 seems
>>> trivial to do, unless I'm missing something.
>>>
>>> Louis
>>>
>>>
>>> On Aug 6, 2009, at 11:34 AM, Ingo Macherius wrote:
>>>
>>>> Do you use the calibrate command or a batch analysis of matchfiles?
>>>> The
>>>> former was shown to be of no value for benchmarks, see here:
>>>> http://lists.gnu.org/archive/html/bug-gnubg/2009-08/msg00006.html
>>>>
>>>> With calibrate I had the very same effect of high idle times during
>>>> benchmarks, unless I used at least 8 threads per physical core.
>>>>
>>>> I am doing benchmark on a 4 core machine which iterates over #thread
>>>> (1..6)
>>>> and cache size (2^1 .. 2^27). Should be posted in say 3 hours, it
>>>> literally
>>>> is still running :)
>>>>
>>>> Ingo
>>>>
>>>>> -----Original Message-----
>>>>> From: address@hidden
>>>>> [mailto:address@hidden On
>>>>> Behalf Of Louis Zulli
>>>>> Sent: Thursday, August 06, 2009 3:21 PM
>>>>> To: Michael Petch
>>>>> Cc: address@hidden
>>>>> Subject: [Bug-gnubg] Re: Getting gnubg to use all available cores
>>>>>
>>>>>
>>>>>
>>>>> On Aug 5, 2009, at 4:02 PM, Michael Petch wrote:
>>>>>
>>>>>> I'm unsure how the architecture is deployed and how OS/X
>>>>> handles the
>>>>>> physical cores, but it almost sounds like one Physical core is being
>>>>>> used
>>>>>> (Using Hyperthreads to run 2 threads simultaneously). I wonder if
>>>>>> the memory
>>>>>> is shared across all the cores? A friend of mine was
>>>>> suggesting that
>>>>>> people
>>>>>> may have to wait for Snow Lapard to come out before OS/X properly
>>>>>> utilizes
>>>>>> the Nehalem architecture (whetehr that si true or not, I
>>>>> don't know).
>>>>>> Anyway, as an experiment. If you run 2 copies of Gnubg at the same
>>>>>> time
>>>>>> (using multiple threads) do you get 400% CPU usage?
>>>>>>
>>>>>
>>>>> Hi Mike,
>>>>>
>>>>> Sorry for the delay. I just had two copies of gnubg analyze the same
>>>>> game, using 3 ply analysis. Each instance of gnubg used 200%
>>>>> CPU. Each
>>>>> copy was set to use 4 evaluation threads.
>>>>>
>>>>> So what's the verdict here? Is Leopard simply not directing threads
>>>>> correctly?
>>>>>
>>>>> Louis
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bug-gnubg mailing list
>>>>> address@hidden http://lists.gnu.org/mailman/listinfo/bug-gnubg
>>
>>
>>
>> _______________________________________________
>> Bug-gnubg mailing list
>> address@hidden
>> http://lists.gnu.org/mailman/listinfo/bug-gnubg
>>
>
>
> _______________________________________________
> Bug-gnubg mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/bug-gnubg
>
>





Celebrate a decade of Messenger with free winks, emoticons, display pics, and more. Get Them Now

reply via email to

[Prev in Thread] Current Thread [Next in Thread]