[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ESPResSo-users] CUDA warning for slower atomicAdd emulation

From: Stefan Kesselheim
Subject: Re: [ESPResSo-users] CUDA warning for slower atomicAdd emulation
Date: Thu, 30 Jan 2014 08:57:45 +0100

Hi Vincent,
I just want to make clear how it works. The atomics are used in the particle-LB 
coupling. One thread per particle calculates which force to apply to which 
node. The forces of every node are accumulated, and this has to be done in 
atomics, because you need to do "read-add-writeback" and this if more then one 
thread does this with the same memory cell, trouble is waiting.
Normally you either have a very dilute system (many more particles than lattice 
nodes). Then collisions are quite rare because only a small fraction of the LB 
nodes is written to. Otherwise if you have many  particles (more than cores on 
the GPU ~1000), the writes are distributed randomly over the memory and 
collisions are also rare. In this case it is very useful to go over the 
particles in random order, to make sure that the memory positions are not 
correlated. Atomics on modern hardware are only expensive if collisions happen 
Therefor: Atomic operations are not too bad.

On Jan 30, 2014, at 8:49 AM, Vincent Ustach <address@hidden> wrote:

> Thanks Georg, good news. I thought I would ask because atomic add is a slow 
> operation, therefore I want to avoid making the slow step slower!
> Cheers,
> Vincent
> On Jan 29, 2014 11:44 PM, "Georg Rempfer" <address@hidden> wrote:
> Your GPU has compute_capability 2.0, as you can see here:
> 2014-01-30 Georg Rempfer <address@hidden>
> Hello Vincent,
> the Espresso build system creates Cuda binaries for compute capability 1.1 as 
> well as 2.0. Compute capability 1.1 does not allow for float atomic adds, so 
> in this case we use a little workaround, which is slower than the native 
> float atomic add present in later compute capabilities. You always get this 
> warning message, unless you remove the "-gencode 
> arch=compute_11,code=compute_11" from". Most likely it has no 
> impact on your performance, since Cuda determines at runtime which executable 
> to use (depending on the compute performance of your Cuda device). So unless 
> you have some pretty old GPU, you will be using the 2.0 binary, which in turn 
> makes use of the native atomic add for floats.
> Greetings,
> Georg
> 2014-01-30 Vincent Ustach <address@hidden>
> Hi All,
> Upon running make on a new build of the developer's version of Espresso, I 
> saw this and several similar warnings:
> ../../src/ warning: #warning Using slower atomicAdd 
> emulation
> Is it a major concern for losing performance, or since it is related to an 
> emulation is it only for debugging?
> See attached for the results of make. I have the configure results as well, 
> if that will help. By the way I am using cudatoolkit-5.5 and the GPU card is 
> a Tesla M2050 GPU
> Best Regards,
> --Vincent Ustach
>   University of California, Davis

reply via email to

[Prev in Thread] Current Thread [Next in Thread]