[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ESPResSo-users] different simulation scenario on desktop or cluster

From: Arash Azari
Subject: Re: [ESPResSo-users] different simulation scenario on desktop or cluster
Date: Fri, 9 Aug 2013 08:30:28 -0700 (PDT)

Dear Axel,

Thank you for your reply.
I am going to use the catch command and sample the energies every step to see 
what causes the crash in this simulation.
I will post the result soon.
Thank you,

Best regards,

Arash Azari

----- Original Message -----
From: Axel Arnold <address@hidden>
To: Arash Azari <address@hidden>
Cc: "address@hidden" <address@hidden>
Sent: Thursday, August 8, 2013 5:14 PM
Subject: Re: [ESPResSo-users] different simulation scenario on desktop or 

Dear Arash,

do I understand you right that the warmup works even on 12 cores? Since 
the warmup uses the same integrator and just changes the force capping, 
that is quite likely to be a problem of the setup, namely after 
switching off the warmup caps. You might need to output the kinetic 
energy very often after switching off capping, maybe even every time 
step, since it usually goes off very fast.

Also, what you can do is to use the Tcl catch command to catch the 
error, and write out energies and particle positions at the time when 
they go through the roof.


On 08/08/2013 04:43 PM, Arash Azari wrote:
> Dear
> Axel,
> Thank
> you very much indeed for the detailed and very useful explanation and
> recommendations.
> The
> simulation on the cluster with single core runs without any error
> messages. Strangely enough! it works with 3 cores as well. I am going
> to increase the number of cores on single node up to 12 again.
> I
> checked the kinetic energy and it is in a proper range (both the 6
> cores desktop and single core cluster outputs are in agreement).
> Actually
> I have used the individual force cap for warmup in this simulation
> and I set the cap radius for each interaction. I have tried a very
> long warmup as well; something around 1 million steps, but still I get
> the same error.
> I will
> post the latest updates soon after I finished the test simulations.
> Thank
> you very much,
> Best
> regards,
> Arash
> Arash Azari
> ________________________________
> From: Axel Arnold <address@hidden>
> To: Arash Azari <address@hidden>
> Cc: "address@hidden" <address@hidden>
> Sent: Wednesday, August 7, 2013 11:01 PM
> Subject: Re: [ESPResSo-users] different simulation scenario on desktop or 
> cluster
> Hi!
> Bonded interactions are only computed for particles that are
>        located on the same CPU. If you increase the number of cores, the
>        range over which a bond can be computed, gets shorted. However,
>        any reasonable bond is much shorter than your box dimensions, so
>        that bond broken errors definitely point to excessive forces. That
>        it "works" on your desktop probably simply means that on that
>        machine, the long bonds can still be accommodated, but it is very
>        likely that your simulation is still aphysical. In particular, I
>        doubt that the problem is due to MPI or the machine, but rather a
>        problem of your setup. An easy check would be to run just with 6
>        cores or even one core on the cluster, just as on your desktop.
> To check the physics of your simulation, just write out the
>        energies, and check that in particular the kinetic energy
>        fluctuates around 1/2 N k_BT, where N is the number of degrees of
>        freedom. Although, there should be no other strong energy drift.
> Under most circumstances, it is very likely that you actually need
>        a warmup phase. However, when combined with walls, capping the
>        wall forces is usually not a good idea, since particles are then
>        not hindered from penetrating the hard core of the wall
>        constraint. Therefore, you should use the individual force cap
>        feature, and only set a cap radius for the particle-particle
>        interactions, see the User's guide for details on "inter forcecap
>        individual".
> Cheers,
> Axel
> On 07.08.13 12:36, Arash Azari wrote:
>> Hello
> everyone,
>> I have a very strange situation and I cannot find any proper solution for 
>> it; I highly appreciate any recommendations.
>> Here is the problem:
>> I have a simulation system (polymer and ions with repulsive wall) and when I 
>> run this simulation on my desktop everything is fine (run on single CPU with 
>> 6 cores) regardless of skin parameter (0.4) and the warm up steps; it works 
>> even with a very short warm up.
>> When I try to run this simulation on a cluster, a few steps after warm up it 
>> crashes with the sometimes bond broken error or wall constraint violation 
>> error. I tried different nodes and cores combinations and even on a single 
>> node with 2 CPUs (run on 12 cores) it crashes. I have changed the skin 
>> parameter up to 2.0 and very long warm up and still I get the same error 
>> messages a few steps after the warm up.
>> I should mention that I did not cap the interactions between the particles 
>> (polymer or ion) and the walls during the warm up.
>> I am not sure whether it is because of the MPI settings on the cluster or 
>> not, but the cluster administrators are not helpful at all to ask anything 
>> about the system settings and configuration.
>> I attached the config.log file if it helps.
>> Thank
> you very much,
>> Best regards,
>> Arash
>> Arash Azari

JP Dr. Axel Arnold
ICP, Universität Stuttgart
Pfaffenwaldring 27
70569 Stuttgart, Germany
Email: address@hidden
Tel: +49 711 685 67609

reply via email to

[Prev in Thread] Current Thread [Next in Thread]