espressomd-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ESPResSo-users] Re: problems on runing parallel Espresso


From: Yanping Fan, Liza (Dr)
Subject: [ESPResSo-users] Re: problems on runing parallel Espresso
Date: Thu, 16 Dec 2010 11:45:57 +0800

Hi, Alex. Thanks for reply.

I switch skin to 3 or 2, adjust my LJ settings and change P3M accuracy from -06 
to -05, it showed other problems

Either stopped outputting new data or showing this error msg

0: Script directory: /home/korolev/espresso-2.1.2j_p/scripts
[compute82:27839] *** Process received signal ***
[compute82:27839] Signal: Segmentation fault (11)
[compute82:27839] Signal code: Address not mapped (1)
[compute82:27839] Failing at address: 0x2aaaef555a78
[compute82:27839] [ 0] /lib64/libpthread.so.0 [0x39f640e4c0]
[compute82:27839] [ 1] 
/home/korolev/espresso-2.1.2j_p/obj-Xeon_64-pc-linux/Espresso_bin(P3M_charge_assign+0x247c)
 [0x47d6ec]
[compute82:27839] [ 2] 
/home/korolev/espresso-2.1.2j_p/obj-Xeon_64-pc-linux/Espresso_bin(force_calc+0x207)
 [0x42e027]
[compute82:27839] [ 3] 
/home/korolev/espresso-2.1.2j_p/obj-Xeon_64-pc-linux/Espresso_bin(integrate_vv+0x56)
 [0x429ae6]
[compute82:27839] [ 4] 
/home/korolev/espresso-2.1.2j_p/obj-Xeon_64-pc-linux/Espresso_bin(mpi_integrate_slave+0x8)
 [0x413368]
[compute82:27839] [ 5] 
/home/korolev/espresso-2.1.2j_p/obj-Xeon_64-pc-linux/Espresso_bin(mpi_loop+0x5e)
 [0x41628e]
[compute82:27839] [ 6] 
/home/korolev/espresso-2.1.2j_p/obj-Xeon_64-pc-linux/Espresso_bin(main+0x67) 
[0x410a47]
[compute82:27839] [ 7] /lib64/libc.so.6(__libc_start_main+0xf4) [0x39f5c1d974]
[compute82:27839] [ 8] 
/home/korolev/espresso-2.1.2j_p/obj-Xeon_64-pc-linux/Espresso_bin [0x410479]
[compute82:27839] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 27839 on node compute82 exited on 
signal 11 (Segmentation fault).

Also I don't understand how come different version have different responses to 
the same script. Thanks.


Best regards,

Liza

-----Original Message-----
From: address@hidden [mailto:address@hidden On Behalf Of address@hidden
Sent: Thursday, December 16, 2010 1:03 AM
To: address@hidden
Subject: Espressomd-users Digest, Vol 2, Issue 5

Send Espressomd-users mailing list submissions to
        address@hidden

To subscribe or unsubscribe via the World Wide Web, visit
        http://lists.nongnu.org/mailman/listinfo/espressomd-users
or, via email, send a message with subject or body 'help' to
        address@hidden

You can reach the person managing the list at
        address@hidden

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Espressomd-users digest..."


Today's Topics:

   1. Re: problems on runing parallel Espresso. (Axel Arnold)


----------------------------------------------------------------------

Message: 1
Date: Wed, 15 Dec 2010 11:34:54 +0100
From: Axel Arnold <address@hidden>
Subject: Re: [ESPResSo-users] problems on runing parallel Espresso.
To: address@hidden
Message-ID: <address@hidden>
Content-Type: text/plain;  charset="iso-8859-15"

On Monday 13 December 2010 11:39:43 Yanping Fan, Liza (Dr) wrote:

> background_errors 0 {079 bond broken between particles 294, 295 and 296
> (particles not stored on the same node)} 6 {079 bond broken between
> particles 225, 226 and 227 (particles not stored on the same node)} 7 {079
> bond broken between particles 331, 332 and 333 (particles not stored on the
> same node)}
>
> The bond broken, particle stored on different node. My simulation box is
> 400A*400A*400A, and all my equilibrium bond lengths are between 20 to 40A.
> I've been suggested to increase the parameter "skin" for my verlet lists,
> (originally I set to 0.5). With 0.5 for skin, it's said there are a quite
> possibility for the two bonded particles to be set up on two different
> processors.

The skin won't help, since it is a tuning parameter. As a side effect, it also
makes broken bonds less likely, but a bond really only breaks if it is far
outside its equilibrium distance. The bonded particles are not on the same
node, which means that they are further apart than 200A, assuming that you
have 8 processors in use. On a single node that won't cause the simulation to
fail since all particles are on the processor, but physically it is still
wrong.

> I increased the "skin" to 30, the above error message disappeared, but
> another error occurred:
> ---------------------------------------------------------------------------
>-- One of the processes started by mpirun has exited with a nonzero exit
> code.  This typically indicates that the process finished in error. If your
> process did not finish in error, be sure to include a "return 0" or
> "exit(0)" in your C code before exiting the application.
>
> PID 13898 failed on node n0 (192.168.2.160) due to signal 11.
>
> If I turned the skin to 30, run it on 2CPU,4CPU,8CPU, they show
> "Segmentation fault" error. Please look at the error message and log file
> attached. I'm thinking of the problem maybe due to espresso distributing
> particles over different nodes, process.

No, the problem is that with such a large skin, you can have only rather small
real space cutoffs, which means that the electrostatics tries to use too large
grids to compensate for the real space error, and doesn't get enough memory. A
skin of at most 5 is a much better idea. Also note that you might request a
too high precision from P3M, in which case it also uses too much memory.

Many regards,
Axel

--
JP Dr. Axel Arnold Tel: +49 711 685 67609
ICP, Universit?t Stuttgart      Email: address@hidden
Pfaffenwaldring 27
70569 Stuttgart, Germany




------------------------------

_______________________________________________
Espressomd-users mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/espressomd-users


End of Espressomd-users Digest, Vol 2, Issue 5
**********************************************

CONFIDENTIALITY: This email is intended solely for the person(s) named and may 
be confidential and/or privileged. If you are not the intended recipient, 
please delete it, notify us and do not copy, use, or disclose its content. 
Thank you.

Towards A Sustainable Earth: Print Only When Necessary



reply via email to

[Prev in Thread] Current Thread [Next in Thread]