[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bohup / molpro don't work together ?

From: Bob Proulx
Subject: Re: bohup / molpro don't work together ?
Date: Sat, 16 Mar 2002 12:51:08 -0700

> I do have some problems with the molpro quantum chemistry program
> exiting due to Signal 1 (Hangup) even though it is run through
> nohup. The hangup-exit is reproducible and happens after the program
> is running for hours or days.

Why is the program being sent a SIGHUP?  That normally happens when
you close a connection such as if your are running remotely across an
rlogin or telnet connection and the network connection drops.  This is
normally a good thing as it cleans up orphan processes.

In particular if that is not happening then you would want to know
where and why the SIGHUP signal is being sent.  Signals like that
don't just happen spontaneously.  It indicates an event of some type
and might be the root cause of your problem.  You probably want to
avoid having the signal sent to the process in the first place as
opposed to requiring the process to ignore it when it happens.

> | nohup molpro --nouse-logfile file.inp &

The nohup program ignores signals.  I imagine that your molpro program
is specifically catching them again.  Therefore it is getting the
signal because it asked to get the signal.  Which is why it is
important to determine why it is being sent a SIGHUP in the first
place and avoid it.

Here is how to test if nohup is ignoring the signal.  Start something
simple, like sleep 300, and verify that it will be killed by SIGHUP.
Then run it again under nohup and verify that it now does not.  Here
is the process on my computer.  But first let's test the nohup
operation on your computer to verify functionality.  Here is a
walkthrough on my computer.

  address@hidden /tmp]$ sleep 300 >/dev/null 2>&1 < /dev/null &
  [2] 25477
  address@hidden /tmp]$ kill -HUP 25477
  [2]+  Hangup                  sleep 300 >/dev/null 2>&1 </dev/null

  address@hidden /tmp]$ nohup sleep 300 >/dev/null 2>&1 < /dev/null &
  [2] 25478
  address@hidden /tmp]$ kill -HUP 25478
  address@hidden /tmp]$ kill -HUP 25478
  address@hidden /tmp]$ kill -HUP 25478
  address@hidden /tmp]$ kill -HUP 25478
  address@hidden /tmp]$ ps
    PID TTY          TIME CMD
  23567 pts/12   00:00:00 bash
  25478 pts/12   00:00:00 sleep
  25480 pts/12   00:00:00 ps

As you can see the nohup ignores SIGHUP and that was inherited by the
sleep process.  It was not killed by SIGHUP even though we sent it
several.  However, it is still attached to the terminal and if it is
to work properly after a connection drop then we need to launch it in
such a way that it is not attached to the terminal in the first

> | nohup molpro --nouse-logfile file.inp &

Try redirecting both input and output to either files or to /dev/null
so that they are not attached to your terminal.

Ideally you would like your program that is being run 'nohup &' to
disassociate from the terminal and become a true daemon.  Then actions
on the originating terminal such as connection drops will not affect
the daemon program.  This either requires a code change to the program
to support daemon mode operation or it requires a wrapper program to
do this first and to spawn off your program second.

I have used a program I wrote called 'nohupd' for years for
specifically this purpose of spawning daemons.  But today a simple
perl script can do this pretty easily.  Here is one brute force
example.  I am sure there are cleaner ways of doing just this.  This
is just off of the top of my head.  YMMV

  perl -MPOSIX -e '
    exit 0 if fork();
    setsid() or die;
    exec("/bin/sleep","123") or die;
  ' >/dev/null 2>&1 < /dev/null

  address@hidden /tmp]$ ps -ef |grep sleep
  bob      25598     1  0 12:29 ?        00:00:00 /bin/sleep 123

Note the "?" in the tty column showing that this program has
disassociated from the terminal.

Here are some references that may help you understand what is
happening.  Searching the web with a search engine for how to write a
unix daemon will turn up many good references.


Good luck!

reply via email to

[Prev in Thread] Current Thread [Next in Thread]