[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs Hangs on Filesystem Operations on Stale NFS

From: Eli Zaretskii
Subject: Re: Emacs Hangs on Filesystem Operations on Stale NFS
Date: Mon, 11 Jun 2018 18:51:43 +0300

> Date: Mon, 11 Jun 2018 14:46:35 +0200
> From: Alexander Shukaev <address@hidden>
> Cc: Emacs-devel <address@hidden>,
>       Noam Postavsky <address@hidden>, Emacs developers <address@hidden>
> On 2018-06-11 14:40, Andreas Schwab wrote:
> > On Jun 11 2018, Alexander Shukaev <address@hidden> wrote:
> > 
> >> signal.signal(signal.SIGALRM, alarm_handler)
> >> signal.alarm(3)
> >> try:
> >>   proc = subprocess.call('stat ' + path,
> >>                shell=True,
> >>                stderr=subprocess.PIPE,
> >>                stdout=subprocess.PIPE)
> >>   stdoutdata, stderrdata = proc.communicate()
> >>   signal.alarm(0)
> >> except Alarm:
> >>   print "Timed out after 3 seconds..."
> > 
> > How do you know that 3 seconds is enough?
> > 
> > Andreas.
> You don't know.  You just decide that it's maximum tolerable for 
> you/your setup/hardware/connection/preferences/whatever, otherwise you 
> are 99.(9)% sure that something is wrong somewhere with your system, but 
> you don't give up your Emacs instance for that and rather get indicated 
> that there might be a potential problem.

I think there's more here than meets the eye.  Sure, it's quite easy
to come up with a toy program that uses SIGALRM to time out a system
call that went awry.  But Emacs is not a toy program, so doing that
has complications, even if we will come up with a suitable number of
seconds to wait (which ain't easy, since some I/O calls could really
need a long time, or example reading a large file or directory).

Here are some complications we should keep in mind:

  . Emacs already uses SIGALRM for different purposes, see atimer.c.
    Reusing it for this issue will need some complex logic, to avoid
    breaking the features that use SIGALRM now.
  . You tried this with a single 'stat' call, but that's just the tip
    of the iceberg.  Typically, Emacs will need to read a file after
    it found it readable, and we normally do that in a way that keeps
    looping as long as the system call was interrupted by signals, see,
    e.g., emacs_intr_read.  Then setting up an alarm clock will not
    help if 'read' hangs, we will just loop forever.
  . We usually deliver signals to the main thread, so if the code that
    hangs happens to run in a non-main thread (recall that Emacs 26
    has threads), it will be somewhat tricky, to say the least, to
    deliver signal there.
  . Even if we somehow succeed to interrupt the hang by a signal, it's
    not clear whether it's safe to continue running the session --
    there's a reason why we stopped doing non-trivial stuff in signal
    handlers.  It may be that the only sensible thing is to shut down,
    and in that case, what did we gain, exactly?
  . This technique is non-portable to MS-Windows.

There are probably other complications.

All in all, I'd be much happier if we could interrupt such hangs,
e.g. by C-g, as Stefan points out (on a TTY frame, this should already
be possible in many cases, since C-g there generates SIGINT).  But I'm
not sure this would be possible in general.  Maybe Paul will have some

reply via email to

[Prev in Thread] Current Thread [Next in Thread]