savannah-hackers-public
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-hackers-public] cvs: timeout: fork system call failed: Resourc


From: Bob Proulx
Subject: [Savannah-hackers-public] cvs: timeout: fork system call failed: Resource temporarily unavailable
Date: Fri, 10 Feb 2017 16:37:26 -0700
User-agent: NeoMutt/20170113 (1.7.2)

Today Andrew grabbed me on IRC with complaints from cvs checkouts:

  timeout: fork system call failed: Resource temporarily unavailable

This was intermittent throughout the day.  I was able to catch some of
them myself when doing test checkouts repeatedly.  Nagios noted the
service flapping today.  Or when poking at the daemon directly.

  $ connect cvs.savannah.gnu.org 2401
      timeout: fork system call failed: Resource temporarily unavailable

However I was logged into vcs0 and monitoring it and it looked
perfectly happy.  No large number of processes.  No large number of
cvs processes.  Everything looked quite peaceful.

The error is from 'timeout'.  It didn't make sense to me that it was
sometimes working and sometimes not working.  The call is from
timeout-cvs-daemon.

#exec \
#  unshare --net --ipc --pid --mount-proc --user --fork \
    nice -n 9 \
      prlimit --nofile=50 --nproc=20 \
        timeout --signal=SIGKILL 480m \
            cvs2 -R "$@"

Now here is the crazy part.  I went looking for prlimit.  It doesn't
exist on the system.

  address@hidden:/etc# type prlimit
    -bash: type: prlimit: not found

It should be part of the util-linux package.  But apparent is not
shipped with Trisquel 7.

  address@hidden:/etc# dpkg -L util-linux | grep bin/prlimit
  address@hidden:/etc# 

How odd.  How did nice find prlimit previously?  I have no idea.
Makes me think I am not editing the right file because of that.  But I
think I am.  In any case I rolled back that further and things seem
better.

#exec \
#  unshare --net --ipc --pid --mount-proc --user --fork \
#    nice -n 9 \
#      prlimit --nofile=50 --nproc=20 \
        timeout --signal=SIGKILL 480m \
            cvs2 -R "$@"

I don't think using nice makes sense there for various reasons.
Probably not hurting anything but also not needed and not desired so I
removed it too.

And after that then nagios reports cvs up again and I haven't see a
failure since.  But can't prove a negative.  I can't explain this.

I am also still befuddled as to the nature of the prlimit not found
problem.

Bob



reply via email to

[Prev in Thread] Current Thread [Next in Thread]