[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#34033: Offloading sometimes hangs

From: Ludovic Courtès
Subject: bug#34033: Offloading sometimes hangs
Date: Sat, 22 Feb 2020 21:35:50 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

Hi Maxim,

Maxim Cournoyer <address@hidden> skribis:

> Ludovic Courtès <address@hidden> writes:
>> Hello,
>> Ludovic Courtès <address@hidden> skribis:
>>> A simple thing would be to somehow get libssh to pass POLLIN | POLLRDHUP
>>> instead of just POLLIN.
>> Reported here:
>>   https://www.libssh.org/archive/libssh/2019-01/0000000.html
>> A fix has been proposed by upstream and should be committed shortly.
>>> Additionally, we could change Guile-SSH so that we can specify a timeout
>>> when reading from a channel.
>> Turns out we can set a per-session timeout, which we already do (see
>> #:timeout in ‘open-ssh-session’ in (guix scripts offload)) but
>> ‘ssh_channel_read’ would ignore it and instead pass an infinite timeout
>> to poll(2):
>>   https://www.libssh.org/archive/libssh/2019-01/0000001.html
>> This issue happens to be fixed in libssh 0.8.x, so I upgraded our libssh
>> package in commit a8b0556ea1e439c89dc1ba33c8864e8b9b811f08.
>> (That still doesn’t tell us why our ‘guix offload’ processes would
>> occasionally be stuck but at least it ensures the build farm keeps
>> making progress even when that happens.)
>> Ludo’.
> Seems the patch in the response at the URL you linked is awaiting some
> feedback/review.  Is this the reason 'guix substitute' hangs for so long
> when the substitute server is down? (like 1 minute or so).

The issues above are in libssh and were fixed a while ago.  ‘guix
substitute’ doesn’t use Guile-SSH/libssh, so the problem you’re seeing
must be something different.

What do you mean by “the substitute server is down”?  You mean ‘guix
publish’ is not running, or the machine is unavailable altogether?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]