bug#26201: hydra.gnu.org uses ‘guix publish’ for nars and narinfos

bug-guix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#26201: hydra.gnu.org uses ‘guix publish’ for nars and narinfos

From:	Mark H Weaver
Subject:	bug#26201: hydra.gnu.org uses ‘guix publish’ for nars and narinfos
Date:	Fri, 24 Mar 2017 04:12:50 -0400
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux)

Hi,

Tobias Geerinckx-Rice <address@hidden> writes:

> On 23/03/17 19:36, Mark H Weaver wrote:
>> One question: what will happen in the case of multiple concurrent
>> requests for the same nar?  Will multiple nar-pack-and-bzip2 processes
>> be run on-demand?
>
> I think this used to be the case with the previous nginx configuration,
> but the recent changes pushed by Ludo' were aimed in part at preventing
> that.
>
>> Recall that the nginx proxy will pass all of those requests through,
>
> Are you sure? I was under the impression¹ that this is exactly what
> ‘proxy_cache_lock on;’ prevents. I'm no nginx guru, obviously, so please
> — anyone! — correct me if I'm misguided.

I agree that "proxy_cache_lock on" should prevent multiple concurrent
requests for the same URL, but unfortunately its behavior is quite
undesirable, and arguably worse than leaving it off in our case.  See:

  https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_lock

Specifically:

  Other requests of the same cache element will either wait for a
  response to appear in the cache or the cache lock for this element to
  be released, up to the time set by the proxy_cache_lock_timeout
  directive.

In our problem case, it takes more than an hour for Hydra to finish
sending a response for the 'texlive-texmf' nar.  During that time, the
nar will be slowly sent to the first client while it's being packed and
bzipped on-demand.

IIUC, with "proxy_cache_lock on", we have two choices of how other
client requests will be treated:

(1) If we increase "proxy_cache_lock_timeout" to a huge value, then
    there will *no* data sent to the other clients until the first
    client has received the entire nar, which means they wait over an
    hour before receiving the first byte.  I guess this will result in
    timeouts on the client side.

(2) If "proxy_cache_lock_timeout" is *not* huge, then all other clients
    will get failure responses until the first client has received the
    entire nar.

Either way, this would cause users to see the same download failures
(requiring user work-arounds like --fallback) that this fix is intended
to prevent for 'texlive-texmf', but instead of happening only for that
one nar, it will now happen for *all* large nars.

Or at least that's what I'd expect based on my reading of the nginx docs
linked above.  I haven't tried it.

IMO, the best solution is to *never* generate nars on Hydra in response
to client requests, but rather to have the build slaves pack and
compress the nars, copy them to Hydra, and then serve them as static
files using nginx.

A far inferior solution, but possibly acceptable and closer to the
current approach, would be to arrange for all concurrent responses for
the same nar to be sent incrementally from a single nar-packing process.
More concretely, while packing and sending a nar response to the first
client, the data would also be written to a file.  Subsequent requests
for the same nar would be serviced using the equivalent of:

  tail --bytes=+0 --follow FILENAME

This way, no one would have to wait an hour to receive the first byte.

What do you think?

      Mark

[Prev in Thread]

Current Thread

[Next in Thread]

bug#26201: No notification of cache misses when downloading substitutes, (continued)

Prev by Date: bug#26215: gschemas.compiled should not be added to the profile by multiple packages
Next by Date: bug#26201: hydra.gnu.org uses ‘guix publish’ for nars and narinfos
Previous by thread: bug#26201: hydra.gnu.org uses ‘guix publish’ for nars and narinfos
Next by thread: bug#26201: hydra.gnu.org uses ‘guix publish’ for nars and narinfos
Index(es):
- Date
- Thread