[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Disarchive update
From: |
Timothy Sample |
Subject: |
Re: Disarchive update |
Date: |
Wed, 13 Oct 2021 10:54:45 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) |
Hi Ludovic,
Ludovic Courtès <ludovic.courtes@inria.fr> writes:
> This job is disassembling all the .tar.gz files packages refer to, using
> the recently-added ‘etc/disarchive-manifest.scm’ file:
>
> https://ci.guix.gnu.org/jobset/disarchive
>
> It has just succeeded for the first time. :-)
Fantastic! I feel bad that I left you holding the bag on this one,
though. Sorry. I’ve been a little adrift this summer. Thanks for
picking it up!
> Where to go from here? Timothy Sample had already set up a Disarchive
> database at <https://disarchive.ngyro.com>, which (guix download) uses
> as a fallback; I’m not sure exactly how it’s populated.
Basically the same as what you are doing now. I have many Cuirass jobs,
and I use the build outputs mechanism (mentioned by Mathieu in elsewhere
in this thread). I don’t have a “disarchive-collection” job, so I have
to use the Cuirass API to dig through the recent build outputs to find
new results. This happens from a cron job, which uploads each new
result to my server.
One simple but satisfying thing that I do is serve the files compressed.
That is, they are compressed on disk and nginx just passes them along
(using the “gzip_static” module). Because of Disarchive’s verbose and
repetitive output format, this makes for a huge reduction in storage
requirements.
> The goal here would be for the Guix project to set up infrastructure
> populating a database automatically and creating backups, possibly via
> SWH (we’ll have to discuss it with them).
>
> A plan we can already deploy would be:
>
> 1. Add the disarchive.guix.gnu.org DNS entry, pointing to berlin.
>
> 2. On berlin, add an mcron job that periodically copies the output of
> the latest “disarchive-collection” build to a directory, say
> /srv/disarchive. Thus, the database would accumulate tarball
> metadata over time.
>
> 3. Add an nginx route so that /srv/disarchive is served at
> https://disarchive.guix.gnu.org.
>
> 4. Add disarchive.guix.gnu.org to (guix download).
>
> How does that sound? Thoughts?
This is great! I can offer some past metadata, too. Specifically, I
have ~14000 files that I generated while digging into SWH coverage.
(That’s a project I’d like to return to, but I’m still trying to get my
head back in the game and pick up where I left off.)
-- Tim
Re: Disarchive update,
Timothy Sample <=
Re: Disarchive update, Ludovic Courtès, 2021/10/14