Re: Disarchive update

guix-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Disarchive update

From:	Timothy Sample
Subject:	Re: Disarchive update
Date:	Wed, 13 Oct 2021 10:54:45 -0400
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Hi Ludovic,

Ludovic Courtès <ludovic.courtes@inria.fr> writes:

> This job is disassembling all the .tar.gz files packages refer to, using
> the recently-added ‘etc/disarchive-manifest.scm’ file:
>
>   https://ci.guix.gnu.org/jobset/disarchive
>
> It has just succeeded for the first time.  :-)

Fantastic!  I feel bad that I left you holding the bag on this one,
though.  Sorry.  I’ve been a little adrift this summer.  Thanks for
picking it up!

> Where to go from here?  Timothy Sample had already set up a Disarchive
> database at <https://disarchive.ngyro.com>, which (guix download) uses
> as a fallback; I’m not sure exactly how it’s populated.

Basically the same as what you are doing now.  I have many Cuirass jobs,
and I use the build outputs mechanism (mentioned by Mathieu in elsewhere
in this thread).  I don’t have a “disarchive-collection” job, so I have
to use the Cuirass API to dig through the recent build outputs to find
new results.  This happens from a cron job, which uploads each new
result to my server.

One simple but satisfying thing that I do is serve the files compressed.
That is, they are compressed on disk and nginx just passes them along
(using the “gzip_static” module).  Because of Disarchive’s verbose and
repetitive output format, this makes for a huge reduction in storage
requirements.

> The goal here would be for the Guix project to set up infrastructure
> populating a database automatically and creating backups, possibly via
> SWH (we’ll have to discuss it with them).
>
> A plan we can already deploy would be:
>
>   1. Add the disarchive.guix.gnu.org DNS entry, pointing to berlin.
>
>   2. On berlin, add an mcron job that periodically copies the output of
>      the latest “disarchive-collection” build to a directory, say
>      /srv/disarchive.  Thus, the database would accumulate tarball
>      metadata over time.
>
>   3. Add an nginx route so that /srv/disarchive is served at
>      https://disarchive.guix.gnu.org.
>
>   4. Add disarchive.guix.gnu.org to (guix download).
>
> How does that sound?  Thoughts?

This is great!  I can offer some past metadata, too.  Specifically, I
have ~14000 files that I generated while digging into SWH coverage.
(That’s a project I’d like to return to, but I’m still trying to get my
head back in the game and pick up where I left off.)

-- Tim

[Prev in Thread]

Current Thread

[Next in Thread]

Disarchive update, Ludovic Courtès, 2021/10/09
- Re: Disarchive update, Mathieu Othacehe, 2021/10/09
  - Re: Disarchive update, Ludovic Courtès, 2021/10/10
    - Re: Disarchive update, Mathieu Othacehe, 2021/10/12
    - Re: Disarchive update, Ludovic Courtès, 2021/10/14
- Re: Disarchive update, zimoun, 2021/10/12
  - Re: Disarchive update, Ludovic Courtès, 2021/10/14
    - Re: Disarchive update, zimoun, 2021/10/14
    - Re: Disarchive update, Ludovic Courtès, 2021/10/21
    - Re: Disarchive update, zimoun, 2021/10/21
- Re: Disarchive update, Timothy Sample <=
  - Re: Disarchive update, Ludovic Courtès, 2021/10/14
- Re: Disarchive update, Ludovic Courtès, 2021/10/14
  - Re: Disarchive update, zimoun, 2021/10/14
  - Re: Disarchive update, Ludovic Courtès, 2021/10/21

Prev by Date: Re: Merging the “binary” NPM importer?
Next by Date: Re: Merging the “binary” NPM importer?
Previous by thread: Re: Disarchive update
Next by thread: Re: Disarchive update
Index(es):
- Date
- Thread