[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Reproductibility, Data Services, guix weather

From: zimoun
Subject: Reproductibility, Data Services, guix weather
Date: Mon, 12 Oct 2020 23:40:55 +0200


Recently, we discovered a regression in the Haskell build system:
introducing unreproducible builds.  Well, it was a kind of luck: I was
testing ’git-annex’ with the willing to have ’git-annex-assitant’
building it several times (--check) [1].

Aside this particular issue, ~10% of packages are not reproducible and I
am not convinced that “--check” is done by submitter/committer at each
update or new package.  Otherwise the case of unreproducible Mesa [2]
would raised before than June. :-)  (That’s fine, we need package after
all and we cannot fix the world all in the same time. :-))

The issue is to be able to find them.  I proposed (below) to run cron
task doing ’--check’ on the build farms and then report by email the
failure.  Chris indicated me the work they is doing [3] and instead of a
cron task, they is proposing to parse the JSON.  That’s what the tiny
script attached is doing.

   guix repl -L . -- weather-repro.scm

For example, I run:

   guix repl -L . -- weather-repro.scm | sort | grep ghc

to list (almost) all the unreproducible Haskell packages.  What I would
like is to be able to filter by build system for example.

First, Chris could you add the fields package name and version?  Because
it is hard to automatically reconstruct them by parsing the output-path.

Second, the revision of <>
does not match the Guix commit.  Is it possible to have a bridge?  Other
said, how is computed this revision hash?

(A working revision is 6cf35799dec60723f37d83a559429aa8b90482d5 which
does not seems founding in Guix repo.)

Third, this tiny script is better than nothing but *far far away* form
perfect.  The question about tooling is: does it make sense to include
something like that directly in “guix weather”?  For example,

  guix weather --reproducible

or maybe under “guix challenge”?

WDYT?  Feedback and ideas are very welcome. :-)

All the best,

PS: Below my question and the Chris’s answer.  Both deserve to be public
as Chris told me. :-)

1: <>
2: <>

-------------------- Start of forwarded message --------------------
From: zimoun <>
Subject: [guix-sysadmin] whishlist: Hook on the build-farm?
Date: Sun, 11 Oct 2020 17:19:26 +0200


Currently, it is hard to catch:

  1. which commit breaks which package
  2. if the package builds reproductibly

Even if the Data services helps, *a lot!*.  There are still a lot of
manual actions to spot one or the other.  And I fully agree that the
work initiated by Chris is The Right Thing©.

However it is not ready and the man power is not extensible.  For the
#1, Danny have started a discussion. 

For the #2, I am proposing to add a cron task on one build-farm.  To be
concrete, let’s *randomly* pick 100 packages once a week, rebuild with
“--check“ and send by email the unreproducible packages.

Even, I am proposing: 1rst week 100 packages of build-system “foo”, 2nd
week 100 packages of build-system “bar”, 3rd week…

It is far from perfect but it seems a good heuristic to catch
regression, spot packages with reproducibility troubles, etc.  Note that
it should not happen since the committer should catch the
reproducibility issue; but as a matter of fact it is not the case.
Somehow, I am proposing a workaround.

I volunteer to be the recipient of these automatic emails, then I can do
some triage (remove false-positive, check what’s going, etc.)  and open
a bug report if there is an issue.

Currently, I do not have the CPU power to do so.  So I am asking if it
possible to put something like that on one of the building machines.  I
totally understand an answer as: « Simon, you are enthusiast and that’s
nice but no and go to hell! » :-)

-------------------- End of forwarded message --------------------

-------------------- Start of forwarded message --------------------
From: Christopher Baines <>
Subject: Re: [guix-sysadmin] whishlist: Hook on the build-farm?
Date: Sun, 11 Oct 2020 17:38:03 +0100

> Currently, I do not have the CPU power to do so.  So I am asking if it
> possible to put something like that on one of the building machines.  I
> totally understand an answer as: « Simon, you are enthusiast and that’s
> nice but no and go to hell! » :-)

I too would really like to be able to identify/prevent regressions,
including with respect to build reproducibility, and although the work
I'm doing on this is going slowly I'm hoping that with the Guix Build
Coordinator now I'll be able to get something sort of working.

I've just made a few tweaks to the Guix Data Service to make the data it
has on this a little easier to use.

This URL [1] should show you package reproducibility stats for each
architecture, computed from substitute data from,, and Currently
there seem to be 1515 outputs (so not exactly packages, but close) that
don't seem to have built reproducibly.


Clicking through to the "Not matching" ones for x86_64-linux should give
you this URL [2]. In case it's useful to have this data in a more
machine readable form, I've added a JSON output option.


You mention triage, and that's probably the biggest blocker to being
able to methodically try and reduce the "Not matching" numbers on
[1]. As far as I know, Debian has things like [3] and [4] to help with


Going back to the issue of a cron job to run guix build --check on some
random packages and send emails, if you're looking for a list of
packages (well actually outputs) which don't build reproducibly, then
[2] might do?

The JSON output doesn't contain the package names, but it probably could
with only a little effort. If it did, you could download the JSON file
for all the non-matching package outputs, and record the package names
in a sorted list in a Git repository. If you do that every day, then you
could read the git log to spot potential patterns/regressions.

I don't think your email hit a mailing list, feel free to send my reply
to one though, maybe guix-devel as this discussion probably deserves a
wide audience.


-------------------- End of forwarded message --------------------

Attachment: weather-repro.scm
Description: weather-repro.scm

reply via email to

[Prev in Thread] Current Thread [Next in Thread]