[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Treating tests as special case

From: Ricardo Wurmus
Subject: Re: Treating tests as special case
Date: Thu, 05 Apr 2018 16:10:04 +0200
User-agent: mu4e 1.0; emacs 25.3.1

Hi Björn,

> On Thu, 05 Apr 2018 12:14:53 +0200
> Ricardo Wurmus <address@hidden> wrote:
>> Björn Höfling <address@hidden> writes:
>> > And you mentioned different environment conditions like machine and
>> > kernel. We still have "only" 70-90% reproducibility.
>> Where does that number come from?  In my tests for a non-trivial set
>> of bioinfo pipelines I got to 97.7% reproducibility (or 95.2% if you
>> include very minor problems) for 355 direct inputs.
>> I rebuilt on three different machines.
> I have no own numbers but checked Ludivic's blog post from October 2017:
> "We’re somewhere between 78% and 91%—not as good as Debian yet, [..]".

Ah, I see.

Back then we didn’t have a fix for Python bytecode, which affects a
large number of packages in Guix but not on Debian (who simply don’t
distribute bytecode AFAIU).

> So if your numbers are valid for the whole repository, that is good
> news and would mean we are now better than Debian [1], and that would
> be worth a new blog post.

The analysis was only done for the “pigx” package and its
direct/propagated inputs.

I’d like to investigate the sources of non-determinism for remaining
packages and fix them one by one.  For some we already know what’s wrong
(e.g. for Haskell packages the random order of packages in the database
seems to be responsible), but for others we haven’t made an effort to
look closely enough.

I’d also take the Debian numbers with a spoonful of salt (and then take
probiotics in an effort to undo some of the damage, see[1]), because
they aren’t actually rebuilding all Debian packages.



GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC

reply via email to

[Prev in Thread] Current Thread [Next in Thread]