guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Investigating a reproducibility failure


From: Ricardo Wurmus
Subject: Re: Investigating a reproducibility failure
Date: Thu, 03 Feb 2022 12:41:58 +0100
User-agent: mu4e 1.6.10; emacs 27.2

Hi Konrad,

>> CPU detection is a bottomless can of worms.
>
> That sounds very credible. But what can we do about this?
>
> There is obviously a trade-off between reproducibility and performance
> here. Can we support both, in a way that users can understand and manage?

So far our default approach has been to use the lowest common set of CPU
instructions, which generally leads to poorly performing code.  Some
packages are smarter and provide different code paths for different
CPUs.  The resulting binary is built the same, but at runtime different
parts of the code run dependent on the features the CPU reports.

The case of OpenBLAS is an anomaly in that this mechanism seems to
produce different binaries dependent on where it is built.  When I first
encountered this problem I guessed that perhaps it can only build these
different code paths up to the feature set of the CPU on the build
machine, so if you’re building with an older CPU your binary will lack
components that would be used on newer CPUs.  This is just a guess,
though.

Your problem is that the OpenBLAS build system doesn’t recognize your
modern CPU.  Ideally, it wouldn’t need to know anything about the
build-time CPU to build all the different code paths for different CPU
features.  The only way around this — retroactively — is to pretend to
have an older CPU, e.g. by using qemu.

In the long term it would be great if we could patch OpenBLAS to not
attempt to detect CPU features at build time.  I’m not sure this will
work if it does indeed use the currently available CPU features to
determine “how far up” to build modules in support of certain CPU
features / instruction sets.

> There is of course the issue that we can never be sure if a build will
> be reproducible in the future. But we can at least take care of the
> cases where the packager is aware of non-reproducibility issues, and
> make them transparent and manageable.

The new “--tune” feature is supposed to take care of cases like this.
We would still patch the code so that by default you’d get a package
that is reproducible (= you get the same exact binary no matter when or
where you build it) but that many not have optimal performance.  With
“--tune” you could opt to replace that generic build with one that uses
features of your current CPU, using grafts to swap the generic library
for the more performant library.

-- 
Ricardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]