[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE

From: Pádraig Brady
Subject: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Wed, 3 Nov 2021 15:37:58 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0

On 27/10/2021 11:00, Janne Heß wrote:
Hi everyone,

I packaged coreutils 9.0 for NixOS and we found breakages that seemed to be 
very random during builds of packages
that use the updated coreutils in their build process. It's really hard to tell 
the main cause but it seems like the issues
are caused by binaries that are corrupted after cp copied them from /tmp to 
/nix. The issue arises both when the
directories are on the same filesystem and when /tmp is on tmpfs.
Upon further inspection/bisection we figured out these issues are caused by 
This seems to happen on ZFS and indeed on the main coreutils mailing list there 
is a ZFS issue linked [1].
The testsuite was patched in 61c81ffaacb0194dec31297bc1aa51be72315858 so it 
doesn't detect this issue anymore,
but the issue still very much happens in the real world.

We have found this to happen while building the completions for a go tool (jx) 
which seems to be the same
issue as [2]. The tool is built, copied using cp, and called which causes a 
segfault to happen.

Building another package (peertube) on x86_64-linux on ext4 also fails with 
strange errors in the
test suite, something about "Error: The service is no longer running". This 
does not happen when the mentioned
coreutils commit is undone by replacing #ifdef with #if 0 [3].

We have also seen this issue on Darwin when building Alacritty but only 
happening on some machines
but we were not able to pin it down any further there so this might be related 
or it might not.

Since the issue is so random, we started wondering if it might be related to 
-frandom-seed which changes in NixOS
when rebuilding a package [4]. A thing to note here is that Nix does a lot of 
sandboxing stuff during builds which
includes mount namespaces so a Kernel bug is not out of the question. All of 
these issues happened during Nix builds,
coreutils 9.0 never made it out of the NixOS staging environment due to the 
builds breaking. We will probably disable
the new code paths as outlined above so the issue is contained for NixOS users 
and does not hit any production environments.

[1]: https://github.com/openzfs/zfs/issues/11900
[2]: https://github.com/golang/go/issues/48636
[4]: https://github.com/NixOS/nixpkgs/pull/141684#issuecomment-952339263

Looks like there is a WIP fix for OpenZFS mentioned at [1],
where mmap'd regions were not being flushed:

So this should unblock enabling coreutils 9 at some stage at least.
I've asked at [1] now they know what's going on,
how programs might best distinguish buggy instances of openzfs.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]