bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE


From: Pádraig Brady
Subject: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Wed, 27 Oct 2021 16:36:29 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0

On 27/10/2021 11:00, Janne Heß wrote:
Hi everyone,

I packaged coreutils 9.0 for NixOS and we found breakages that seemed to be 
very random during builds of packages
that use the updated coreutils in their build process. It's really hard to tell 
the main cause but it seems like the issues
are caused by binaries that are corrupted after cp copied them from /tmp to 
/nix. The issue arises both when the
directories are on the same filesystem and when /tmp is on tmpfs.
Upon further inspection/bisection we figured out these issues are caused by 
a6eaee501f6ec0c152abe88640203a64c390993e.
This seems to happen on ZFS and indeed on the main coreutils mailing list there 
is a ZFS issue linked [1].
The testsuite was patched in 61c81ffaacb0194dec31297bc1aa51be72315858 so it 
doesn't detect this issue anymore,
but the issue still very much happens in the real world.

We have found this to happen while building the completions for a go tool (jx) 
which seems to be the same
issue as [2]. The tool is built, copied using cp, and called which causes a 
segfault to happen.

Building another package (peertube) on x86_64-linux on ext4 also fails with 
strange errors in the
test suite, something about "Error: The service is no longer running". This 
does not happen when the mentioned
coreutils commit is undone by replacing #ifdef with #if 0 [3].

We have also seen this issue on Darwin when building Alacritty but only 
happening on some machines
but we were not able to pin it down any further there so this might be related 
or it might not.

Since the issue is so random, we started wondering if it might be related to 
-frandom-seed which changes in NixOS
when rebuilding a package [4]. A thing to note here is that Nix does a lot of 
sandboxing stuff during builds which
includes mount namespaces so a Kernel bug is not out of the question. All of 
these issues happened during Nix builds,
coreutils 9.0 never made it out of the NixOS staging environment due to the 
builds breaking. We will probably disable
the new code paths as outlined above so the issue is contained for NixOS users 
and does not hit any production environments.

[1]: https://github.com/openzfs/zfs/issues/11900
[2]: https://github.com/golang/go/issues/48636
[3]: 
https://raw.githubusercontent.com/NixOS/nixpkgs/bf0531b4f8a2de4ff2700797fb211a90c951786e/pkgs/tools/misc/coreutils/disable-seek-hole.patch
[4]: https://github.com/NixOS/nixpkgs/pull/141684#issuecomment-952339263

We know about the ZFS issue with SEEK_HOLE:
https://lists.gnu.org/archive/html/coreutils/2021-10/msg00021.html

I've asked the user having nixos issues on darwin whether they're using the zfs 
on darwin port,
or at least what file system is being copied from there.

This is awkward to handle unfortunately.
All I can think of now is to identify the file system type for each source file,
and disable SEEK_HOLE on zfs at least.

thanks for the info,
Pádraig





reply via email to

[Prev in Thread] Current Thread [Next in Thread]