bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#61386: [PATCH] cp,mv,install: Disable sparse copy on macOS


From: Pádraig Brady
Subject: bug#61386: [PATCH] cp,mv,install: Disable sparse copy on macOS
Date: Fri, 10 Feb 2023 18:58:33 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Thunderbird/109.0

On 10/02/2023 17:24, George Valkov wrote:

On 2023-02-10, at 4:02 PM, Pádraig Brady <P@draigBrady.com> wrote:

On 10/02/2023 12:13, George Valkov wrote:
On 2023-02-10, at 11:18 AM, Pádraig Brady <P@draigBrady.com> wrote:

I'll apply the simple patch later I think.
I have an interesting idea: If I copy a large file, say the 16 GB disk image
where I compiled OpenWRT. So I copy this on the same filesystem, and check
disk usage before an after: no change. Also the copy is instant. That’s because
APFS supports copy-on-write. Initially both files share the same sectors on 
disk,
any changes made after that are copied to new sectors.
This is the most efficient way to copy files on APFS, and should produce no 
corruption.
Let’s implement it. I would assume there is a system cal that does the entire 
copy,
And we don’t need to read and write any data.
Does the trace contain any interesting calls that might be related to that?

When you say "I copy a large file", is that with gcp or something else?

Finder: option + drag in the same directory

Since coreutils 9.1 we try the CoW by default with fclonefileat()
which is available since macOS 10.12 (2016).

I can confirms fclonefileat works on macOS 12.6.3 and solves the issue.
File attributes are also preserved. To clarify: these observations are from
the sample below. I don’t have internal experience with cloreutils, but I can
test it if you provide some description.


Note that works only when src and dest are within the same file system,
but that is the case for you if I'm reading your original report correctly.

Correct. fclonefileat only works on the same file system.
If we attempt to clone to another volume, fclonefileat fails 18 Cross-device 
link.
When building OpenWRT everything works on the same file system, so
fclonefileat is applicable.


When I mentioned that earlier I thought your macOS version was too old to 
support that,
but in fact your 12.6.3 should support fclonefileat() fine.

Thankfully it is supported. The latest security update came in January 2023.


So that's another variable. Is that call failing completely for you?
You might be quicker to add a couple of printfs around the
fclonefileat() calls in the coreutils from latest git you compliled.
Note there is new error handling related to that call in the latest git,
compared to what was released in coreutils 9.1.

Coreutils tests on master
git clone git://git.savannah.gnu.org/coreutils.git
mv coreutils coreutils-clone
cd coreutils-clone
git submodule foreach git pull origin master
./bootstrap
./configure
make -j 16
git log
commit d374d32ccf12f8cf33c8f823d573498b7c8b27a4 (HEAD -> clone, origin/master, 
origin/HEAD, master)
cd ..

./coreutils-clone/src/cp cc1 cc1-test

printf("HAVE_FCLONEFILEAT %u  USE_XATTR %u\n", HAVE_FCLONEFILEAT, USE_XATTR);
HAVE_FCLONEFILEAT 1  USE_XATTR 0

       int fc_flags = x->preserve_ownership ? 0 : CLONE_NOOWNERCOPY;
                printf(
                        "data_copy_required      %u  x->reflink_mode        
%u\n"
                        "x->preserve_mode        %u  x->preserve_timestamps 
%u\n"
                        "x->preserve_ownership   %u  CLONE_NOOWNERCOPY      
%u\n",
                        data_copy_required, x->reflink_mode,
                        x->preserve_mode, x->preserve_timestamps,
                        x->preserve_ownership, CLONE_NOOWNERCOPY
                );
       if (data_copy_required && x->reflink_mode
           && x->preserve_mode && x->preserve_timestamps
           && (x->preserve_ownership || CLONE_NOOWNERCOPY))
         {
                        int a = fclonefileat (source_desc, dst_dirfd, 
dst_relname, fc_flags);
                        int e = errno;
                        printf("fclonefileat %i %i\n", a, e);
           if (a == 0)
             goto close_src_desc;

data_copy_required      1  x->reflink_mode        1
x->preserve_mode        0  x->preserve_timestamps 0
x->preserve_ownership   0  CLONE_NOOWNERCOPY      2

fclonefileat is not used because x->preserve_mode = 0, x->preserve_timestamps = 0
That design seems wrong. The fact than no one requested to preserve that data,
doesn’t mean preserving it via fclonefileat is a bad thing. I think we should
always call fclonefileat and proceed to other means if it fails. It is more 
efficient, since
it requires no additional space or data to be copied. And always produces a 
valid copy.


Note also I don't see any fclonefileat() syscalls in your Finder logs at least.

Great news: fclonefileat works on the same file system, and if we
need to copy outside, then just disable sparse copy according to my patch.
One option is to start blindly with fclonefileat, and if that fails, fall back 
to
normal copy. If you can use the trace to find what Finder is doing, we can try 
it.
It might bring some improvements. Else, disable sparse and do a regular copy.

It was easier for me to write a small sample:

// clone.c
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/clonefile.h>

int main(int argc, char ** argv)
{
        int fd = open("cc1", O_RDONLY);
        int dir = open("./", O_RDONLY);
        int a = fclonefileat(fd, dir, "cc1-clone", 0);
        close(fd);
        close(dir);

        printf("fd %i  dir %i  fclonefileat %i\n", fd, dir, a);

        return 0;
}

./coreutils/src/cp cc1 cc1-ori-before
gcc clone.c && ./a.out
./coreutils/src/cp cc1 cc1-ori-after

ll cc1*
-rwxr-xr-x  1 g  staff  27551296 Feb 10 17:35 cc1*
-rwxr-xr-x  1 g  staff  27551296 Feb 10 17:35 cc1-clone*
-rwxr-xr-x  1 g  staff  27551296 Feb 10 17:39 cc1-ori-after*
-rwxr-xr-x  1 g  staff  27551296 Feb 10 17:39 cc1-ori-before*

sha1sum
812cb0591b000f56bdf75c0f3f94c622e371b970  cc1
812cb0591b000f56bdf75c0f3f94c622e371b970  cc1-clone
812cb0591b000f56bdf75c0f3f94c622e371b970  cc1-ori-after
7b447132f56f3b549ef62a4780945e82d26f7d2c  cc1-ori-before

df -h
Filesystem       Size   Used  Avail Capacity iused     ifree %iused  Mounted on
/dev/disk1s1    1.1Ti  1.0Ti   58Gi    95% 2400403 605165920    0%   
/System/Volumes/Data

# clone a large file 15 GB using fclonefileat
./a.out

# copy a large file 15 GB using Finder: option + drag in the same directory
# both complete instantly

df -h
Filesystem       Size   Used  Avail Capacity iused     ifree %iused  Mounted on
/dev/disk1s1    1.1Ti  1.0Ti   58Gi    95% 2400401 605207240    0%   
/System/Volumes/Data

ll -h /Users/g/vhd/OpenWRT.wrt3200*
-rw-r--r--@ 1 g  staff    15G Feb 10 04:06 
/Users/g/vhd/OpenWRT.wrt3200-clone.sparseimage
-rw-r--r--@ 1 g  staff    15G Feb 10 04:06 
/Users/g/vhd/OpenWRT.wrt3200-finder.sparseimage
-rw-r--r--@ 1 g  staff    15G Feb 10 04:06 
/Users/g/vhd/OpenWRT.wrt3200.sparseimage

Cool that clears up the fclonefileat() questions.
Attached is a patch to attempt the clone unless
we've explicitly specified --no-preserve=mode.
I was considering "touch"ing the timestamps after also,
but it's better to just maintain them as we're
pointing to the same data after all.

So with this patch we should operate quickly within a file system.
With your original patch we should operate robustly otherwise.

cheers,
Pádraig

Attachment: macos-more-cow.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]