Re: Potential regression in 'qemu-img convert' to LVM

From: Nir Soffer
Subject: Re: Potential regression in 'qemu-img convert' to LVM
Date: Tue, 15 Sep 2020 12:08:35 +0300

On Mon, Sep 14, 2020 at 3:25 PM Stefan Reiter <s.reiter@proxmox.com> wrote:
> Hi list,
> following command fails since 5.1 (tested on kernel 5.4.60):
> # qemu-img convert -p -f raw -O raw /dev/zvol/pool/disk-1 /dev/vg/disk-1
> qemu-img: error while writing at byte 2157968896: Device or resource busy
> (source is ZFS here, but doesn't matter in practice, it always fails the
> same; offset changes slightly but consistently hovers around 2^31)
> strace shows the following:
> fallocate(13, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2157968896,
> 4608) = -1 EBUSY (Device or resource busy)

What is the size of the LV?

Does it happen if you change sparse minimum size (-S)?

For example: -S 64k

    qemu-img convert -p -f raw -O raw -S 64k /dev/zvol/pool/disk-1

> Other fallocate calls leading up to this work fine.
> This happens since commit edafc70c0c "qemu-img convert: Don't pre-zero
> images", before that all fallocates happened at the start. Reverting the
> commit and calling qemu-img exactly the same way on the same data works
> fine.

But slowly, doing up to 100% more work for fully allocated images.

> Simply retrying the syscall on EBUSY (like EINTR) does *not* work,
> once it fails it keeps failing with the same error.
> I couldn't find anything related to EBUSY on fallocate, and it only
> happens on LVM targets... Any idea or pointers where to look?

Is this thin LV?

This works for us using regular LVs.

Which kernel? which distro?


