[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Potential regression in 'qemu-img convert' to LVM

From: Stefan Reiter
Subject: Re: Potential regression in 'qemu-img convert' to LVM
Date: Tue, 15 Sep 2020 13:51:40 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0

On 9/15/20 11:08 AM, Nir Soffer wrote:
On Mon, Sep 14, 2020 at 3:25 PM Stefan Reiter <s.reiter@proxmox.com> wrote:

Hi list,

following command fails since 5.1 (tested on kernel 5.4.60):

# qemu-img convert -p -f raw -O raw /dev/zvol/pool/disk-1 /dev/vg/disk-1
qemu-img: error while writing at byte 2157968896: Device or resource busy

(source is ZFS here, but doesn't matter in practice, it always fails the
same; offset changes slightly but consistently hovers around 2^31)

strace shows the following:
fallocate(13, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2157968896,
4608) = -1 EBUSY (Device or resource busy)

What is the size of the LV?

Same as the source, 5GB in my test case. Created with:

# lvcreate -ay --size 5242880k --name disk-1 vg

Does it happen if you change sparse minimum size (-S)?

For example: -S 64k

     qemu-img convert -p -f raw -O raw -S 64k /dev/zvol/pool/disk-1

Tried a few different values, always the same result: EBUSY at byte 2157968896.

Other fallocate calls leading up to this work fine.

This happens since commit edafc70c0c "qemu-img convert: Don't pre-zero
images", before that all fallocates happened at the start. Reverting the
commit and calling qemu-img exactly the same way on the same data works

But slowly, doing up to 100% more work for fully allocated images.

Of course, I'm not saying the patch is wrong, reverting it just avoids triggering the bug.

Simply retrying the syscall on EBUSY (like EINTR) does *not* work,
once it fails it keeps failing with the same error.

I couldn't find anything related to EBUSY on fallocate, and it only
happens on LVM targets... Any idea or pointers where to look?

Is this thin LV?

No, regular LV. See command above.

This works for us using regular LVs.

Which kernel? which distro?

Reproducible on:
* PVE w/ kernel 5.4.60 (Ubuntu based)
* Manjaro w/ kernel 5.8.6

I found that it does not happen with all images, I suppose there must be a certain number of smaller holes for it to happen. I am using a VM image with a bare-bones Alpine Linux installation, but it's not an isolated case, we've had two people report the issue on our bug tracker: https://bugzilla.proxmox.com/show_bug.cgi?id=3002



reply via email to

[Prev in Thread] Current Thread [Next in Thread]