qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] qemu-img: Check failed: No space left on device


From: Nicolas Ecarnot
Subject: Re: [Qemu-block] qemu-img: Check failed: No space left on device
Date: Tue, 26 Sep 2017 16:31:22 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0

Le 21/09/2017 à 16:31, Stefan Hajnoczi a écrit :
On Tue, Sep 19, 2017 at 12:09:06PM +0200, Nicolas Ecarnot wrote:
Hello,

First post here, so maybe I should introduce myself :
- I'm a sysadmin for decades and currently managing 4 oVirt clusters, made
out of tens of hypervisors, all are CentOS 7.2+ based.
- I'm very happy with this solution we choose especially because it is based
on qemu-kvm (open source, reliable, documented).

On one VM, we experienced the following :
- oVirt/vdsm is detecting an issue on the image
- following this hints https://access.redhat.com/solutions/1173623, I
managed to detect one error and fix it
- the VM is now running perfectly

On two other VMs, we experienced a similar situation, except the check stage
is showing something like 14000+ errors, and the relevant logs are :

Repairing refcount block 14 is outside image
ERROR could not resize image: Invalid argument
ERROR cluster 425984 refcount=0 reference=1
ERROR cluster 425985 refcount=0 reference=1
[... repeating the previous line 7000+ times...]
ERROR cluster 457166 refcount=0 reference=1
Rebuilding refcount structure
ERROR writing refblock: No space left on device
qemu-img: Check failed: No space left on device

Please run strace qemu-img info /the/relevant/logical/volume/path.  It
will print all the syscalls that qemu-img makes.  That way we'll be able
to verify that the ENOSPC error is coming from a pwritev syscall.

I did but I'm not skilled enough to ensure where the ENOSPC error is coming from.

Is your question meaning the reads and/or the writes may come from or go to places outside the expected boundaries?

You surely know that oVirt/RHEV is storing its qcow2 images in dedicated
logical volumes.

pvs/vgs/lvs are all showing there is plenty of space available, so I
understand that I don't understand what "No space left on device" means.

After you have the strace data you can look at the file offset from the
failing pwritev syscall and check that it's really within the LV.

I think there is no fancy thin provisioning going on at the LVM level
with oVirt, but if there is then perhaps a write within the LV could
still result in an ENOSPC error.  It would be worth confirming that
these are class "thick" LVs.

I think there is no such thin prov. at the LVM level, but I wouldn't swear.
Don't you mind if I forward your question to the oVirt mailing-list?

--
Nicolas ECARNOT



reply via email to

[Prev in Thread] Current Thread [Next in Thread]