qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ovirt-devel] Disk sizes not updated on unmap/discard


From: Nir Soffer
Subject: Re: [ovirt-devel] Disk sizes not updated on unmap/discard
Date: Fri, 2 Oct 2020 01:57:04 +0300

On Wed, Sep 30, 2020 at 1:49 PM Tomáš Golembiovský <tgolembi@redhat.com> wrote:
>
> Hi,
>
> currently, when we run virt-sparsify on VM or user runs VM with discard
> enabled and when the disk is on block storage in qcow, the results are
> not reflected in oVirt. The blocks get discarded, storage can reuse them
> and reports correct allocation statistics, but oVirt does not. In oVirt
> one can still see the original allocation for disk and storage domain as
> it was before blocks were discarded. This is super-confusing to the
> users because when they check after running virt-sparsify and see the
> same values they think sparsification is not working. Which is not true.

This may be documentation issue. This is a known limitation of oVirt thin
provisioned storage. We allocate space as needed, but we release the
space only when a volume is deleted.

> It all seems to be because of our LVM layout that we have on storage
> domain. The feature page for discard [1] suggests it could be solved by
> running lvreduce. But this does not seem to be true. When blocks are
> discarded the QCOW does not necessarily change its apparent size, the
> blocks don't have to be removed from the end of the disk. So running
> lvreduce is likely to remove valuable data.

We have an API to (safely) reduce a volume to optimal size:
http://ovirt.github.io/ovirt-engine-api-model/master/#services/disk/methods/reduce

Reducing images depends on qcow2 image-end-offset. We can tell which
is the highest offset used by inactive disk:
https://github.com/oVirt/vdsm/blob/24f646383acb615b090078fc7aeddaf7097afe57/lib/vdsm/storage/blockVolume.py#L403

and reduce the logical volume to this size.

But this will not works since qcow2 image-end-offset is not decreased by

    virt-sparsify --in-place

So it is true that sparsify releases unused space on storage level, but it does
not decrease the qcow2 image allocation, so we cannot reduce the logical
volumes.

> At the moment I don't see how we could achieve the correct values. If
> anyone has any idea feel free to entertain me. The only option seems to
> be to switch to LVM thin pools. Do we have any plans on doing that?

No, thin pools do not support clustering, this can be used only on a single
host. oVirt lvm based volumes are accessed on multiple hosts at the same
time.

Here is an example sparisfy test showing the issue:

Before writing data to new disk

guest:

# df -h /data
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        10G  104M  9.9G   2% /data

storage:

$ ls -lhs /home/target/2/00
2.1G -rw-r--r--. 1 root root 100G Oct  2 00:57 /home/target/2/00

host:

# qemu-img info
/dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
image: 
/dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 0 B
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

# qemu-img check
/dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
No errors were found on the image.
168/163840 = 0.10% allocated, 0.60% fragmented, 0.00% compressed clusters
Image end offset: 12582912


After writing 5g file to file system on this disk in the guest:

guest:

$ dd if=/dev/zero bs=8M count=640 of=/data/test oflag=direct
conv=fsync status=progress

# df -h /data
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        10G  5.2G  4.9G  52% /data

storage:

$ ls -lhs /home/target/2/00
7.1G -rw-r--r--. 1 root root 100G Oct  2 01:06 /home/target/2/00

host:

# qemu-img check
/dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
No errors were found on the image.
82088/163840 = 50.10% allocated, 5.77% fragmented, 0.00% compressed clusters
Image end offset: 5381423104


After deleting the 5g file:

guest:

# df -h /data
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        10G  104M  9.9G   2% /data

storage:

$ ls -lhs /home/target/2/00
7.1G -rw-r--r--. 1 root root 100G Oct  2 01:12 /home/target/2/00

host:

# qemu-img check
/dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
No errors were found on the image.
82088/163840 = 50.10% allocated, 5.77% fragmented, 0.00% compressed clusters
Image end offset: 5381423104


After sparsifying disk:

storage:
$ qemu-img check /var/tmp/download.qcow2
No errors were found on the image.
170/163840 = 0.10% allocated, 0.59% fragmented, 0.00% compressed clusters
Image end offset: 11927552

$ ls -lhs /home/target/2/00
2.1G -rw-r--r--. 1 root root 100G Oct  2 01:14 /home/target/2/00

host:

# qemu-img check
/dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
No errors were found on the image.
170/163840 = 0.10% allocated, 0.59% fragmented, 0.00% compressed clusters
Image end offset: 4822138880

Allocation decreased from 50% to 0.1%, but image end offset
decreased only from 5381423104 to 4822138880 (-10.5%).

I don't know if this is a behavior change in virt-sparsify or qemu or
it was always
like that.

We had an old and unused sparsifyVolume API in vdsm before 4.4. This did not use
--in-place and was very complicated because of this. But I think it
would work in this
case, since qemu-img convert will drop the unallocated areas.

For example after downloading the sparsified disk, we get:

$ qemu-img check download.qcow2
No errors were found on the image.
170/163840 = 0.10% allocated, 0.59% fragmented, 0.00% compressed clusters
Image end offset: 11927552


Kevin, is this the expected behavior or a bug in qemu?

The disk I tested is a single qcow2 image without the backing file, so
theoretically
qemu can deallocate all the discarded clusters.

Nir




reply via email to

[Prev in Thread] Current Thread [Next in Thread]