Re: [ovirt-devel] Disk sizes not updated on unmap/discard

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ovirt-devel] Disk sizes not updated on unmap/discard

From:	Richard W.M. Jones
Subject:	Re: [ovirt-devel] Disk sizes not updated on unmap/discard
Date:	Fri, 2 Oct 2020 08:51:43 +0100
User-agent:	Mutt/1.5.21 (2010-09-15)

On Fri, Oct 02, 2020 at 01:57:04AM +0300, Nir Soffer wrote:
> On Wed, Sep 30, 2020 at 1:49 PM Tomáš Golembiovský <tgolembi@redhat.com> 
> wrote:
> > Hi,
> >
> > currently, when we run virt-sparsify on VM or user runs VM with discard
> > enabled and when the disk is on block storage in qcow, the results are
> > not reflected in oVirt. The blocks get discarded, storage can reuse them
> > and reports correct allocation statistics, but oVirt does not. In oVirt
> > one can still see the original allocation for disk and storage domain as
> > it was before blocks were discarded. This is super-confusing to the
> > users because when they check after running virt-sparsify and see the
> > same values they think sparsification is not working. Which is not true.
> 
> This may be documentation issue. This is a known limitation of oVirt thin
> provisioned storage. We allocate space as needed, but we release the
> space only when a volume is deleted.
> 
> > It all seems to be because of our LVM layout that we have on storage
> > domain. The feature page for discard [1] suggests it could be solved by
> > running lvreduce. But this does not seem to be true. When blocks are
> > discarded the QCOW does not necessarily change its apparent size, the
> > blocks don't have to be removed from the end of the disk. So running
> > lvreduce is likely to remove valuable data.
> 
> We have an API to (safely) reduce a volume to optimal size:
> http://ovirt.github.io/ovirt-engine-api-model/master/#services/disk/methods/reduce
> 
> Reducing images depends on qcow2 image-end-offset. We can tell which
> is the highest offset used by inactive disk:
> https://github.com/oVirt/vdsm/blob/24f646383acb615b090078fc7aeddaf7097afe57/lib/vdsm/storage/blockVolume.py#L403
> 
> and reduce the logical volume to this size.
> 
> But this will not works since qcow2 image-end-offset is not decreased by
> 
>     virt-sparsify --in-place

Right - this doesn't "defragment" the qcow2 file, ie. moving clusters
to the beginning - so (except by accident) it won't make the qcow2
file smaller.

Virt-sparsify in copying mode will actually do what you want, but
obviously is much more heavyweight and complex to use.

> So it is true that sparsify releases unused space on storage level, but it 
> does
> not decrease the qcow2 image allocation, so we cannot reduce the logical
> volumes.
> 
> > At the moment I don't see how we could achieve the correct values. If
> > anyone has any idea feel free to entertain me. The only option seems to
> > be to switch to LVM thin pools. Do we have any plans on doing that?
> 
> No, thin pools do not support clustering, this can be used only on a single
> host. oVirt lvm based volumes are accessed on multiple hosts at the same
> time.
> 
> Here is an example sparisfy test showing the issue:
> 
> Before writing data to new disk
> 
> guest:
> 
> # df -h /data
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sda1        10G  104M  9.9G   2% /data
> 
> storage:
> 
> $ ls -lhs /home/target/2/00
> 2.1G -rw-r--r--. 1 root root 100G Oct  2 00:57 /home/target/2/00
> 
> host:
> 
> # qemu-img info
> /dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
> image: 
> /dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
> file format: qcow2
> virtual size: 10 GiB (10737418240 bytes)
> disk size: 0 B
> cluster_size: 65536
> Format specific information:
>     compat: 1.1
>     compression type: zlib
>     lazy refcounts: false
>     refcount bits: 16
>     corrupt: false
> 
> # qemu-img check
> /dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
> No errors were found on the image.
> 168/163840 = 0.10% allocated, 0.60% fragmented, 0.00% compressed clusters
> Image end offset: 12582912
> 
> 
> After writing 5g file to file system on this disk in the guest:
> 
> guest:
> 
> $ dd if=/dev/zero bs=8M count=640 of=/data/test oflag=direct
> conv=fsync status=progress
> 
> # df -h /data
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sda1        10G  5.2G  4.9G  52% /data
> 
> storage:
> 
> $ ls -lhs /home/target/2/00
> 7.1G -rw-r--r--. 1 root root 100G Oct  2 01:06 /home/target/2/00
> 
> host:
> 
> # qemu-img check
> /dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
> No errors were found on the image.
> 82088/163840 = 50.10% allocated, 5.77% fragmented, 0.00% compressed clusters
> Image end offset: 5381423104
> 
> 
> After deleting the 5g file:
> 
> guest:
> 
> # df -h /data
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sda1        10G  104M  9.9G   2% /data
> 
> storage:
> 
> $ ls -lhs /home/target/2/00
> 7.1G -rw-r--r--. 1 root root 100G Oct  2 01:12 /home/target/2/00
> 
> host:
> 
> # qemu-img check
> /dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
> No errors were found on the image.
> 82088/163840 = 50.10% allocated, 5.77% fragmented, 0.00% compressed clusters
> Image end offset: 5381423104
> 
> 
> After sparsifying disk:
> 
> storage:
> $ qemu-img check /var/tmp/download.qcow2
> No errors were found on the image.
> 170/163840 = 0.10% allocated, 0.59% fragmented, 0.00% compressed clusters
> Image end offset: 11927552
> 
> $ ls -lhs /home/target/2/00
> 2.1G -rw-r--r--. 1 root root 100G Oct  2 01:14 /home/target/2/00
> 
> host:
> 
> # qemu-img check
> /dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
> No errors were found on the image.
> 170/163840 = 0.10% allocated, 0.59% fragmented, 0.00% compressed clusters
> Image end offset: 4822138880
> 
> Allocation decreased from 50% to 0.1%, but image end offset
> decreased only from 5381423104 to 4822138880 (-10.5%).
>
> I don't know if this is a behavior change in virt-sparsify or qemu
> or it was always like that.

AFAIK nothing in virt-sparsify --in-place or qemu has changed here.

> We had an old and unused sparsifyVolume API in vdsm before 4.4. This did not 
> use
> --in-place and was very complicated because of this. But I think it
> would work in this
> case, since qemu-img convert will drop the unallocated areas.
> 
> For example after downloading the sparsified disk, we get:
> 
> $ qemu-img check download.qcow2
> No errors were found on the image.
> 170/163840 = 0.10% allocated, 0.59% fragmented, 0.00% compressed clusters
> Image end offset: 11927552
> 
> 
> Kevin, is this the expected behavior or a bug in qemu?
> 
> The disk I tested is a single qcow2 image without the backing file, so
> theoretically
> qemu can deallocate all the discarded clusters.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [ovirt-devel] Disk sizes not updated on unmap/discard, Nir Soffer, 2020/10/01
- Re: [ovirt-devel] Disk sizes not updated on unmap/discard, Richard W.M. Jones <=
- Re: [ovirt-devel] Disk sizes not updated on unmap/discard, Kevin Wolf, 2020/10/02
  - Re: [ovirt-devel] Disk sizes not updated on unmap/discard, Eric Blake, 2020/10/02

Prev by Date: Re: [PATCH qemu 1/4] drive-mirror: add support for sync=bitmap mode=never
Next by Date: Re: [PATCH v7 03/13] monitor: Use getter/setter functions for cur_mon
Previous by thread: Re: [ovirt-devel] Disk sizes not updated on unmap/discard
Next by thread: Re: [ovirt-devel] Disk sizes not updated on unmap/discard
Index(es):
- Date
- Thread