qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] import thin provisioned disks with image upload


From: Eric Blake
Subject: Re: [Qemu-block] import thin provisioned disks with image upload
Date: Thu, 7 Dec 2017 15:18:46 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0

On 12/07/2017 02:33 PM, Nir Soffer wrote:

> $ truncate -s 1g empty
> 
> $ stat empty
>   File: 'empty'
>   Size: 1073741824 Blocks: 0          IO Block: 4096   regular file
>   ...
> 
> $ qemu-img info empty
> image: empty
> file format: raw
> virtual size: 1.0G (1073741824 bytes)
> disk size: 0
> 
> The value "disk size" used by qemu-img is confusing and not useful
> when you want to transfer the file to another host.
> 
> I don't know why qemu-img display this value instead of the actual
> file size, adding qemu-block mailing list in case someone can explain
> this.

For regular files, qemu is reporting the stat Blocks value (a completely
sparse file occupies 0 blocks, and therefore uses 0 bytes of the host
filesystem).

> 
> When you upload or download this file, you will transfer 1g of zeros.

Well, depending on whether your transfer mechanism has optimizations for
transferring holes efficiently, you may not have to send 1G of zeroes
over the wire; but yes, the receiving end must reconstruct a file
containing 1G of zeroes (even if that reconstruction is also sparse, and
occupies 0 blocks of the file system).


>> If I go and check on my exported file system I get this
>>
>> address@hidden NFS_DOMAIN]# find . -name
>> "3c68d43f-0f28-4564-b557-d390a125daa6"
>>
>> ./572eabe7-15d0-42c2-8fa9-0bd773e22e2e/images/3c68d43f-0f28-4564-b557-d390a125daa6
>> address@hidden NFS_DOMAIN]# ls -lsh
>> ./572eabe7-15d0-42c2-8fa9-0bd773e22e2e/images/3c68d43f-0f28-4564-b557-d390a125daa6
>> total 8.6G
>> 8.6G -rw-rw----. 1 vdsm kvm  10G Dec  7 08:42
>> 09ad8e53-0b22-4fe3-b718-d14352b8290a

> 
> Yes this is raw sparse file.
> 
> 
>> 1.3G is the used size on the file system, we cannot upload only used
>>> blocks.

You have to understand something about the way file systems work.  Just
because the filesystem reports 1.3G used, does NOT mean that 1.3G is
contiguous; depending on how scatter-shot its allocation patterns have
been, and the minimum granularity for holes in the host filesystem,
there could very well be gaps that the guest is reporting as unused but
which still contribute towards the in-use block count reported by the host.

Furthermore, when you delete a file, most filesystems do not immediately
rewrite the underlying storage back to 0, but just update metadata in
the filesystem to state that the storage can be reused later for a new
file.  The TRIM operation on filesystems can help reclaim some of that
(now-unused) space, and more and more filesystems are starting to
support the Linux ioctls for punching holes in the middle of existing
files to quit occupying blocks.  But the fact that the host filesystem
claims 8.6G of used blocks is a typical symptom of not having TRIM'ed
the guest filesystem recently, or a case where guest TRIMs do not reach
through all the layers into a host hole-punching operation.

>>> qemu-img info "Disk size" is not the file size the the used size, not
>>> useful
>>> for upload.

Disk size of a raw file is how many host blocks are currently in use;
but you are correct that it does not necessarily tell you how much data
the guest filesystem is currently reporting as used - on the other hand,
it is a good conservative upper bound for how much storage you need to
reproduce the same bit pattern on the destination (even if many of the
bits don't need to be reproduced, because the guest filesystem didn't
care about them) - but only insofar as you can preserve sparseness of
the portions of the file that are holes in the source.

You may also be interested in the virt-sparsify tool from libguestfs,
which is very good at squeezing out unneeded space from an image in
order to get a smaller version that still produces the same content that
the guest cares about.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]