qemu-devel
[Top][All Lists]

## Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image

 From: Maor Lipchuk Subject: Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image Date: Wed, 22 Feb 2017 18:15:47 +0200

```Hi all,

We adopted John Snow advice and implemented our own calculation so we
can resolve the issue now,
We plan to drop this code once we can get this estimation from qemu-img.

This is the link to the patch introducing the calculation:
https://gerrit.ovirt.org/#/c/65039/14/lib/vdsm/storage/qcow2.py

https://gerrit.ovirt.org/#/c/65039/14/tests/storage_qcow2_test.py

Here is how the calculation goes:

We first use qemuimg map to get the used clusters and count all the
clusters for each run returned from qemuimg.map(filename):

def count_clusters(runs):
count = 0
last = -1
for r in runs:
# Find the cluster when start and end are located.
start = r["start"] // CLUSTER_SIZE
end = (r["start"] + r["length"]) // CLUSTER_SIZE
if r["data"]:
if start == end:
# This run is smaller then a cluster. If we have several runs
# in the same cluster, we want to count the cluster only once.
if start != last:
count += 1
else:
# This run span over multiple clusters - we want to count all
# the clusters this run touch.
count += end - start
last = end
return count

The following calculation is based on Kevin's comments on the original
bug, and qcow2 spec:
https://github.com/qemu/qemu/blob/master/docs/specs/qcow2.txt:

virtual_size = os.stat(filename).st_size

# Each 512MiB has one l2 table (64K)
l2_tables = (virtual_size + (512 * 1024**2) - 1) // (512 * 1024**2)
l2_tables_size = l2_tables * CLUSTER_SIZE

# Each cluster have a refcount entry (16 bits) in the refcount tables. It
# is not clear what is the size of the refcount table, lets assume it is
# the same size as the l2 tables.
refcounts_tables_size = l2_tables_size

After we calculate the estimated size we do the following logic and
multiply it with 1.1:

chunk_size = config.getint("irs",
"volume_utilization_chunk_mb")
chunk_size = chunk_size * sc.MEGAB
newsize = (estimate_size + chunk_size) / sc.BLOCK_SIZE
self.log.debug("Estimated allocation for qcow2 volume:"
"%d bytes", newsize)
newsize = newsize * 1.1

Please let me know if that calculation is acceptable and makes since
for this use case

Thanks,
Maor

>> On Mon, Feb 20, 2017 at 1:07 PM, Stefan Hajnoczi <address@hidden> wrote:
>>> On Wed, Feb 15, 2017 at 05:49:58PM +0200, Nir Soffer wrote:
>>>> On Wed, Feb 15, 2017 at 5:14 PM, Stefan Hajnoczi <address@hidden> wrote:
>>>> > On Mon, Feb 13, 2017 at 05:46:19PM +0200, Maor Lipchuk wrote:
>>>> >> I was wondering if that is possible to provide a new API that
>>>> >> estimates the size of
>>>> >> qcow2 image converted from a raw image. We could use this new API to
>>>> >> allocate the
>>>> >> size more precisely before the convert operation.
>>>> >>
>>>> > [...]
>>>> >> We think that the best way to solve this issue is to return this info
>>>> >> from qemu-img, maybe as a flag to qemu-img convert that will
>>>> >> calculate the size of the converted image without doing any writes.
>>>> >
>>>> > Sounds reasonable.  qcow2 actually already does some of this calculation
>>>> > internally for image preallocation in qcow2_create2().
>>>> >
>>>> > Let's try this syntax:
>>>> >
>>>> >   \$ qemu-img query-max-size -f raw -O qcow2 input.raw
>>>> >   1234678000
>>>>
>>>> This is little bit verbose compared to other commands
>>>> (e.g. info, check, convert)
>>>>
>>>> Since this is needed only during convert, maybe this can be
>>>> a convert flag?
>>>>
>>>>     qemu-img convert -f xxx -O yyy src dst --estimate-size --output json
>>>>     {
>>>>         "estimated size": 1234678000
>>>>     }
>>>
>>> What is dst?  It's a dummy argument.
>>>
>>> Let's not try to shoehorn this new sub-command into qemu-img convert.
>>>
>>>> > As John explained, it is only an estimate.  But it will be a
>>>> > conservative maximum.
>>>> >
>>>> > Internally BlockDriver needs a new interface:
>>>> >
>>>> > struct BlockDriver {
>>>> >     /*
>>>> >      * Return a conservative estimate of the maximum host file size
>>>> >      * required by a new image given an existing BlockDriverState (not
>>>> >      * necessarily opened with this BlockDriver).
>>>> >      */
>>>> >     uint64_t (*bdrv_query_max_size)(BlockDriverState *other_bs,
>>>> >                                     Error **errp);
>>>> > };
>>>> >
>>>> > This interface allows individual block drivers to probe other_bs in
>>>> > whatever way necessary (e.g. querying block allocation status).
>>>> >
>>>> > Since this is a conservative max estimate there's no need to read all
>>>> > data to check for zero regions.  We should give the best estimate that
>>>> > can be generated quickly.
>>>>
>>>> I think we need to check allocation (e.g. with SEEK_DATA), I hope this
>>>> is what you mean by not read all data.
>>>
>>> Yes, allocation data must be checked.  But it will not read data
>>> clusters from disk to check if they contains only zeroes.
>>>
>>> Stefan

```