qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image


From: Maor Lipchuk
Subject: Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
Date: Thu, 23 Feb 2017 00:06:48 +0200

Hi,

I added a few clarifications inline.
The relevant calculation is:
     header_size + l2_tables_size + refcounts_tables_size + data_size

and it is also described inline.
I will be happy if you can confirm this calculation is acceptable

Thanks,
Maor

On Wed, Feb 22, 2017 at 6:15 PM, Maor Lipchuk <address@hidden> wrote:
> Hi all,
>
> Thank you very much for your help, it was much helpful
> We adopted John Snow advice and implemented our own calculation so we
> can resolve the issue now,
> We plan to drop this code once we can get this estimation from qemu-img.
>
> This is the link to the patch introducing the calculation:
> https://gerrit.ovirt.org/#/c/65039/14/lib/vdsm/storage/qcow2.py
>
> And here are link to the tests that we added:
> https://gerrit.ovirt.org/#/c/65039/14/tests/storage_qcow2_test.py
>
> Here is how the calculation goes:
>
> We first use qemuimg map to get the used clusters and count all the
> clusters for each run returned from qemuimg.map(filename):
>
>  def count_clusters(runs):

Just a clarification:
runs are the output of qemuimg.map(filename).
Here is the code from github that implements it:
https://github.com/oVirt/vdsm/blob/master/lib/vdsm/qemuimg.py

>     count = 0
>      last = -1
>      for r in runs:
>          # Find the cluster when start and end are located.
>          start = r["start"] // CLUSTER_SIZE
>          end = (r["start"] + r["length"]) // CLUSTER_SIZE
>          if r["data"]:
>              if start == end:
>                  # This run is smaller then a cluster. If we have several runs
>                  # in the same cluster, we want to count the cluster only 
> once.
>                  if start != last:
>                      count += 1
>              else:
>                  # This run span over multiple clusters - we want to count all
>                  # the clusters this run touch.
>                  count += end - start
>              last = end
>      return count
>
>
> The following calculation is based on Kevin's comments on the original
> bug, and qcow2 spec:
> https://github.com/qemu/qemu/blob/master/docs/specs/qcow2.txt:
>
>      header_size = 3 * CLUSTER_SIZE
>
>      virtual_size = os.stat(filename).st_size
>
>      # Each 512MiB has one l2 table (64K)
>      l2_tables = (virtual_size + (512 * 1024**2) - 1) // (512 * 1024**2)
>      l2_tables_size = l2_tables * CLUSTER_SIZE
>
>      # Each cluster have a refcount entry (16 bits) in the refcount tables. It
>      # is not clear what is the size of the refcount table, lets assume it is
>      # the same size as the l2 tables.
>      refcounts_tables_size = l2_tables_size

The calculation is missing two more lines:

     data_size = used_clusters * CLUSTER_SIZE
     return header_size + l2_tables_size + refcounts_tables_size + data_size


>
>
> After we calculate the estimated size we do the following logic and
> multiply it with 1.1:
>
>      chunk_size = config.getint("irs",
>                                 "volume_utilization_chunk_mb")
>      chunk_size = chunk_size * sc.MEGAB
>      newsize = (estimate_size + chunk_size) / sc.BLOCK_SIZE
>      self.log.debug("Estimated allocation for qcow2 volume:"
>                     "%d bytes", newsize)
>      newsize = newsize * 1.1
>


This last calculation can be ignored, it is mainly a safety space we add


>
> Please let me know if that calculation is acceptable and makes since
> for this use case
>
> Thanks,
> Maor
>
>
>
>>> On Mon, Feb 20, 2017 at 1:07 PM, Stefan Hajnoczi <address@hidden> wrote:
>>>> On Wed, Feb 15, 2017 at 05:49:58PM +0200, Nir Soffer wrote:
>>>>> On Wed, Feb 15, 2017 at 5:14 PM, Stefan Hajnoczi <address@hidden> wrote:
>>>>> > On Mon, Feb 13, 2017 at 05:46:19PM +0200, Maor Lipchuk wrote:
>>>>> >> I was wondering if that is possible to provide a new API that
>>>>> >> estimates the size of
>>>>> >> qcow2 image converted from a raw image. We could use this new API to
>>>>> >> allocate the
>>>>> >> size more precisely before the convert operation.
>>>>> >>
>>>>> > [...]
>>>>> >> We think that the best way to solve this issue is to return this info
>>>>> >> from qemu-img, maybe as a flag to qemu-img convert that will
>>>>> >> calculate the size of the converted image without doing any writes.
>>>>> >
>>>>> > Sounds reasonable.  qcow2 actually already does some of this calculation
>>>>> > internally for image preallocation in qcow2_create2().
>>>>> >
>>>>> > Let's try this syntax:
>>>>> >
>>>>> >   $ qemu-img query-max-size -f raw -O qcow2 input.raw
>>>>> >   1234678000
>>>>>
>>>>> This is little bit verbose compared to other commands
>>>>> (e.g. info, check, convert)
>>>>>
>>>>> Since this is needed only during convert, maybe this can be
>>>>> a convert flag?
>>>>>
>>>>>     qemu-img convert -f xxx -O yyy src dst --estimate-size --output json
>>>>>     {
>>>>>         "estimated size": 1234678000
>>>>>     }
>>>>
>>>> What is dst?  It's a dummy argument.
>>>>
>>>> Let's not try to shoehorn this new sub-command into qemu-img convert.
>>>>
>>>>> > As John explained, it is only an estimate.  But it will be a
>>>>> > conservative maximum.
>>>>> >
>>>>> > Internally BlockDriver needs a new interface:
>>>>> >
>>>>> > struct BlockDriver {
>>>>> >     /*
>>>>> >      * Return a conservative estimate of the maximum host file size
>>>>> >      * required by a new image given an existing BlockDriverState (not
>>>>> >      * necessarily opened with this BlockDriver).
>>>>> >      */
>>>>> >     uint64_t (*bdrv_query_max_size)(BlockDriverState *other_bs,
>>>>> >                                     Error **errp);
>>>>> > };
>>>>> >
>>>>> > This interface allows individual block drivers to probe other_bs in
>>>>> > whatever way necessary (e.g. querying block allocation status).
>>>>> >
>>>>> > Since this is a conservative max estimate there's no need to read all
>>>>> > data to check for zero regions.  We should give the best estimate that
>>>>> > can be generated quickly.
>>>>>
>>>>> I think we need to check allocation (e.g. with SEEK_DATA), I hope this
>>>>> is what you mean by not read all data.
>>>>
>>>> Yes, allocation data must be checked.  But it will not read data
>>>> clusters from disk to check if they contains only zeroes.
>>>>
>>>> Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]