[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH for-2.12] qcow2: Reset free_cluster

From: Eric Blake
Subject: Re: [Qemu-block] [Qemu-devel] [PATCH for-2.12] qcow2: Reset free_cluster_index when allocating a new refcount block
Date: Wed, 21 Mar 2018 08:08:23 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0

On 03/21/2018 04:28 AM, Alberto Garcia wrote:
On Tue 20 Mar 2018 06:54:15 PM CET, Eric Blake wrote:

When we try to allocate new clusters we first look for available ones
starting from s->free_cluster_index and once we find them we increase
their reference counts. Before we get to call update_refcount() to do
this last step s->free_cluster_index is already pointing to the next
cluster after the ones we are trying to allocate.

  > During update_refcount() it may happen however that we also need to
  > allocate a new refcount block in order to store the refcounts of
  > these new clusters

Your changes to test 121 covers this...

  > (and to complicate things further that may also require us to grow
  > the refcount table).

...but not this.  Is it worth also trying to cover this case in the
testsuite as well?

I checked and the patch doesn't really fix that scenario. There's a
different problem that I haven't debugged completely yet, because of two

  - One difference is that when we grow the refcount table we actually
    allocate a new one, so s->free_cluster_index points to the beginning
    of the image (where the previous table was) and any holes left during
    the process are allocated after that (depending on how much data we
    write though).

Yeah, that can make the test harder to reason about, and is slightly more sensitive to our internal algorithm - but it also explains why you checked for index > start rather than unconditionally assigning index = start.

  - This scenario is harder to reach: in order to fill a 1-cluster
    refcount table the size of the image needs to be larger than
    (cluster_sizeĀ³ / refcount_bits) bytes, that's 16TB with the default
    parameters. So although it can be reproduced easily if you reduce the
    cluster size I think it's very infrequent under normal conditions.

Yes, 16TB for default size, but only 2M for the best-case (512-byte cluster, 64-bit refcount), so still easy to write a test for.

But yes, it's a task left for the future.

Fair enough.

+            /* If the caller needs to restart the search for free clusters,
+             * try the same ones first to see if they're still free. */
+            if (ret == -EAGAIN) {
+                if (s->free_cluster_index > (start >> s->cluster_bits)) {
+                    s->free_cluster_index = (start >> s->cluster_bits);
+                }

Is there any harm in making this assignment unconditional, instead of
only doing it when free_cluster_index has grown larger than start?

It can happen that it is smaller than 'start' if we were moving the
refcount table to a new location, so we want to keep the lowest value.

Okay, then my R-b is sufficient on the patch as-is.

[And unrelated, but it might be nice to do a followup cleanup to track
free_cluster_offset by bytes instead of having to shift
free_cluster_index everywhere]

I've actually just seen that we already have free_byte_offset, we use
that for compressed clusters, so it might be possible to use that one...

free_byte_offset really IS a byte offset, as it can point mid-cluster. But having EVERYTHING be byte-based seems like it will make reasoning about the code easier to do.

I'll put that in my TODO list.


Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

reply via email to

[Prev in Thread] Current Thread [Next in Thread]