qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 1/3] qcow2: Add qcow2_shrink_l1_and_l2_table


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH v5 1/3] qcow2: Add qcow2_shrink_l1_and_l2_table for qcow2 shrinking
Date: Thu, 15 Jan 2015 13:47:40 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0

On 2015-01-03 at 07:23, Jun Li wrote:
On Fri, 11/21 11:56, Max Reitz wrote:
So, as for what I think we do need to do when shrinking (and keep in mind:
The offset given to qcow2_truncate() is the guest size! NOT the host image
size!):

(1) Determine the first L2 table and the first entry in the table which will
lie beyond the new guest disk size.
Here is not correct always. Due to the COW, using offset to calculate the
first entry of the first L2 table will be incorrect.

Again: This is *not* about the host disk size or the host offset of some cluster, but about the *guest* disk size.

Let's make up an example. You have a 2 GB disk but you want to resize it to 1.25 GB. The cluster size is 64 kB, therefore we have 2 GB / 64 kB = 32,768 data clusters (as long as there aren't any internal snapshots, which is a prerequisite for resizing qcow2 images).

Every L2 table contains 65,536 / 8 = 8,192 entries; there are thus 32,768 / 8,192 = 4 L2 tables.

As you can see, one can directly derive the number of data clusters and L2 tables from the guest disk size (as long as there aren't any internal snapshots).

So of course we can do the same for the target disk size: 1.25 GB / 64 kB = 20,480 data clusters; 20,480 / 8,192 = 2.5 L2 tables, therefore we need three L2 tables but only half of the last one (4,096 entries).

We know that every cluster references somewhere after that limit (that is, every entry in the fourth L2 table and every entry starting with index 4,096 in the third L2 table) is a data cluster with a guest offset somewhere beyond 1.25 GB, so we don't need it anymore.

Thus, we simply discard all those data clusters and after that we can discard the fourth L2 table. That's it.

If we really want to we can calculate the highest cluster host offset in use and truncate the image accordingly. But that's optional, see the last point in my "problems with this approach" list (having discarded the clusters should save us all the space already). Furthermore, as I'm saying in that list, to really solve this issue, we'd need qcow2 defragmentation.

What I have done for this scenario:
(1) if the first entry is the first entry of the L2 table, so will scan "the
previous L2 table"("the previous L2 table" location is in front of "L2 table"
in L1 table). If the entry of previous L2 table is larger than offset, will
discard this entry, too.
(2) If the first entry is not the first entry of the L2 table, still to scan
the whole L2 table to make sure no entry is beyond offset.

(2) Discard all clusters beginning from there.
(3) Discard all L2 tables which are then completely empty.
(4) Update the header size.
For this patch current's realizion, have include above 4 steps I think.
Current patch, also have another step 5.
(5) truncate the file.

As I wrote above, you can do that but it shouldn't matter much because the discarded clusters should not use any disk space.

Here I think we also should add discard refcount table and refcount block
table when they are completely empty.

And that's it. We can either speed up step (2) by implementing it manually,
or we just use bdrv_discard() on the qcow2 BDS (in the simplest case:
bdrv_discard(bs, DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE), bs->total_sectors -
DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE));.

We can incorporate step (3) by extending qcow2_discard_clusters() to free L2
tables when they are empty after discard_single_l2(). But we don't even have
to that now. It's an optimization we can go about later.

So, we can do (1), (2) and (3) in a single step: Just one bdrv_discard()
call. But it's probably better to use qcow2_discard_clusters() instead and
set the full_discard parameter to true.

So: qcow2_discard_clusters(bs, offset, bs->total_sectors - offset /
BDRV_SECTOR_SIZE, true);. Then update the guest disk size field in the
header. And we're done.

There are four problems with this approach:
- qcow2_discard_clusters() might be slower than optimal. I personally don't
care at all.
- If "bs->total_sectors * BDRV_SECTOR_SIZE - offset" is greater than
INT_MAX, this won't work. Trivially solvable by encapsulating the
qcow2_discard_clusters() call in a loop which limits nb_clusters to INT_MAX
/ BDRV_SECTOR_SIZE.
- The L1 table is not resized. Should not matter in practice at all.
Yes, agree with you.

- The file is not truncated. Does not matter either (because holes in the
file are good enough), and we can't actually solve this problem without
defragmentation anyway.

There is one advantage:
- It's extremely simple. It's literally below ten lines of code.

I think the advantage far outweighs the disadvantage. But I may be wrong.
What do you think?
Hi max,

   Sorry for so late to reply as I am so busy recently. I think let's have an
agreement on how to realize qcow2 shrinking first, then type code is better.

Yes, this will probably be for the best. :-)

Another issue, as gmail can not be used in current China, I have to use this
email to reply. :)

No problem.

Max



reply via email to

[Prev in Thread] Current Thread [Next in Thread]