qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 1/3] qcow2: Add qcow2_shrink_l1_and_l2_table


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH v5 1/3] qcow2: Add qcow2_shrink_l1_and_l2_table for qcow2 shrinking
Date: Mon, 24 Nov 2014 10:49:10 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0

On 11/21/2014 03:56 AM, Max Reitz wrote:
> 
> Second, why are you truncating the file to offset if offset is smaller
> than actual_size? That is always wrong. You are blindly assuming:
> (1) that the image consists of only data and no metadata, because you're
> using the guest disk size (offset) to shrink the host file
> (2) that all that data is tightly packed at the beginning of the file
> 
> (1) is obviously wrong. (2) is wrong as well, because clusters may be
> spread all over the host file. We would need defragmentation to solve
> that issue.
> 
> Therefore, to shorten the host file, you'd need to walk through the
> image and find the highest *host* cluster actually in use. Then you can
> truncate to after that cluster. However, I'd strongly advise against
> that because it doesn't bring any actual benefit.

Well, there is a minor benefit - the 'query-blockstats' QMP command
reports wr_highest_offset, which IS the offset of the highest host
cluster already in use, but right now, we only populate that field on
the first allocating write, whereas I would like to be able to get at
that information even for a read-only file.  See also Max's thread for
optimizing overlap checks, where I mention this same thing as a
side-effect we would get for free when improving overlap checks.

> So, as for what I think we do need to do when shrinking (and keep in
> mind: The offset given to qcow2_truncate() is the guest size! NOT the
> host image size!):
> 
> (1) Determine the first L2 table and the first entry in the table which
> will lie beyond the new guest disk size.
> (2) Discard all clusters beginning from there.
> (3) Discard all L2 tables which are then completely empty.

You may want to also consider discarding refblocks when completely
empty, and truncating the size of the reftable if enough trailing
refblocks are dropped.  On the other hand, just as Max argued that
shrinking the L1 table is going to be in the noise, you are also going
to be in the noise for trying to shrink the reftable.

> (4) Update the header size.
> 
> And that's it. We can either speed up step (2) by implementing it
> manually, or we just use bdrv_discard() on the qcow2 BDS (in the
> simplest case: bdrv_discard(bs, DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE),
> bs->total_sectors - DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE));.
> 
> We can incorporate step (3) by extending qcow2_discard_clusters() to
> free L2 tables when they are empty after discard_single_l2(). But we
> don't even have to that now. It's an optimization we can go about later.

Same for trimming unused refblocks and/or shrinking the reftable.  Also,
I think that a future addition of a defragmentation pass would be a more
ideal place for tightly packing an image, including trimming reftables
to a bare minimum.

> 
> There is one advantage:
> - It's extremely simple. It's literally below ten lines of code.
> 
> I think the advantage far outweighs the disadvantage. But I may be
> wrong. What do you think?

I also like the much simpler approach of leaving holes.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]