[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/17] Improve qcow2 all-zero detection

From: Eric Blake
Subject: Re: [PATCH 00/17] Improve qcow2 all-zero detection
Date: Wed, 5 Feb 2020 13:21:40 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1

On 2/5/20 11:04 AM, Max Reitz wrote:
OK, I expected users to come in a separate patch.

I can refactor that better in v2.

That's the use case: when copying into a destination file, it's useful
to know if the destination already reads as all zeroes, before
attempting a fallback to bdrv_make_zero(BDRV_REQ_NO_FALLBACK) or calls
to block status checking for holes.

But that was my point on IRC.  Is it really more useful if
bdrv_make_zero() is just as quick?  (And the fact that NBD doesn’t have
an implementation looks more like a problem with NBD to me.)

That is indeed a thought - why should qemu-img even TRY to call bdrv_has_init_zero; it should instead call bdrv_make_zero(), which should be as fast as possible (see my latest reply on 9/17 exploring that idea more). Under the hood, we can then make bdrv_make_zero() use whatever tricks it needs, whether keeping the driver's .bdrv_has_zero_init/_truncate callbacks, adding another callback, making write_zeroes faster, or whatever, but instead of making qemu-img sort through multiple ideas, the burden would now be hidden in the block layer.

(Considering that at least the code we discussed on IRC didn’t work for
preallocated images, which was the one point where we actually have a
problem in practice.)

And this series DOES improve the case for preallocated qcow2 images, by virtue of a new qcow2 autoclear bit. But again, that may be something we want to hide in the driver callback interfaces, while qemu-img just blindly calls bdrv_make_zero() and gets success (the image now reads as zeroes, either because it was already that way or we did something quick) or failure (it is a waste of time to prezero, whether by write_zeroes or by trim or by truncate, so just manually write zeroes as part of your image copying).

(We have a use case with convert -n to freshly created image files, but
my position on this on IRC was that we want the --target-is-zero flag
for that anyway: Auto-detection may always break, our preferred default
behavior may always change, so if you want convert -n not to touch the
target image except to write non-zero data from the source, we need a
--target-is-zero flag and users need to use it.  Well, management
layers, because I don’t think users would use convert -n anyway.

And with --target-is-zero and users effectively having to use it, I
don’t think that’s a good example of a use case.)

Yes, there will still be cases where you have to use --target-is-zero
because the image itself couldn't report that it already reads as
zeroes, but there are also enough cases where the destination is already
known to read zeroes and it's a shame to tell the user that 'you have to
add --target-is-zero to get faster copying even though we could have
inferred it on your behalf'.

How is it a shame?  I think only management tools would use convert -n.
  Management tools want reliable behavior.  If you want reliable
behavior, you have to use --target-is-zero anyway.  So I don’t see the
actual benefit for qemu-img convert.

qemu-img convert to an NBD destination cannot create the destination, so it ALWAYS has to use -n. I don't know how often users are likely to wire up a command line for qemu-img convert with NBD as the destination, or whether you are correct that it will be a management app able to supply -n (and thus able to supply --target-is-zero). But at the same time, can a management app learn whether it is safe to supply --target-is-zero? With my upcoming NBD patches, 'qemu-img --list' will show whether the NBD target is known to start life all zero, and a management app could use mechanism to decide when to pass --target-is-zero (whether a management app would actually fork qemu-img --list, or just be an NBD client itself to do the same thing qemu-img would do, is beside the point).

Similarly, this series includes enhancements to 'qemu-img info' on qcow2 images known to currently read as zero; again, that sort of information is necessary somewhere in the chain, whether it be because qemu-img consumes the information itself, or because the management app consumes the information in order to pass the --target-is-zero option to qemu-img, either way, the information needs to be available for consumption.

I suppose there is the point of blockdev-create + blockdev-mirror: This
has exactly the same problem as convert -n.  But again, if you really
want blockdev-mirror not just to force-zero the image, you probably need
to tell it so explicitly (i.e., with a --target-is-zero flag for

(Well, I suppose we could save us a target-is-zero for mirror if we took
this series and had a filter driver that force-reports BDRV_ZERO_OPEN.
But, well, please no.)

But maybe I’m just an idiot and there is no reason not to take this
series and make blockdev-create + blockdev-mirror do the sensible thing
by default in most cases. *shrug*

My argument for taking the series _is_ that the common case can be made
more efficient without user effort.

The thing is, I don’t see the user effort.  I don’t think users use
convert -n or backup manually.  And for management tools, it isn’t
really effort to add another switch.

Maybe, but it is just shifting the burden between who consumes the information that an image is all zero, and how the consumption of that information is passed to qemu-img.

Yes, we still need the knob for
when the common case isn't already smart enough,

But the user can’t know when qemu isn’t smart enough.  So users who care
have to always give the flag.

but the difference in
avoiding a pre-zeroing pass is noticeable when copying images around

I’m sure it is, but the question I ask is whether in practice we
wouldn’t get --target-is-zero in all of these cases anyway.

So I’m not sold on “it works most of the time”, because if it’s just
most of the time, then we’ll likely see --target-is-zero all of the time.

OTOH, I suppose that with the new qcow2 extension, it would always work
for the following case:
(1) Create a qcow2 file,
(2) Immediately (with the next qemu-img/QMP invocation) use it as a
target of convert -n or mirror or anything similar.

Yes, that is one of the immediately obvious fallouts from this series - you can now create a preallocated qcow2 image in one process, and the next process using that image can readily tell that it is still just-created.

If so, that means it works reliably all of the time for a common case.
I guess that’d be enough for me.


(and more than just for qcow2 - my followup series to improve NBD is
similarly useful given how much work has already been invested in
mapping NBD into storage access over https in the upper layers like ovirt).

At any rate, I think you've convinced me to rethink how I present v2 (maybe not by refactoring bdrv_known_zeroes usage, but instead refactoring bdrv_make_zero), but that the qcow2 autoclear bit is still a useful feature to have.

Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

reply via email to

[Prev in Thread] Current Thread [Next in Thread]