qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-block] [PATCH] doc: Propose NBD_FLAG_INIT_ZEROES


From: Eric Blake
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH] doc: Propose NBD_FLAG_INIT_ZEROES extension
Date: Tue, 6 Dec 2016 09:21:42 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0

On 12/06/2016 03:25 AM, Kevin Wolf wrote:
> Am 06.12.2016 um 00:42 hat Eric Blake geschrieben:
>> While not directly related to NBD_CMD_WRITE_ZEROES, the qemu
>> team discovered that it is useful if a server can advertise
>> whether an export is in a known-all-zeroes state at the time
>> the client connects.
> 
> Does a server usually have the information to set this flag, other than
> querying the block status of all blocks at startup? If so, the client
> could just query this by itself.

Well, only if the client can query information at all (we don't have the
documentation finished for extent queries, let alone a reference
implementation).

> 
> The patch that was originally sent to qemu-devel just forwarded qemu's
> .bdrv_has_zero_init() call to the server. However, what this function
> returns is not a known-all-zeroes state on open, but just a
> known-all-zeroes state immediately after bdrv_create(), i.e. creating a
> new image. Then it becomes information that is easy to get and doesn't
> involve querying all blocks (e.g. true for COW image formats, true for
> raw on regular files, false for raw on block devices).

Just because the NBD spec describes the bit does NOT require that
servers HAVE to set the bit on all images that are all zeroes.  It is
perfectly compliant if the server never advertises the bit.  That said,
I think there are cases where qemu can easily advertise the bit.

I _do_ agree that it is NOT as trivial as the qemu server just
forwarding the value of .bdrv_has_zero_init() - the server HAS to prove
that no data has been written to the image.  But for a qcow2 image just
created with qemu-img, it is a fairly easy proof: If the L1 table has
all-zero entries, then the image has not been written to yet.  Reading
the L1 table for all-zeroes is only a single cluster read, which is MUCH
faster than crawling the entire image for extent status.  And for
regular files, a single lseek(SEEK_DATA) is sufficient to see if the
entire image is currently sparse.

Note that I only proposed the NBD implementation - it still remains to
be coded into the qemu code for the client to make use of the bit
(fairly easy: if the bit is set, the client can make its own
.bdrv_has_zero_init() return true), as well as for the server to set the
bit (harder: the server has to check .bdrv_has_zero_init() of the
wrapped image, but also has to prove the image is still unwritten).
Maybe this means that qemu's block layer wants to add a new
.bdrv_has_been_written() [or whatever name] to better abstract the proof
across drivers.  But those patches would be qemu 2.9 material, and do
not need to further cc the NBD list.

> 
> This is useful for 'qemu-img convert', which creates an image and then
> writes the whole contents, but I'm not sure if this property is
> applicable for NBD, which I think doesn't even have a create operation.

Another option on the NBD server side is to create a server option -
when firing up a server to serve a particular file as an export, the
user can explicitly tell the server to advertise the bit because the
user has side knowledge that the file was just created (and then the
burden of misbehavior is on the user if they mistakenly request the
advertisement when it is not true).

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]