[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] doc: Add NBD_CMD_BLOCK_STATUS extension

From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH v2] doc: Add NBD_CMD_BLOCK_STATUS extension
Date: Thu, 7 Apr 2016 10:10:58 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.1

On 04/07/2016 04:38 AM, Vladimir Sementsov-Ogievskiy wrote:
> On 05.04.2016 16:43, Paolo Bonzini wrote:
>> On 05/04/2016 06:05, Kevin Wolf wrote:
>>> The options I can think of is adding a request field "max number of
>>> descriptors" or a flag "only single descriptor" (with the assumption
>>> that clients always want one or unlimited), but maybe you have a better
>>> idea.
>> I think a limit is better.  Even if the client is ultimately going to
>> process the whole file, it may take a very long time and space to
>> retrieve all the descriptors in one go.  Rather than query e.g. 16GB at
>> a time, I think it's simpler to put a limit of 1024 descriptors or so.
>> Paolo
> I vote for the limit too. More over, I think, there should be two sides
> limit:
> 1. The client can specify the limit, so server should not return more
> extents than requested. Of course, server should chose sequential
> extents from the beginning of requested range.

For the client to request a limit would entail that we enhance the
protocol to allow structured requests (where a wire-sniffer would know
how many bytes to read for the client's additional data, even if it does
not understand the extension's semantics).  Might not be a bad idea to
have this in the long run, but so far I've been reluctant to bite the

> 2. Server side limit: if client asked too many extents or not specified
> a limit at all, server should not return all extents, but only 1024 (for
> ex.) from the beginning of the range.

Okay, I'm fairly convinced now that letting the server limit the reply
is a good thing, and that one doesn't require a structured request from
the client.  Since we just recently documented that strings should be no
more than 4096 bytes, and my v2 proposal used 8 bytes per descriptor,
maybe a good way to enforce a similar limit would be:

The server MAY choose to send fewer descriptors than what would describe
the full extent of the client's request, but MUST send at least one
descriptor unless an error is reported.  The server MUST NOT send more
than 512 descriptors, even if that does not completely describe the
client's requested length.

That way, a client in general should never expect more than ~4096 bytes
+ overhead on any server reply except a reply to NBD_CMD_READ, and can
therefore utilize stack allocation for all other replies (if we do this,
maybe we should make a hard rule that all future protocol extensions,
other than NBD_CMD_READ, will guarantee that a reply has a bounded size)

I also think it may be okay to let the server reply with MORE data than
the client requested, but only as long as it does not result in any
extra descriptors (that is, only the last descriptor can result in a
length beyond the client's request).  For example, if the client asks
for block status of 1M of the file, but the server can conveniently
learn via lseek(SEEK_HOLE) or other means that there are 2M of data
before status changes, then there's no reason to force the server to
throw away the information about the 1M beyond the client's read, and
the client might even be able to be more efficient in later requests.

> 2.1 And/or, why not allow the server use the power of structured reply
> and send several reply chunks? Why did you forbid this? (if I correctly
> understand "This chunk type MUST appear at most once in a structured
> reply.")

If we allow more than one chunk, then either every chunk has to include
an offset (more traffic over the wire), or the chunks have to be sent in
a particular order (we aren't gaining any benefits that NBD_CMD_READ
gains by allowing out-of-order transmission).  It's also more work for
the client to reconstruct if it has to reassemble; with NBD_CMD_READ,
the payload is dominated by the data being read, and you can pwrite()
the data into its final location as the client; but with
NBD_CMD_BLOCK_STATUS, the payload is dominated by the metadata and we
want to keep it minimal; and there is no convenient command for the
client to reassemble the information if received out of order.

Allowing for a short reply seems to be worth doing, but allowing for
multiple reply chunks seems not worth the risk.

I'm also starting to think that it is worth FIRST documenting an
extension for advertising block sizes, so that we can then couch
BLOCK_STATUS in those terms (a server MUST NOT subdivide status into
finer granularity than the advertised block sizes).

Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]