qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v2] nbd/proto: introduce extended request and 64bit commands


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [RFC v2] nbd/proto: introduce extended request and 64bit commands
Date: Wed, 22 Apr 2020 16:06:16 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1

18.03.2020 16:10, Vladimir Sementsov-Ogievskiy wrote:
18.03.2020 15:22, Eric Blake wrote:
On 3/18/20 3:04 AM, Wouter Verhelst wrote:
On Wed, Mar 18, 2020 at 09:19:45AM +0300, Vladimir Sementsov-Ogievskiy wrote:
OK, understand. Reasonable thing, agreed. I'll resend.

Hmm. But we can't read in one go anyway, because we need to distinguish 
NBD_REQUEST_MAGIC
from NBD_EXTENDED_REQUEST_MAGIC..

Yes, that needs to happen at any rate, indeed. So the difference will be
two reads rather than three, instead of one read rather than two.

That's still an advantage.

Not much of one.  You're micro-optimizing the read()s, but in reality, the 
sender will likely send() the entire packet at once, at which point the data is 
in the kernel and the reads will succeed back-to-back, or you can even write 
the client to read into a buffer to minimize syscalls and then parse as much as 
needed out of the buffer.

You've got a LOT more overhead in the TCP packet and network transmission time 
than you do in deciding whether the server has to do 2 or 3 read()s per client 
message.

While it might be nice to design a message that doesn't need the server to do 
additional decision points mid-packet in determining how much packet remains, 
that should not be your driving factor. Even with current servers, that is 
already the case (the server has to decide mid-packet whether it is handling 
NBD_CMD_WRITE and thus has more data to read).


I think, that modern client will use mostly NBD_EXTENDED_REQUEST_MAGIC 
interface, so it will
be great to optimize it. But to read extended request in one go, we should make 
it
shorter than simple request, and it doesn't seem possible.

May be we should not support simple and extended requests together? May be 
better to force
using only extended requests if they are negotiated? Then we will read extended 
request in
on go, and we may just define it like

As extended requests already have to be negotiated, and no client nor server exists yet that supports them, we can indeed declare that on successful handshake of the new feature, a client may ONLY send extended requests.  However, a server already has to handle both packet types (if the client doesn't request the feature) and a client already has to be able to send both packet types (for fallback when talking to a server that lacks the feature), so what does it buy us to require that when the feature is negotiated, only extended packets may be sent?  I guess it boils down to whether the server implementation is simplified or made more complex, depending on whether we state that once negotiated both packet types are allowed (server must decide on a per-packet basis which type it is receiving) or whether only extended packets are allowed (server must now be more stateful in order to reject an ill-behaved client that sends wrong type).  In fact, I argue that a server that replies to an extended packet even when an ill-behaved client that forgot to negotiate them is a reasonable server implementation (the client can't depend on that behavior, though).

Hmm, so the restriction doesn't help us, as we'll any way try to handle "wrong" 
type of the message.. Then, we, any way, will have two reads for extended request (if it 
is larger than simple request), as on first we should understand is it extended or not. 
So, I again don't see any benefit in forcing offset and length to be in the header, it's 
just less portable for the future.

So, if we go with my proposal as is, how will it work?

Smallest extended request is of 20 bytes length. Simple request is 28 bytes..

So:

1. read 20 bytes

2. if extended, read the tail defined by length
    else, read the tail, corresponding to simple request

- we always have two reads.

===

If we extened extended request headers by some fields, we can only optimize 
simple request, so it will be

1. read 28 bytes

2. if simple WRITE read corresponding tail
    if another simple request - we are done
    if extended, read corresponding tail. - For this, payload_length field 
should be placed in first 28 bytes, so it can't follow offset and bytes fields.

So, it will look like

C: 32 bits, 0x23876289, magic (`NBD_EXTENDED_REQUEST_MAGIC`)
C: 16 bits, flags
C: 16 bits, type
C: 64 bits, handle
C: 32 bits, length of payload (unsigned)
C: 64 bits, offset
C: 64 bits, length of requested region
C: *length* bytes of payload data (if *length* is nonzero)

which is again, a bit strange, and the only benefit is one read instead of two 
on simple request handling (except for WRITE), when most of requests will be 
extended anyway.

So, I'm for first scheme and my original proposal, as it is more flexible for 
future extensions.. Hmm about non-offset-size commands:
I can imagine Qemu extension, which will export qapi block-layer interface 
through NBD, why not?



Ping? What do you think? Can we proceed with my patch as is? Or I didn't 
convince you?


--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]