[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Nbd] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extensi
Re: [Qemu-devel] [Nbd] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension
Mon, 04 Apr 2016 12:32 +0200
KMail/4.14.1 (Linux/4.3.0-0.bpo.1-amd64; KDE/4.14.2; x86_64; ; )
On Monday 28 March 2016 09:58:52 Eric Blake wrote:
> On 03/25/2016 02:49 AM, Wouter Verhelst wrote:
> >> You may also want to add a rule that for all future extensions, any
> >> command that requires data in the server response (other than the server
> >> response to NBD_CMD_READ) must include a 32-bit length as the first
> >> field of its data payload, so that the server reply is always structured.
> > Right.
> >> Hmm, I still think it would be hard to write a wireshark dissector for
> >> server replies, since there is no indication whether a given reply will
> >> include data or not.
> > Well, a wireshark dissector already exists. However, it is very old; it
> > doesn't support the NBD_CMD_TRIM or NBD_CMD_FLUSH commands, it doesn't
> > support the NBD_CMD_FLAG_FUA option to NBD_CMD_WRITE, and it doesn't
> > deal with negotiation at all. It was written when newstyle negotiation
> > didn't exist yet, and scanning for INIT_PASSWD at the time was too hard,
> > according to the guy who wrote it (don't remember who that was).
> And I've never written a wireshark dissector myself, to know how easy or
> hard it would be to extend that work.
> >> The _client_ knows (well, any well-written client
> >> that uses a different value for 'handle' for every command sent to the
> >> server), because it can read the returned 'handle' field to know what
> >> command it matches to and therefore what data to expect; but a
> >> third-party observer seeing _just_ the server messages has no idea which
> >> server responses should have payload.
> > It can if it knows enough about the protocol, but granted, that makes it
> > harder for us to extend the protocol without breaking the dissector.
> >> Scanning the stream and assuming that a new reply starts each time
> >> NBD_REPLY_MAGIC is encountered is a close approximation, but hits
> >> false positives if the data being returned for NBD_CMD_READ contains
> >> the same magic number as part of its contents.
> > Indeed.
> One benefit of TCP: we can rely on packet boundaries (whether or not
> fragmentation is also happening); I'm not sure if UDP shares the same
> benefits (then again, I'm not even sure if UDP is usable for the NBD
> protocol, since we definitely rely on in-order delivery of packets: a
> read command can definitely return more bytes than even a jumbo frame
> can contain, even though we do allow out-of-order delivery of replies).
> So if the server always sends each reply as the start of its own packet
> (rather than coalescing multiple replies into a single network packet),
> and the dissector only looks for magic numbers at the start of a packet,
> then any server packet not starting with the magic number must be data
> payload, and you could potentially even avoid the false positives by
> choosing to break data replies into packets by adjusting packet lengths
> by one byte any time the next data chunk would otherwise happen to start
> with the same two bytes as the magic number. But I haven't actually
> tested any of this, to know if packet coalescing goes on, and I
> certainly don't think it is worth complicating the server to avoid false
> positive magic number detection by splitting read payloads across packet
> boundaries, just for the benefit of wire-sniffing.
> > I like this. However, before committing to it, I would like to see
> > Markus' feedback on that (explicit Cc added, even though he's on the
> > list, too).
> > We'd also need a way to communicate the ability to speak this protocol
> > from the kernel to userspace before telling the server that the client
> > understands something which maybe its kernel doesn't.
> proto.md already documents that ioctl()s exist for the user space to
> inform the kernel about options sent by the server prior to kicking off
> transmission phase, and that the NBD protocol itself does not go into
> full detail about available ioctl()s (the kernel/userspace coordination
> does not affect the protocol). It seems fairly straightforward for the
> kernel to supply another ioctl() that userspace can use to learn whether
> it is allowed or forbidden from advertising structured reply status
> during the handshake phase (including the case where the ioctl() is not
> present being treated as the client must not enable structured replies).
Depending on the implementation we may not even need to communicate the
used NBD protocol from userspace to the kernel. We have a different
magic and the magic stays at the beginning of the message so we can
simply use the magic to decide what message we have. Of course that
would need another receive which may be too slow.
For the other direction, giving the userspace information about the
capabilities of the kernel implementation, it may be better to use a
sysfs entry? All current ioctls are used for the exact nbd device it is
used on. But implementation capabilities are a common property over all
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
Description: This is a digitally signed message part.