qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] 9pfs: fix init_in_iov_from_pdu truncating size


From: Stefano Stabellini
Subject: Re: [PATCH 2/2] 9pfs: fix init_in_iov_from_pdu truncating size
Date: Wed, 13 May 2020 16:31:30 -0700 (PDT)
User-agent: Alpine 2.21 (DEB 202 2017-01-01)

On Wed, 13 May 2020, Christian Schoenebeck wrote:
> On Mittwoch, 13. Mai 2020 01:24:16 CEST Stefano Stabellini wrote:
> > Let me premise that this patch fixes an important bug, so I am not
> > asking you to do any more work to get this patch committed right now :-)
> 
> Actually, I realized that this overall issue is much more delicate to handle 
> than I thought. My patch would still miss other important things, like e.g. 
> it 
> does not roll back the read / readdir streaming positions of the respective 
> FID on 9P server side. -> data loss / undefined behaviour on subsequent read 
> / 
> readdir
> 
> Also the 'count' parameter in the Rread response would need to be updated as 
> well. -> undefined behaviour
> 
> For these and other reasons already described, I start to think that it is 
> actually not a good idea to truncate the response at all, at least not at 
> that 
> final transport stage (as my or your patch do).
> 
> Question: is there really no way to handle this overall issue exlusively on 
> Xen side, i.e. delaying response delivery until large enough transport buffer 
> would be available or acquiring on-demand the required buffer size? Or by 
> looking ahead at msize in advance (see below)?
> 
> >From what I see now, I would recommend completely reverting response 
> truncation on virtio transport side at least. Because so far I have not seen 
> a 
> single case with virtio where the transport buffer was too small for 
> delivering any response. And as you can see, response truncation at late 
> transport level opens all kinds of severe issues on other ends. It is not 
> easy 
> to do and requires a lot of code!
> 
> > But I think it would be good to test what would happen if the client did
> > a read on a directory with hundreds of entries, such as a maildir
> > directory. There has to be point where the number of directory entries
> > is larger than the shared buffer. What happens then?
> 
> I'm talking about virtio only now: that scenario does not happen. When client 
> opens a 9P session, client transmits 'msize' as parameter for the 9P sesssion 
> which is the "maximum message size" ever to be assumed on both sides for  
> client requests and server responses. By default msize is 4 kiB (4096 bytes) 
> with Linux clients, and our server would immediately error out if any client 
> tries to open a session with 'msize' < 4 kiB (for similar reasons discussed; 
> because some 9P response types cannot be truncated at all):
> 
> https://github.com/qemu/qemu/commit/
> e16453a31a00c1c0a199cab0617e8aa888f6419a#diff-f3e98ed0a65e27743b14785fa85b7d79
> 
> The Rread and Rreaddir responses are already truncated (in a sane and correct 
> way) in 9p.c by server's request handler in case the response would exceed 
> 'msize'. For that reason it is actually unexpected that transport would not 
> be 
> able to provide the required transport buffer size for any response.
>
> > I am guessing that we'll have to implement the "truncate at the
> > boundaries of individual entries" to get it right in all cases.
> > 
> > Given that it doesn't look like truncation would work right with
> > Rreaddir anyway today, I think it would be OK to fix it in a separate
> > patch.
> 
> If there is really no way with Xen to ensure there's always a buffer size 
> according to msize, 

Thank you for the long writeup. Let's see if we can come up with a
good plan.

The Xen transport
(https://xenbits.xenproject.org/docs/4.13-testing/misc/9pfs.html) has a
shared memory area with 64 pages, so 256KB. It is split in half: 128KB
for requests and 128KB for responses. 

The original error message is the following:

  xen be: 9pfs-0: xen be: 9pfs-0: Xen 9pfs request type 116 needs 126987 bytes, 
buffer has 126965
  Xen 9pfs request type 116needs 126987 bytes, buffer has 126965

and msize was set to 131072 in the Linux client.


So transport_size == msize. However, there can be multiple requests and
responses inflight at any given time. It means that even with
transport_size=128KB and msize=4KB we could still have so many outstanding
requests that the available memory is less then 4KB.

I admit that with transport_size=128KB and msize=4KB it would be so
unlikely to happen that it would be close to impossible to reproduce the
error. But theoretically it is possible.

If we can't come up with any better ideas, we could simply limit the
msize in the Linux client to something like transport_size/2 or
transport_size/4, but it doesn't feel right.


> > I think we have to return here because the connection gets closed by
> > xen_9pfs_disconnect. I wonder if we should return -EAGAIN like you
> > suggested instead of calling xen_9pfs_disconnect() which is
> > irrecoverable. But that could be done later in a separate patch.
> 
> Even -EAGAIN would require a bunch of code, since the precise error response 
> type depends a) on the protocol dialect (9p2000.L vs. 9p2000.u) and b) even 
> depending on the precise request type. For some 9P request types it is not 
> even allowed to return an error response at all.
> 
> As you can see it all boils down to one thing: better look for a solution for 
> Xen to deliver the required buffer size, or in worst case: yield the 
> coroutine, i.e. delay delivery of response until that can be assured by Xen.

That's might be the best option. But it would have to be
xen_9pfs_init_in_iov_from_pdu to call qemu_coroutine_yield() in a loop?
Like:


diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
index 18fe5b7c92..ef1db47d0c 100644
--- a/hw/9pfs/xen-9p-backend.c
+++ b/hw/9pfs/xen-9p-backend.c
@@ -197,18 +197,13 @@ static void xen_9pfs_init_in_iov_from_pdu(V9fsPDU *pdu,
     g_free(ring->sg);
 
     ring->sg = g_new0(struct iovec, 2);
-    xen_9pfs_in_sg(ring, ring->sg, &num, pdu->idx, *size);
 
+again:
+    xen_9pfs_in_sg(ring, ring->sg, &num, pdu->idx, *size);
     buf_size = iov_size(ring->sg, num);
-    if (buf_size  < P9_IOHDRSZ) {
-        xen_pv_printf(&xen_9pfs->xendev, 0, "Xen 9pfs request type %d"
-                "needs %zu bytes, buffer has %zu, less than minimum\n",
-                pdu->id, *size, buf_size);
-        xen_be_set_state(&xen_9pfs->xendev, XenbusStateClosing);
-        xen_9pfs_disconnect(&xen_9pfs->xendev);
-    }
     if (buf_size  < *size) {
-        *size = buf_size;
+        qemu_coroutine_yield();
+        goto again;
     }
 
     *piov = ring->sg;


How is the coroutine going to get scheduled again? Is it automatic or do
we need to call it again explicitly when we get the next notification
from the client, which should arrive as soon as the client finishes
reading the responses?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]