qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v15 3/8] net/vmnet: implement shared mode (vmnet-shared)


From: Akihiko Odaki
Subject: Re: [PATCH v15 3/8] net/vmnet: implement shared mode (vmnet-shared)
Date: Mon, 7 Mar 2022 15:54:51 +0900

On Mon, Mar 7, 2022 at 1:52 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Mon, Mar 7, 2022 at 12:25 PM Akihiko Odaki <akihiko.odaki@gmail.com> wrote:
> >
> > On Mon, Mar 7, 2022 at 12:59 PM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Fri, Mar 4, 2022 at 12:37 PM Akihiko Odaki <akihiko.odaki@gmail.com> 
> > > wrote:
> > > >
> > > > On 2022/03/04 10:37, Jason Wang wrote:
> > > > > On Thu, Mar 3, 2022 at 11:43 PM Vladislav Yaroshchuk
> > > > > <vladislav.yaroshchuk@jetbrains.com> wrote:
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Tue, Mar 1, 2022 at 11:21 AM Akihiko Odaki 
> > > > >> <akihiko.odaki@gmail.com> wrote:
> > > > >>>
> > > > >>> On 2022/03/01 17:09, Vladislav Yaroshchuk wrote:
> > > > >>>>       > Not sure that only one field is enough, cause
> > > > >>>>       > we may have two states on bh execution start:
> > > > >>>>       > 1. There are packets in vmnet buffer s->packets_buf
> > > > >>>>       >      that were rejected by qemu_send_async and waiting
> > > > >>>>       >      to be sent. If this happens, we should complete 
> > > > >>>> sending
> > > > >>>>       >      these waiting packets with qemu_send_async firstly,
> > > > >>>>       >      and after that we should call vmnet_read to get
> > > > >>>>       >      new ones and send them to QEMU;
> > > > >>>>       > 2. There are no packets in s->packets_buf to be sent to
> > > > >>>>       >      qemu, we only need to get new packets from vmnet
> > > > >>>>       >      with vmnet_read and send them to QEMU
> > > > >>>>
> > > > >>>>      In case 1, you should just keep calling 
> > > > >>>> qemu_send_packet_async.
> > > > >>>>      Actually
> > > > >>>>      qemu_send_packet_async adds the packet to its internal queue 
> > > > >>>> and calls
> > > > >>>>      the callback when it is consumed.
> > > > >>>>
> > > > >>>>
> > > > >>>> I'm not sure we can keep calling qemu_send_packet_async,
> > > > >>>> because as docs from net/queue.c says:
> > > > >>>>
> > > > >>>> /* [...]
> > > > >>>>    * If a sent callback is provided to send(), the caller must 
> > > > >>>> handle a
> > > > >>>>    * zero return from the delivery handler by not sending any more 
> > > > >>>> packets
> > > > >>>>    * until we have invoked the callback. Only in that case will we 
> > > > >>>> queue
> > > > >>>>    * the packet.
> > > > >>>>    *
> > > > >>>>    * If a sent callback isn't provided, we just drop the packet to 
> > > > >>>> avoid
> > > > >>>>    * unbounded queueing.
> > > > >>>>    */
> > > > >>>>
> > > > >>>> So after we did vmnet_read and read N packets
> > > > >>>> into temporary s->packets_buf, we begin calling
> > > > >>>> qemu_send_packet_async. If it returns 0 - it says
> > > > >>>> "no more packets until sent_cb called please".
> > > > >>>> At this moment we have N packets in s->packets_buf
> > > > >>>> and already queued K < N of them. But, packets K..N
> > > > >>>> are not queued and keep waiting for sent_cb to be sent
> > > > >>>> with qemu_send_packet_async.
> > > > >>>> Thus when sent_cb called, we should finish
> > > > >>>> our transfer of packets K..N from s->packets_buf
> > > > >>>> to qemu calling qemu_send_packet_async.
> > > > >>>> I meant this.
> > > > >>>
> > > > >>> I missed the comment. The description is contradicting with the 
> > > > >>> actual
> > > > >>> code; qemu_net_queue_send_iov appends the packet to the queue 
> > > > >>> whenever
> > > > >>> it cannot send one immediately.
> > > > >>>
> > > > >>
> > > > >> Yes, it appends, but (net/queue.c):
> > > > >> *  qemu_net_queue_send tries to deliver the packet
> > > > >>      immediately. If the packet cannot be delivered, the
> > > > >>      qemu_net_queue_append is called and 0 is returned
> > > > >>      to say the caller "the receiver is not ready, hold on";
> > > > >> *  qemu_net_queue_append does a probe before adding
> > > > >>      the packet to the queue:
> > > > >>      if (queue->nq_count >= queue->nq_maxlen && !sent_cb) {
> > > > >>          return; /* drop if queue full and no callback */
> > > > >>      }
> > > > >>
> > > > >> The queue is not infinite, so we have three cases:
> > > > >> 1. The queue is not full -> append the packet, no
> > > > >>      problems here
> > > > >> 2. The queue is full, no callback -> we cannot notify
> > > > >>      a caller when we're ready, so just drop the packet
> > > > >>      if we can't append it.
> > > > >> 3. The queue is full, callback present -> we can notify
> > > > >>      a caller when we are ready, so "let's queue this packet,
> > > > >>      but expect no more (!) packets is sent until I call
> > > > >>      sent_cb when the queue is ready"
> > > > >>
> > > > >> Therefore if we provide a callback and keep sending
> > > > >> packets if 0 is returned, this may cause unlimited(!)
> > > > >> queue growth. To prevent this, we should stop sending
> > > > >> packets and wait for notification callback to continue.
> > > > >
> > > > > Right.
> > > > >
> > > > >>
> > > > >> I don't see any contradiction with that comment.
> > > > >>
> > > > >>> Jason Wang, I saw you are in the MAINTAINERS for net/. Can you tell 
> > > > >>> if
> > > > >>> calling qemu_send_packet_async is allowed after it returns 0?
> > > > >>>
> > > > >>
> > > > >> It may be wrong, but I think it's not allowed to send
> > > > >> packets after qemu_send_packet_async returns 0.
> > > > >>
> > > > >> Jason Wang, can you confirm please?
> > > > >
> > > > > With a cb, we can't do this. All users with cb will disable the source
> > > > > polling and depend on the cb to re-read the polling.
> > > > > (tap/l2tpv3/socket).
> > > > >
> > > > > Without a cb, we can. As analyzed above, qemu_net_queue_append() can
> > > > > limit the number of packets queued in this case.
> > > >
> > > > vmnet can read multiple packets at once. What about such a case? Isn't
> > > > calling qemu_send_packet_async for already read packet and stopping
> > > > reading more fine?
> > >
> > > It should be fine, I remember I've asked whether or not the source
> > > could be disabled but I get the answer that it can't.
> > >
> > > Thanks
> >
> > It can, but there could be more packets before disabling the source
> > because vmnet reads multiple packets with one call. The procedure is:
> > call qemu_send_packet_async with a callback, and if it returns 0,
> > disable the source and keep calling qemu_send_packet_async with a
> > callback until all the packets already read are queued. Is this kind
> > of procedure allowed?
>
> So it might have several issues if we do this:
>
> 1) there's no guarantee that the following call to
> qemu_send_packet_async() will always return 0, this could happen if
> the device has some rx buffer after the first call to
> qemu_send_packet_asnyc(). Then we may end up with out-of order
> delivery.
> 2) calling with a cb may suppress the queue limit check, buggy driver
> may cause unlimited number of packets to be queued under heavy traffic
>
> It looks to me we'd better store those pending buffers in the vmnet,
> and when sent_cb is triggered, start from those buffers first. (Or
> simply drop the packets)

Thanks for the explanation. Vladislav had an idea of such an
implementation so it would be the option we would choose. Let me
review the design again.

On 2022/02/28 20:59, Vladislav Yaroshchuk wrote:
> It can be done kinda this way:
> ```
> struct s:
>      send_bh:                    bh
>      packets_buf:              array[packet]
>      ## Three new fields
>      send_enabled:           bool

send_enabled can be eliminated. When it is enabled, packets_send_pos
and packets_batch_size must be equal. They must not be equal
otherwise. packets_send_pos must represent the position of the packet
which is not sent yet, possibly in the process of sending.
vmnet_send_completed must call qemu_send_wrapper before scheduling
send_bh. bh_send should do nothing if s->packets_send_pos <
s->packets_batch_size.

>      packets_send_pos:    int
>      packets_batch_size:  int

The name packets_batch_size lacks the context that it is used for
sending. The relation between the two members is not obvious either.
Perhaps they could be named packets_send_current_pos and
packets_send_end_pos.

>
> fun bh_send(s):
>      ## Send disabled - do nothing
>      if not s->send_enabled:
>          return
>      ## If some packets are waiting to be sent in
>      ## s->packets_buf - send them
>      if s->packets_send_pos < s->packets_batch_size:
>          if not qemu_send_wrapper(s):
>              return
>
>      ## Try to read new packets from vmnet
>      s->packets_send_pos = 0
>      s->packets_batch_size = 0
>      s->packets_buf = vmnet_read(&s->packets_batch_size)
>      if s->packets_batch_size > 0:
>          # If something received - process them
>          qemu_send_wrapper(s)
> fun qemu_send_wrapper(s):
>      for i in s->packets_send_pos to s->packets_batch_size:
>          size = qemu_send_async(s->packets_buf[i],
>                                                    vmnet_send_completed)
>          if size == 0:
>              ## Disable packets processing until vmnet_send_completed
>              ## Save the state: packets to be sent now in s->packets_buf
>              ## in range s->packets_send_pos..s->packets_batch_size
>              s->send_enabled = false
>              s->packets_send_pos = i + 1
>              break
>          if size < 0:
>              ## On error just drop the packets
>              s->packets_send_pos = 0
>              s->packets_batch_size = 0

We could ignore just one packet. Although it is unlikely, something
specific to the packet may be wrong and other packets may be fine.

>              break
>
>       return s->send_enabled
>
>
> fun vmnet_send_completed(s):
>      ## Complete sending packets from s->packets_buf
>      s->send_enabled = true
>      qemu_bh_schedule(s->send_bh)
>
> ## From callback we only scheduling the bh
> vmnet.set_callback(callback = s: qemu_bh_schedule(s->send_bh))
> ```
>
> I think a simpler implementation may exist, but currently
> I see only this approach with the send_enabled flag and
> two more fields to save packets sending state.

Aside from the send_enabled flag, I agree it is the simplest.

Regards,
Akihiko Odaki



reply via email to

[Prev in Thread] Current Thread [Next in Thread]