qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH] net: delay peer host device delete


From: Michael S. Tsirkin
Subject: [Qemu-devel] Re: [PATCH] net: delay peer host device delete
Date: Mon, 20 Sep 2010 20:59:13 +0200
User-agent: Mutt/1.5.20 (2009-12-10)

On Mon, Sep 20, 2010 at 01:19:48PM -0500, Anthony Liguori wrote:
> On 09/20/2010 01:14 PM, Anthony Liguori wrote:
> >Here's what makes sense to me:
> >
> >1) async device remove + poll device status/removal notification +
> >remove backend
> >
> >The management tool needs to determine when the device is gone and
> >remove the backend.
> >
> >2) sync device remove + remove backend
> >
> >Command does not return until device is removed
> >
> >3) async device and backend removal + poll device/backend removal
> >+ removal notification
> >
> >One command that removes the device and any associated backend.
> >We need to indicate to the management layer when this operation is
> >complete.
> >
> >I think (2) is the most elegant but also the most difficult to
> >implement today.  I think (1) is the least invasive to implement
> >but has the most management tool complexity.  (3) is probably the
> >best compromise in terms of complexity and ease of implementation.
> >
> >Just for comparison, your patch does:
> >
> >4) async device removal + remove backend
> >
> >Whereas remove backend may or may not cause removal depending on
> >whether device removal has happened.  So it's really async removal
> >but it doesn't happen deterministically on it's own.  What happens
> >if you call remove backend before starting async device removal?
> >What if the guest never removes the device?  What if a reset
> >happens?
> >
> >One advantage of (1) is that there is no tricky life cycle
> >considerations.  If we did (3), we would have to think through
> >what happens if a guest doesn't respond to an unplug request.
> 
> BTW, maybe the real problem is that device_del and pci hot unplug
> shouldn't be intimately related.
> 
> We could have a pci_unplug that requested the device be unplugged.
> However, unplugging didn't result in the device being deleted.
> Instead, device_del had to be explicitly called after pci_unplug
> succeeded.
> 
> IRL, if I recall correctly, there's a button that you press that
> sends the ACPI event to the OS.  There usually is an LED associated
> with the slot too and once the device has been unplugged by the OS,
> the LED goes from red to green (or something like that).  At that
> point, the human can physically remove the card from the slot.

PCI express spec has all that.

> You can also initiate the unplug from the OS without the ACPI event
> ever happening.  I suspect that in our current implementation, that
> means that we'll automatically delete the device which may have
> strange effects on management tools.
> 
> So it probably makes sense for our interface to present the same
> procedure.  What do you think?
> 
> Regards,
> 
> Anthony Liguori

We seem to have two discussions here. you speak about how an ideal hot plug
interface will look. This can involve new commands etc.
I speak about fixing existing ones so qemu and/or guest won't crash.
This requires fixing existing commands, unless we can't
fix them at all - which is demonstrably not the case.


> >Regards,
> >
> >Anthony Liguori
> >
> >>>IOW, if device_del returns and the device isn't actually deleted,
> >>>that's a bug and addressing it like this just means we'll trip over
> >>>it somewhere else.
> >>>
> >>>We'll have the same problem with drive_del.
> >>Let's fix it there as well then.
> >>
> >>>Regards,
> >>>
> >>>Anthony Liguori
> >>>
> >>>>>>Signed-off-by: Michael S. Tsirkin<address@hidden>
> >>>>>>---
> >>>>>>  net.c |   21 ++++++++++++++++++++-
> >>>>>>  net.h |    1 +
> >>>>>>  2 files changed, 21 insertions(+), 1 deletions(-)
> >>>>>>
> >>>>>>diff --git a/net.c b/net.c
> >>>>>>index 3d0fde7..10855d1 100644
> >>>>>>--- a/net.c
> >>>>>>+++ b/net.c
> >>>>>>@@ -286,12 +286,31 @@ void qemu_del_vlan_client(VLANClientState *vc)
> >>>>>>      if (vc->vlan) {
> >>>>>>          QTAILQ_REMOVE(&vc->vlan->clients, vc, next);
> >>>>>>      } else {
> >>>>>>+        /* Even if client will not be deleted yet,
> >>>>>>remove it from list so it
> >>>>>>+         * does not appear in monitor.  */
> >>>>>>+        QTAILQ_REMOVE(&non_vlan_clients, vc, next);
> >>>>>>+        /* Detect that guest-visible (NIC) peer is
> >>>>>>active, and delay deletion.
> >>>>>>+         * */
> >>>>>>+        if (vc->peer&&    vc->peer->info->type ==
> >>>>>>NET_CLIENT_TYPE_NIC) {
> >>>>>>+            NICState *nic = DO_UPCAST(NICState, nc, vc->peer);
> >>>>>>+            assert(!nic->peer_deleted);
> >>>>>>+            nic->peer_deleted = true;
> >>>>>>+            return;
> >>>>>>+        }
> >>>>>>          if (vc->send_queue) {
> >>>>>>              qemu_del_net_queue(vc->send_queue);
> >>>>>>          }
> >>>>>>-        QTAILQ_REMOVE(&non_vlan_clients, vc, next);
> >>>>>>          if (vc->peer) {
> >>>>>>              vc->peer->peer = NULL;
> >>>>>>+            /* If this is a guest-visible (NIC) device,
> >>>>>>+             * and peer has already been removed from monitor,
> >>>>>>+             * delete it here. */
> >>>>>>+            if (vc->info->type == NET_CLIENT_TYPE_NIC) {
> >>>>>>+                NICState *nic = DO_UPCAST(NICState, nc, vc);
> >>>>>>+                if (nic->peer_deleted) {
> >>>>>>+                    qemu_del_vlan_client(vc->peer);
> >>>>>>+                }
> >>>>>>+            }
> >>>>>>          }
> >>>>>>      }
> >>>>>>
> >>>>>>diff --git a/net.h b/net.h
> >>>>>>index 518cf9c..44c31a9 100644
> >>>>>>--- a/net.h
> >>>>>>+++ b/net.h
> >>>>>>@@ -72,6 +72,7 @@ typedef struct NICState {
> >>>>>>      VLANClientState nc;
> >>>>>>      NICConf *conf;
> >>>>>>      void *opaque;
> >>>>>>+    bool peer_deleted;
> >>>>>>  } NICState;
> >>>>>>
> >>>>>>  struct VLANState {
> >
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]