qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/3] v4 Decouple block device removal from devic


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH 0/3] v4 Decouple block device removal from device removal
Date: Fri, 05 Nov 2010 14:27:49 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux)

Ryan Harper <address@hidden> writes:

> * Michael S. Tsirkin <address@hidden> [2010-11-03 16:46]:
>> On Wed, Nov 03, 2010 at 03:59:29PM -0500, Ryan Harper wrote:
>> > * Michael S. Tsirkin <address@hidden> [2010-11-03 13:03]:
>> > > On Wed, Nov 03, 2010 at 12:29:10PM -0500, Ryan Harper wrote:
>> > > > * Markus Armbruster <address@hidden> [2010-11-03 11:42]:
>> > > > > Ryan Harper <address@hidden> writes:
>> > > > > 
>> > > > > > * Michael S. Tsirkin <address@hidden> [2010-11-03 02:22]:
>> > > > > >> On Tue, Nov 02, 2010 at 03:23:38PM -0500, Ryan Harper wrote:
>> > > > > >> > * Michael S. Tsirkin <address@hidden> [2010-11-02 14:18]:
>> > > > > >> > > On Tue, Nov 02, 2010 at 02:01:08PM -0500, Ryan Harper wrote:
>> > > > > >> > > > > > > > I like the idea of disconnect; if part of the 
>> > > > > >> > > > > > > > device_del method was to
>> > > > > >> > > > > > > > invoke a disconnect method, we could implement that 
>> > > > > >> > > > > > > > for block, net, etc;
>> > > > > >> > > > > > > > 
>> > > > > >> > > > > > > > I'd think we'd want to send the notification, then 
>> > > > > >> > > > > > > > disconnect.
>> > > > > >> > > > > > > > Struggling with whether it's worth having some 
>> > > > > >> > > > > > > > reasonable timeout
>> > > > > >> > > > > > > > between notification and disconnect.  
>> > > > > >> > > > > > > 
>> > > > > >> > > > > > > The problem with this is that it has no analog in 
>> > > > > >> > > > > > > real world.
>> > > > > >> > > > > > > In real world, you can send some notifications to the 
>> > > > > >> > > > > > > guest, and you can
>> > > > > >> > > > > > > remove the card.  Tying them together is what created 
>> > > > > >> > > > > > > the problem in the
>> > > > > >> > > > > > > first place.
>> > > > > >> > > > > > > 
>> > > > > >> > > > > > > Timeouts can be implemented by management, maybe with 
>> > > > > >> > > > > > > a nice dialog
>> > > > > >> > > > > > > being shown to the user.
>> > > > > >> > > > > > 
>> > > > > >> > > > > > Very true.  I'm fine with forcing a disconnect during 
>> > > > > >> > > > > > the removal path
>> > > > > >> > > > > > prior to notification.  Do we want a new disconnect 
>> > > > > >> > > > > > method at the device
>> > > > > >> > > > > > level (pci)? or just use the existing removal callback 
>> > > > > >> > > > > > and call that
>> > > > > >> > > > > > during the initial hotremov event?
>> > > > > >> > > > > 
>> > > > > >> > > > > Not sure what you mean by that, but I don't see a device 
>> > > > > >> > > > > doing anything
>> > > > > >> > > > > differently wrt surprise or ordered removal. So probably 
>> > > > > >> > > > > the existing
>> > > > > >> > > > > callback should do. I don't think we need to talk about 
>> > > > > >> > > > > disconnect:
>> > > > > >> > > > > since we decided we are emulating device removal, let's 
>> > > > > >> > > > > call it
>> > > > > >> > > > > just that.
>> > > > > >> > > > 
>> > > > > >> > > > Because current the "removal" process depends on the guest 
>> > > > > >> > > > actually
>> > > > > >> > > > responding.  What I'm suggesting is that, in Marcus's term, 
>> > > > > >> > > > and what
>> > > > > >> > > > drive_unplug() implements, is to disconnect the host block 
>> > > > > >> > > > device from
>> > > > > >> > > > the guest device to prevent any further access to it in the 
>> > > > > >> > > > case the
>> > > > > >> > > > guest doesn't respond to the removal request made via ACPI.
>> > > > > >> > > > 
>> > > > > >> > > > Very specifically, what we're suggesting instead of the 
>> > > > > >> > > > drive_unplug()
>> > > > > >> > > > command so to complete the device removal operation without 
>> > > > > >> > > > waiting for
>> > > > > >> > > > the guest to respond; that's what's going to happen if we 
>> > > > > >> > > > invoke the
>> > > > > >> > > > response callback; it will appear as if the guest responded 
>> > > > > >> > > > whether it
>> > > > > >> > > > did or not.
>> > > > > >> > > > 
>> > > > > >> > > > What I was suggesting above was to instead of calling the 
>> > > > > >> > > > callback for
>> > > > > >> > > > handing the guest response was to add a device function 
>> > > > > >> > > > called
>> > > > > >> > > > disconnect which would remove any association of host 
>> > > > > >> > > > resources from
>> > > > > >> > > > guest resources before we notified the guest.  Thinking 
>> > > > > >> > > > about it again
>> > > > > >> > > > I'm not sure this is useful, but if we're going to remove 
>> > > > > >> > > > the device
>> > > > > >> > > > without the guests knowledge, I'm not sure how useful 
>> > > > > >> > > > sending the
>> > > > > >> > > > removal requests via ACPI is in the first place.
>> > > > > >> > > > 
>> > > > > >> > > > My feeling is that I'd like to have explicit control over 
>> > > > > >> > > > the disconnect
>> > > > > >> > > > from host resources separate from the device removal *if* 
>> > > > > >> > > > we're going to
>> > > > > >> > > > retain the guest notification.  If we don't care to notify 
>> > > > > >> > > > the guest,
>> > > > > >> > > > then we can just do device removal without notifying the 
>> > > > > >> > > > guest
>> > > > > >> > > > and be done with it.
>> > > > > >> > > 
>> > > > > >> > > I imagine management would typically want to do this:
>> > > > > >> > > 1. notify guest
>> > > > > >> > > 2. wait a bit
>> > > > > >> > > 3. remove device
>> > > > > >> > 
>> > > > > >> > Yes; but this argues for (1) being a separate command from (3)
>> > > > > >> 
>> > > > > >> Yes. Long term I think we will want a way to do that.
>> > > > > >> 
>> > > > > >> > unless we
>> > > > > >> > require (3) to include (1) and (2) in the qemu implementation.
>> > > > > >> > 
>> > > > > >> > Currently we implement:
>> > > > > >> > 
>> > > > > >> > 1. device_del (attempt to remove device)
>> > > > > >> > 2. notify guest
>> > > > > >> > 3. if guest responds, remove device
>> > > > > >> > 4. disconnect host resource from device on destruction
>> > > > > >> > 
>> > > > > >> > With my drive_unplug patch we do:
>> > > > > >> > 
>> > > > > >> > 1. disconnect host resource from device
>> > > > > >> 
>> > > > > >> This is what drive_unplug does, right?
>> > > > > >
>> > > > > > Correct.
>> > > > > >
>> > > > > >> 
>> > > > > >> > 2. device_del (attempt to remove device)
>> > > > > >> > 3. notify guest
>> > > > > >> > 4. if guest responds, remove device
>> > > > > >> > 
>> > > > > >> > I think we're suggesting to instead do (if we keep disconnect 
>> > > > > >> > as part of
>> > > > > >> > device_del)
>> > > > > >> > 
>> > > > > >> > 1. device_del (attemp to remove device)
>> > > > > >> > 2. notify guest
>> > > > > >> > 3. invoke device destruction callback resulting in disconnect 
>> > > > > >> > host resource from device
>> > > > > >> > 4. if guest responds, invoke device destruction path a second 
>> > > > > >> > time.
>> > > > > >> 
>> > > > > >> By response you mean eject?  No, this is not what I was 
>> > > > > >> suggesting.
>> > > > > >> I was really suggesting that your patch is fine :)
>> > > > > >> Sorry about confusion.
>> > > > > >
>> > > > > > I don't mean eject; I mean responding to the ACPI event by writing 
>> > > > > > a
>> > > > > > response to the PCI chipset which QEMU then in turn will invoke the
>> > > > > > qdev_unplug() path which ultimately kills the device and the Drive 
>> > > > > > and
>> > > > > > BlockState objects.
>> > > > > >
>> > > > > >> 
>> > > > > >> I was also saying that from what I hear, the pci express support
>> > > > > >> will at some point need interfaces to
>> > > > > >> - notify guest about device removal/addition
>> > > > > >> - get eject from guest
>> > > > > >> - remove device without talking to guest
>> > > > > >> - add device without talking to guest
>> > > > > >> - suppress device deletion on eject
>> > > > > >> 
>> > > > > >> All this can be generic and can work through express
>> > > > > >> configuration mechanisms or through acpi for pci.
>> > > > > >> But this is completely separate from unplugging
>> > > > > >> the host backend, which should be possible at any point.
>> > > > > >
>> > > > > > Yes.  I think we've worked out that we do want an independent
>> > > > > > unplug/disconnect mechanism rather than tying it to device_del.
>> > > > > >
>> > > > > > Marcus, it sounds like then you wanted to see a 
>> > > > > > net_unplug/disconnect
>> > > > > > and that instead of having device_del always succeed and replacing 
>> > > > > > it
>> > > > > > with a shell, we'd need to provide an explicit command to do the
>> > > > > > disconnect in a similar fashion to how we're doing drive_unplug?
>> > > > > 
>> > > > > I'm not sure I parse this.
>> > > > 
>> > > > You were asking for net and block disconnect to have similar 
>> > > > mechanisms.
>> > > > You mentioned the net fix for suprise removal was to have device_del()
>> > > > always succeed by replacing the device with a shell/zombie.  The
>> > > > drive_unplug() patch doesn't do the same thing; it doesn't affect the
>> > > > device_del() path at all, rather it provides mgmt apps a hook to
>> > > > directly disconnect host resource from guest resource.
>> > > 
>> > > Yes, the shell thing is just an implementation detail.
>> > 
>> > ok.  What qemu monitor command do I call for net delete to do the
>> > "disconnect/unplug"?
>> 
>> 
>> netdev_del
>
> OK.  With netdev_del and drive_unplug commands (not sure if we care to
> change the names to be similar, maybe blockdev_del) in qemu, we can then
> implement the following in libvirt:
>
> 1) detach-device invocation
> 2) issue device_del to QEMU
> 2a) notification is sent)
> 3) issue netdev_del/blockdev_del as appropriate for the device type
> 4) update guest XML to indicate device has been removed
>
> And a fancier version would look like:
>
> 1) detach-device invocation
> 2) issue device_del to QEMU
> 2a) notification is sent)
> 3) set a timeout for guest to respond
> 4) when timeout expires
> 4a) check if the pci device has been removed by quering QEMU
>     if it hasn't been removed, issue netdev_del/blockdev_del
> 5) update guest XML to indicate device has been removed
>
>
> in both cases, I think we'll also want a patch that validates that the
> pci slot is available before handing it out again; this will handle the
> case where the guest doesn't respond to the device removal request; our
> net/blockdev_del command will break the host/guest association, but we
> don't want to attempt to attach a device to the same slot.
>
> Marcus, do you think we're at a point where the mechanisms for
> explicitly revoking access to the host resource is consistent between
> net and block?
>
> If so, then I suppose I might have a consmetic patch to fix up the
> monitor command name to line up with the netdev_del.

I'd be fine with any of these:

1. A new command "device_disconnet ID" (or similar name) to disconnect
   device ID from any host parts.  Nice touch: you don't have to know
   about the device's host part(s) to disconnect it.  But it might be
   more work than the other two.

2. New commands netdev_disconnect, drive_disconnect (or similar names)
   to disconnect a host part from a guest device.  Like (1), except you
   have to point to the other end of the connection to cut it.

3. A new command "drive_del ID" similar to existing netdev_del.  This is
   (2) fused with delete.  Conceptual wart: you can't disconnect and
   keep the host part around.  Moreover, delete is slightly dangerous,
   because it renders any guest device still using the host part
   useless.

Do you need anything else from me to make progress?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]