qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] pci: Skip power-off reset when pending unplug


From: Michael S. Tsirkin
Subject: Re: [PATCH] pci: Skip power-off reset when pending unplug
Date: Wed, 22 Dec 2021 15:48:24 -0500

On Wed, Dec 22, 2021 at 12:08:09PM -0700, Alex Williamson wrote:
> On Tue, 21 Dec 2021 18:40:09 -0500
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Tue, Dec 21, 2021 at 09:36:56AM -0700, Alex Williamson wrote:
> > > On Mon, 20 Dec 2021 18:03:56 -0500
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > >   
> > > > On Mon, Dec 20, 2021 at 11:26:59AM -0700, Alex Williamson wrote:  
> > > > > The below referenced commit introduced a change where devices under a
> > > > > root port slot are reset in response to removing power to the slot.
> > > > > This improves emulation relative to bare metal when the slot is 
> > > > > powered
> > > > > off, but introduces an unnecessary step when devices under that slot
> > > > > are slated for removal.
> > > > > 
> > > > > In the case of an assigned device, there are mandatory delays
> > > > > associated with many device reset mechanisms which can stall the hot
> > > > > unplug operation.  Also, in cases where the unplug request is 
> > > > > triggered
> > > > > via a release operation of the host driver, internal device locking in
> > > > > the host kernel may result in a failure of the device reset mechanism,
> > > > > which generates unnecessary log warnings.
> > > > > 
> > > > > Skip the reset for devices that are slated for unplug.
> > > > > 
> > > > > Cc: qemu-stable@nongnu.org
> > > > > Fixes: d5daff7d3126 ("pcie: implement slot power control for pcie 
> > > > > root ports")
> > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>    
> > > > 
> > > > I am not sure this is safe. IIUC pending_deleted_event
> > > > is normally set after host admin requested device removal,
> > > > while the reset could be triggered by guest for its own reasons
> > > > such as suspend or driver reload.  
> > > 
> > > Right, the case where I mention that we get the warning looks exactly
> > > like the admin doing a device eject, it calls qdev_unplug().  I'm not
> > > trying to prevent arbitrary guest resets of the device, in fact there
> > > are cases where the guest really should be able to reset the device,
> > > nested assignment in addition to the cases you mention.  Gerd noted
> > > that this was an unintended side effect of the referenced patch to
> > > reset device that are imminently being removed.
> > >   
> > > > Looking at this some more, I am not sure I understand the
> > > > issue completely.
> > > > We have:
> > > > 
> > > >     if ((sltsta & PCI_EXP_SLTSTA_PDS) && (val & PCI_EXP_SLTCTL_PCC) &&
> > > >         (val & PCI_EXP_SLTCTL_PIC_OFF) == PCI_EXP_SLTCTL_PIC_OFF &&
> > > >         (!(old_slt_ctl & PCI_EXP_SLTCTL_PCC) ||
> > > >         (old_slt_ctl & PCI_EXP_SLTCTL_PIC_OFF) != 
> > > > PCI_EXP_SLTCTL_PIC_OFF)) {
> > > >         pcie_cap_slot_do_unplug(dev);
> > > >     }
> > > >     pcie_cap_update_power(dev);
> > > > 
> > > > so device unplug triggers first, reset follows and by that time
> > > > there should be no devices under the bus, if there are then
> > > > it's because guest did not clear the power indicator.  
> > > 
> > > Note that the unplug only triggers here if the Power Indicator Control
> > > is OFF, I see writes to SLTCTL in the following order:
> > > 
> > >  01f1 - > 02f1 -> 06f1 -> 07f1
> > > 
> > > So PIC changes to BLINK, then PCC changes the slot to OFF (this
> > > triggers the reset), then PIC changes to OFF triggering the unplug.
> > > 
> > > The unnecessary reset that occurs here is universal.  Should the unplug
> > > be occurring when:
> > > 
> > >   (val & PCI_EXP_SLTCTL_PIC_OFF) != PCI_EXP_SLTCTL_PIC_ON
> > > 
> > > ?  
> > 
> > well blinking generally means "do not remove yet".
> 
> Blinking indicates that the slot is in a transition phase,

Well the spec seems to state that blinking indicates it's waiting
to see user does not change his/her mind by pressing the
button again.

> which we
> could also interpret to mean that power has been removed and this is
> the time required for the power to settle.  By that token, it might be
> reasonable that a power state induced reset doesn't actually occur
> until the slot reaches both the slot power off and power indicator off
> state.

The reset is actually just an attempt to approximate power off.
So I'm not sure that is right powering device off and then on
is just a slow but reasonable way for guests to reset a device.



>  In that case we could reorganize things to let the unplug occur
> before the power transition.

Hmm you mean unplug on host immediately when it starts blinking?
But drivers are not notified at this point, are they?

>  Of course the original proposal also
> essentially supports this interpretation, the slot power off reset does
> not occur for devices with a pending unplug and those devices are
> removed after the slot transition grace period.

Meaning the patch you posted? It relies on guest doing a specific
thing though, and guest and host states are not synchronized.


I think it might work to defer reset if it's blinking until it actually
stops blinking. To me it seems a bit less risky but but again, in theory
some guest driver could use the power cycle reset while hotplug plays
with PIC waiting for the cancel button press.
E.g. I suspect your patch can be broken just by guest loading/unloading
driver in a loop while host also triggers plug/unplug.


> > > > So I am not sure how to fix the assignment issues as I'm not sure how do
> > > > they trigger, but here is a wild idea: maybe it should support an API
> > > > for starting reset asynchronously, then if the following access is
> > > > trying to reset again that second reset can just be skipped, while any
> > > > other access will stall.  
> > > 
> > > As above, there's not a concurrency problem, so I don't see how an
> > > async API buys us anything.  
> > 
> > Well unplug resets the device again, right? Why is that reset not
> > problematic and this one is?
> 
> It has the same issue, but there's no log message generated that
> worries QE into marking this as a regression.

Well is the device already stopped from working at this point?
Prevented from getting and responding to guest accesses?
By something else?
Because this is what happens when it's powered off, isn't it?

>  Obviously the ideal
> outcome would be that we could reset the device under these conditions,
> but to this point we've only managed to introduce "try" semantics to
> the functions to prevent deadlock.  As this is a condition induced by
> corner case admin device handling, we've so far considered the reset
> failure acceptable.  Thanks,
> 
> Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]