[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [qemu-s390x] [PATCH v3 2/2] s390x/pci: Unplug remaining
From: |
David Hildenbrand |
Subject: |
Re: [Qemu-devel] [qemu-s390x] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset |
Date: |
Tue, 29 Jan 2019 16:11:14 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1 |
On 29.01.19 14:50, Pierre Morel wrote:
> On 29/01/2019 11:24, David Hildenbrand wrote:
>>>>> I'm wondering what the architecture says regarding those events -- can
>>>>> someone with access to the documentation comment?
>>>>
>>>> Ping. Any comments from the IBM folks?
>
> Hi,
>
> Sorry to have wait so long.
> At least Collin was faster.
>
>>>>
>>>
>>>
>>> So the idea here is that if we have a PCI device that is the process of
>>> being deconfigured and we are also in the middle of a reset, then let's
>>> accelerate deconfiguring of the PCI device during the reset. Makes sense.
>
> to me too.
> However, how do we ensure that the guest got time to respond to the
> first deconfigure request?
30 seconds, then the reboot. On a reboot, I don't see why we should give
a guest more time. "It's dead", rip out the card as the guest refused to
hand it back. (maybe it crashed! but after a reboot the guest state is
reset and baught back to life)
>
>>>
>>> Note:
>>>
>>> The callback function will deconfigure the the device and put it into
>>> standby mode. However, a PCI device should only go into standby from the
>>> *disabled state* (which it could already be in due to the unplug
>>> sequence), or from a *permanent error state* (something we should
>>> hopefully never see -- this means something went seriously wrong with
>>> the device).
>
> Not completely exact, the CHSC event 0x304, on the initiative of the
> host, force the "deconfigure state" from "configure state" generally,
> whatever sub-state it has (enabled/disabled/error...).
>
>>
>> Right, this should already have been checked before setting up the timer.
>
> Apropos timer, do we need a timer or wouldn't it be better to use a
> delay / a timer + condition?
I don't think we need a timer at all.
>
> AFAIU we get out of the unplug without waiting for any answer from the
> guest and we surely get the timer triggering after the reset has been done.
> That seems bad.
This is the case right now, correct.
>
>
>>
>>>
>>> Two things I'm concerned about:
>>>
>>> 1)
>>>
>>> What I would suggest is adding a check for the pbdev->state for
>>> ZPCI_FS_DISABLED || ZPCI_FS_PERMANENT_ERROR. If it is in either of these
>>> states, then we're safe to deconfigure and put into standby. If the
>>
>> We setup a timer if !ZPCI_FS_STANDBY and !ZPCI_FS_RESERVED.
>>
>> So for
>> - ZPCI_FS_DISABLED
>> - ZPCI_FS_ENABLED
>> - ZPCI_FS_BLOCKED
>> - ZPCI_FS_ERROR
>> - ZPCI_FS_PERMANENT_ERROR
>>
>> We setup a timer and simply go ahead and unplug the device when the
>> timer expires ("forced unplug").
>
> I agree only for ZPCI_FS_ENABLED why do we need to be smooth for other
> states?
> ZPCI_FS_DISABLED may be a candidate even I do not see the interrest but
> other states of the device should issue a HP_EVENT_DECONFIGURE_REQUEST
> and we do not need a timer (or any delay)
You can always expect that your guest driver is dead.
>
>>
>> Changing that behavior might be more invasive. Simply not unplugging in
>> s390_pcihost_timer_cb() on some of these states would mean that somebody
>> issued a request and that requests is simply lost/ignored. Not sure if
>> that is what we want. I think we need separate patches to change
>> something like that. Especially
>>
>> 1. What happens if the device was in ZPCI_FS_DISABLED, the guest ignores
>> the unplug request and moves the device to ZPCI_FS_ENABLED before the
>> timer expires? These are corner cases to consider.
>
> +1, we must ensure to do the work inside the unplug CB.
>
>>
>> 2. Do we need a timer at all? Now that Patch #1 introduces
>> unplug_requests, we are free to ignore requests from the user if the
>> guest is not reacting. I would really favor getting rid of the timer
>> completely. Was there a special reason why this was introduced?
>
> Yes, to let a chance to the guest to smoothly relinquish the device.
> (for example sync/clean the disk)
> However I do not think it is right implemented.
>
>>
>> No other architecture (e.g. ACPI) uses such a timer. They use a simple
>> flag to remember if a request is pending. I would really favor going
>> into that direction.
>
> I am not sure that the Intel architecture is a good example. :)
Right, we all learned that zPCI did it better. (sorry ;) )
>
> AFAIU they do not wait for the guest to have relinquish the device.
> Or do they?
> How long do they wait?
They wait for ever. And I guess we should do the same thing. If the
guest driver is broken (and this is really a rare scenario!) we would
not get the device back. Which is perfectly fine in my point of view. In
all other scenarios I guess the guest will simply respond in a timely
manner. And ripping out stuff from the guest always feels wrong. (yes
the channel subsystem works like that, but here we actually have a choice)
If we reboot, we can unplug the device. Otherwise, let's keep it simply
and don't use a timer.
Thanks!
--
Thanks,
David / dhildenb
- Re: [Qemu-devel] [PATCH v3 1/2] s390x/pci: Introduce unplug requests and split unplug handler, (continued)
[Qemu-devel] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset, David Hildenbrand, 2019/01/21
- Re: [Qemu-devel] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset, Cornelia Huck, 2019/01/23
- Re: [Qemu-devel] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset, Cornelia Huck, 2019/01/28
- Re: [Qemu-devel] [qemu-s390x] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset, Collin Walling, 2019/01/28
- Re: [Qemu-devel] [qemu-s390x] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset, David Hildenbrand, 2019/01/29
- Re: [Qemu-devel] [qemu-s390x] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset, Pierre Morel, 2019/01/29
- Re: [Qemu-devel] [qemu-s390x] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset,
David Hildenbrand <=
- Re: [Qemu-devel] [qemu-s390x] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset, Pierre Morel, 2019/01/29
- Re: [Qemu-devel] [qemu-s390x] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset, David Hildenbrand, 2019/01/29
- Re: [Qemu-devel] [qemu-s390x] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset, Cornelia Huck, 2019/01/29
- Re: [Qemu-devel] [qemu-s390x] [PATCH v3 2/2] s390x/pci: Unplug remaining devices on pcihost reset, Collin Walling, 2019/01/29