[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2] vfio-ccw: Permit missing IRQs
From: |
Eric Farman |
Subject: |
Re: [PATCH v2] vfio-ccw: Permit missing IRQs |
Date: |
Fri, 23 Apr 2021 12:24:57 -0400 |
On Fri, 2021-04-23 at 09:22 -0400, Matthew Rosato wrote:
> On 4/23/21 7:42 AM, Cornelia Huck wrote:
> > On Wed, 21 Apr 2021 17:20:53 +0200
> > Eric Farman <farman@linux.ibm.com> wrote:
> >
> > > Commit 690e29b91102 ("vfio-ccw: Refactor ccw irq handler")
> > > changed
> > > one of the checks for the IRQ notifier registration from saying
> > > "the host needs to recognize the only IRQ that exists" to saying
> > > "the host needs to recognize ANY IRQ that exists."
> > >
> > > And this worked fine, because the subsequent change to support
> > > the
> > > CRW IRQ notifier doesn't get into this code when running on an
> > > older
> > > kernel, thanks to a guard by a capability region. The later
> > > addition
> > > of the REQ(uest) IRQ by commit b2f96f9e4f5f ("vfio-ccw: Connect
> > > the
> > > device request notifier") broke this assumption because there is
> > > no
> > > matching capability region. Thus, running new QEMU on an older
> > > kernel fails with:
> > >
> > > vfio: unexpected number of irqs 2
> > >
> > > Let's adapt the message here so that there's a better clue of
> > > what
> > > IRQ is missing.
> > >
> > > Furthermore, let's make the REQ(uest) IRQ not fail when
> > > attempting
> > > to register it, to permit running vfio-ccw on a newer QEMU with
> > > an
> > > older kernel.
> > >
> > > Fixes: b2f96f9e4f5f ("vfio-ccw: Connect the device request
> > > notifier")
> > > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > > ---
> > >
> > > Notes:
> > > v1->v2:
> > > - Keep existing "invalid number of IRQs" message with
> > > adapted text [CH]
> > > - Move the "is this an error" test to the registration of
> > > the IRQ in
> > > question, rather than making it allowable for any IRQ
> > > mismatch [CH]
> > > - Drop Fixes tag for initial commit [EF]
> > >
> > > v1:
> > > 20210419184906.2847283-1-farman@linux.ibm.com/">https://lore.kernel.org/qemu-devel/20210419184906.2847283-1-farman@linux.ibm.com/
> > >
> > > hw/vfio/ccw.c | 12 +++++++-----
> > > 1 file changed, 7 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> > > index b2df708e4b..400bc07fe2 100644
> > > --- a/hw/vfio/ccw.c
> > > +++ b/hw/vfio/ccw.c
> > > @@ -412,8 +412,8 @@ static void
> > > vfio_ccw_register_irq_notifier(VFIOCCWDevice *vcdev,
> > > }
> > >
> > > if (vdev->num_irqs < irq + 1) {
> > > - error_setg(errp, "vfio: unexpected number of irqs %u",
> > > - vdev->num_irqs);
> > > + error_setg(errp, "vfio: IRQ %u not available (number of
> > > irqs %u)",
> > > + irq, vdev->num_irqs);
> > > return;
> > > }
> > >
> > > @@ -696,13 +696,15 @@ static void vfio_ccw_realize(DeviceState
> > > *dev, Error **errp)
> > >
> > > vfio_ccw_register_irq_notifier(vcdev,
> > > VFIO_CCW_REQ_IRQ_INDEX, &err);
> > > if (err) {
> > > - goto out_req_notifier_err;
> > > + /*
> > > + * Report this error, but do not make it a failing
> > > condition.
> > > + * Lack of this IRQ in the host does not prevent normal
> > > operation.
> > > + */
> > > + error_report_err(err);
> > > }
> > >
> > > return;
> > >
> > > -out_req_notifier_err:
> > > - vfio_ccw_unregister_irq_notifier(vcdev,
> > > VFIO_CCW_CRW_IRQ_INDEX);
> > > out_crw_notifier_err:
> > > vfio_ccw_unregister_irq_notifier(vcdev,
> > > VFIO_CCW_IO_IRQ_INDEX);
> > > out_io_notifier_err:
> >
> > Patch looks good to me, but now I'm wondering: Is calling
> > vfio_ccw_unregister_irq_notifier() for an unregistered irq actually
> > safe? I think it is (our notifiers are always present, and we
> > should
>
> So the unregister really does 4 things of interest:
s/4/3/ :)
>
> 1) vfio_set_irq_signaling(VFIO_IRQ_SET_ACTION_TRIGGER)
> This will drive VFIO_DEVICE_SET_IRQS ioctl, and it looks to me like
> the
> VFIO_DEVICE_SET_IRQS ioctl should return -EINVAL if it's not
> registered
> with the kernel, which will in turn cause the vfio_set_irq_signaling
> to
> fast-path out on a return 0. That seems correct.
>
> 2) qemu_set_fd_handler
> If we never registered (or it was already unregistered) then fd
> should
> not show up in find_aio_handler() so we should also bail out fast on
> qemu_set_fd_handler()
>
> 3) event_notifier_cleanup
> Should bail out right away if already cleaned up before
> (!initialized)
>
> So, this looks OK to me.
+1 (Thanks for doing the research, Matt)
>
>
> > handle any ioctl failure gracefully), but worth a second look. In
> > fact,
> > we already unregister the crw irq unconditionally, so we would
> > likely
> > already have seen any problems for that one, so I assume all is
> > good.
> >
>
> But +1 on driving the path and making sure it works anyway (do a
> double-unregister?)
Yeah, works fine. Tried skipping the register of the CRW and double-
unregistering the IO IRQ.
I also tried a combination where I unconditionally unregister the REQ
IRQ, which obviously throws a message when it doesn't exist on the
host.
That might be nice to clean up so that adding new IRQs in the future is
more intuitive; I'll add it to the list unless you want me to address
it in a v2 of this. (Previously, the addition of the REQ IRQ needed to
add the cleanup of the CRW IRQ. So the next IRQ would need to cleanup
the REQ IRQ.)
Eric