qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: completion timeouts with pin-based interrupts in QEMU hw/nvme


From: Keith Busch
Subject: Re: completion timeouts with pin-based interrupts in QEMU hw/nvme
Date: Mon, 16 Jan 2023 21:58:13 -0700

On Mon, Jan 16, 2023 at 10:14:07PM +0100, Klaus Jensen wrote:
> I noticed that the Linux driver does not use the INTMS/INTMC registers
> to mask interrupts on the controller while processing CQEs. While not
> required by the spec, it is *recommended* in setups not using MSI-X to
> reduce the risk of spurious and/or missed interrupts.

That's assuming completions are deferred to a bottom half. We don't do
that by default in Linux nvme, though you can ask the driver to do that
if you want.
 
> With the patch below, running 100 boot iterations, no timeouts were
> observed on QEMU emulated riscv64 or mips64.
>
> No changes are required in the QEMU hw/nvme interrupt logic.

Yeah, I can see why: it forces the irq line to deassert then assert,
just like we had forced to happen within the device side patches. Still,
none of that is supposed to be necessary, but this idea of using these
registers is probably fine.

>  static irqreturn_t nvme_irq(int irq, void *data)
> +{
> +       struct nvme_queue *nvmeq = data;
> +       struct nvme_dev *dev = nvmeq->dev;
> +       u32 mask = 1 << nvmeq->cq_vector;
> +       irqreturn_t ret = IRQ_NONE;
> +       DEFINE_IO_COMP_BATCH(iob);
> +
> +       writel(mask, dev->bar + NVME_REG_INTMS);
> +
> +       if (nvme_poll_cq(nvmeq, &iob)) {
> +               if (!rq_list_empty(iob.req_list))
> +                       nvme_pci_complete_batch(&iob);
> +               ret = IRQ_HANDLED;
> +       }
> +
> +       writel(mask, dev->bar + NVME_REG_INTMC);
> +
> +       return ret;
> +}

If threaded interrupts are used, you'll want to do the masking in
nvme_irq_check(), then clear it in the threaded handler instead of doing
both in the same callback.

> +static irqreturn_t nvme_irq_msix(int irq, void *data)
>  {
>         struct nvme_queue *nvmeq = data;
>         DEFINE_IO_COMP_BATCH(iob);
> @@ -1602,12 +1623,13 @@ static int queue_request_irq(struct nvme_queue *nvmeq)
>  {
>         struct pci_dev *pdev = to_pci_dev(nvmeq->dev->dev);
>         int nr = nvmeq->dev->ctrl.instance;
> +       irq_handler_t handler = pdev->msix_enabled ? nvme_irq_msix : nvme_irq;
> 
>         if (use_threaded_interrupts) {
>                 return pci_request_irq(pdev, nvmeq->cq_vector, nvme_irq_check,
> -                               nvme_irq, nvmeq, "nvme%dq%d", nr, nvmeq->qid);
> +                               handler, nvmeq, "nvme%dq%d", nr, nvmeq->qid);
>         } else {
> -               return pci_request_irq(pdev, nvmeq->cq_vector, nvme_irq,
> +               return pci_request_irq(pdev, nvmeq->cq_vector, handler,
>                                 NULL, nvmeq, "nvme%dq%d", nr, nvmeq->qid);
>         }
>  }
> 
> 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]