[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [GPU and VFIO] qemu hang at startup, VFIO_IOMMU_MAP_DMA
From: |
Bob Chen |
Subject: |
Re: [Qemu-devel] [GPU and VFIO] qemu hang at startup, VFIO_IOMMU_MAP_DMA is extremely slow |
Date: |
Tue, 26 Dec 2017 19:37:36 +0800 |
2017-12-26 18:51 GMT+08:00 Liu, Yi L <address@hidden>:
> > -----Original Message-----
> > From: Qemu-devel [mailto:qemu-devel-bounces+yi.l.liu=
> address@hidden
> > On Behalf Of Bob Chen
> > Sent: Tuesday, December 26, 2017 6:30 PM
> > To: address@hidden
> > Subject: [Qemu-devel] [GPU and VFIO] qemu hang at startup,
> > VFIO_IOMMU_MAP_DMA is extremely slow
> >
> > Hi,
> >
> > I have a host server with multiple GPU cards, and was assigning them to
> qemu
> > with VFIO.
> >
> > I found that when setting up the last free GPU, the qemu process would
> hang
>
> Are all the GPUs in the same iommu group?
>
Each of them is in a single group.
>
> > there and took almost 10 minutes before finishing startup. I made some
> dig by
> > gdb, and found the slowest part occurred at the
> > hw/vfio/common.c:vfio_dma_map function call.
>
> This is to setup mapping and it takes time. This function would be called
> multiple
> times and it will take some time. The slowest part, do you mean it takes
> a long time for a single vfio_dma_map() calling or the whole passthru
> spends a lot
> of time on creating mapping. If a single calling takes a lot of time, then
> it may be
> a problem.
>
Each vfio_dma_map() takes 3 to 10 mins accordingly.
>
> You may paste your Qemu command which might help. And the dmesg in host
> would also help.
>
cmd line:
After adding -device vfio-pci,host=09:00.0,multifunction=on,addr=0x15, qemu
would hang.
Otherwise, could start immediately without this option.
dmesg:
[Tue Dec 26 18:39:50 2017] vfio-pci 0000:09:00.0: enabling device (0400 ->
0402)
[Tue Dec 26 18:39:51 2017] vfio_ecap_init: 0000:09:00.0 hiding ecap
address@hidden
[Tue Dec 26 18:39:51 2017] vfio_ecap_init: 0000:09:00.0 hiding ecap
address@hidden
[Tue Dec 26 18:39:55 2017] kvm: zapping shadow pages for mmio generation
wraparound
[Tue Dec 26 18:39:55 2017] kvm: zapping shadow pages for mmio generation
wraparound
[Tue Dec 26 18:40:03 2017] kvm [74663]: vcpu0 ignored rdmsr: 0x345
Kernel:
3.10.0-514.16.1 CentOS 7.3
>
> >
> >
> > static int vfio_dma_map(VFIOContainer *container, hwaddr iova, ram_addr_t
> > size, void *vaddr, bool readonly) { ...
> > if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0 ||
> > (errno == EBUSY && vfio_dma_unmap(container, iova, size) == 0 &&
> > ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0)) {
> > return 0;
> > }
> > ...
> > }
> >
> >
> > The hang was enable to reproduce on one of my hosts, I was setting up a
> 4GB
> > memory VM, while the host still had 16GB free. GPU physical mem is 8G.
>
> Does it happen when you only assign a single GPU?
>
Not sure. Didn't try multiple GPUs.
>
> > Also, this phenomenon was observed on other hosts occasionally, and the
> > similarity is that they always happened on the last free GPU.
> >
> >
> > Full stack trace file is attached. Looking forward for you help, thanks
> >
> >
> > - Bob
>
> Regards,
> Yi L
>