[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1
From: |
Blue Swirl |
Subject: |
Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1 |
Date: |
Thu, 27 Nov 2008 21:14:45 +0200 |
On 11/27/08, Andrea Arcangeli <address@hidden> wrote:
> Hello everyone,
>
> One major limitation for KVM today is the lack of a proper way to
> write drivers in a way that allows the host OS to use direct DMA to
> the guest physical memory to avoid any intermediate copy. The only API
> provided to drivers seems to be the cpu_physical_memory_rw and that
> enforces all drivers to bounce and trash cpu caches and be memory
> bound. This new DMA API instead allows drivers to use a pci_dma_sg
> method for SG I/O that will translate the guest physical addresses to
> host virutal addresses and it will call two operation, one is a submit
> method and one is the complete method. The pci_dma_sg may have to
> bounce buffer internally and to limit the max bounce size it may have
> to submit I/O in pieces with multiple submit calls. The patch adapts
> the ide.c HD driver to use this. Once cdrom is converted too
> dma_buf_rw can be eliminated. As you can see the new ide_dma_submit
> and ide_dma_complete code is much more readable than the previous
> rearming callback.
>
> This is only tested with KVM so far but qemu builds, in general
> there's nothing kvm specific here (with the exception of a single
> kvm_enabled), so it should all work well for both.
>
> All we care about is the performance of the direct path, so I tried to
> avoid dynamic allocations there to avoid entering glibc, the current
> logic doesn't satisfy me yet but it should be at least faster than
> calling malloc (but I'm still working on it to avoid memory waste to
> detect when more than one iov should be cached). But in case of
> instabilities I recommend first thing to set MAX_IOVEC_IOVCNT 0 to
> disable that logic ;). I recommend to test with DEBUG_BOUNCE and with
> a 512 max bounce buffer too. It's running stable in all modes so
> far. However if ide.c end up calling aio_cancel things will likely
> fall apart but this is all because of bdrv_aio_readv/writev, and the
> astonishing lack of aio_readv/writev in glibc!
>
> Once we finish fixing storage performance with a real
> bdrv_aio_readv/writev (now a blocker issue), a pci_dma_single can be
> added for zero copy networking (one NIC per VM, or VMDq, IOV
> etc..). The DMA API should allow for that too.
The previous similar attempt by Anthony for generic DMA using vectored
IO was abandoned because the malloc/free overhead was more than the
performance gain. Have you made any performance measurements? How does
this version compare to the previous ones?
I think the pci_ prefix can be removed, there is little PCI specific.
For Sparc32 IOMMU (and probably other IOMMUS), it should be possible
to register a function used in place of cpu_physical_memory_rw,
c_p_m_can_dma etc. The goal is that it should be possible to stack the
DMA resolvers (think of devices behind a number of buses).
- [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Andrea Arcangeli, 2008/11/27
- [Qemu-devel] [RFC 2/2] bdrv_aio_readv/writev_em, Andrea Arcangeli, 2008/11/27
- Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1,
Blue Swirl <=
- Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Andrea Arcangeli, 2008/11/27
- Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Blue Swirl, 2008/11/28
- Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Andrea Arcangeli, 2008/11/28
- Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Blue Swirl, 2008/11/28
- Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Jamie Lokier, 2008/11/28
- Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Avi Kivity, 2008/11/29
- Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Andrea Arcangeli, 2008/11/30
- Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Anthony Liguori, 2008/11/30
- Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Andrea Arcangeli, 2008/11/30
- [Qemu-devel] [RFC 1/1] pci-dma-api-v2, Andrea Arcangeli, 2008/11/30