qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1


From: Andrea Arcangeli
Subject: Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1
Date: Fri, 28 Nov 2008 02:56:02 +0100

On Thu, Nov 27, 2008 at 09:14:45PM +0200, Blue Swirl wrote:
> The previous similar attempt by Anthony for generic DMA using vectored
> IO was abandoned because the malloc/free overhead was more than the

Even if there were dynamic allocations in the fast path, the overhead
of malloc/free is nothing if compared to running and waiting a host
kernel syscall to return every 4k, not to tell with O_DIRECT enabled
which is the whole point of having a direct-dma API that truly doesn't
pollute the cache. With O_DIRECT, without a real readv/writev I/O
performance would be destroyed going down to something like 10M/sec
even on the fastest storage/CPU/ram combinations.

So the question is how those benchmarks were run, with or without a
real readv/writev and with or without O_DIRECT to truly eliminate all
CPU cache pollution out of the memory copies?

About malloc, all we care about is the direct-io fast path, and with
my patch there is no allocation whatsoever in the fast path. About the
bounce layer, that is there for correctness only (purely to do DMA to
non-RAM or in non-linear RAM ranges with non-RAM holes in between) and
we don't care about it in performance terms.

> performance gain. Have you made any performance measurements? How does
> this version compare to the previous ones?

I run some minor benchmark but it's basically futile to benchmark with
the bdrv_aio_readv/writev_em.

> I think the pci_ prefix can be removed, there is little PCI specific.

Adding the pci_ prefix looked a requirement in naming terms from
previous threads on the topic. Before I learnt about it, I didn't want
to have a pci_ prefix too so I can certainly agree with you ;).

There's is nothing PCI specific so far. Anyway this is just a naming
matter, it's up to you to decide what you like :).

> For Sparc32 IOMMU (and probably other IOMMUS), it should be possible
> to register a function used in place of  cpu_physical_memory_rw,
> c_p_m_can_dma etc. The goal is that it should be possible to stack the
> DMA resolvers (think of devices behind a number of buses).

The hardware thing being emulated in the real world wouldn't attach to
both buses I think, hence you can specify it in the driver what kind
of iommu it has (then behind it you can emulate whatever hardware you
want but still the original device was pci or not-pci). I personally
don't see much difference as renaming later wouldn't be harder than a
sed script...




reply via email to

[Prev in Thread] Current Thread [Next in Thread]