[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Re: [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emu
From: |
Michael S. Tsirkin |
Subject: |
[Qemu-devel] Re: [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4) |
Date: |
Thu, 16 Sep 2010 11:20:43 +0200 |
User-agent: |
Mutt/1.5.20 (2009-12-10) |
On Thu, Sep 16, 2010 at 10:06:16AM +0300, Eduard - Gabriel Munteanu wrote:
> On Mon, Sep 13, 2010 at 10:01:20PM +0200, Michael S. Tsirkin wrote:
> > So I think the following will give the idea of what an API
> > might look like that will let us avoid the scary hacks in
> > e.g. the ide layer and other generic layers that need to do DMA,
> > without either binding us to pci, adding more complexity with
> > callbacks, or losing type safety with casts and void*.
> >
> > Basically we have DMADevice that we can use container_of on
> > to get a PCIDevice from, and DMAMmu that will get instanciated
> > in a specific MMU.
> >
> > This is not complete code - just a header - I might complete
> > this later if/when there's interest or hopefully someone interested
> > in iommu emulation will.
>
> Hi,
>
> I personally like this approach better. It also seems to make poisoning
> cpu_physical_memory_*() easier if we convert every device to this API.
> We could then ban cpu_physical_memory_*(), perhaps by requiring a
> #define and #ifdef-ing those declarations.
>
> > Notes:
> > the IOMMU_PERM_RW code seem unused, so I replaced
> > this with plain is_write. Is it ever useful?
>
> The original idea made provisions for stuff like full R/W memory maps.
> In that case cpu_physical_memory_map() would call the translation /
> checking function with perms == IOMMU_PERM_RW. That's not there yet so
> it can be removed at the moment, especially since it only affects these
> helpers.
>
> Also, I'm not sure if there are other sorts of accesses besides reads
> and writes we want to check or translate.
>
> > It seems that invalidate callback should be able to
> > get away with just a device, so I switched to that
> > from a void pointer for type safety.
> > Seems enough for the users I saw.
>
> I think this makes matters too complicated. Normally, a single DMADevice
> should be embedded within a <bus>Device,
No, DMADevice is a device that does DMA.
So e.g. a PCI device would embed one.
Remember, traslations are per device, right?
DMAMmu is part of the iommu object.
> so doing this makes it really
> hard to invalidate a specific map when there are more of them. It forces
> device code to act as a bus, provide fake 'DMADevice's for each map and
> dispatch translation to the real DMATranslateFunc. I see no other way.
>
> If you really want more type-safety (although I think this is a case of
> a true opaque identifying something only device code understands), I
> have another proposal: have a DMAMap embedded in the opaque. Example
> from dma-helpers.c:
>
> typedef struct {
> DMADevice *owner;
> [...]
> } DMAMap;
>
> typedef struct {
> [...]
> DMAMap map;
> [...]
> } DMAAIOCB;
>
> /* The callback. */
> static void dma_bdrv_cancel(DMAMap *map)
> {
> DMAAIOCB *dbs = container_of(map, DMAAIOCB, map);
>
> [...]
> }
>
> The upside is we only need to pass the DMAMap. That can also contain
> details of the actual map in case the device wants to release only the
> relevant range and remap the rest.
Fine.
Or maybe DMAAIOCB (just make some letters lower case: DMAIocb?).
Everyone will use it anyway, right?
> > I saw devices do stl_le_phys and such, these
> > might need to be wrapped as well.
>
> stl_le_phys() is defined and used only by hw/eepro100.c. That's already
> dealt with by converting the device.
>
I see. Need to get around to adding some prefix to it to make this clear.
> Thanks,
> Eduard
>
> > Signed-off-by: Michael S. Tsirkin <address@hidden>
> >
> > ---
> >
> > diff --git a/hw/dma_rw.h b/hw/dma_rw.h
> > new file mode 100644
> > index 0000000..d63fd17
> > --- /dev/null
> > +++ b/hw/dma_rw.h
> > @@ -0,0 +1,122 @@
> > +#ifndef DMA_RW_H
> > +#define DMA_RW_H
> > +
> > +#include "qemu-common.h"
> > +
> > +/* We currently only have pci mmus, but using
> > + a generic type makes it possible to use this
> > + e.g. from the generic ide code without callbacks. */
> > +typedef uint64_t dma_addr_t;
> > +
> > +typedef struct DMAMmu DMAMmu;
> > +typedef struct DMADevice DMADevice;
> > +
> > +typedef int DMATranslateFunc(DMAMmu *mmu,
> > + DMADevice *dev,
> > + dma_addr_t addr,
> > + dma_addr_t *paddr,
> > + dma_addr_t *len,
> > + int is_write);
> > +
> > +typedef int DMAInvalidateMapFunc(DMADevice *);
> > +struct DMAMmu {
> > + /* invalidate, etc. */
> > + DmaTranslateFunc *translate;
> > +};
> > +
> > +struct DMADevice {
> > + DMAMmu *mmu;
> > + DMAInvalidateMapFunc *invalidate;
> > +};
> > +
> > +void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *);
> > +
> > +static inline void dma_memory_rw(DMADevice *dev,
> > + dma_addr_t addr,
> > + void *buf,
> > + uint32_t len,
> > + int is_write)
> > +{
> > + uint32_t plen;
> > + /* Fast-path non-iommu.
> > + * More importantly, makes it obvious what this function does. */
> > + if (!dev->mmu) {
> > + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> > + return;
> > + }
> > + while (len) {
> > + err = dev->mmu->translate(iommu, dev, addr, &paddr, &plen,
> > is_write);
> > + if (err) {
> > + return;
> > + }
> > +
> > + /* The translation might be valid for larger regions. */
> > + if (plen > len) {
> > + plen = len;
> > + }
> > +
> > + cpu_physical_memory_rw(paddr, buf, plen, is_write);
> > +
> > + len -= plen;
> > + addr += plen;
> > + buf += plen;
> > + }
> > +}
> > +
> > +void *dma_memory_map(DMADevice *dev,
> > + dma_addr_t addr,
> > + uint32_t *len,
> > + int is_write);
> > +void dma_memory_unmap(DMADevice *dev,
> > + void *buffer,
> > + uint32_t len,
> > + int is_write,
> > + uint32_t access_len);
> > +
> > +
> > ++#define DEFINE_DMA_LD(suffix, size)
> > \
> > ++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr)
> > \
> > ++{
> > \
> > ++ int err;
> > \
> > ++ target_phys_addr_t paddr, plen;
> > \
> > ++ if (!dev->mmu) {
> > \
> > ++ return ld##suffix##_phys(addr, val);
> > \
> > ++ }
> > \
> > ++
> > \
> > ++ err = dev->mmu->translate(dev->bus->iommu, dev,
> > \
> > ++ addr, &paddr, &plen, IOMMU_PERM_READ);
> > \
> > ++ if (err || (plen < size / 8))
> > \
> > ++ return 0;
> > \
> > ++
> > \
> > ++ return ld##suffix##_phys(paddr);
> > \
> > ++}
> > ++
> > ++#define DEFINE_DMA_ST(suffix, size)
> > \
> > ++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val)
> > \
> > ++{
> > \
> > ++ int err;
> > \
> > ++ target_phys_addr_t paddr, plen;
> > \
> > ++
> > \
> > ++ if (!dev->mmu) {
> > \
> > ++ st##suffix##_phys(addr, val);
> > \
> > ++ return;
> > \
> > ++ }
> > \
> > ++ err = dev->mmu->translate(dev->bus->iommu, dev,
> > \
> > ++ addr, &paddr, &plen, IOMMU_PERM_WRITE);
> > \
> > ++ if (err || (plen < size / 8))
> > \
> > ++ return;
> > \
> > ++
> > \
> > ++ st##suffix##_phys(paddr, val);
> > \
> > ++}
> > +
> > +DEFINE_DMA_LD(ub, 8)
> > +DEFINE_DMA_LD(uw, 16)
> > +DEFINE_DMA_LD(l, 32)
> > +DEFINE_DMA_LD(q, 64)
> > +
> > +DEFINE_DMA_ST(b, 8)
> > +DEFINE_DMA_ST(w, 16)
> > +DEFINE_DMA_ST(l, 32)
> > +DEFINE_DMA_ST(q, 64)
> > +
> > +#endif
> > diff --git a/hw/pci.h b/hw/pci.h
> > index 1c6075e..9737f0e 100644
> > --- a/hw/pci.h
> > +++ b/hw/pci.h
> > @@ -5,6 +5,7 @@
> > #include "qobject.h"
> >
> > #include "qdev.h"
> > +#include "dma_rw.h"
> >
> > /* PCI includes legacy ISA access. */
> > #include "isa.h"
> > @@ -119,6 +120,10 @@ enum {
> >
> > struct PCIDevice {
> > DeviceState qdev;
> > +
> > + /* For devices that do DMA. */
> > + DMADevice dma;
> > +
> > /* PCI config space */
> > uint8_t *config;
> >