qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] VFIO: Not require to make VFIO_IOMMU_MAP_DMA for MMAPED P


From: Bhushan Bharat-R65777
Subject: Re: [Qemu-ppc] VFIO: Not require to make VFIO_IOMMU_MAP_DMA for MMAPED PCI bars
Date: Tue, 19 Feb 2013 04:56:09 +0000


> -----Original Message-----
> From: Alex Williamson [mailto:address@hidden
> Sent: Tuesday, February 19, 2013 9:55 AM
> To: Bhushan Bharat-R65777
> Cc: Yoder Stuart-B08248; Wood Scott-B07421; Graf Alexander-B36701; qemu-
> address@hidden; address@hidden
> Subject: Re: VFIO: Not require to make VFIO_IOMMU_MAP_DMA for MMAPED PCI bars
> 
> On Tue, 2013-02-19 at 04:05 +0000, Bhushan Bharat-R65777 wrote:
> >
> > > -----Original Message-----
> > > From: Alex Williamson [mailto:address@hidden
> > > Sent: Thursday, February 14, 2013 11:57 PM
> > > To: Bhushan Bharat-R65777
> > > Cc: Yoder Stuart-B08248; Wood Scott-B07421; Graf Alexander-B36701;
> > > qemu- address@hidden; address@hidden
> > > Subject: Re: VFIO: Not require to make VFIO_IOMMU_MAP_DMA for MMAPED
> > > PCI bars
> > >
> > > On Thu, 2013-02-14 at 08:22 +0000, Bhushan Bharat-R65777 wrote:
> > > > Hi Alex Williamson,
> > > >
> > > > I am able (with some hacks :)) to directly assign the e1000 PCI
> > > > device to KVM guest using VFIO on Freescale device. One of the
> > > > problem I am facing is about the DMA mapping in IOMMU for PCI
> > > > device BARs. On Freescale devices, the mmaped PCI device BARs are
> > > > not required to be mapped in IOMMU. Typically the flow of in/out
> > > > transaction (from CPU)
> > > > is:
> > > >
> > > > Incoming flow:
> > > >
> > > > -----|                     |----------|         |---------------|
> > > |-------------|
> > > > CORE |<----<------<-----<--| IOMMU    |<---<---<| 
> > > > PCI-Controller|<------<-
> ----
> > > <----<| PCI device  |
> > > > -----|                     |----------|         |---------------|
> > > |-------------|
> > > >
> > > > Outgoing Flow: IOMMU is bypassed for out transactions
> > > >
> > > > -----|                     |----------|         |---------------|
> > > |-------------|
> > > > CORE |>---->------>----|   | IOMMU    |    ->-->| 
> > > > PCI-Controller|>------>-
> ----
> > > >---->| PCI device  |
> > > > -----|                 |   |----------|   ^     |---------------|
> > > |-------------|
> > > >                        |                  |
> > > >                        |------------------|
> > >
> > > Mapping the BAR into the IOMMU isn't for this second use case, CPU
> > > to device is never IOMMU translated.  Instead, it's for this:
> > >
> > > |----------|         |---------------|                   |-------------|
> > > | IOMMU    |<---<---<| PCI-Controller|<------<-----<----<| PCI device  |
> > > |----------|         |---------------|                   |-------------|
> > >       |
> > >       |              |---------------|                   |-------------|
> > >       +-->---->--- ->| PCI-Controller|>------>----->---->| PCI device  |
> > >                      |---------------|                   |-------------|
> > >
> > > It's perfectly fine to not support peer-to-peer DMA, but let's skip
> > > it where it's not supported in case it might actually work in some cases.
> > >
> > > > Also because of some hardware limitations on our IOMMU it is
> > > > difficult to map these BAR regions with RAM (DDR) regions. So on
> > > > Freescale device we want the VFIO_IOMMU_MAP_DMA ioctl to be called
> > > > for RAM regions (DDR) only and _not_ for these mmaped ram regions
> > > > of PCI device bars. I can understand that we need to add these
> > > > mmaped PCI bars as RAM type in qemu memory_region_*(). So for that
> > > > I tried to skip these regions in VFIO memory_listeners. Below
> > > > changes which works for me. I am not sure whether this is correct
> > > > approach, please suggest.
> > > >
> > > > -------------
> > > > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c index c51ae67..63728d8
> > > > 100644
> > > > --- a/hw/vfio_pci.c
> > > > +++ b/hw/vfio_pci.c
> > > > @@ -1115,9 +1115,35 @@ static int vfio_dma_map(VFIOContainer
> > > > *container,
> > > hwaddr iova,
> > > >      return -errno;
> > > >  }
> > > >
> > > > -static bool vfio_listener_skipped_section(MemoryRegionSection
> > > > *section)
> > > > +static int memory_region_is_mmap_bars(VFIOContainer *container,
> > > > +                                      MemoryRegionSection
> > > > +*section)
> > > >  {
> > > > -    return !memory_region_is_ram(section->mr);
> > > > +    VFIOGroup *group;
> > > > +    VFIODevice *vdev;
> > > > +    int i;
> > > > +
> > > > +    QLIST_FOREACH(group, &container->group_list, next) {
> > >
> > > I think you want to start at the global &group_list
> >
> > Why you think global? My understanding is that the operation on 
> > dma_map/unmap
> is limited to a group in the given container.
> 
> There can theoretically be more than one container, so if you're only 
> searching
> groups within a container then you may not be searching all the devices.
> 
> > > > +        QLIST_FOREACH(vdev, &group->device_list, next) {
> > > > +            if (vdev->msix->mmap_mem.ram_addr ==
> > > > + section->mr->ram_addr)
> > >
> > > Note the comment in memory.h:
> > >
> > > struct MemoryRegion {
> > >     /* All fields are private - violators will be prosecuted */
> > >     ...
> > >     ram_addr_t ram_addr;
> > >
> > > You'll need to invent some kind of memory region comparison
> > > interfaces instead of accessing these private fields.
> >
> > You mean adding some api ine memory.c?
> 
> Yes
> 
> > > > +                return 1;
> > > > +            for (i = 0; i < PCI_ROM_SLOT; i++) {
> > > > +                VFIOBAR *bar = &vdev->bars[i];
> > > > +                if (bar->mmap_mem.ram_addr == section->mr->ram_addr)
> > > > +                    return 1;
> > > > +            }
> > > > +        }
> > > > +    }
> > > > +
> > > > +    return 0;
> > > > +}
> > >
> > > It's a pretty heavy weight function, I think the memory API should
> > > help us differentiate MMIO from RAM.
> >
> > What about adding a bool in " struct MemoryRegion {"
> >
> > Bool require_iommu_map;
> > There will be APIs to set/clear this bool and default all "ram = true", will
> have "require_iommu_map = true"
> >
> > if (!(memory_region->ram == true && memory_region->require_iommu_map == 
> > true)
> >     skip_iommumap;
> 
> I don't think memory.c wants to care about requiring iommu mapping or not.
> Perhaps an mmio address space vs a ram address space.

Can you please describe what you mean by mmio address space vs ram address 
space here? Which elements on struct MemoryRegion you are talking about?

> > >
> > > > +static bool vfio_listener_skipped_section(VFIOContainer *container,
> > > > +                                          MemoryRegionSection
> > > > +*section) {
> > > > +    if (!memory_region_is_ram(section->mr))
> > > > +       return 1;
> > >
> > > Need some kind of conditional here for platforms that support 
> > > peer-to-peer.
> > > Thanks,
> >
> > What about adding a iommu_type will help here (VFIO_TYPE2_IOMMU), which does
> not require BARs to be mapped in iommu ?
> 
> "Type2" seems excessive, there's already an VFIO_IOMMU_GET_INFO ioctl that 
> could
> easily add a flag and maybe a field.  I think you're doing something with 
> iommu
> geometry, perhaps inferring the geometry somehow.
> Why not add that to the GET_INFO data and use that to decide whether to 
> restrict
> mapping to ram?  Thanks,

I agree that getting the iommu type in VFIO_IOMMU_GET_INFO and then using a 
flag is a good idea. What I mean was that the flag have value either TYPE1 or 
TYPE2/FSL_PAMU etc.

BTW: Currently the hw/vfio_pci.c uses VFIO_TYPE1_IOMMU defines

Thanks
-Bharat

> 
> Alex
> 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]