Re: [Qemu-devel] RFC: vfio API changes needed for powerpc

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] RFC: vfio API changes needed for powerpc

From:	Scott Wood
Subject:	Re: [Qemu-devel] RFC: vfio API changes needed for powerpc
Date:	Tue, 2 Apr 2013 14:39:24 -0500

On 04/02/2013 12:32:00 PM, Yoder Stuart-B08248 wrote:

Alex,

We are in the process of implementing vfio-pci support for theFreescaleIOMMU (PAMU). It is an aperture/window-based IOMMU and is quitedifferent

than x86, and will involve creating a 'type 2' vfio implementation.

For each device's DMA mappings, PAMU has an overall aperture and anumberof windows. All sizes and window counts must be power of 2. Toillustrate,below is a mapping for a 256MB guest, including guest memory (backedby

64MB huge pages) and some windows for MSIs:

    Total aperture: 512MB
    # of windows: 8

    win gphys/
    #   iova        phys          size
    --- ----        ----          ----
    0   0x00000000  0xX_XX000000  64MB
    1   0x04000000  0xX_XX000000  64MB
    2   0x08000000  0xX_XX000000  64MB
    3   0x0C000000  0xX_XX000000  64MB
    4   0x10000000  0xf_fe044000  4KB    // msi bank 1
    5   0x14000000  0xf_fe045000  4KB    // msi bank 2
    6   0x18000000  0xf_fe046000  4KB    // msi bank 3
    7            -             -  disabled

There are a couple of updates needed to the vfio user->kernelinterface

that we would like your feedback on.

1.  IOMMU geometry

   The kernel IOMMU driver now has an interface (see domain_set_attr,
   domain_get_attr) that lets us set the domain geometry using
   "attributes".

   We want to expose that to user space, so envision needing a couple
   of new ioctls to do this:
        VFIO_IOMMU_SET_ATTR
        VFIO_IOMMU_GET_ATTR

Note that this means attributes need to be updated for user-APIappropriateness, such as using fixed-size types.

2.   MSI window mappings

   The more problematic question is how to deal with MSIs.  We need to
create mappings for up to 3 MSI banks that a device may need totargetto generate interrupts. The Linux MSI driver can allocate MSIsfromthe 3 banks any way it wants, and currently user space has no wayof
   knowing which bank may be used for a given device.
There are 3 options we have discussed and would like yourdirection:
   A.  Implicit mappings -- with this approach user space would not
       explicitly map MSIs.  User space would be required to set the
geometry so that there are 3 unused windows (the last 3windows)

Where does userspace get the number "3" from? E.g. on newer chipsthere are 4 MSI banks. Maybe future chips have even more.

   B.  Explicit mapping using DMA map flags.  The idea is that a new
       flag to DMA map (VFIO_DMA_MAP_FLAG_MSI) would mean that
       a mapping is to be created for the supplied iova.  No vaddr
       is given though.  So in the above example there would be a
       a dma map at 0x10000000 for 24KB (and no vaddr).

A single 24 KiB mapping wouldn't work (and why 24KB? What if only oneMSI group is involved in this VFIO group? What if four MSI groups areinvolved?). You'd need to either have a naturally aligned,power-of-two sized mapping that covers exactly the pages you want tomap and no more, or you'd need to create a separate mapping for eachMSI bank, and due to PAMU subwindow alignment restrictions thesemappings could not be contiguous in iova-space.

   C.  Explicit mapping using normal DMA map.  The last idea is that
       we would introduce a new ioctl to give user-space an fd to
       the MSI bank, which could be mmapped.  The flow would be
       something like this:

-for each group user space calls new ioctlVFIO_GROUP_GET_MSI_FD

          -user space mmaps the fd, getting a vaddr
          -user space does a normal DMA map for desired iova
       This approach makes everything explicit, but adds a new ioctl
       applicable most likely only to the PAMU (type2 iommu).

The new ioctl isn't really specific to PAMU (or whatever "type2" issupposed to be, which nobody ever explains when I ask), so much as tothe MSI implementation. It just exposes the MSI register as anotherdevice resource (well, technically a groupwide resource, unless weexpose it on a per-device basis and provide enough information foruserspace to recognize when it's the same for other devices in thegroup) to be mmapped, which userspace can choose to map in the IOMMU aswell.

Note that in the explicit case, userspace would have to program the MSIiova into the PCI device's config space (or communicate the chosenaddress to the kernel so it can set the config space registers).


-Scott

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] RFC: vfio API changes needed for powerpc, Yoder Stuart-B08248, 2013/04/02
- Re: [Qemu-devel] RFC: vfio API changes needed for powerpc, Scott Wood <=
  - Re: [Qemu-devel] RFC: vfio API changes needed for powerpc, Stuart Yoder, 2013/04/02
    - Re: [Qemu-devel] RFC: vfio API changes needed for powerpc, Scott Wood, 2013/04/02
    - Re: [Qemu-devel] RFC: vfio API changes needed for powerpc, Stuart Yoder, 2013/04/02
- Re: [Qemu-devel] RFC: vfio API changes needed for powerpc, Alex Williamson, 2013/04/02
  - Re: [Qemu-devel] RFC: vfio API changes needed for powerpc, Stuart Yoder, 2013/04/02
    - Re: [Qemu-devel] RFC: vfio API changes needed for powerpc, Alex Williamson, 2013/04/02
    - Re: [Qemu-devel] RFC: vfio API changes needed for powerpc, Scott Wood, 2013/04/02
    - Re: [Qemu-devel] RFC: vfio API changes needed for powerpc, Alex Williamson, 2013/04/02
  - Re: [Qemu-devel] RFC: vfio API changes needed for powerpc, Scott Wood, 2013/04/02
    - Re: [Qemu-devel] RFC: vfio API changes needed for powerpc, Stuart Yoder, 2013/04/02

Prev by Date: Re: [Qemu-devel] Virtio 9p live migration patches
Next by Date: [Qemu-devel] [PATCH] tpm: Fix several compiler warnings (redefined data types)
Previous by thread: [Qemu-devel] RFC: vfio API changes needed for powerpc
Next by thread: Re: [Qemu-devel] RFC: vfio API changes needed for powerpc
Index(es):
- Date
- Thread