qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] RFC: vfio API changes needed for powerpc


From: Stuart Yoder
Subject: Re: [Qemu-devel] RFC: vfio API changes needed for powerpc
Date: Tue, 2 Apr 2013 15:58:53 -0500

On Tue, Apr 2, 2013 at 3:47 PM, Scott Wood <address@hidden> wrote:
> On 04/02/2013 03:38:42 PM, Stuart Yoder wrote:
>>
>> On Tue, Apr 2, 2013 at 2:39 PM, Scott Wood <address@hidden>
>> wrote:
>> > On 04/02/2013 12:32:00 PM, Yoder Stuart-B08248 wrote:
>> >>
>> >> Alex,
>> >>
>> >> We are in the process of implementing vfio-pci support for the
>> >> Freescale
>> >> IOMMU (PAMU).  It is an aperture/window-based IOMMU and is quite
>> >> different
>> >> than x86, and will involve creating a 'type 2' vfio implementation.
>> >>
>> >> For each device's DMA mappings, PAMU has an overall aperture and a
>> >> number
>> >> of windows.  All sizes and window counts must be power of 2.  To
>> >> illustrate,
>> >> below is a mapping for a 256MB guest, including guest memory (backed by
>> >> 64MB huge pages) and some windows for MSIs:
>> >>
>> >>     Total aperture: 512MB
>> >>     # of windows: 8
>> >>
>> >>     win gphys/
>> >>     #   iova        phys          size
>> >>     --- ----        ----          ----
>> >>     0   0x00000000  0xX_XX000000  64MB
>> >>     1   0x04000000  0xX_XX000000  64MB
>> >>     2   0x08000000  0xX_XX000000  64MB
>> >>     3   0x0C000000  0xX_XX000000  64MB
>> >>     4   0x10000000  0xf_fe044000  4KB    // msi bank 1
>> >>     5   0x14000000  0xf_fe045000  4KB    // msi bank 2
>> >>     6   0x18000000  0xf_fe046000  4KB    // msi bank 3
>> >>     7            -             -  disabled
>> >>
>> >> There are a couple of updates needed to the vfio user->kernel interface
>> >> that we would like your feedback on.
>> >>
>> >> 1.  IOMMU geometry
>> >>
>> >>    The kernel IOMMU driver now has an interface (see domain_set_attr,
>> >>    domain_get_attr) that lets us set the domain geometry using
>> >>    "attributes".
>> >>
>> >>    We want to expose that to user space, so envision needing a couple
>> >>    of new ioctls to do this:
>> >>         VFIO_IOMMU_SET_ATTR
>> >>         VFIO_IOMMU_GET_ATTR
>> >
>> >
>> > Note that this means attributes need to be updated for user-API
>> > appropriateness, such as using fixed-size types.
>> >
>> >
>> >> 2.   MSI window mappings
>> >>
>> >>    The more problematic question is how to deal with MSIs.  We need to
>> >>    create mappings for up to 3 MSI banks that a device may need to
>> >> target
>> >>    to generate interrupts.  The Linux MSI driver can allocate MSIs from
>> >>    the 3 banks any way it wants, and currently user space has no way of
>> >>    knowing which bank may be used for a given device.
>> >>
>> >>    There are 3 options we have discussed and would like your direction:
>> >>
>> >>    A.  Implicit mappings -- with this approach user space would not
>> >>        explicitly map MSIs.  User space would be required to set the
>> >>        geometry so that there are 3 unused windows (the last 3 windows)
>> >
>> >
>> > Where does userspace get the number "3" from?  E.g. on newer chips there
>> > are
>> > 4 MSI banks.  Maybe future chips have even more.
>>
>> Ok, then make the number 4.   The chance of more MSI banks in future chips
>> is nil,
>
>
> What makes you so sure?  Especially since you seem to be presenting this as
> not specifically an MPIC API.
>
>
>> and if it ever happened user space could adjust.
>
>
> What bit of API is going to tell it that it needs to adjust?

Haven't thought through that completely, but I guess we could add an API
to return the number of MSI banks for type 2 iommus.

>> Also, practically speaking since memory is typically allocate in powers of
>> 2 way you need to approximately double the window geometry anyway.
>
>
> Only if your existing mapping needs fit exactly in a power of two.
>
>
>> >>    B.  Explicit mapping using DMA map flags.  The idea is that a new
>> >>        flag to DMA map (VFIO_DMA_MAP_FLAG_MSI) would mean that
>> >>        a mapping is to be created for the supplied iova.  No vaddr
>> >>        is given though.  So in the above example there would be a
>> >>        a dma map at 0x10000000 for 24KB (and no vaddr).
>> >
>> >
>> > A single 24 KiB mapping wouldn't work (and why 24KB? What if only one
>> > MSI
>> > group is involved in this VFIO group?  What if four MSI groups are
>> > involved?).  You'd need to either have a naturally aligned, power-of-two
>> > sized mapping that covers exactly the pages you want to map and no more,
>> > or
>> > you'd need to create a separate mapping for each MSI bank, and due to
>> > PAMU
>> > subwindow alignment restrictions these mappings could not be contiguous
>> > in
>> > iova-space.
>>
>> You're right, a single 24KB mapping wouldn't work--  in the case of 3 MSI
>> banks
>> perhaps we could just do one 64MB*3 mapping to identify which windows
>> are used for MSIs.
>
>
> Where did the assumption of a 64MiB subwindow size come from?

The example I was using.   User space would need to create a
mapping for window_size * msi_bank_count.

Stuart



reply via email to

[Prev in Thread] Current Thread [Next in Thread]