From: Alexey Kardashevskiy
Subject: Re: [Qemu-ppc] [PATCH qemu 2/2] spapr_pci: Advertise 16M IOMMU pages when available
Date: Mon, 9 Jan 2017 13:06:03 +1100
On 03/01/17 10:41, David Gibson wrote:
> On Thu, Dec 22, 2016 at 04:22:12PM +1100, Alexey Kardashevskiy wrote:
>> On sPAPR, IOMMU page size varies and if QEMU is running with RAM
>> backed with hugepages, we can advertise this to the guest so does
>> this patch.
>> Signed-off-by: Alexey Kardashevskiy <address@hidden>
>> ---
>>  hw/ppc/spapr_pci.c | 3 +++
>>  1 file changed, 3 insertions(+)
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index fd6fc1d953..09244056fc 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -1505,6 +1505,9 @@ static void spapr_phb_realize(DeviceState *dev, Error 
>> **errp)
>>      }
>>      /* DMA setup */
>> +    /* This allows huge pages for IOMMU when guest is backed with huge 
>> pages */
>> +    sphb->page_size_mask |= qemu_getrampagesize();
> This doesn't look right - you're unconditionally enabling the host ram
> page size, regardless of anything else.  Instead the backing page size
> should be used to filter out those sizes which are possible from the
> list of those supported by the guest hardware.  This patch will give
> particularly odd results if you ran it on x86 with hugepages for
> example: it would advertise a 2M IOMMU page size, which could never
> exist on native POWER.

Ok, I'll filter 16M out if passed to PHB and not supported by the host.

> Except... come to think of it, why is the backing RAM page size
> relevant at all? 

Because this is just an optimization/acceleration and I'd think the user
wants to know if it is actually accelerated or not. If I always allow 16M
pages, and QEMU is not backed with hugepages, then all H_PUT_TCE will go
via slow path and consume as much memory for TCE as without hugepages, and
it will only be visible to the user if TCE-tracepoints are enabled.

> Or rather.. I think VFIO should be able to cope with
> any guest IOMMU page size which is larger than the host ram page size

It could, I just do not see much benefit in it. pseries guest can negotiate
4k, 64k, 16m pages and this seems to cover everything we want, why would we
want to emulate IOMMU page size?

> (although if it's much larger it could get expensive in the host
> tables).  This case would already be routine for ppc64 on x86, where
> the guest IOMMU page size is 64kiB, but the host page size is 4 kiB.


