qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] intel-iommu: optimize nodmar memory regions


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH] intel-iommu: optimize nodmar memory regions
Date: Fri, 15 Mar 2019 14:02:59 +0800
User-agent: Mutt/1.10.1 (2018-07-13)

On Thu, Mar 14, 2019 at 11:36:39AM +0100, Sergio Lopez wrote:
> 
> Paolo Bonzini writes:
> 
> > On 13/03/19 12:45, Sergio Lopez wrote:
> >> 
> >> Peter Xu writes:
> >> 
> >>> Previously we have per-device system memory aliases when DMAR is
> >>> disabled by the system.  It will slow the system down if there are
> >>> lots of devices especially when DMAR is disabled, because each of the
> >>> aliased system address space will contain O(N) slots, and rendering
> >>> such N address spaces will be O(N^2) complexity.
> >>>
> >>> This patch introduces a shared nodmar memory region and for each
> >>> device we only create an alias to the shared memory region.  With the
> >>> aliasing, QEMU memory core API will be able to detect when devices are
> >>> sharing the same address space (which is the nodmar address space)
> >>> when rendering the FlatViews and the total number of FlatViews can be
> >>> dramatically reduced when there are a lot of devices.
> >>>
> >>> Suggested-by: Paolo Bonzini <address@hidden>
> >>> Signed-off-by: Peter Xu <address@hidden>
> >>> ---
> >>>
> >>> Hi, Sergio,
> >>>
> >>> This patch implements the optimization that Paolo proposed in the
> >>> other thread.  Would you please try this patch to see whether it could
> >>> help for your case?  Thanks,
> >> 
> >> Hi,
> >> 
> >> I've just gave a try and it fixes the issue here. The number of
> >> FlatViews goes down from 119 to 4, and the initialization time for PCI
> >> devices on the Guest is back to normal levels.
> >
> > You must be using "iommu=pt" then.  Can you also test performance
> > without it?  It should be fine even if the number of FlatViews goes back
> > to 119.
> 
> Hm... I don't have "iommu=pt" in either the Host nor the Guest (the
> issue can also be perceived in SeaBIOS). After taking some traces of
> what QEMU is doing during that time, I'm quite convinced the slowness
> comes from having to construct that amount of FlatViews, each one with
> up to 200 regions.

Thanks for giving it a shot!

And yes iommu=pt in the guest after Linux boots should probably have
the same state as during BIOS as long as the emulated VT-d device
declared "hardware passthrough" support (upstream QEMU has it on by
default, which corresponds to "-device intel-iommu,pt=on").

Regards,

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]