[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] intel-iommu: optimize nodmar memory regions
From: |
Peter Xu |
Subject: |
Re: [Qemu-devel] [PATCH] intel-iommu: optimize nodmar memory regions |
Date: |
Fri, 15 Mar 2019 14:02:59 +0800 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
On Thu, Mar 14, 2019 at 11:36:39AM +0100, Sergio Lopez wrote:
>
> Paolo Bonzini writes:
>
> > On 13/03/19 12:45, Sergio Lopez wrote:
> >>
> >> Peter Xu writes:
> >>
> >>> Previously we have per-device system memory aliases when DMAR is
> >>> disabled by the system. It will slow the system down if there are
> >>> lots of devices especially when DMAR is disabled, because each of the
> >>> aliased system address space will contain O(N) slots, and rendering
> >>> such N address spaces will be O(N^2) complexity.
> >>>
> >>> This patch introduces a shared nodmar memory region and for each
> >>> device we only create an alias to the shared memory region. With the
> >>> aliasing, QEMU memory core API will be able to detect when devices are
> >>> sharing the same address space (which is the nodmar address space)
> >>> when rendering the FlatViews and the total number of FlatViews can be
> >>> dramatically reduced when there are a lot of devices.
> >>>
> >>> Suggested-by: Paolo Bonzini <address@hidden>
> >>> Signed-off-by: Peter Xu <address@hidden>
> >>> ---
> >>>
> >>> Hi, Sergio,
> >>>
> >>> This patch implements the optimization that Paolo proposed in the
> >>> other thread. Would you please try this patch to see whether it could
> >>> help for your case? Thanks,
> >>
> >> Hi,
> >>
> >> I've just gave a try and it fixes the issue here. The number of
> >> FlatViews goes down from 119 to 4, and the initialization time for PCI
> >> devices on the Guest is back to normal levels.
> >
> > You must be using "iommu=pt" then. Can you also test performance
> > without it? It should be fine even if the number of FlatViews goes back
> > to 119.
>
> Hm... I don't have "iommu=pt" in either the Host nor the Guest (the
> issue can also be perceived in SeaBIOS). After taking some traces of
> what QEMU is doing during that time, I'm quite convinced the slowness
> comes from having to construct that amount of FlatViews, each one with
> up to 200 regions.
Thanks for giving it a shot!
And yes iommu=pt in the guest after Linux boots should probably have
the same state as during BIOS as long as the emulated VT-d device
declared "hardware passthrough" support (upstream QEMU has it on by
default, which corresponds to "-device intel-iommu,pt=on").
Regards,
--
Peter Xu