qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Flatview rendering scalability issue


From: Peter Xu
Subject: Re: [Qemu-devel] Flatview rendering scalability issue
Date: Tue, 12 Mar 2019 20:14:11 +0800
User-agent: Mutt/1.10.1 (2018-07-13)

On Tue, Mar 12, 2019 at 12:42:07PM +0100, Paolo Bonzini wrote:
> On 12/03/19 04:23, Peter Xu wrote:
> > On Mon, Mar 11, 2019 at 03:07:43PM +0100, Paolo Bonzini wrote:
> >> On 11/03/19 14:48, Sergio Lopez wrote:
> >>>> The initialization is O(n^2) because the guest initializes one device at
> >>>> a time, so you rebuild the FlatView first with 0 devices, then 1, then
> >>>> 2, etc.  This is very hard to fix, if at all possible.
> >>>>
> >>>> However, each FlatView creation should be O(n) where n is the number of
> >>>> devices currently configured.  Please check with "info mtree -f" that
> >>>> you only have a fixed number of FlatViews.  Old versions had one per 
> >>>> device.
> >>> I'm seeing 9 FVs with 1 PCI, and 119 with 100 PCIs.
> >>
> >> With
> >>
> >> $ eval qemu-system-x86_64 -M q35 \
> >>     -device\ e1000,id=n{1,2,3,4,5,6,7,8}{1,2,3}
> >>
> >> I only see 4 flat views ("system", "io", "memory", "(none)").
> >>
> >> Probably you are using intel-iommu?  Peter, it should be possible to
> >> reorganize the VT-d memory regions like this:
> >>
> >>     intel_iommu_ir (MMIO, not added to any container)
> >>
> >>     vtd_root_dmar (container)
> >>       intel_iommu_dmar (IOMMU), priority 0
> >>       alias to intel_iommu_ir, priority 1
> >>
> >>     vtd_root_nodmar
> >>       alias to get_system_memory(), priority 0
> >>       alias to intel_iommu_ir, priority 1
> >>
> >>     vtd_root_0 memory region (container)
> >>         vtd_root_dmar             # only one of these is enabled
> >>         vtd_root_nodmar
> >>
> >> where the vtd_root_dmar and vtd_root_nodmar memory regions are created
> >> in vtd_init once and for all.  Because all vtd_root_* memory regions
> >> have only one child, memory.c will recognize that they represent the
> >> same memory, and create at most two FlatViews (one for vtd_root_dmar,
> >> one for vtd_root_nodmar).
> > 
> > Yes this sounds good.  The only thing I'm still uncertain is about the
> > IOMMU notifiers, which should be per-device (for real).
> 
> You're right.  However, the DMAR FlatView only has three sections so I
> suspect it's not a big deal if we keep it per-device.  You'd still have
> O(n) flatviews when the IOMMU is present and DMAR is enabled, but they
> would have a constant number of sections so the cost overall is still
> O(n) and not O(n^2).  If the IOMMU is present but DMAR is disabled, all
> VT-d address spaces would still share the same FlatView vtd_root_nodmar,
> and that is where the performance loss happens.
> 
> The final scheme would be same as above with vtd_root_dmar replaced by
> vtd_root_dmar_%d.

Just to make sure I understand - do you mean to keep the DMAR-enabled
scenario as it is (I think it should be at least as slow as the perf
number provided by Sergio) but we only fix up when guest doesn't
enable DMAR?  Asked since IIUC normally people won't specify "-device
intel-iommu" if they don't need the IOMMU functionality at all (and
DMAR is the major one), and if it's not specified then we should not
suffer from the performance degradation after all?

(But hmm... I think it could benefit users who only need IR/x2apic...)

Btw, how do you think about my previous proposal?  Do you think it's
doable somehow? (Make IOMMUMemoryRegion to have only MR pointer that
points to the shared MRs)

Thanks,

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]