qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Qemu and 32 PCIe devices


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] Qemu and 32 PCIe devices
Date: Wed, 9 Aug 2017 20:16:55 +0300

On Wed, Aug 09, 2017 at 09:26:11AM +0200, Paolo Bonzini wrote:
> On 09/08/2017 03:06, Laszlo Ersek wrote:
> >>   20.14%  qemu-system-x86_64                  [.] render_memory_region
> >>   17.14%  qemu-system-x86_64                  [.] subpage_register
> >>   10.31%  qemu-system-x86_64                  [.] int128_add
> >>    7.86%  qemu-system-x86_64                  [.] addrrange_end
> >>    7.30%  qemu-system-x86_64                  [.] int128_ge
> >>    4.89%  qemu-system-x86_64                  [.] int128_nz
> >>    3.94%  qemu-system-x86_64                  [.] phys_page_compact
> >>    2.73%  qemu-system-x86_64                  [.] phys_map_node_alloc
> 
> Yes, this is the O(n^3) thing.  An optimized build should be faster
> because int128 operations will be inlined and become much more efficient.
> 
> > With this patch, I only tested the "93 devices" case, as the slowdown
> > became visible to the naked eye from the trace messages, as the firmware
> > enabled more and more BARs / command registers (and inversely, the
> > speedup was perceivable when the firmware disabled more and more BARs /
> > command registers).
> 
> This is an interesting observation, and it's expected.  Looking at the
> O(n^3) complexity more in detail you have N operations, where the "i"th
> operates on "i" DMA address spaces, all of which have at least "i"
> memory regions (at least 1 BAR per device).
> 
> So the total cost is sum i=1..N i^2 = N(N+1)(2N+1)/6 = O(n^3).
> Expressing it as a sum shows why it gets slower as time progresses.
> 
> The solution is to note that those "i" address spaces are actually all
> the same, so we can get it down to sum i=1..N i = N(N+1)/2 = O(n^2).
> 
> Thanks,
> 
> Paolo

We'll probably run into more issues with the vIOMMU but I guess we
can look into it later.

Resolving addresses lazily somehow might be interesting. And would
the caching work that went in a while ago but got disabled
since we couldn't iron out all the small issues
help go in that direction somehow?

-- 
MST



reply via email to

[Prev in Thread] Current Thread [Next in Thread]