qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Qemu and 32 PCIe devices


From: Laszlo Ersek
Subject: Re: [Qemu-devel] Qemu and 32 PCIe devices
Date: Wed, 9 Aug 2017 12:56:53 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

On 08/09/17 12:16, Paolo Bonzini wrote:
> On 09/08/2017 12:00, Laszlo Ersek wrote:
>> On 08/09/17 09:26, Paolo Bonzini wrote:
>>> On 09/08/2017 03:06, Laszlo Ersek wrote:
>>>>>   20.14%  qemu-system-x86_64                  [.] render_memory_region
>>>>>   17.14%  qemu-system-x86_64                  [.] subpage_register
>>>>>   10.31%  qemu-system-x86_64                  [.] int128_add
>>>>>    7.86%  qemu-system-x86_64                  [.] addrrange_end
>>>>>    7.30%  qemu-system-x86_64                  [.] int128_ge
>>>>>    4.89%  qemu-system-x86_64                  [.] int128_nz
>>>>>    3.94%  qemu-system-x86_64                  [.] phys_page_compact
>>>>>    2.73%  qemu-system-x86_64                  [.] phys_map_node_alloc
>>>
>>> Yes, this is the O(n^3) thing.  An optimized build should be faster
>>> because int128 operations will be inlined and become much more efficient.
>>>
>>>> With this patch, I only tested the "93 devices" case, as the slowdown
>>>> became visible to the naked eye from the trace messages, as the firmware
>>>> enabled more and more BARs / command registers (and inversely, the
>>>> speedup was perceivable when the firmware disabled more and more BARs /
>>>> command registers).
>>>
>>> This is an interesting observation, and it's expected.  Looking at the
>>> O(n^3) complexity more in detail you have N operations, where the "i"th
>>> operates on "i" DMA address spaces, all of which have at least "i"
>>> memory regions (at least 1 BAR per device).
>>
>> - Can you please give me a pointer to the code where the "i"th operation
>> works on "i" DMA address spaces? (Not that I dream about patching *that*
>> code, wherever it may live :) )
> 
> It's all driven by actions of the guest.
> 
> Simply, by the time you get to the "i"th command register, you have
> enabled bus-master DMA on "i" devices (so that "i" DMA address spaces
> are non-empty) and you have enabled BARs on "i" devices (so that their
> BARs are included in the address spaces).
> 
>> - You mentioned that changing this is on the ToDo list. I couldn't find
>> it under <https://wiki.qemu.org/index.php/ToDo>. Is it tracked somewhere
>> else?
> 
> I've added it to https://wiki.qemu.org/index.php/ToDo/MemoryAPI (thanks
> for the nudge).

Thank you!

Allow me one last question -- why (and since when) does each device have
its own separate address space? Is that related to the virtual IOMMU?

Now that I look at the "info mtree" monitor output of a random VM, I see
the following "address-space"s:
- memory
- I/O
- cpu-memory
- bunch of nameless ones, with top level regions called
  "bus master container"
- several named "virtio-pci-cfg-as"
- KVM-SMRAM

I (sort of) understand MemoryRegions and aliases, but:
- I don't know why "memory" and "cpu-memory" exist separately, for example,
- I seem to remember that the "bunch of nameless ones" has not always
been there? (I could be totally wrong, of course.)

... There is one address_space_init() call in "hw/pci/pci.c", and it
comes (most recently) from commit 3716d5902d74 ("pci: introduce a bus
master container", 2017-03-13). The earliest commit that added it seems
to be 817dcc536898 ("pci: give each device its own address space",
2012-10-03). The commit messages do mention IOMMUs.

Thanks!
Laszlo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]