qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] The maximum limit of virtual network device


From: Marcel Apfelbaum
Subject: Re: [Qemu-devel] The maximum limit of virtual network device
Date: Thu, 6 Jul 2017 12:24:12 +0300
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.1.1

On 06/07/2017 11:31, Laszlo Ersek wrote:
Hi Jiaxin,

it's nice to see a question from you on qemu-devel! :)

On 07/06/17 08:20, Wu, Jiaxin wrote:
Hello experts,

We know QEMU has the capability to create the multiple network devices
in one QEMU guest with the -device syntax. But I met the below failure
when I'm trying to create more than 30 virtual devices with the each
TAP backend:

qemu-system-x86_64: -device e1000: PCI: no slot/function available for
e1000, all in use.

The corresponding QEMU command shows as following:

sudo qemu-system-x86_64 \
   -pflash OVMF.fd \
   -global e1000.romfile="" \
   -netdev tap,id=hostnet0,ifname=tap0,script=no,downscript=no \
   -device e1000,netdev=hostnet0 \

[...]

   -netdev tap,id=hostnet29,ifname=tap29,script=no,downscript=no \
   -device e1000,netdev=hostnet29

 From above,  the max limit of virtual network device in one guest is
about 29? If not, how can I avoid such failure? My use case is to
create more than 150 network devices in one guest. Please provide your
comments on this.

You are seeing the above symptom because the above command line
instructs QEMU to do the following:
- use the i440fx machine type,
- use a single PCI bus (= the main root bridge),
- add the e1000 cards to separate slots (always using function 0) on
   that bus.

Accordingly, there are three things you can do to remedy this:

- Use the Q35 machine type and work with a PCI Express hierarchy rather
   than a PCI hierarchy. I'm mentioning this only for completeness,
   because it won't directly help your use case. But, I certainly want to
   highlight "docs/pcie.txt". Please read it sometime; it has nice
   examples and makes good points.

- Use multiple PCI bridges to attach the devices. For this, several ways
   are possible:

   - use multiple root buses, with the pxb or pxb-pcie devices (see
     "docs/pci_expander_bridge.txt" and "docs/pcie.txt")

   - use multiple normal PCI bridges

   - use multiple PCI Express root ports or downstream ports (but for
     this, you'll likely have to use the PCI Express variant of the
     e1000, namely e1000e)

- If you don't need hot-plug / hot-unplug, aggregate eights of e1000
   NICs into multifunction PCI devices.
> Now, I would normally recommend sticking with i440fx for simplicity.
However, each PCI bridge requires 4KB of IO space (meaning (1 + 5) * 4KB
= 24KB),  and OVMF on the i440fx does not support that much (only
0x4000). So, I'll recommend Q35 for IO space purposes; OVMF on Q35
provides 0xA000 (40KB).

So if we use OVMF, going for Q35 gives us actually more IO space, nice!
However recommending Q35 for IO space seems odd :)


For scaling higher than this, a PCI Express hierarchy should be used
with PCI Express devices that require no IO space at all. However, that
setup is even more problematic *for now*; please see section "3. IO
space issues" in "docs/pcie.txt". We have open OVMF and QEMU BZs for
limiting IO space allocation to cases when it is really necessary:

   https://bugzilla.redhat.com/show_bug.cgi?id=1344299
   https://bugzilla.redhat.com/show_bug.cgi?id=1434740

Therefore I guess the simplest example I can give now is:
- use Q35 (for a larger IO space),
- plug a DMI-PCI bridge into the root bridge,
- plug 5 PCI bridges into the DMI-PCI bridge,
- plug 31 NICs per PCI bridge, each NIC into a separate slot.


The setup looks OK to me (assuming OVMF is needed, otherwise
PC + pci-bridges will result in more devices),
I do have a little concern.
We want to deprecate the dmi-pci bridge since it does not support hot-plug (for itself or devices behind it).
Alexandr (CCed) is a GSOC student working on a generic
pcie-pci bridge that can (eventually) be hot-plugged
into a PCIe Root Port and keeps the machine cleaner.

See:
https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg05498.html

If is a "lab" project it doesn't really matter, but I wanted
to point out the direction.

Thanks,
Marcel

This follows the following recommendation of "2.3 PCI only hierarchy" in
"docs/pcie.txt" (slightly rewrapped here):

2.3 PCI only hierarchy
======================
Legacy PCI devices can be plugged into pcie.0 as Integrated Endpoints,
but, as mentioned in section 5, doing so means the legacy PCI
device in question will be incapable of hot-unplugging.
Besides that use DMI-PCI Bridges (i82801b11-bridge) in combination
with PCI-PCI Bridges (pci-bridge) to start PCI hierarchies.

Prefer flat hierarchies. For most scenarios a single DMI-PCI Bridge
(having 32 slots) and several PCI-PCI Bridges attached to it (each
supporting also 32 slots) will support hundreds of legacy devices. The
recommendation is to populate one PCI-PCI Bridge under the DMI-PCI
Bridge until is full and then plug a new PCI-PCI Bridge...

Here's a command line. Please note that the OVMF boot may take quite
long with this, as the E3522X2.EFI driver from BootUtil (-D
E1000_ENABLE) binds all 150 e1000 NICs in succession! Watching the OVMF
debug log is recommended.

qemu-system-x86_64 \
   \
   -machine q35,vmport=off,accel=kvm \
   -pflash OVMF.fd \
   -global e1000.romfile="" \
   -m 2048 \
   -debugcon file:debug.log \
   -global isa-debugcon.iobase=0x402 \
   \
   -netdev tap,id=hostnet0,ifname=tap0,script=no,downscript=no \
[...]
   -netdev tap,id=hostnet149,ifname=tap149,script=no,downscript=no \
   \
   -device i82801b11-bridge,id=dmi-pci-bridge \
   \
   -device pci-bridge,id=bridge-1,chassis_nr=1,bus=dmi-pci-bridge \
   -device pci-bridge,id=bridge-2,chassis_nr=2,bus=dmi-pci-bridge \
   -device pci-bridge,id=bridge-3,chassis_nr=3,bus=dmi-pci-bridge \
   -device pci-bridge,id=bridge-4,chassis_nr=4,bus=dmi-pci-bridge \
   -device pci-bridge,id=bridge-5,chassis_nr=5,bus=dmi-pci-bridge \
   \
   -device e1000,netdev=hostnet0,bus=bridge-1,addr=0x1.0 \
[...]
   -device e1000,netdev=hostnet149,bus=bridge-5,addr=0x1a.0

Thanks
Laszlo





reply via email to

[Prev in Thread] Current Thread [Next in Thread]