qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC 0/5] pc: support user provided NIC naming/indexing


From: Igor Mammedov
Subject: Re: [RFC 0/5] pc: support user provided NIC naming/indexing
Date: Wed, 20 Jan 2021 13:07:14 +0100

On Sun, 17 Jan 2021 05:59:18 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Fri, Jan 15, 2021 at 02:59:02AM +0100, Igor Mammedov wrote:
> > On Wed, 13 Jan 2021 07:09:56 -0500
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >   
> > > On Tue, Dec 22, 2020 at 06:39:29PM -0500, Igor Mammedov wrote:  
> > > > 
> > > > Series implements support for 'onboard' naming scheme for network
> > > > interfaces (1), which is based on PCI firmware spec and lets user
> > > > to explicitly specify index that will be used by guest to name
> > > > network interface, ex:
> > > >     -device e1000,acpi-index=33
> > > > should make guest rename NIC name to 'eno33' where 'eno' is default
> > > > prefix for this scheme.
> > > > 
> > > > Hope is that it will allow to simplify launching VMs from
> > > > template disk images with different set VM configurations
> > > > without need to reconfigure guest network intrfaces or
> > > > risk of loosing network connectivity.    
> > > 
> > > Questions:
> > > the spec says:
> > > Assignment of specific device names to multi-function devices installed 
> > > in expansion
> > > slots, and/or PCI or PCI Express devices that are hot-added to expansion 
> > > slots in operating system-
> > > environment would be handled in operating system-specific manner, and is 
> > > not specified via this
> > > specification.
> > > 
> > > Accordingly, link below says:
> > > " Names incorporating Firmware/BIOS provided index numbers for on-board 
> > > devices (example: eno1)"
> > > 
> > > to what extend does guest assume the index is for on-board devices?
> > > it seems things work for fedora but how confident are we that this
> > > will keep working.  
> > 
> > code itself is not limiting it to onboard devices in any way.
> > (I can only speculate here, reason for calling it onboard is that
> > on real hardware ACPI is mostly static tables and firmware provides
> > an option to set index for only built-in NICs).
> > Technically there is no reason to call in onboad though.  
> 
> Well there's an smbios thing that I believe says onboard
> in the spec. Come to think of it that one actually
> applies to multifunction devices too.
Can you point to the specific place in the spec, pls?

> 
> 
> > 
> > I'd believe it should work with any distribution that uses
> > recent enough systemd/udev (released starting from 2003).
> >   
> > > Further, code seems to only look at the slot level.
> > > According to this, and according to the spec, this does not work with
> > > multifunction devices, does it?  
> > 
> > we probably should disable it for multifunction devices,
> > any suggestions how to detect those in QEMU?  
> 
> QEMU_PCI_CAP_MULTIFUNCTION_BITNR ?
> 
> Enforcing this will help but can we make this self-documenting somehow?
what do you mean saying "to make it self-documenting somehow"?

> > > The link you supplied lists another option:
> > > "Names incorporating Firmware/BIOS provided PCI Express hotplug slot 
> > > index numbers (example: ens1)"
> > > these are under management control already ...   
> > 
> > with it interface name continues to depend on PCI topology (and 
> > theoretically
> > limited to PCI expess). That's becomes harder to consume as complexity grows
> > (i.e. mgmt needs to keep NIC in the same place for which guest image was 
> > configured for).
> > acpi-index doesn't impose such limitation.
> >
> > In case of 1 NIC, it could be moved anywhere within PCI hierarchy and guest
> > doesn't have to be reconfigured to account for new interface name
> > (i.e without loosing network connectivity - that's the actual issue coming 
> > from
> > upper layers that made me look into acpi-index approach).  
> 
> Could you describe the issue in a bit more detail in the commit
> log pls?
sure

> > In another words acpi index is easier to consume for users above libvirt
> > and frees mgmt hands in a way it could distribute PCI devices.
> > Even better would be if guest image could carry index as metadata
> >   
> > > Also if we ask users to supply the property on the slot then it seems
> > > that the property can be baked into ACPI when it's created instead of
> > > being loaded from host - we can avoid adding new registers, this seems
> > > preferable.  Could someone from management side chime in on whether that
> > > is sufficient?  
> > 
> > I did consider it (it would be simpler, but not much), however unless we 
> > disable
> > PCI hotplug for affected slots it won't work. (unplug device from such slot 
> > and
> > plug another in that place will still return boot time index.  
> 
> Well it's interesting that you mention it. Let's say you remove
> a device then plug in a different index immediately ...
> Guest gets a wrong name temporarily, right?
> Should we worry about that?
guest will get not a wrong name but rather name that user asked for.
(i.e. NIC with old index could be moved to other slot and still keep its name)


> > That's why I ended up with hotplug variant. 
> > 
> > I chose abusing existing PCI hotplug registers for it, 
> > but we can use a new range if that's preferred.   
> 
> Would it work for multi-function devices though?
> If yes that's a big advantage ...
Does ACPI hotplug work with multi-function devices (I thought not)?
If not, I'd prefer explicitly disable new feature
for multi-function devices (the code is heavily based on
ACPI hotplug infrastructure we already have).
(
in other words, I don't see immediate benefit in grabbing new registers
)


> > > More questions:
> > > 
> > > does all this affect windows guests at all?  
> > I don't know (spec shows examples that reminded
> > me about NIC naming which Windows use(s|d)).
> > But I won't bet on it.
> > If I recall correctly, for e1000 NIC, it didn't made any
> > difference in naming (pre-existing guest image).
> >   
> > > where does the "acpi index" terminology come from?
> > > the pci firmware spec talks about "instance number", right?  
> > it comes from linux kernel (that's how it's named in sysfs)
> > and systemd/udev uses it. So I tried to avoid making up another
> > one.  
> 
> OTOH that's guest specific ... and does not seem to imply
> the limitations unless you look at the code ...
"instance number" looks to me too generic (it's fine in context of
specific chapter in spec). While linux kernel's variant looks to me as
as better option that fits the bill.

If you don't object, I'd prefer 'acpi-index'.

> >   
> > > > For more detailed description/examples see patches [3-4/5]
> > > > 
> > > > 1)
> > > >  
> > > > https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/
> > > >  
> > > > 
> > > > Git repo for testing:
> > > >    https://github.com/imammedo/qemu/branches acpi-index-rfc
> > > > 
> > > > Igor Mammedov (5):
> > > >   acpi: add aml_to_decimalstring() and aml_call6() helpers
> > > >   tests: acpi: temporary whitelist DSDT changes
> > > >   pci: introduce apci-index property for PCI device
> > > >   pci: acpi: add _DSM method to PCI devices
> > > >   tests: acpi: update expected data files
> > > > 
> > > >  include/hw/acpi/aml-build.h                 |   3 +
> > > >  include/hw/acpi/pci.h                       |   1 +
> > > >  include/hw/acpi/pcihp.h                     |   7 +-
> > > >  include/hw/pci/pci.h                        |   1 +
> > > >  tests/qtest/bios-tables-test-allowed-diff.h |  21 +++++
> > > >  hw/acpi/aml-build.c                         |  28 +++++++
> > > >  hw/acpi/pci.c                               |  84 ++++++++++++++++++++
> > > >  hw/acpi/pcihp.c                             |  25 +++++-
> > > >  hw/i386/acpi-build.c                        |  31 +++++++-
> > > >  hw/pci/pci.c                                |   1 +
> > > >  tests/data/acpi/pc/DSDT                     | Bin 5065 -> 6023 bytes
> > > >  tests/data/acpi/pc/DSDT.acpihmat            | Bin 6390 -> 7348 bytes
> > > >  tests/data/acpi/pc/DSDT.bridge              | Bin 6924 -> 8689 bytes
> > > >  tests/data/acpi/pc/DSDT.cphp                | Bin 5529 -> 6487 bytes
> > > >  tests/data/acpi/pc/DSDT.dimmpxm             | Bin 6719 -> 7677 bytes
> > > >  tests/data/acpi/pc/DSDT.hpbridge            | Bin 5026 -> 5990 bytes
> > > >  tests/data/acpi/pc/DSDT.hpbrroot            | Bin 3084 -> 3177 bytes
> > > >  tests/data/acpi/pc/DSDT.ipmikcs             | Bin 5137 -> 6095 bytes
> > > >  tests/data/acpi/pc/DSDT.memhp               | Bin 6424 -> 7382 bytes
> > > >  tests/data/acpi/pc/DSDT.numamem             | Bin 5071 -> 6029 bytes
> > > >  tests/data/acpi/pc/DSDT.roothp              | Bin 5261 -> 6324 bytes
> > > >  tests/data/acpi/q35/DSDT                    | Bin 7801 -> 7863 bytes
> > > >  tests/data/acpi/q35/DSDT.acpihmat           | Bin 9126 -> 9188 bytes
> > > >  tests/data/acpi/q35/DSDT.bridge             | Bin 7819 -> 7911 bytes
> > > >  tests/data/acpi/q35/DSDT.cphp               | Bin 8265 -> 8327 bytes
> > > >  tests/data/acpi/q35/DSDT.dimmpxm            | Bin 9455 -> 9517 bytes
> > > >  tests/data/acpi/q35/DSDT.ipmibt             | Bin 7876 -> 7938 bytes
> > > >  tests/data/acpi/q35/DSDT.memhp              | Bin 9160 -> 9222 bytes
> > > >  tests/data/acpi/q35/DSDT.mmio64             | Bin 8932 -> 9024 bytes
> > > >  tests/data/acpi/q35/DSDT.numamem            | Bin 7807 -> 7869 bytes
> > > >  tests/data/acpi/q35/DSDT.tis                | Bin 8407 -> 8468 bytes
> > > >  31 files changed, 197 insertions(+), 5 deletions(-)
> > > > 
> > > > -- 
> > > > 2.27.0    
> > > 
> > >   
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]