qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the de


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device
Date: Sun, 21 Nov 2010 16:48:44 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Sun, Nov 21, 2010 at 02:50:14PM +0200, Gleb Natapov wrote:
> On Sun, Nov 21, 2010 at 01:53:26PM +0200, Michael S. Tsirkin wrote:
> > > > The guests.
> > > Which one? There are many guests. Your favorite?
> > > 
> > > > For CLI, we need an easy way to map a device in guest to the
> > > > device in qemu and back.
> > > Then use eth0, /dev/sdb, or even C:. Your way is not less broken since 
> > > what
> > > you are saying is "lets use name that guest assigned to a device". 
> > 
> > No I am saying let's use the name that our ACPI tables assigned.
> > 
> ACPI does not assign any name. In a best case ACPI tables describe resources
> used by a device.

Not only that. bus number and segment aren't resources as such.
They describe addressing.

> And not all guests qemu supports has support for ACPI. Qemu
> even support machines types that do not support ACPI.

So? Different machines -> different names.

> > > > 
> > > > 
> > > > > It looks like you identify yourself with most of
> > > > > qemu users, but if most qemu users are like you then qemu has not 
> > > > > enough
> > > > > users :) Most users that consider themselves to be "advanced" may know
> > > > > what eth1 or /dev/sdb means. This doesn't mean we should provide
> > > > > "device_del eth1" or "device_add /dev/sdb" command though. 
> > > > > 
> > > > > More important is that "domain" (encoded as number like you used to)
> > > > > and "bus number" has no meaning from inside qemu.
> > > > > So while I said many
> > > > > times I don't care about exact CLI syntax to much it should make sense
> > > > > at least. It can use id to specify PCI bus in CLI like this:
> > > > > device_del pci.0:1.1. Or it can even use device id too like this:
> > > > > device_del pci.0:ide.0. Or it can use HW topology like in FO device
> > > > > path. But doing ah-hoc device enumeration inside qemu and then using 
> > > > > it
> > > > > for CLI is not it.
> > > > > 
> > > > > > functionality in the guests.  Qemu is buggy in the moment in that is
> > > > > > uses the bus addresses assigned by guest and not the ones in ACPI,
> > > > > > but that can be fixed.
> > > > > It looks like you confused ACPI _SEG for something it isn't.
> > > > 
> > > > Maybe I did. This is what linux does:
> > > > 
> > > > struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root
> > > > *root)
> > > > {
> > > >         struct acpi_device *device = root->device;
> > > >         int domain = root->segment;
> > > >         int busnum = root->secondary.start;
> > > > 
> > > > And I think this is consistent with the spec.
> > > > 
> > > It means that one domain may include several host bridges.
> > > At that level
> > > domain is defined as something that have unique name for each device
> > > inside it thus no two buses in one segment/domain can have same bus
> > > number. This is what PCI spec tells you. 
> > 
> > And that really is enough for CLI because all we need is locate the
> > specific slot in a unique way.
> > 
> At qemu level we do not have bus numbers. They are assigned by a guest.
> So inside a guest domain:bus:slot.func points you to a device, but in
> qemu does not enumerate buses.
> 
> > > And this further shows that using "domain" as defined by guest is very
> > > bad idea. 
> > 
> > As defined by ACPI, really.
> > 
> ACPI is a part of a guest software that may not event present in the
> guest. How is it relevant?

It's relevant because this is what guests use. To access the root
device with cf8/cfc you need to know the bus number assigned to it
by firmware. How that was assigned is of interest to BIOS/ACPI but not
really interesting to the user or, I suspect, guest OS.

> > > > > ACPI spec
> > > > > says that PCI segment group is purely software concept managed by 
> > > > > system
> > > > > firmware. In fact one segment may include multiple PCI host bridges.
> > > > 
> > > > It can't I think:
> > > Read _BBN definition:
> > >  The _BBN object is located under a PCI host bridge and must be unique for
> > >  every host bridge within a segment since it is the PCI bus number.
> > > 
> > > Clearly above speaks about multiple host bridge within a segment.
> > 
> > Yes, it looks like the firmware spec allows that.
> It even have explicit example that shows it.
> 
> > 
> > > >         Multiple Host Bridges
> > > > 
> > > >         A platform may have multiple PCI Express or PCI-X host bridges. 
> > > > The base
> > > >         address for the
> > > >         MMCONFIG space for these host bridges may need to be allocated 
> > > > at
> > > >         different locations. In such
> > > >         cases, using MCFG table and _CBA method as defined in this 
> > > > section means
> > > >         that each of these host
> > > >         bridges must be in its own PCI Segment Group.
> > > > 
> > > This is not from ACPI spec,
> > 
> > PCI Firmware Specification 3.0
> > 
> > > but without going to deep into it above
> > > paragraph talks about some particular case when each host bridge must
> > > be in its own PCI Segment Group with is a definite prove that in other
> > > cases multiple host bridges can be in on segment group.
> > 
> > I stand corrected. I think you are right. But note that if they are,
> > they must have distinct bus numbers assigned by ACPI.
> ACPI does not assign any numbers.

For all root pci devices firmware must supply BBN number. This is the
bus number, isn't it? For nested buses, this is optional.

> Bios enumerates buses and assign
> numbers.

There's no standard way to enumerate pci root devices in guest AFAIK.
The spec says:
        Firmware must configure all Host Bridges in the systems, even if
        they are not connected to a console or boot device. Firmware must
        configure Host Bridges in order to allow operating systems to use the
        devices below the Host Bridges. This is because the Host Bridges
        programming model is not defined by the PCI Specifications. 


> ACPI, in a base case, describes what BIOS did to OSPM. Qemu sits
> one layer below all this and does not enumerate PC buses. Even if we make
> it to do so there is not way to guaranty that guest will enumerate them
> in the same order since there is more then one way to do enumeration. I
> repeated this numerous times to you already.

ACPI is really part of the motherboard. Calling it the guest just
confuses things. Guest OS can override bus numbering for nested buses
but not for root buses.

> > 
> > > > 
> > > > > _SEG
> > > > > is not what OSPM uses to tie HW resource to ACPI resource. It used 
> > > > > _CRS
> > > > > (Current Resource Settings) for that just like OF. No surprise there.
> > > > 
> > > > OSPM uses both I think.
> > > > 
> > > > All I see linux do with CRS is get the bus number range.
> > > So lets assume that HW has two PCI host bridges and ACPI has:
> > >         Device(PCI0) {
> > >             Name (_HID, EisaId ("PNP0A03"))
> > >             Name (_SEG, 0x00)
> > >         }
> > >         Device(PCI1) {
> > >             Name (_HID, EisaId ("PNP0A03"))
> > >             Name (_SEG, 0x01)
> > >         }
> > > I.e no _CRS to describe resources. How do you think OSPM knows which of
> > > two pci host bridges is PCI0 and which one is PCI1?
> > 
> > You must be able to uniquely address any bridge using the combination of 
> > _SEG
> > and _BBN.
> 
> No at all. And saying "you must be able" without actually show how
> doesn't prove anything. _SEG is relevant only for those host bridges
> that support MMCONFIG (not all of them do, and none that qemu support
> does yet). _SEG points to specific entry in MCFG table and MCFG entry
> holds base address for MMCONFIG space for the bridge (this address
> is configured by a guest). This is all _SEG does really, no magic at
> all. _BBN returns bus number assigned by the BIOS to host bridge. Nothing
> qemu visible again.
> So _SEG and _BBN gives you two numbers assigned by
> a guest FW. Nothing qemu can use to identify a device.

This FW is given to guest by qemu. It only assigns bus numbers
because qemu told it to do so.

> > 
> > > > And the spec says, e.g.:
> > > > 
> > > >           the memory mapped configuration base
> > > >         address (always corresponds to bus number 0) for the PCI 
> > > > Segment Group
> > > >         of the host bridge is provided by _CBA and the bus range 
> > > > covered by the
> > > >         base address is indicated by the corresponding bus range 
> > > > specified in
> > > >         _CRS.
> > > > 
> > > Don't see how it is relevant. And _CBA is defined only for PCI Express. 
> > > Lets
> > > solve the problem for PCI first and then move to PCI Express. Jumping 
> > > from one
> > > to another destruct us from main discussion.
> > 
> > I think this is what confuses us.  As long as you are using cf8/cfc there's 
> > no
> > concept of a domain really.
> > Thus:
> >     /address@hidden
> > 
> > is probably enough for BIOS boot because we'll need to make root bus numbers
> > unique for legacy guests/option ROMs.  But this is not a hardware 
> > requirement
> > and might become easier to ignore eith EFI.
> > 
> You do not need MMCONFIG to have multiple PCI domains. You can have one
> configured via standard cf8/cfc and another one on ef8/efc and one more
> at mmio fce00000 and you can address all of them:
> /address@hidden
> /address@hidden
> /address@hidden
> 
> And each one of those PCI domains can have 256 subbridges.

Will common guests such as windows or linux be able to use them? This
seems to be outside the scope of the PCI Firmware specification, which
says that bus numbers must be unique.

> > > > 
> > > > > > 
> > > > > > That should be enough for e.g. device_del. We do have the need to
> > > > > > describe the topology when we interface with firmware, e.g. to 
> > > > > > describe
> > > > > > the ACPI tables themselves to qemu (this is what Gleb's patches deal
> > > > > > with), but that's probably the only case.
> > > > > > 
> > > > > Describing HW topology is the only way to unambiguously describe 
> > > > > device to
> > > > > something or someone outside qemu and have persistent device naming
> > > > > between different HW configuration.
> > > > 
> > > > Not really, since ACPI is a binary blob programmed by qemu.
> > > > 
> > > APCI is part of the guest, not qemu.
> > 
> > Yes it runs in the guest but it's generated by qemu. On real hardware,
> > it's supplied by the motherboard.
> > 
> It is not generated by qemu. Parts of it depend on HW and other part depend
> on how BIOS configure HW. _BBN for instance is clearly defined to return
> address assigned bu the BIOS.

BIOS is supplied on the motherboard and in our case by qemu as well.
There's no standard way for BIOS to assign bus number to the pci root,
so it does it in device-specific way. Why should a management tool
or a CLI user care about these? As far as they are concerned
we could use some PV scheme to find root devices and assign bus
numbers, and it would be exactly the same.

> > > Just saying "not really" doesn't
> > > prove much. I still haven't seen any proposition from you that actually
> > > solve the problem. No, "lets use guest naming" is not it. There is no
> > > such thing as "The Guest". 
> > > 
> > > --
> > >                   Gleb.
> > 
> > I am sorry if I didn't make this clear.  I think we should use the 
> > domain:bus
> > pair to name the root device. As these are unique and 
> > 
> You forgot to complete the sentence :) But you made it clear enough and
> it is incorrect. domain:bus pair not only not unique they do not exist
> in qemu at all

Sure they do. domain maps to mcfg address for express. bus is used for
cf8/cfc addressing. They are assigned by BIOS but since BIOS
is supplied with hardware the point is moot.

> and as such can't be used to address device. They are
> product of HW enumeration done by a guest OS just like eth0 or C:.
> 
> --
>                       Gleb.

There's a huge difference between BIOS and guest OS, and between bus
numbers of pci root and of nested bridges.

Describing hardware io ports makes sense if you are trying to
communicate data from qemu to the BIOS.  But the rest of the world might
not care.

-- 
MST



reply via email to

[Prev in Thread] Current Thread [Next in Thread]