[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] hw/arm/virt-acpi - reserve ECAM space as PNP

From: Ard Biesheuvel
Subject: Re: [Qemu-devel] [PATCH v2] hw/arm/virt-acpi - reserve ECAM space as PNP0C02 device
Date: Tue, 17 Jan 2017 07:47:21 +0000

On 16 January 2017 at 22:35, Laszlo Ersek <address@hidden> wrote:
> On 01/16/17 22:23, Ard Biesheuvel wrote:
>> On 16 January 2017 at 21:13, Laszlo Ersek <address@hidden> wrote:
>>> On 01/16/17 20:31, Ard Biesheuvel wrote:
>>>> On 16 January 2017 at 18:20, Peter Maydell <address@hidden> wrote:
>>>>> On 16 January 2017 at 17:30, Ard Biesheuvel <address@hidden> wrote:
>>>>>> On 16 January 2017 at 17:25, Peter Maydell <address@hidden> wrote:
>>>>>>> On 13 January 2017 at 17:32, Ard Biesheuvel <address@hidden> wrote:
>>>>>>>> Linux for arm64 v4.10 and later will complain if the ECAM config space 
>>>>>>>> is
>>>>>>>> not reserved in the ACPI namespace:
>>>>>>>>   acpi PNP0A08:00: [Firmware Bug]: ECAM area [mem 
>>>>>>>> 0x3f000000-0x3fffffff] not reserved in ACPI namespace
>>>>>>>> The rationale is that OSes that don't consume the MCFG table should 
>>>>>>>> still
>>>>>>>> be able to infer that the PCI config space MMIO region is occupied.
>>>>>>>> So update the ACPI table generation routine to add this reservation.
>>>>>>>> Signed-off-by: Ard Biesheuvel <address@hidden>
>>>>>>>> ---
>>>>>>>>  hw/arm/virt-acpi-build.c | 7 +++++++
>>>>>>>>  1 file changed, 7 insertions(+)
>>>>>>>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>>>>>>>> index 085a61117378..50d52f685f68 100644
>>>>>>>> --- a/hw/arm/virt-acpi-build.c
>>>>>>>> +++ b/hw/arm/virt-acpi-build.c
>>>>>>>> @@ -310,6 +310,13 @@ static void acpi_dsdt_add_pci(Aml *scope, const 
>>>>>>>> MemMapEntry *memmap,
>>>>>>>>      Aml *dev_rp0 = aml_device("%s", "RP0");
>>>>>>>>      aml_append(dev_rp0, aml_name_decl("_ADR", aml_int(0)));
>>>>>>>>      aml_append(dev, dev_rp0);
>>>>>>>> +
>>>>>>>> +    Aml *dev_res0 = aml_device("%s", "RES0");
>>>>>>>> +    aml_append(dev_res0, aml_name_decl("_HID", 
>>>>>>>> aml_string("PNP0C02")));
>>>>>>>> +    crs = aml_resource_template();
>>>>>>>> +    aml_append(crs, aml_memory32_fixed(base_ecam, size_ecam, 
>>>>>>>> AML_READ_WRITE));
>>>>>>>> +    aml_append(dev_res0, aml_name_decl("_CRS", crs));
>>>>>>>> +    aml_append(dev, dev_res0);
>>>>>>>>      aml_append(scope, dev);
>>>>>>>>  }
>>>>>>> This needs to be controlled via the machine class back-compat
>>>>>>> machinery in hw/arm/virt.c so that it only happens for virt-2.9
>>>>>>> and later.
>>>>>> Why exactly?
>>>>> Because the "virt-2.8" machine has to present to the guest
>>>>> exactly what "virt" did as of the QEMU 2.8 release, including
>>>>> any bugs or missing things we happened to have in our ACPI
>>>>> tables. This allows cross-version compatibility (including
>>>>> VM migration). Drew will have a more detailed explanation
>>>>> if you need it.
>>>> I suspected as much.
>>>> But in this case, I am not sure if it is worth the trouble: the
>>>> generated data is only consumed at boot time by the firmware, and I
>>>> suppose migration involves freezing a VM, including whatever resident
>>>> firmware image was used to boot the OS, and so this is unlikely to
>>>> affect migration.
>>>> But I will let Drew explain ...
>>> The PCI Firmware Specification (rev 3.1) says in 4.1.2. "MCFG Table
>>> Description": "The resources can *optionally* be returned in [...]
>>> EFIGetMemoryMap as reserved memory [...]". (Emphasis mine.) Linux seems
>>> to *insist* on this kind of reservation however.
>> No, not at the UEFI level but at the ACPI level. Reservations in the
>> UEFI memory map describe memory not MMIO space
>>> PNP0C02 is "General ID for reserving resources required by PnP
>>> motherboard registers. (Not device specific.)", according to
>>> <http://www.plasma-online.de/english/identify/serial/pnp_id_pnp.html>.
>>> So what this patch does is reserve a memory area through ACPI,
>>> practically as an unspecified "platform resource".
>> This has been discussed at great length on the linux mailing lists
>> https://patchwork.kernel.org/patch/9453149/
>>> There's an alternative that's contained entirely in the firmware. You
>>> can cover the MMCONFIG area in ArmVirtQemu with an EfiReservedMemoryType
>>> memory map entry (by producing an appropriate memalloc HOB in PEI, or by
>>> calling the appropriate gDS memory space map functions in DXE). OVMF
>>> does the former (memalloc HOB).
>>> In ArmVirtQemu, we grab the MMCONFIG range from "pci-host-ecam-generic",
>>> from QEMU's DTB. If you don't dislike the idea, we could cover the range
>>> as well, right in "ArmVirtPkg/Library/FdtPciPcdProducerLib". That lib
>>> instance already sets the base address PCD, and makes sure that the
>>> relevant code is executed only once (in whatever driver module the
>>> library instance was built into). You could call the gDS functions
>>> mentioned above from that spot. (The library instance is already
>>> restricted to DXE_DRIVER and UEFI_DRIVER modules.)
>> In general, I think describing MMIO in the UEFI memory map is not very
>> useful, and counter to the spec, which mentions that the memory map
>> describes memory ("however it is used"), not memory *space* (unless
>> UEFI itself needs to access it to implement runtime services)
> The UEFI memory map will reflect allocations from the GCD memory space,
> for the Reserved and MMIO types. See "Figure 2. GCD Memory State
> Transitions" in "7.2.2 GCD Memory Resources", Vol2 of the PI spec.
> See also "9.7.1 UEFI Boot Services Dependencies" in the same,
> GetMemoryMap()
>   The GetMemoryMap() implementation must include into the UEFI memory
>   map all GCD map entries of types EfiGcdMemoryTypeReserved and
>   EfiPersistentMemory, and all GCD map entries of type
>   EfiGcdMemoryTypeMemoryMappedIo that have EFI_MEMORY_RUNTIME attribute
>   set.
> (Note that I wrote Reserved earlier, not MMIO.)

What the PI spec stipulates is irrelevant: the contract between the
firmware and the OS is in the UEFI and ACPI specifications, not in the
PI spec.

> However, you are right that *just* the UEFI memmap entry is not
> sufficient, according to the PCI firmware spec. (Regardless of the fact
> that in practice, just the memmap entry does keep Linux happy. Or is it
> about to change?)

The kernel uses the UEFI memory map for two purposes:
- finding out where memory is, and which parts are usable (i.e., non-reserved)
- setting up page tables to allow UEFI runtime services calls, which
may include MMIO mappings

This means that MMIO regions in the UEFI memory map are *not*
considered reservations. This is in line with the wording of the UEFI
spec, which mentions that the memory map describes memory, not MMIO
(with the exception of MMIO peripherals that the firmware needs to
access to implement the runtime services)

> Namely, looking again at the spot I quoted above (and it's also quoted
> in the kernel docs patch that you linked above, under ref [6]), we find
>     If the operating system does not natively comprehend reserving the
>     MMCFG region, the MMCFG region must be reserved by firmware. The
>     address range reported in the MCFG table or by _CBA method (see
>     Section 4.1.3) must be reserved by declaring a motherboard resource.
>     For most systems, the motherboard resource would appear at the root
>     of the ACPI namespace (under \_SB) in a node with a _HID of EISAID
>     (PNP0C02), and the resources in this case should not be claimed in
>     the root PCI bus’s _CRS. The resources can optionally be returned in
>     Int15 E820 or EFIGetMemoryMap as reserved memory but must always be
>     reported through ACPI as a motherboard resource.
> Therefore I agree that reserving the MMCONFIG area via a PNP0C02 object
> in QEMU's ACPI payload improves spec conformance.


> (Actually, the argument can be made for x86/Q35 as well. Adding Marcel
> and MST.)
> ... Beyond the machine-type dependency raised by Peter (which I gather
> is still being discussed), I suggest that the commit message of this
> patch quote the relevant passage from the PCI fw spec in full (see
> above, or in the kernel docs patch).

I will expand the commit message in the next respin

Thanks (and apologies for not cc'ing you in the first place),

reply via email to

[Prev in Thread] Current Thread [Next in Thread]