qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] 64-bit MMIO aperture expansion


From: Marcel Apfelbaum
Subject: Re: [Qemu-devel] 64-bit MMIO aperture expansion
Date: Fri, 21 Sep 2018 18:01:30 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1



On 09/20/2018 05:49 PM, Laszlo Ersek wrote:
Hi Marcel,

Hi Laszlo,
I had to read this mail a few times...


this email should actually be an RFC patch. But RFC patches tend to turn
into real PATCHes (if the submitter is lucky, that is), and I can't
really promise sending multiple versions of a PATCH at this time. So
please consider this a "maybe bug report".

In commit 9fa99d2519cb ("hw/pci-host: Fix x86 Host Bridges 64bit PCI
hole", 2017-11-16) we added logic so that QEMU expand the 64-bit PCI
hole (for hotplug purposes), if (a) the firmware doesn't "configure" one
(via programming individual BARs with 64-bit addresses), or (b) the
firmware's programming results in an aperture smaller than we'd like
(32GB on Q35).

Right


We made sure that the aperture required by the firmware's programming
would never be shrunken or otherwise truncated by QEMU, so that's fine.
However, the expansion doesn't work as "widely" in all cases as it
should.

Consider the following three functions, at current master (= commit
19b599f7664b):

[hw/i386/pc.c]

/*
  * The 64bit pci hole starts after "above 4G RAM" and
  * potentially the space reserved for memory hotplug.
  */
uint64_t pc_pci_hole64_start(void)
{
     PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
     MachineState *ms = MACHINE(pcms);
     uint64_t hole64_start = 0;

     if (pcmc->has_reserved_memory && ms->device_memory->base) {
         hole64_start = ms->device_memory->base;
         if (!pcmc->broken_reserved_end) {
             hole64_start += memory_region_size(&ms->device_memory->mr);
         }
     } else {
         hole64_start = 0x100000000ULL + pcms->above_4g_mem_size;
     }

     return ROUND_UP(hole64_start, 1 * GiB);
}
[hw/pci-host/q35.c]

/*
  * The 64bit PCI hole start is set by the Guest firmware
  * as the address of the first 64bit PCI MEM resource.
  * If no PCI device has resources on the 64bit area,
  * the 64bit PCI hole will start after "over 4G RAM" and the
  * reserved space for memory hotplug if any.
  */
static void q35_host_get_pci_hole64_start(Object *obj, Visitor *v,
                                           const char *name, void *opaque,
                                           Error **errp)
{
     PCIHostState *h = PCI_HOST_BRIDGE(obj);
     Q35PCIHost *s = Q35_HOST_DEVICE(obj);
     Range w64;
     uint64_t value;

     pci_bus_get_w64_range(h->bus, &w64);
     value = range_is_empty(&w64) ? 0 : range_lob(&w64);
     if (!value && s->pci_hole64_fix) {
         value = pc_pci_hole64_start();
     }
     visit_type_uint64(v, name, &value, errp);
}

/*
  * The 64bit PCI hole end is set by the Guest firmware
  * as the address of the last 64bit PCI MEM resource.
  * Then it is expanded to the PCI_HOST_PROP_PCI_HOLE64_SIZE
  * that can be configured by the user.
  */
static void q35_host_get_pci_hole64_end(Object *obj, Visitor *v,
                                         const char *name, void *opaque,
                                         Error **errp)
{
     PCIHostState *h = PCI_HOST_BRIDGE(obj);
     Q35PCIHost *s = Q35_HOST_DEVICE(obj);
     uint64_t hole64_start = pc_pci_hole64_start();
     Range w64;
     uint64_t value, hole64_end;

     pci_bus_get_w64_range(h->bus, &w64);
     value = range_is_empty(&w64) ? 0 : range_upb(&w64) + 1;
     hole64_end = ROUND_UP(hole64_start + s->mch.pci_hole64_size, 1ULL << 30);
     if (s->pci_hole64_fix && value < hole64_end) {
         value = hole64_end;
     }
     visit_type_uint64(v, name, &value, errp);
}

Now consider the following scenario:

- the firmware programs some BARs with 64-bit addresses such that the
   aperture that we deduce starts at 32GB,

- the guest has 4GB of RAM, and no DIMM hotplug range.

Consequences:

- Because the "32-bit RAM split" for Q35 is at 2GB, the
   pc_pci_hole64_start() function will return 6GB.

- The q35_host_get_pci_hole64_start() function will return 32GB. (It
   will not fall back to pc_pci_hole64_start() -- correctly -- because
   the firmware has programmed some BARs with 64-bit addresses.)

- The q35_host_get_pci_hole64_end() function *intends* to return 64GB,
   because -- let's say -- the guest assigned BARs covering the
   32GB..34GB range, which is 2GB in size, and we *intend* to round that
   size up to 32GB, so that 30GB be left for hotplug purposes. (This is
   the original intent of commit 9fa99d2519cb.)
- However, because we initialize "hole64_start" from
   pc_pci_hole64_start(), and not from q35_host_get_pci_hole64_start(),
   we add "mch.pci_hole64_size" (32GB by default) to 6GB (the end of
   RAM), and not to 32GB (the aperture base deduced from the firmware's
   programming). As a result, we'll extend the aperture end address only
   to 38GB, and not to 64GB.

Right, there is no reason to use pc_pci_hole64_start, it looks
like a plain bug. We diverged from pc and the fact that
q35_host_get_pci_hole64_star uses it is only an implementation
detail.

My suggestion is simply to initialize "hole64_start" from
q35_host_get_pci_hole64_start(), in the q35_host_get_pci_hole64_end()
function. If the firmware doesn't program 64-bit addresses, then this
change is a no-op -- q35_host_get_pci_hole64_start() will fall back to
pc_pci_hole64_start() in that case. Otherwise, the behavior will be
fixed.

Looks OK to me.

Now, there's another complication, obviously -- machine type compat. In
commit 9fa99d2519cb, we added the "pci_hole64_fix" compat property. I
assume the additional fix I'm proposing requires another compat
property?

We have to, is a guest visible change. I really don't like these compat
properties, but I don't see a way around it.

  Something like:

diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 02f95765880a..c02b128189cd 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -138,12 +138,14 @@ static void q35_host_get_pci_hole64_end(Object *obj, 
Visitor *v,
  {
      PCIHostState *h = PCI_HOST_BRIDGE(obj);
      Q35PCIHost *s = Q35_HOST_DEVICE(obj);
-    uint64_t hole64_start = pc_pci_hole64_start();
+    uint64_t hole64_start;
      Range w64;
      uint64_t value, hole64_end;

      pci_bus_get_w64_range(h->bus, &w64);
      value = range_is_empty(&w64) ? 0 : range_upb(&w64) + 1;
+    hole64_start = s->pci_hole64_fix2 ? q35_host_get_pci_hole64_start() :
+                                        pc_pci_hole64_start();
      hole64_end = ROUND_UP(hole64_start + s->mch.pci_hole64_size, 1ULL << 30);
      if (s->pci_hole64_fix && value < hole64_end) {
          value = hole64_end;
The same would apply to i440fx too.

I am lost here. The q35 PCI 64bit hole computation issue starts from the miss-use
of the PC conterpart functions. What is the problem with the PC?


What do you think?

I think is a clear bug, even we can say we promised "32G" hole and we do provide it.
But we clearly intended to have 32G-64G space in this case.

Michael, what do you think?

Thanks,
Marcel



Thanks
Laszlo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]