qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC 4/6] i386/pc: Keep PCI 64-bit hole within usable IOVA spa


From: Joao Martins
Subject: Re: [PATCH RFC 4/6] i386/pc: Keep PCI 64-bit hole within usable IOVA space
Date: Wed, 23 Jun 2021 14:22:29 +0100


On 6/23/21 1:30 PM, Igor Mammedov wrote:
> On Tue, 22 Jun 2021 16:49:03 +0100
> Joao Martins <joao.m.martins@oracle.com> wrote:
> 
>> pci_memory initialized by q35 and i440fx is set to a range
>> of 0 .. UINT64_MAX, and as a consequence when ACPI and pci-host
>> pick the hole64_start it does not account for allowed IOVA ranges.
>>
>> Rather than blindly returning, round up the hole64_start
>> value to the allowable IOVA range, such that it accounts for
>> the 1Tb hole *on AMD*. On Intel it returns the input value
>> for hole64 start.
> 
> wouldn't that work only in case where guest firmware hadn't
> mapped any PCI bars above 4Gb (possibly in not allowed region).
> 
> Looking at Seabios, it uses reserved_memory_end as the start
> of PCI64 window. Not sure about OVMF,
>  CCing Laszlo.
> 
Hmmm, perhaps only in the case that I don't advertise the reserved
region (i.e. mem size not being big enough to let the guest know in
the e820). But then that it begs the question, should we then always
advertise the said HT region as reserved regardless of memory size?

>> Suggested-by: David Edmondson <david.edmondson@oracle.com>
>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>> ---
>>  hw/i386/pc.c         | 17 +++++++++++++++--
>>  hw/pci-host/i440fx.c |  4 +++-
>>  hw/pci-host/q35.c    |  4 +++-
>>  include/hw/i386/pc.h |  3 ++-
>>  4 files changed, 23 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index 2e2ea82a4661..65885cc16037 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -1141,7 +1141,7 @@ void pc_memory_init(PCMachineState *pcms,
>>   * The 64bit pci hole starts after "above 4G RAM" and
>>   * potentially the space reserved for memory hotplug.
>>   */
>> -uint64_t pc_pci_hole64_start(void)
>> +uint64_t pc_pci_hole64_start(uint64_t size)
>>  {
>>      PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
>>      PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
>> @@ -1155,12 +1155,25 @@ uint64_t pc_pci_hole64_start(void)
>>              hole64_start += memory_region_size(&ms->device_memory->mr);
>>          }
>>      } else {
>> -        hole64_start = 0x100000000ULL + x86ms->above_4g_mem_size;
>> +        if (!x86ms->above_1t_mem_size) {
>> +            hole64_start = 0x100000000ULL + x86ms->above_4g_mem_size;
>> +        } else {
>> +            hole64_start = x86ms->above_1t_maxram_start;
>> +        }
>>      }
> 
>> +    hole64_start = allowed_round_up(hole64_start, size);
> 
> I'm not sure but, might it cause problems if there were BARs placed
> by firmware in region below rounded up value?
> (i.e. ACPI description which uses PCI_HOST_PROP_PCI_HOLE_START property
> won't match whatever firmware programmed due to rounding pushing hole
> start up)
> 
But wouldn't then the problem be firmware ignoring E820 to avoid putting
firmware region where it shouldn't? Unless of course, it wasn't advertised
like I mentioned earlier.

>>      return ROUND_UP(hole64_start, 1 * GiB);
>>  }
>>  
>> +uint64_t pc_pci_hole64_start_aligned(uint64_t start, uint64_t size)
>> +{
>> +    if (nb_iova_ranges == DEFAULT_NR_USABLE_IOVAS) {
>> +        return start;
>> +    }
>> +    return allowed_round_up(start, size);
>> +}
>> +
>>  DeviceState *pc_vga_init(ISABus *isa_bus, PCIBus *pci_bus)
>>  {
>>      DeviceState *dev = NULL;
>> diff --git a/hw/pci-host/i440fx.c b/hw/pci-host/i440fx.c
>> index 28c9bae89944..e8eaebfe1034 100644
>> --- a/hw/pci-host/i440fx.c
>> +++ b/hw/pci-host/i440fx.c
>> @@ -163,8 +163,10 @@ static uint64_t 
>> i440fx_pcihost_get_pci_hole64_start_value(Object *obj)
>>      pci_bus_get_w64_range(h->bus, &w64);
>>      value = range_is_empty(&w64) ? 0 : range_lob(&w64);
>>      if (!value && s->pci_hole64_fix) {
>> -        value = pc_pci_hole64_start();
>> +        value = pc_pci_hole64_start(s->pci_hole64_size);
>>      }
>> +    /* This returns @value when not on AMD */
>> +    value = pc_pci_hole64_start_aligned(value, s->pci_hole64_size);
>>      return value;
>>  }
>>  
>> diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
>> index 2eb729dff585..d556eb965ddb 100644
>> --- a/hw/pci-host/q35.c
>> +++ b/hw/pci-host/q35.c
>> @@ -126,8 +126,10 @@ static uint64_t 
>> q35_host_get_pci_hole64_start_value(Object *obj)
>>      pci_bus_get_w64_range(h->bus, &w64);
>>      value = range_is_empty(&w64) ? 0 : range_lob(&w64);
>>      if (!value && s->pci_hole64_fix) {
>> -        value = pc_pci_hole64_start();
>> +        value = pc_pci_hole64_start(s->mch.pci_hole64_size);
>>      }
>> +    /* This returns @value when not on AMD */
>> +    value = pc_pci_hole64_start_aligned(value, s->mch.pci_hole64_size);
>>      return value;
>>  }
>>  
>> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
>> index 73b8e2900c72..b924aef3a218 100644
>> --- a/include/hw/i386/pc.h
>> +++ b/include/hw/i386/pc.h
>> @@ -217,7 +217,8 @@ void pc_memory_init(PCMachineState *pcms,
>>                      MemoryRegion *system_memory,
>>                      MemoryRegion *rom_memory,
>>                      MemoryRegion **ram_memory);
>> -uint64_t pc_pci_hole64_start(void);
>> +uint64_t pc_pci_hole64_start(uint64_t size);
>> +uint64_t pc_pci_hole64_start_aligned(uint64_t value, uint64_t size);
>>  DeviceState *pc_vga_init(ISABus *isa_bus, PCIBus *pci_bus);
>>  void pc_basic_device_init(struct PCMachineState *pcms,
>>                            ISABus *isa_bus, qemu_irq *gsi,
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]