qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFCv2 2/4] i386/pc: relocate 4g start to 1T where applicable


From: Joao Martins
Subject: Re: [PATCH RFCv2 2/4] i386/pc: relocate 4g start to 1T where applicable
Date: Mon, 21 Feb 2022 15:28:11 +0000


On 2/21/22 06:58, Igor Mammedov wrote:
> On Fri, 18 Feb 2022 17:12:21 +0000
> Joao Martins <joao.m.martins@oracle.com> wrote:
> 
>> On 2/14/22 15:31, Igor Mammedov wrote:
>>> On Mon, 14 Feb 2022 15:05:00 +0000
>>> Joao Martins <joao.m.martins@oracle.com> wrote:  
>>>> On 2/14/22 14:53, Igor Mammedov wrote:  
>>>>> On Mon,  7 Feb 2022 20:24:20 +0000
>>>>> Joao Martins <joao.m.martins@oracle.com> wrote:  
>>>>>> +{
>>>>>> +    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
>>>>>> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>>>>>> +    ram_addr_t device_mem_size = 0;
>>>>>> +    uint32_t eax, vendor[3];
>>>>>> +
>>>>>> +    host_cpuid(0x0, 0, &eax, &vendor[0], &vendor[2], &vendor[1]);
>>>>>> +    if (!IS_AMD_VENDOR(vendor)) {
>>>>>> +        return;
>>>>>> +    }
>>>>>> +
>>>>>> +    if (pcmc->has_reserved_memory &&
>>>>>> +       (machine->ram_size < machine->maxram_size)) {
>>>>>> +        device_mem_size = machine->maxram_size - machine->ram_size;
>>>>>> +    }
>>>>>> +
>>>>>> +    if ((x86ms->above_4g_mem_start + x86ms->above_4g_mem_size +
>>>>>> +         device_mem_size) < AMD_HT_START) {    
>>>>>     
>>>> And I was at two minds on this one, whether to advertise *always*
>>>> the 1T hole, regardless of relocation. Or account the size
>>>> we advertise for the pci64 hole and make that part of the equation
>>>> above. Although that has the flaw that the firmware at admin request
>>>> may pick some ludricous number (limited by maxphysaddr).  
>>>
>>> it this point we have only pci64 hole size (machine property),
>>> so I'd include that in equation to make firmware assign
>>> pci64 aperture above HT range.
>>>
>>> as for checking maxphysaddr, we can only check 'default' PCI hole
>>> range at this stage (i.e. 1Gb aligned hole size after all possible RAM)
>>> and hard error on it.
>>>   
>>
>> Igor, in the context of your comment above, I'll be introducing another
>> preparatory patch that adds up pci_hole64_size to pc_memory_init() such
>> that all used/max physaddr space checks are consolidated in pc_memory_init().
>>
>> To that end, the changes involve mainly moves the the pcihost qdev creation
>> to be before pc_memory_init(). Q35 just needs a 2-line order change. i440fx
>> needs slightly more of a dance to extract that from i440fx_init() and also
>> because most i440fx state is private (hence the new helper for size). But
>> the actual initialization of I440fx/q35 pci host is still after 
>> pc_memory_init(),
>> it is just to extra pci_hole64_size from the object + user passed args 
>> (-global etc).
> 
> Shuffling init order is looks too intrusive and in practice
> quite risky.

Yeah, it is an intrusive change sadly. Although, why would you consider it
risky (curious)? We are "only" moving this:

        qdev_new(host_type);

... located at the very top of i440fx_init() and called at the top of q35_host
initilization to be instead before pc_memory_init(). And that means that an 
instance of an
object gets made and its properties initialized i.e. @instance_init of q35 and 
i440fx and
its properties. I don't see a particular dependence in PC code to tell that this
would affect its surroundings parts.

The actual pcihost-related initialization is still kept entirely unchanged.

> How about moving maxphysaddr check to pc_machine_done() instead?
> (this way you won't have to move pcihost around)
> 
I can move it. But be there will be a slight disconnect between what 
pc_memory_init()
checks against "max used address"  between ... dictating if the 4G mem start 
should change
to 1T or not ...  and when the phys-bits check is actually made which includes 
the pci hole64.

For example, we create a guest with maxram 1009G (which 4G mem start would get 
at
unchanged) and then the pci_hole64 goes likely assigned across the rest until 
1023G (i.e.
across the HT region). Here it would need an extra check and fail if pci_hole64 
crosses
the HT region. Whereby if it was added in pc_memory_init() then we could just 
relocate to
1T and the guest creation would still proceed.

> 
>> Raw staging changes below the scissors mark so far.
>>
>> -->8--  
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index b2e43eba1106..902977081350 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -875,7 +875,8 @@ static void x86_update_above_4g_mem_start(PCMachineState 
>> *pcms)
>>  void pc_memory_init(PCMachineState *pcms,
>>                      MemoryRegion *system_memory,
>>                      MemoryRegion *rom_memory,
>> -                    MemoryRegion **ram_memory)
>> +                    MemoryRegion **ram_memory,
>> +                    uint64_t pci_hole64_size)
>>  {
>>      int linux_boot, i;
>>      MemoryRegion *option_rom_mr;
>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
>> index d9b344248dac..5a608e30e28f 100644
>> --- a/hw/i386/pc_piix.c
>> +++ b/hw/i386/pc_piix.c
>> @@ -91,6 +91,8 @@ static void pc_init1(MachineState *machine,
>>      MemoryRegion *pci_memory;
>>      MemoryRegion *rom_memory;
>>      ram_addr_t lowmem;
>> +    uint64_t hole64_size;
>> +    DeviceState *i440fx_dev;
>>
>>      /*
>>       * Calculate ram split, for memory below and above 4G.  It's a bit
>> @@ -164,9 +166,13 @@ static void pc_init1(MachineState *machine,
>>          pci_memory = g_new(MemoryRegion, 1);
>>          memory_region_init(pci_memory, NULL, "pci", UINT64_MAX);
>>          rom_memory = pci_memory;
>> +        i440fx_dev = qdev_new(host_type);
>> +        hole64_size = i440fx_pci_hole64_size(i440fx_dev);
>>      } else {
>>          pci_memory = NULL;
>>          rom_memory = system_memory;
>> +        i440fx_dev = NULL;
>> +        hole64_size = 0;
>>      }
>>
>>      pc_guest_info_init(pcms);
>> @@ -183,7 +189,7 @@ static void pc_init1(MachineState *machine,
>>      /* allocate ram and load rom/bios */
>>      if (!xen_enabled()) {
>>          pc_memory_init(pcms, system_memory,
>> -                       rom_memory, &ram_memory);
>> +                       rom_memory, &ram_memory, hole64_size);
>>      } else {
>>          pc_system_flash_cleanup_unused(pcms);
>>          if (machine->kernel_filename != NULL) {
>> @@ -199,7 +205,7 @@ static void pc_init1(MachineState *machine,
>>
>>          pci_bus = i440fx_init(host_type,
>>                                pci_type,
>> -                              &i440fx_state,
>> +                              i440fx_dev, &i440fx_state,
>>                                system_memory, system_io, machine->ram_size,
>>                                x86ms->below_4g_mem_size,
>>                                x86ms->above_4g_mem_size,
>> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
>> index 1780f79bc127..b7cf44d4755e 100644
>> --- a/hw/i386/pc_q35.c
>> +++ b/hw/i386/pc_q35.c
>> @@ -203,12 +203,13 @@ static void pc_q35_init(MachineState *machine)
>>                              pcms->smbios_entry_point_type);
>>      }
>>
>> -    /* allocate ram and load rom/bios */
>> -    pc_memory_init(pcms, get_system_memory(), rom_memory, &ram_memory);
>> -
>>      /* create pci host bus */
>>      q35_host = Q35_HOST_DEVICE(qdev_new(TYPE_Q35_HOST_DEVICE));
>>
>> +    /* allocate ram and load rom/bios */
>> +    pc_memory_init(pcms, get_system_memory(), rom_memory, &ram_memory,
>> +                   q35_host->mch.pci_hole64_size);
>> +
>>      object_property_add_child(qdev_get_machine(), "q35", OBJECT(q35_host));
>>      object_property_set_link(OBJECT(q35_host), MCH_HOST_PROP_RAM_MEM,
>>                               OBJECT(ram_memory), NULL);
>> diff --git a/hw/pci-host/i440fx.c b/hw/pci-host/i440fx.c
>> index e08716142b6e..c5cc28250d5c 100644
>> --- a/hw/pci-host/i440fx.c
>> +++ b/hw/pci-host/i440fx.c
>> @@ -237,7 +237,15 @@ static void i440fx_realize(PCIDevice *dev, Error **errp)
>>      }
>>  }
>>
>> +uint64_t i440fx_pci_hole64_size(DeviceState *i440fx_dev)
>> +{
>> +        I440FXState *i440fx = I440FX_PCI_HOST_BRIDGE(i440fx_dev);
>> +
>> +        return i440fx->pci_hole64_size;
>> +}
>> +
>>  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
>> +                    DeviceState *dev,
>>                      PCII440FXState **pi440fx_state,
>>                      MemoryRegion *address_space_mem,
>>                      MemoryRegion *address_space_io,
>> @@ -247,7 +255,6 @@ PCIBus *i440fx_init(const char *host_type, const char 
>> *pci_type,
>>                      MemoryRegion *pci_address_space,
>>                      MemoryRegion *ram_memory)
>>  {
>> -    DeviceState *dev;
>>      PCIBus *b;
>>      PCIDevice *d;
>>      PCIHostState *s;
>> @@ -255,7 +262,6 @@ PCIBus *i440fx_init(const char *host_type, const char 
>> *pci_type,
>>      unsigned i;
>>      I440FXState *i440fx;
>>
>> -    dev = qdev_new(host_type);
>>      s = PCI_HOST_BRIDGE(dev);
>>      b = pci_root_bus_new(dev, NULL, pci_address_space,
>>                           address_space_io, 0, TYPE_PCI_BUS);
>> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
>> index 9c9f4ac74810..d8b9c4ebd748 100644
>> --- a/include/hw/i386/pc.h
>> +++ b/include/hw/i386/pc.h
>> @@ -158,7 +158,8 @@ void xen_load_linux(PCMachineState *pcms);
>>  void pc_memory_init(PCMachineState *pcms,
>>                      MemoryRegion *system_memory,
>>                      MemoryRegion *rom_memory,
>> -                    MemoryRegion **ram_memory);
>> +                    MemoryRegion **ram_memory,
>> +                    uint64_t pci_hole64_size);
>>  uint64_t pc_pci_hole64_start(void);
>>  DeviceState *pc_vga_init(ISABus *isa_bus, PCIBus *pci_bus);
>>  void pc_basic_device_init(struct PCMachineState *pcms,
>> diff --git a/include/hw/pci-host/i440fx.h b/include/hw/pci-host/i440fx.h
>> index f068aaba8fda..1299d6a2b0e4 100644
>> --- a/include/hw/pci-host/i440fx.h
>> +++ b/include/hw/pci-host/i440fx.h
>> @@ -36,7 +36,7 @@ struct PCII440FXState {
>>  #define TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE "igd-passthrough-i440FX"
>>
>>  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
>> -                    PCII440FXState **pi440fx_state,
>> +                    DeviceState *dev, PCII440FXState **pi440fx_state,
>>                      MemoryRegion *address_space_mem,
>>                      MemoryRegion *address_space_io,
>>                      ram_addr_t ram_size,
>> @@ -45,5 +45,6 @@ PCIBus *i440fx_init(const char *host_type, const char 
>> *pci_type,
>>                      MemoryRegion *pci_memory,
>>                      MemoryRegion *ram_memory);
>>
>> +uint64_t i440fx_pci_hole64_size(DeviceState *i440fx_dev);
>>
>>  #endif
>>
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]