qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] q35/mch: implement extended TSEG sizes


From: Laszlo Ersek
Subject: Re: [Qemu-devel] [RFC] q35/mch: implement extended TSEG sizes
Date: Tue, 6 Jun 2017 18:58:01 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1

Gerd, can you please comment on this?

Thanks
Laszlo

On 05/30/17 23:26, Laszlo Ersek wrote:
> The q35 machine type currently lets the guest firmware select a 1MB, 2MB
> or 8MB TSEG (basically, SMRAM) size. In edk2/OVMF, we use 8MB, but even
> that is not enough when a lot of VCPUs (more than approx. 224) are
> configured -- SMRAM footprint scales largely proportionally with VCPU
> count.
> 
> Introduce a new property for "mch" called "extended-tseg-mbytes", which
> expresses (in megabytes) the user's choice of TSEG (SMRAM) size.
> 
> Invent a new, QEMU-specific register in the config space of the DRAM
> Controller, at offset 0x50, in order to allow guest firmware to query the
> TSEG (SMRAM) size.
> 
> According to Intel Document Number 316966-002, Table 5-1 "DRAM Controller
> Register Address Map (D0:F0)":
> 
>     Warning: Address locations that are not listed are considered Intel
>              Reserved registers locations. Reads to Reserved registers may
>              return non-zero values. Writes to reserved locations may
>              cause system failures.
> 
>              All registers that are defined in the PCI 2.3 specification,
>              but are not necessary or implemented in this component are
>              simply not included in this document. The
>              reserved/unimplemented space in the PCI configuration header
>              space is not documented as such in this summary.
> 
> Offsets 0x50 and 0x51 are not listed in Table 5-1. They are also not part
> of the standard PCI config space header. And they precede the capability
> list as well, which starts at 0xe0 for this device.
> 
> When the guest writes value 0xffff to this register, the value that can be
> read back is that of "mch.extended-tseg-mbytes" -- unless it remains
> 0xffff. The guest is required to write 0xffff first (as opposed to a
> read-only register) because PCI config space is generally not cleared on
> QEMU reset, and after S3 resume or reboot, new guest firmware running on
> old QEMU could read a guest OS-injected value from this register.
> 
> After reading the available "extended" TSEG size, the guest firmware may
> actually request that TSEG size by writing pattern 11b to the ESMRAMC
> register's TSEG_SZ bit-field. (The Intel spec referenced above defines
> only patterns 00b (1MB), 01b (2MB) and 10b (8MB); 11b is reserved.)
> 
> On the QEMU command line, the value can be set with
> 
>   -global mch.extended-tseg-mbytes=N
> 
> The default value for 2.10+ q35 machine types is 16. The value is limited
> to 0xfff (4095) at the moment, purely so that the product (4095 MB) can be
> stored to the uint32_t variable "tseg_size" in mch_update_smram(). Users
> are responsible for choosing sensible TSEG sizes.
> 
> On 2.9 and earlier q35 machine types, the default value is 0. This lets
> the 11b bit pattern in ESMRAMC.TSEG_SZ, and the register at offset 0x50,
> keep their original behavior.
> 
> When "extended-tseg-mbytes" is nonzero, the new register at offset 0x50 is
> set to that value on reset, for completeness.
> 
> PCI config space is migrated automatically, so no VMSD changes are
> necessary.
> 
> Cc: "Michael S. Tsirkin" <address@hidden>
> Cc: Gerd Hoffmann <address@hidden>
> Cc: Paolo Bonzini <address@hidden>
> Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1447027
> Ref: https://lists.01.org/pipermail/edk2-devel/2017-May/010456.html
> Signed-off-by: Laszlo Ersek <address@hidden>
> ---
> 
> Notes:
>     I haven't yet written any OVMF code to interface with this; I figured
>     I'd ask for comments on the approach first. Thanks.
> 
>  include/hw/i386/pc.h      |  5 +++++
>  include/hw/pci-host/q35.h |  6 ++++++
>  hw/pci-host/q35.c         | 41 ++++++++++++++++++++++++++++++++++++++---
>  3 files changed, 49 insertions(+), 3 deletions(-)
> 
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index e447f5d8f4d1..78cd630f3133 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -384,6 +384,11 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t 
> *);
>  
>  #define PC_COMPAT_2_9 \
>      HW_COMPAT_2_9 \
> +    {\
> +        .driver   = "mch",\
> +        .property = "extended-tseg-mbytes",\
> +        .value    = stringify(0),\
> +    },\
>  
>  #define PC_COMPAT_2_8 \
>      HW_COMPAT_2_8 \
> diff --git a/include/hw/pci-host/q35.h b/include/hw/pci-host/q35.h
> index 53b6760c16c5..58983c00b32d 100644
> --- a/include/hw/pci-host/q35.h
> +++ b/include/hw/pci-host/q35.h
> @@ -60,6 +60,7 @@ typedef struct MCHPCIState {
>      uint64_t above_4g_mem_size;
>      uint64_t pci_hole64_size;
>      uint32_t short_root_bus;
> +    uint16_t ext_tseg_mbytes;
>  } MCHPCIState;
>  
>  typedef struct Q35PCIHost {
> @@ -91,6 +92,11 @@ typedef struct Q35PCIHost {
>  /* D0:F0 configuration space */
>  #define MCH_HOST_BRIDGE_REVISION_DEFAULT       0x0
>  
> +#define MCH_HOST_BRIDGE_EXT_TSEG_MBYTES        0x50
> +#define MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_SIZE   2
> +#define MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_QUERY  0xffff
> +#define MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_MAX    0xfff
> +
>  #define MCH_HOST_BRIDGE_PCIEXBAR               0x60    /* 64bit register */
>  #define MCH_HOST_BRIDGE_PCIEXBAR_SIZE          8       /* 64bit register */
>  #define MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT       0xb0000000
> diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
> index cd5c49616ef9..28cb97b60fa3 100644
> --- a/hw/pci-host/q35.c
> +++ b/hw/pci-host/q35.c
> @@ -134,7 +134,7 @@ static void q35_host_get_mmcfg_size(Object *obj, Visitor 
> *v, const char *name,
>      visit_type_uint32(v, name, &value, errp);
>  }
>  
> -static Property mch_props[] = {
> +static Property q35_host_props[] = {
>      DEFINE_PROP_UINT64(PCIE_HOST_MCFG_BASE, Q35PCIHost, parent_obj.base_addr,
>                          MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT),
>      DEFINE_PROP_SIZE(PCI_HOST_PROP_PCI_HOLE64_SIZE, Q35PCIHost,
> @@ -154,7 +154,7 @@ static void q35_host_class_init(ObjectClass *klass, void 
> *data)
>  
>      hc->root_bus_path = q35_host_root_bus_path;
>      dc->realize = q35_host_realize;
> -    dc->props = mch_props;
> +    dc->props = q35_host_props;
>      /* Reason: needs to be wired up by pc_q35_init */
>      dc->user_creatable = false;
>      set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> @@ -369,7 +369,7 @@ static void mch_update_smram(MCHPCIState *mch)
>              tseg_size = 1024 * 1024 * 8;
>              break;
>          default:
> -            tseg_size = 0;
> +            tseg_size = 1024 * 1024 * (uint32_t)mch->ext_tseg_mbytes;
>              break;
>          }
>      } else {
> @@ -392,6 +392,17 @@ static void mch_update_smram(MCHPCIState *mch)
>      memory_region_transaction_commit();
>  }
>  
> +static void mch_update_ext_tseg_mbytes(MCHPCIState *mch)
> +{
> +    PCIDevice *pd = PCI_DEVICE(mch);
> +    uint8_t *reg = pd->config + MCH_HOST_BRIDGE_EXT_TSEG_MBYTES;
> +
> +    if (mch->ext_tseg_mbytes > 0 &&
> +        pci_get_word(reg) == MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_QUERY) {
> +        pci_set_word(reg, mch->ext_tseg_mbytes);
> +    }
> +}
> +
>  static void mch_write_config(PCIDevice *d,
>                                uint32_t address, uint32_t val, int len)
>  {
> @@ -413,6 +424,11 @@ static void mch_write_config(PCIDevice *d,
>                         MCH_HOST_BRIDGE_SMRAM_SIZE)) {
>          mch_update_smram(mch);
>      }
> +
> +    if (ranges_overlap(address, len, MCH_HOST_BRIDGE_EXT_TSEG_MBYTES,
> +                       MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_SIZE)) {
> +        mch_update_ext_tseg_mbytes(mch);
> +    }
>  }
>  
>  static void mch_update(MCHPCIState *mch)
> @@ -420,6 +436,7 @@ static void mch_update(MCHPCIState *mch)
>      mch_update_pciexbar(mch);
>      mch_update_pam(mch);
>      mch_update_smram(mch);
> +    mch_update_ext_tseg_mbytes(mch);
>  }
>  
>  static int mch_post_load(void *opaque, int version_id)
> @@ -457,6 +474,11 @@ static void mch_reset(DeviceState *qdev)
>      d->wmask[MCH_HOST_BRIDGE_SMRAM] = MCH_HOST_BRIDGE_SMRAM_WMASK;
>      d->wmask[MCH_HOST_BRIDGE_ESMRAMC] = MCH_HOST_BRIDGE_ESMRAMC_WMASK;
>  
> +    if (mch->ext_tseg_mbytes > 0) {
> +        pci_set_word(d->config + MCH_HOST_BRIDGE_EXT_TSEG_MBYTES,
> +                     MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_QUERY);
> +    }
> +
>      mch_update(mch);
>  }
>  
> @@ -465,6 +487,12 @@ static void mch_realize(PCIDevice *d, Error **errp)
>      int i;
>      MCHPCIState *mch = MCH_PCI_DEVICE(d);
>  
> +    if (mch->ext_tseg_mbytes > MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_MAX) {
> +        error_setg(errp, "invalid extended-tseg-mbytes value: %" PRIu16,
> +                   mch->ext_tseg_mbytes);
> +        return;
> +    }
> +
>      /* setup pci memory mapping */
>      pc_pci_as_mapping_init(OBJECT(mch), mch->system_memory,
>                             mch->pci_address_space);
> @@ -530,6 +558,12 @@ uint64_t mch_mcfg_base(void)
>      return MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT;
>  }
>  
> +static Property mch_props[] = {
> +    DEFINE_PROP_UINT16("extended-tseg-mbytes", MCHPCIState, ext_tseg_mbytes,
> +                       16),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
>  static void mch_class_init(ObjectClass *klass, void *data)
>  {
>      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> @@ -538,6 +572,7 @@ static void mch_class_init(ObjectClass *klass, void *data)
>      k->realize = mch_realize;
>      k->config_write = mch_write_config;
>      dc->reset = mch_reset;
> +    dc->props = mch_props;
>      set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
>      dc->desc = "Host bridge";
>      dc->vmsd = &vmstate_mch;
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]