qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 01/14] spapr: populate DRC entries for root d


From: Michael Roth
Subject: Re: [Qemu-devel] [PATCH v2 01/14] spapr: populate DRC entries for root dt node
Date: Mon, 20 Jan 2014 11:24:28 -0600
User-agent: alot/0.3.4

Quoting Alexey Kardashevskiy (2014-01-19 20:58:20)
> On 01/17/2014 07:51 AM, Michael Roth wrote:
> > Quoting Alexey Kardashevskiy (2013-12-15 22:54:42)
> >> On 12/16/2013 01:59 PM, Alexey Kardashevskiy wrote:
> >>> On 12/06/2013 09:32 AM, Michael Roth wrote:
> >>>> From: Nathan Fontenot <address@hidden>
> >>>>
> >>>> This add entries to the root OF node to advertise our PHBs as being
> >>>> DR-capable in according with PAPR specification.
> >>>>
> >>>> Each PHB is given a name of PHB<bus#>, advertised as a PHB type,
> >>>> and associated with a power domain of -1 (indicating to guests that
> >>>> power management is handled automatically by hardware).
> >>>>
> >>>> We currently allocate entries for up to 32 DR-capable PHBs, though
> >>>> this limit can be increased later.
> >>>>
> >>>> DrcEntry objects to track the state of the DR-connector associated
> >>>> with each PHB are stored in a 32-entry array, and each DrcEntry has
> >>>> in turn have a dynamically-sized number of child DR-connectors,
> >>>> which we will use later to track the state of DR-connectors
> >>>> associated with a PHB's physical slots.
> >>>>
> >>>> Signed-off-by: Nathan Fontenot <address@hidden>
> >>>> Signed-off-by: Michael Roth <address@hidden>
> >>>> ---
> >>>>  hw/ppc/spapr.c         |  132 
> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>  include/hw/ppc/spapr.h |   33 ++++++++++++
> >>>>  2 files changed, 165 insertions(+)
> >>>>
> >>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>>> index 7e53a5f..ec3ba43 100644
> >>>> --- a/hw/ppc/spapr.c
> >>>> +++ b/hw/ppc/spapr.c
> >>>> @@ -81,6 +81,7 @@
> >>>>  #define HTAB_SIZE(spapr)        (1ULL << ((spapr)->htab_shift))
> >>>>  
> >>>>  sPAPREnvironment *spapr;
> >>>> +DrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
> >>>>  
> >>>>  int spapr_allocate_irq(int hint, bool lsi)
> >>>>  {
> >>>> @@ -276,6 +277,130 @@ static size_t create_page_sizes_prop(CPUPPCState 
> >>>> *env, uint32_t *prop,
> >>>>      return (p - prop) * sizeof(uint32_t);
> >>>>  }
> >>>>  
> >>>> +static void spapr_init_drc_table(void)
> >>>> +{
> >>>> +    int i;
> >>>> +
> >>>> +    memset(drc_table, 0, sizeof(drc_table));
> >>>> +
> >>>> +    /* For now we only care about PHB entries */
> >>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> >>>> +        drc_table[i].drc_index = 0x2000001 + i;
> >>>> +    }
> >>>> +}
> >>>> +
> >>>> +DrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
> >>>> +{
> >>>> +    DrcEntry *empty_drc = NULL;
> >>>> +    DrcEntry *found_drc = NULL;
> >>>> +    int i, phb_index;
> >>>> +
> >>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> >>>> +        if (drc_table[i].phb_buid == 0) {
> >>>> +            empty_drc = &drc_table[i];
> >>>> +        }
> >>>> +
> >>>> +        if (drc_table[i].phb_buid == buid) {
> >>>> +            found_drc = &drc_table[i];
> >>>> +            break;
> >>>> +        }
> >>>> +    }
> >>>> +
> >>>> +    if (found_drc) {
> >>>> +        return found_drc;
> >>>> +    }
> >>>> +
> >>>> +    if (empty_drc) {
> >>>> +        empty_drc->phb_buid = buid;
> >>>> +        empty_drc->state = state;
> >>>> +        empty_drc->cc_state.fdt = NULL;
> >>>> +        empty_drc->cc_state.offset = 0;
> >>>> +        empty_drc->cc_state.depth = 0;
> >>>> +        empty_drc->cc_state.state = CC_STATE_IDLE;
> >>>> +        empty_drc->child_entries =
> >>>> +            g_malloc0(sizeof(DrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
> >>>> +        phb_index = buid - SPAPR_PCI_BASE_BUID;
> >>>> +        for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> >>>> +            empty_drc->child_entries[i].drc_index =
> >>>> +                SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
> >>>> +        }
> >>>> +        return empty_drc;
> >>>> +    }
> >>>> +
> >>>> +    return NULL;
> >>>> +}
> >>>> +
> >>>> +static void spapr_create_drc_dt_entries(void *fdt)
> >>>> +{
> >>>> +    char char_buf[1024];
> >>>> +    uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
> >>>> +    uint32_t *entries;
> >>>> +    int offset, fdt_offset;
> >>>> +    int i, ret;
> >>>> +
> >>>> +    fdt_offset = fdt_path_offset(fdt, "/");
> >>>> +
> >>>> +    /* ibm,drc-indexes */
> >>>> +    memset(int_buf, 0, sizeof(int_buf));
> >>>> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> >>>> +
> >>>> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> >>>> +        int_buf[i] = drc_table[i-1].drc_index;
> >>>> +    }
> >>>> +
> >>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
> >>>> +                      sizeof(int_buf));
> >>>> +    if (ret) {
> >>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
> >>>> +    }
> >>>> +
> >>>> +    /* ibm,drc-power-domains */
> >>>> +    memset(int_buf, 0, sizeof(int_buf));
> >>>> +    int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> >>>> +
> >>>> +    for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> >>>> +        int_buf[i] = 0xffffffff;
> >>>> +    }
> >>>> +
> >>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
> >>>> +                      sizeof(int_buf));
> >>>> +    if (ret) {
> >>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-power-domains 
> >>>> property\n");
> >>>> +    }
> >>>> +
> >>>> +    /* ibm,drc-names */
> >>>> +    memset(char_buf, 0, sizeof(char_buf));
> >>>> +    entries = (uint32_t *)&char_buf[0];
> >>>> +    *entries = SPAPR_DRC_TABLE_SIZE;
> >>>> +    offset = sizeof(*entries);
> >>>> +
> >>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> >>>> +        offset += sprintf(char_buf + offset, "PHB %d", i + 1);
> >>>> +        char_buf[offset++] = '\0';
> >>>> +    }
> >>>> +
> >>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, 
> >>>> offset);
> >>>> +    if (ret) {
> >>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-names property\n");
> >>>> +    }
> >>>> +
> >>>> +    /* ibm,drc-types */
> >>>> +    memset(char_buf, 0, sizeof(char_buf));
> >>>> +    entries = (uint32_t *)&char_buf[0];
> >>>> +    *entries = SPAPR_DRC_TABLE_SIZE;
> >>>> +    offset = sizeof(*entries);
> >>>> +
> >>>> +    for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> >>>> +        offset += sprintf(char_buf + offset, "PHB");
> >>>> +        char_buf[offset++] = '\0';
> >>>> +    }
> >>>> +
> >>>> +    ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-types", char_buf, 
> >>>> offset);
> >>>> +    if (ret) {
> >>>> +        fprintf(stderr, "Couldn't finalize ibm,drc-types property\n");
> >>>> +    }
> >>>> +}
> >>>> +
> >>>>  #define _FDT(exp) \
> >>>>      do { \
> >>>>          int ret = (exp);                                           \
> >>>> @@ -307,6 +432,8 @@ static void *spapr_create_fdt_skel(hwaddr 
> >>>> initrd_base,
> >>>>      int i, smt = kvmppc_smt_threads();
> >>>>      unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
> >>>>  
> >>>> +    spapr_init_drc_table();
> >>>> +
> >>>>      fdt = g_malloc0(FDT_MAX_SIZE);
> >>>>      _FDT((fdt_create(fdt, FDT_MAX_SIZE)));
> >>>>  
> >>>> @@ -590,6 +717,7 @@ static void spapr_finalize_fdt(sPAPREnvironment 
> >>>> *spapr,
> >>>>      int ret;
> >>>>      void *fdt;
> >>>>      sPAPRPHBState *phb;
> >>>> +    DrcEntry *drc_entry;
> >>>>  
> >>>>      fdt = g_malloc(FDT_MAX_SIZE);
> >>>>  
> >>>> @@ -609,6 +737,8 @@ static void spapr_finalize_fdt(sPAPREnvironment 
> >>>> *spapr,
> >>>>      }
> >>>>  
> >>>>      QLIST_FOREACH(phb, &spapr->phbs, list) {
> >>>> +        drc_entry = spapr_add_phb_to_drc_table(phb->buid, 2 /* Unusable 
> >>>> */);
> >>>> +        g_assert(drc_entry);
> >>>>          ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> >>>>      }
> >>>>  
> >>>> @@ -633,6 +763,8 @@ static void spapr_finalize_fdt(sPAPREnvironment 
> >>>> *spapr,
> >>>>          spapr_populate_chosen_stdout(fdt, spapr->vio_bus);
> >>>>      }
> >>>>  
> >>>> +    spapr_create_drc_dt_entries(fdt);
> >>>> +
> >>>>      _FDT((fdt_pack(fdt)));
> >>>>  
> >>>>      if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> >>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >>>> index b2f11e9..0f2e705 100644
> >>>> --- a/include/hw/ppc/spapr.h
> >>>> +++ b/include/hw/ppc/spapr.h
> >>>> @@ -299,6 +299,39 @@ typedef struct sPAPREnvironment {
> >>>>  #define KVMPPC_H_LOGICAL_MEMOP  (KVMPPC_HCALL_BASE + 0x1)
> >>>>  #define KVMPPC_HCALL_MAX        KVMPPC_H_LOGICAL_MEMOP
> >>>>  
> >>>> +/* For dlparable/hotpluggable slots */
> >>>> +#define SPAPR_DRC_TABLE_SIZE    32
> >>>> +#define SPAPR_DRC_PHB_SLOT_MAX  32
> >>>> +#define SPAPR_DRC_DEV_ID_BASE   0x40000000
> >>>> +
> >>>> +typedef struct ConfigureConnectorState {
> >>>> +    void *fdt;
> >>>> +    int offset_start;
> >>>> +    int offset;
> >>>> +    int depth;
> >>>> +    PCIDevice *dev;
> >>>> +    enum {
> >>>> +        CC_STATE_IDLE = 0,
> >>>> +        CC_STATE_PENDING = 1,
> >>>> +        CC_STATE_ACTIVE,
> >>>> +    } state;
> >>>> +} ConfigureConnectorState;
> >>>> +
> >>>> +typedef struct DrcEntry DrcEntry;
> >>>> +
> >>>> +struct DrcEntry {
> >>>> +    uint32_t drc_index;
> >>>> +    uint64_t phb_buid;
> >>>> +    void *fdt;
> >>>> +    int fdt_offset;
> >>>> +    uint32_t state;
> >>>> +    ConfigureConnectorState cc_state;
> >>>> +    DrcEntry *child_entries;
> >>>> +};
> >>>> +
> >>>> +extern DrcEntry drc_table[SPAPR_DRC_TABLE_SIZE];
> >>>> +DrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state);
> >>>> +
> >>>>  extern sPAPREnvironment *spapr;
> >>>
> >>> So far we were trying to keep everything sPAPR-related in 
> >>> sPAPREnvironment.
> >>> Is @drc_table really that special?
> >>
> >>
> >> One more note - we are trying to add a "spapr" or "sPAPR" prefix to all
> >> global types defines in headers (such as sPAPRPHBState, spapr_pci_lsi,
> >> VIOsPAPRBus, sPAPREnvironment), it would be nice to have "spapr" in some
> >> form in these new types too.
> >>
> >> Or we could move the whole patch (except spapr_create_drc_dt_entries()) to
> >> hw/ppc/spapr_pci.c (and keep the original names) as it seems to be the only
> >> user of the whole DrcEntry and ConfigureConnectorState thing.
> >> And put a pointer to drc_table[] into @spapr (or make it static?)
> > 
> > That would work, but I think we'd need to move spapr_create_drc_dt_entries()
> > as well, or the bits that rely on DrcEntry at least. Though I worry
> > about scoping DrcEntry to spapr_pci.c at this early stage, as DR-capable
> > components other than PCI may come to rely on state that's captured by the
> > DrcEntry nodes, such as boot-time FDT generation and run-time management
> > (via ibm,configure-connector) of CPUs and memory.
> > 
> > Assuming that seems like a reasonable expectation, I think I'd prefer the
> > first option of using spapr-specific prefixes for global types and moving
> > drc_table into sPAPREnvironment
> 
> 
> I did not realize DRC is not just for PCI. How hard would it be to add hot
> plug support for a whole PHB? The current QEMU trend is to make QEMU
> monitor's "device_add" equal to the command line's "-device" which is not
> (yet) true for PHB but could be. Thanks.
> 

Would need to look at it a bit more closely to say for certain, but after
discussing it a bit Tyrel/Mike, I think the main considerations would be:

1) PHB hotplug/unplug would need to signal a different event type in it's
   check-exception/epow message, we have stubs in place for a PHB event type,
   so that's mostly a matter of adding special-casing in the hotplug callback
   for spapr-pci-host-bridge devices
2) The required properties for the OF node corresponding PHB will be different.
   Currently these are generated as part of the hotplug callback, and attached
   to the corresponding ConfigureConnectorState node to be fed to the guest
   via subsequent ibm,configure-connector RTAS calls, so we'd just hook the
   PHB's OF node generation code in there as.
3) The sysctl/kernel interface for handling PHB hotplug would be different,
   we'd be relying on the rpadlar kernel module
   (/sys/bus/pci/slots/control/add_slot) rather than rpaphp
   (/sys/bus/pci/slots/<slot>/power) or the PCI rescan fallback.
   This is mostly a matter of modifying the handling in the guest tools, namely
   in rtas_errd, to handle the event accordingly.

We also haven't done anything extensive using rpadlpar operations within qemu
guests, so there may be various odds/ends and possibly kernel changes needed to
get that working properly (as was the case for rpaphp, though thanks to the PCI
rescan workaround a new kernel isn't required for existing guests... a similar
fallback likely won't be available for rpadlpar)

But from a high-level view at least it seems fairly straight-forward. I'll see
if we can get a prototype working.

> 
> 
> >> The only remaining user of DrcEntry is spapr_hotplug_req_event() but this
> >> can be easily fixed by small helper like this:
> >>
> >> int spapr_phb_slot_to_drc_index(uint64_t buid, int slot)
> >> {
> >>         DrcEntry *drc_entry = spapr_phb_to_drc_entry(phb->buid);
> >>         if (!drc_entry) {
> >>                 drc_entry = spapr_add_phb_to_drc_table(phb->buid, 2);
> >>         }
> >>         return drc_entry->child_entries[slot].drc_index;
> >> }
> 
> 
> 
> 
> -- 
> Alexey




reply via email to

[Prev in Thread] Current Thread [Next in Thread]