qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Invalid blob size on NVDIMM hot-add (was: RE: [RFC PATCH 0/4] ARM vi


From: Shameerali Kolothum Thodi
Subject: RE: Invalid blob size on NVDIMM hot-add (was: RE: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support)
Date: Tue, 24 Sep 2019 16:38:36 +0000


> -----Original Message-----
> From: Laszlo Ersek [mailto:address@hidden]
> Sent: 24 September 2019 16:53
> To: Shameerali Kolothum Thodi <address@hidden>;
> Igor Mammedov <address@hidden>
> Cc: Auger Eric <address@hidden>; address@hidden;
> address@hidden; address@hidden; address@hidden;
> xuwei (O) <address@hidden>; Linuxarm <address@hidden>; Ard
> Biesheuvel <address@hidden>; Leif Lindholm (Linaro address)
> <address@hidden>
> Subject: Re: Invalid blob size on NVDIMM hot-add (was: RE: [RFC PATCH 0/4]
> ARM virt: ACPI memory hotplug support)
> 
> On 09/20/19 19:04, Shameerali Kolothum Thodi wrote:
> > Hi Laszlo/Igor,
> >
> > I spend some time to debug this further as I was rebasing the nvdimm
> > hot-add support patches on top of the ongoing pc-dimm hot add ones.
> >
> > Just to refresh the memory:
> >
> > https://patchwork.kernel.org/cover/10783589/
> >
> > " It is observed that hot adding nvdimm will results in guest reboot
> > failure. EDK2 fails to build the ACPI tables on reboot. Please find
> > below EDK2 log on Guest reboot after nvdimm hot-add,
> >
> > ProcessCmdAddChecksum: invalid checksum range in "etc/acpi/tables"
> > OnRootBridgesConnected: InstallAcpiTables: Protocol Error
> > "
> >
> > Please find below,
> >
> >> -----Original Message-----
> >> From: Laszlo Ersek [mailto:address@hidden]
> >> Sent: 05 March 2019 12:15
> >> To: Igor Mammedov <address@hidden>
> >> Cc: Shameerali Kolothum Thodi <address@hidden>;
> >> Auger Eric <address@hidden>; address@hidden;
> >> address@hidden; address@hidden;
> address@hidden;
> >> xuwei (O) <address@hidden>; Linuxarm <address@hidden>; Ard
> >> Biesheuvel <address@hidden>; Leif Lindholm (Linaro
> >> address) <address@hidden>
> >> Subject: Re: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support
> >>
> >> On 03/01/19 18:39, Igor Mammedov wrote:
> >>> On Fri, 1 Mar 2019 14:49:45 +0100
> >>> Laszlo Ersek <address@hidden> wrote:
> >>>
> >>>> On 02/28/19 15:02, Shameerali Kolothum Thodi wrote:
> >>>>
> >>>>> Ah..I missed the fact that, firmware indeed sees an update in the
> >>>>> blob len here (rounded or not) after reboot. So don't think x86
> >>>>> has the same issue and padding is not the right solution as Igor
> >>>>> explained in his reply.
> >>>>>
> >>>>> I will try to debug this further. Any pointers welcome.
> >>>>
> >>>> How about this.
> >>>>
> >>>> (1) The firmware looks up the fw_cfg file called "etc/table-loader"
> >>>> in the fw_cfg file directory (identified by constant selector key
> >>>> 0x0019, FW_CFG_FILE_DIR).
> >>>>
> >>>> (2) The directory entry, once found, tells the firmware two things
> >>>> simultaneously. The selector key, and the size of the blob.
> >>>>
> >>>> (3) The firmware selects the selector key from step (2).
> >>>>
> >>>> (4) QEMU regenerates the ACPI payload (as a select callback).
> >>>>
> >>>> (5) The firmware reads the number of bytes from the fw_cfg blob
> >>>> that it learned in step (2).
> >>>>
> >>>> Here's the problem. As long as QEMU used to perform step (4) only
> >>>> for the purpose of refreshing PCI resources in the ACPI payload,
> >>>> step (4) wouldn't *resize* the blob.
> >>>>
> >>>> However, if step (4) enlarges the blob, then the byte count that
> >>>> step (5) uses -- from step (2) -- for reading, is obsolete.
> >>
> >>> I've thought that was a problem with IO based fw_cfg, as reading
> >>> size/content were separates steps and that it was solved by DMA
> >>> based fw_cfg file read.
> >>
> >> The DMA backend is not relevant for this question, for two reasons:
> >>
> >> (a) The question whether the fw_cfg transfer takes places with port
> >> IO vs. DMA is hidden from the fw_cfg client code; that code goes
> >> through an abstract library API.
> >>
> >> (b) While the DMA method indeed lets the firmware specify the details
> >> of the transfer with one action, the issue is with the number of
> >> bytes that the firmware requests (that is, not with *how* the
> >> firmware requests the transfer). The firmware has to know the size of
> >> the transfer before it can initiate the transfer (regardless of port
> >> IO vs. DMA).
> >>
> >>
> >> My question is: assume the firmware item in question is selected, and
> >> the QEMU-side select callback runs (regenerating the ACPI payload).
> >> Does this action update the blob size in the fw_cfg file directory as
> >> well?
> >
> > I think it doesn't update the blob size on select callback which is
> > the root cause of this issue. And the reason looks like,
> > qemu_ram_resize() function returns without invoking the callback to
> > update the blob size.
> >
> > On boot up, Qemu builds the table and exposes it to guest,
> >       virt_acpi_setup()
> >         acpi_add_rom_blob()
> >           rom_add_blob()
> >             rom_set_mr()  --> mr is allocated here and ram_block
> used_length = HOST_PAGE_ALIGN(blob size);
> >             fw_cfg_add_file_callback()
> >               fw_cfg_add_bytes_callback() --> This uses the blob size
> passed into it.
> >
> > On select callback path,
> >
> > virt_acpi_build_update()
> >    acpi_ram_update()
> >     memory_region_ram_resize()
> >       qemu_ram_resize() -->. Here the newsize gets aligned to HOST_PAGE
> and callback is only called used_length != newsize.
> >
> > https://github.com/qemu/qemu/blob/master/exec.c#L2180
> >
> > Debug logs:
> > Initial boot:
> > ##QEMU_DEBUG## rom_add_blob: file etc/acpi/tables size 0x64f7
> > ##QEMU_DEBUG## fw_cfg_add_bytes_callback: key 0x21 len 0x64f7
> > ........
> > ........
> > ###UEFI#### InstallQemuFwCfgTables: "etc/table-loader" has FwCfgItem
> 0x27 size 0xD00
> > ##QEMU_DEBUG## virt_acpi_build_update:
> > ##QEMU_DEBUG## acpi_ram_update: size 0x64f7
> > ##QEMU_DEBUG## qemu_ram_resize: idstr /rom@etc/acpi/tables
> used_length  0x7000 newsize 0x7000 --> No callback.
> > .....
> > ######UEFI###### ProcessCmdAllocate:
> QemuFwCfgFindFile("etc/acpi/tables"): size 0x64F7 --> UEFI get the actual 
> size,
> which is fine for now.
> >
> > Hot-add nvdimms and reboot.
> >
> > root@ubuntu:/# reboot
> > .......
> > ........
> > ###UEFI#### InstallQemuFwCfgTables: "etc/table-loader" has FwCfgItem
> 0x27 size 0xD00
> > ##QEMU_DEBUG## virt_acpi_build_update:
> > ##QEMU_DEBUG## acpi_ram_update: size 0x6667 --> Size changed.
> > ##QEMU_DEBUG## qemu_ram_resize: idstr /rom@etc/acpi/tables
> used_length  0x7000 newsize 0x7000 --> newsize is still aligned to 0x7000 and
> no callback to update.
> > ......
> > ######UEFI###### ProcessCmdAllocate:
> QemuFwCfgFindFile("etc/acpi/tables"): size 0x64F7 -->UEFI still sees the old
> value and fails.
> >
> > This can be fixed by,
> >
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index
> f3bd45675b..79da3fd35d 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -854,6 +854,9 @@ void virt_acpi_build(VirtMachineState *vms,
> AcpiBuildTables *tables)
> >          build_rsdp(tables->rsdp, tables->linker, &rsdp_data);
> >      }
> >
> > +    g_array_set_size(tables_blob,
> > +
> TARGET_PAGE_ALIGN(acpi_data_len(tables_blob)));
> > +
> >      /* Cleanup memory that's no longer used. */
> >      g_array_free(table_offsets, true);
> >  }
> >
> > But I am not sure this is the best to way fix this issue (Or I am
> > missing something here).
> >
> > Please let me know.
> 
> The above QEMU patch, for virt_acpi_build(), may be necessary, but I
> don't think it is sufficient.
> 
> For the firmware to see the updated (enlarged) blob, two things are
> required:
> 
> (a) QEMU to update the blob size in the fw_cfg directory entry.
> 
> Note: to the firmware, it is totally irrelevant if QEMU updates some
> *other* value or field that reflects the fresh blob size. The only thing
> the firmware can see is the entry in the FW_CFG_FILE_DIR blob.
> 
> To illustrate the field I'm referring to, see:
> 
>     s->files->f[index].size   = cpu_to_be32(len);
> 
> in fw_cfg_add_file_callback().
> 
> See also:
> 
>             s->files->f[i].size   = cpu_to_be32(len);
> 
> in fw_cfg_modify_file().
> 
> That "size" field is what the firmware can see.

Agree. And what I am trying to illustrate above is why this update is not 
happening.
 
The Qemu path on select key from firmware is(I think),
 
fw_cfg_select()
 e->select_cb() -->Callback registered in virt_acpi_setup() --> 
acpi_add_rom_blob() ..... -> fw_cfg_add_file_callback()
  virt_acpi_build_update()
   acpi_ram_update()
    memory_region_ram_resize()
     qemu_ram_resize()
      block->resized()  --> Callback registered in virt_acpi_setup() -> 
acpi_add_rom_blob()-> rom_add_blob() -> rom_set_mr()
       fw_cfg_resized()
        fw_cfg_modify_file()
         s->files->f[i].size  = cpu_to_be32(len); Updates the size for firmware

But as I was trying to explain above, the callback function fw_cfg_resized() is 
not
invoked in qemu_ram_resize()because while adding the mr region(rom_set_mr()),
it uses the aligned size. And as a result if the updated/modified blob size is 
not greater
than the aligned size the qemu_ram_resize() returns immediately. 

int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp) 
{ 
    assert(block); 

        newsize = HOST_PAGE_ALIGN(newsize); 

    if (block->used_length == newsize) { 
        return 0; 
    } 
...

    if (block->resized) { 
        block->resized(block->idstr, newsize, block->host);  ---> registered 
callback, fw_cfg_resized()
    } 
    return 0; 
}

> Note: *all* relevant fw_cfg files must have their "size" fields updated
> in the directory blob (FW_CFG_FILE_DIR). I.e. the requirement applies to
> both the linker-loader script, and to all blobs that are referenced (by
> name) by the commands in the linker-loader script.
> 
> 
> (b) The firmware to re-read the size from the directory, after selecting
> the key for the sake of ACPI regeneration.

Hmm...I am not sure this is required. In my testing with the above fix I 
mentioned,
I can see that firmware is reading the correct(modified) size on select.
I will double check this though.

Thanks,
Shameer

> I wrote:
> 
> >> If it does, then I can work around the problem in the firmware. I can
> >> add a re-lookup to the code after the item selection, in order to get
> >> the fresh blob size from the fw_cfg file directory. Then we can use
> >> that size for the actual transfer.
> >>
> >> This won't help old firmware on new QEMU, but at least new firmware
> >> on old QEMU will not be hurt (the re-fetching of the fw_cfg file
> >> directory will come with a small performance penalty, but
> >> functionally it will be a no-op).
> 
> Thus, the firmware patch in question would be:
> 
> | diff --git a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
> b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
> | index bc1a891dbaf1..07f70ffe158a 100644
> | --- a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
> | +++ b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c
> | @@ -975,6 +975,24 @@ InstallQemuFwCfgTables (
> |    ORDERED_COLLECTION       *SeenPointers;
> |    ORDERED_COLLECTION_ENTRY *SeenPointerEntry, *SeenPointerEntry2;
> |
> | +  Status = QemuFwCfgFindFile ("etc/table-loader", &FwCfgItem,
> &FwCfgSize);
> | +  if (EFI_ERROR (Status)) {
> | +    return Status;
> | +  }
> | +
> | +  //
> | +  // By selecting "FwCfgItem", ask QEMU to regenerate the ACPI payload,
> with
> | +  // all PCI devices decoding their resources. Note: further selections
> | +  // of the same key will not repeat the patching.
> | +  //
> | +  EnablePciDecoding (&OriginalPciAttributes, &OriginalPciAttributesCount);
> | +  QemuFwCfgSelectItem (FwCfgItem);
> | +  RestorePciDecoding (OriginalPciAttributes, OriginalPciAttributesCount);
> | +
> | +  //
> | +  // The size of the script may have changed, possibly due to platform
> devices
> | +  // having been hot-plugged before platform reset. Re-read the size.
> | +  //
> |    Status = QemuFwCfgFindFile ("etc/table-loader", &FwCfgItem,
> &FwCfgSize);
> |    if (EFI_ERROR (Status)) {
> |      return Status;
> | @@ -989,10 +1007,8 @@ InstallQemuFwCfgTables (
> |    if (LoaderStart == NULL) {
> |      return EFI_OUT_OF_RESOURCES;
> |    }
> | -  EnablePciDecoding (&OriginalPciAttributes, &OriginalPciAttributesCount);
> |    QemuFwCfgSelectItem (FwCfgItem);
> |    QemuFwCfgReadBytes (FwCfgSize, LoaderStart);
> | -  RestorePciDecoding (OriginalPciAttributes, OriginalPciAttributesCount);
> |    LoaderEnd = LoaderStart + FwCfgSize / sizeof *LoaderEntry;
> |
> |    AllocationsRestrictedTo32Bit = NULL;
> 
> But, again, this only makes sense if QEMU implements (a).
> 
> Thanks
> Laszlo

reply via email to

[Prev in Thread] Current Thread [Next in Thread]