[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID |
Date: |
Wed, 15 Feb 2017 22:09:48 +0200 |
On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote:
> On 02/15/17 07:15, address@hidden wrote:
> > From: Ben Warren <address@hidden>
> >
> > This patch set adds support for passing a GUID to Windows guests. It
> > is a re-implementation of previous patch sets written by Igor Mammedov
> > et al, but this time passing the GUID data as a fw_cfg blob.
> >
> > This patch set has dependencies on new guest functionality, in
> > particular the support for a new linker-loader command and the ability
> > to write back data to QEMU over a DMA link. Work is in flight in both
> > SeaBIOS and OVMF to support this.
> >
> > v5->v6:
> > - Rebased to top of tree.
> > - Changed device from sysbus to a simple device. This removed the need
> > for
> > adding dynamic sysbus support to pc_piix boards.
> > - Removed patch that introduced QWORD patching of AML.
> > - Removed ability to set GUID via QMP/HMP.
> > - Improved comments/documentation in code.
>
> So here's my testing with a RHEL-7 guest:
>
> (1) The command line option passed to QEMU is
>
> -device vmgenid,guid=00112233-4455-6677-8899-AABBCCDDEEFF
>
> This is the example GUID provided in the SMBIOS spec v3.0.0 (DSP0134),
> section 7.2.1 "System -- UUID". (SMBIOS is only relevant here because it
> codifies the fact that Microsoft consumes UUID in little-endian order.)
> The expected representation, according to the SMBIOS spec, is
>
> 33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF
>
> (2) Here's an excerpt from the OVMF log:
>
> > ProcessCmdAllocate: File="etc/vmgenid_guid" Alignment=0x1000 Zone=1
> > Size=0x1000 Address=0x7FE5C000
>
> This is where "etc/vmgenid_guid" is allocated and downloaded, the
> allocation address is 0x7FE5C000.
>
> > Select Item: 0x19
> > Select Item: 0x22
> > ProcessCmdAllocate: File="etc/acpi/tables" Alignment=0x40 Zone=1
> > Size=0x20000 Address=0x7E7AB000
> > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x49 Start=0x40
> > Length=0x1403
> > ProcessCmdAddPointer: PointerFile="etc/acpi/tables"
> > PointeeFile="etc/acpi/tables" PointerOffset=0x1467 PointerSize=4
> > ProcessCmdAddPointer: PointerFile="etc/acpi/tables"
> > PointeeFile="etc/acpi/tables" PointerOffset=0x146B PointerSize=4
> > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x144C
> > Start=0x1443 Length=0x74
> > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x14C0
> > Start=0x14B7 Length=0x80
> > Select Item: 0x19
> > SaveCondensedWritePointerToS3Context: 0x002B/[0x00000000+8] := 0x7FE5C000
> > (0)
>
> This is where OVMF stashes the WRITE_POINTER command in "condensed"
> form, for S3. The fw_cfg selector value is 0x2B (for the fw_cfg file to
> be rewritten), the pointer is located at offset 0, has size 0, and the
> value to assign is 0x7FE5C000. And, this is #0 of the saved / condensed
> WRITE_POINTER commands.
>
> > Select Item: 0x2B
> > ProcessCmdWritePointer: PointerFile="etc/vmgenid_addr"
> > PointeeFile="etc/vmgenid_guid" PointerOffset=0x0 PointerSize=8
>
> This is where the WRITE_POINTER command is actually executed, during
> normal boot.
>
> > ProcessCmdAddPointer: PointerFile="etc/acpi/tables"
> > PointeeFile="etc/vmgenid_guid" PointerOffset=0x1561 PointerSize=4
>
> This is where we link "etc/vmgenid_guid" into VGIA.
>
> > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x1540
> > Start=0x1537 Length=0xCA
> > ProcessCmdAddPointer: PointerFile="etc/acpi/tables"
> > PointeeFile="etc/acpi/tables" PointerOffset=0x1625 PointerSize=4
> > ProcessCmdAddPointer: PointerFile="etc/acpi/tables"
> > PointeeFile="etc/acpi/tables" PointerOffset=0x1629 PointerSize=4
> > ProcessCmdAddPointer: PointerFile="etc/acpi/tables"
> > PointeeFile="etc/acpi/tables" PointerOffset=0x162D PointerSize=4
> > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x160A
> > Start=0x1601 Length=0x30
> > ProcessCmdAddPointer: PointerFile="etc/acpi/rsdp"
> > PointeeFile="etc/acpi/tables" PointerOffset=0x10 PointerSize=4
> > ProcessCmdAddChecksum: File="etc/acpi/rsdp" ResultOffset=0x8 Start=0x0
> > Length=0x24
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables"
> > at 0x7E7AB000 (remaining: 0x20000): found "FACS" size 0x40
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables"
> > at 0x7E7AB040 (remaining: 0x1FFC0): found "DSDT" size 0x1403
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/vmgenid_guid"
> > at 0x7FE5C000 (remaining: 0x1000): not found; marking fw_cfg blob as opaque
>
> This is where the OVMF SDT Header Probe Suppressor does its job. (NB,
> the "opaque marking" has happened already in ProcessCmdWritePointer()
> too, above.)
>
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables"
> > at 0x7E7AC443 (remaining: 0x1EBBD): found "FACP" size 0x74
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables"
> > at 0x7E7AC4B7 (remaining: 0x1EB49): found "APIC" size 0x80
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables"
> > at 0x7E7AC537 (remaining: 0x1EAC9): found "SSDT" size 0xCA
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables"
> > at 0x7E7AC601 (remaining: 0x1E9FF): found "RSDT" size 0x30
> > TransferS3ContextToBootScript: boot script fragment saved,
> > ScratchBuffer=7FE4F018
>
> This is where the WRITE_POINTER commands, stashed earlier in condensed
> form, are translated to S3 Boot Script opcodes.
>
> > InstallQemuFwCfgTables: installed 5 tables
>
> Such as: FACS, DSDT, FACP, APIC, SSDT. OVMF recognizes RSDT and ignores
> it (it's handled by edk2 automatically).
>
> > InstallQemuFwCfgTables: freeing "etc/acpi/rsdp"
> > InstallQemuFwCfgTables: freeing "etc/acpi/tables"
>
> OVMF sees that the above two blobs have not been marked as "opaque" --
> they only contained ACPI tables, judged from the ADD_POINTER commands
> that pointed into them. So these two blobs are freed.
>
> Note that "etc/vmgenid_guid" is not freed.
>
> So, from the firmware log, everything looks OK.
>
> (3) I dumped the SSDT in the RHEL-7 guest:
>
> > /*
> > * Intel ACPI Component Architecture
> > * AML/ASL+ Disassembler version 20160527-64
> > * Copyright (c) 2000 - 2016 Intel Corporation
> > *
> > * Disassembling to symbolic ASL+ operators
> > *
> > * Disassembly of ssdt.dat, Wed Feb 15 19:21:11 2017
> > *
> > * Original Table Header:
> > * Signature "SSDT"
> > * Length 0x000000CA (202)
> > * Revision 0x01
> > * Checksum 0x1D
> > * OEM ID "BOCHS "
> > * OEM Table ID "VMGENID"
> > * OEM Revision 0x00000001 (1)
> > * Compiler ID "BXPC"
> > * Compiler Version 0x00000001 (1)
> > */
> > DefinitionBlock ("", "SSDT", 1, "BOCHS ", "VMGENID", 0x00000001)
> > {
> > Name (VGIA, 0x7FE5C000)
>
> Note that the value matches the value logged by the firmware in (2).
>
> > Scope (\_SB)
> > {
> > Device (VGEN)
> > {
> > Name (_HID, "QEMUVGID") // _HID: Hardware ID
> > Name (_CID, "VM_Gen_Counter") // _CID: Compatible ID
> > Name (_DDN, "VM_Gen_Counter") // _DDN: DOS Device Name
> > Method (_STA, 0, NotSerialized) // _STA: Status
> > {
> > Local0 = 0x0F
> > If (VGIA == Zero)
> > {
> > Local0 = Zero
> > }
> >
> > Return (Local0)
> > }
> >
> > Method (ADDR, 0, NotSerialized)
> > {
> > Local0 = Package (0x02) {}
> > Local0 [Zero] = (VGIA + 0x28)
> > Local0 [One] = Zero
> > Return (Local0)
> > }
> > }
> > }
> >
> > Method (\_GPE._E05, 0, NotSerialized) // _Exx: Edge-Triggered GPE
> > {
> > Notify (\_SB.VGEN, 0x80) // Status Change
> > }
> > }
>
> Looks good and matches the documentation.
>
> (4) To be sure, I checked the address against the guest dmesg, which
> contains a dump of the UEFI memory map:
>
> > [ 0.000000] efi: mem52: type=10, attr=0xf,
> > range=[0x000000007fe5a000-0x000000007fe5e000) (0MB)
>
> The page (4096 bytes) at 0x7FE5C000 falls into this range. Type=10 means
> EfiACPIMemoryNVS.
>
> (5) At this point I dumped the guest RAM with the dump-guest-memory
> monitor command, opened it with "crash", and listed it:
>
> > crash> rd -p -8 0x7FE5C000 0x40
> > 7fe5c000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > ................
> > 7fe5c010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > ................
> > 7fe5c020: 00 00 00 00 00 00 00 00 33 22 11 00 55 44 77 66
> > ........3"..UDwf
> > 7fe5c030: 88 99 aa bb cc dd ee ff 00 00 00 00 00 00 00 00
> > ................
>
> We can see that the GUID starts at 0x7FE5C000 + 0x28, and also that the
> byte-level representation matches the little endian one given in (1).
>
> This proves that the initial blob download worked fine.
>
> (6) Here I attached "gdb" to QEMU, set a breakpoint on
> vmgenid_handle_reset(), allowed the inferior process to continue
> execution.
>
> Then I suspended and resumed the guest (ACPI S3). The breakpoint was hit
> during resume:
>
> > Breakpoint 1, vmgenid_handle_reset (opaque=0x7f2bd03c36e0) at
> > .../hw/acpi/vmgenid.c:205
> > 205 VmGenIdState *vms = VMGENID(opaque);
>
> First of all, before allowing QEMU to zero out the address blob, I
> listed the address and the contents of the address blob (here exploiting
> that my host is also little endian):
>
> > (gdb) print (void*)vms->vmgenid_addr_le
> > $2 = (void *) 0x7f2bd03c37b0
>
> > (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le
> > $4 = 0x7fe5c000
>
> This proves that QEMU has the right address, matching the firmware log
> from (2), and the ACPI dump from (3).
>
> (7) At this point I allowed the inferior to proceed a bit:
>
> > (gdb) n
> > 207 memset(vms->vmgenid_addr_le, 0,
> > ARRAY_SIZE(vms->vmgenid_addr_le));
> > (gdb) n
> > 208 }
>
> I verified that the blob was zeroed:
>
> > (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le
> > $5 = 0x0
>
> then allowed the inferior to run free.
>
> > (gdb) cont
> > Continuing.
>
> (8) New messages appeared in the firmware log:
>
> > S3ResumeExecuteBootScript()
> > PeiS3ResumeState - 7FF92B18
> > transfer control to Standalone Boot Script Executor
> > S3BootScriptExecute:
> > TableHeader - 0x7E7A7000
> > TableHeader.Version - 0x0001
> > TableHeader.TableLength - 0x000000ED
> > ExecuteBootScript - 7E7A700D
> > EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE
> > BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000010, 0x00000000
>
> Here the ACPI S3 Boot Script, prepared in
> TransferS3ContextToBootScript() -- see (2) -- creates a DMA access
> command for fw_cfg. The DMA access command is written to pre-reserved
> memory (see "ScratchBuffer" above).
>
> > S3BootScriptWidthUint8 - 0x7FE4F018 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F019 (0x2B)
>
> The fw_cfg selector is 0x2B. (See under (2).)
>
> > S3BootScriptWidthUint8 - 0x7FE4F01A (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01B (0x0C)
>
> This is a combined select+skip operation.
>
> > S3BootScriptWidthUint8 - 0x7FE4F01C (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01D (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01E (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01F (0x00)
>
> The skip size is 0 bytes.
>
> > S3BootScriptWidthUint8 - 0x7FE4F020 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F021 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F022 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F023 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F024 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F025 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F026 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F027 (0x00)
>
> The address is irrelevant for skip, so it's just nuleld.
>
> > ExecuteBootScript - 7E7A7030
> > EFI_BOOT_SCRIPT_IO_WRITE_OPCODE
> > BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002
> > S3BootScriptWidthUint32 - 0x00000514 (0x00000000)
> > S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F)
>
> The Boot Script passes the DMA command to QEMU, by writing the address
> of the command buffer to IO ports 0x514 and 0x518, in BE byte order.
>
> > ExecuteBootScript - 7E7A704B
> > EFI_BOOT_SCRIPT_MEM_POLL_OPCODE
> > BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF,
> > 0x0000000000000000
> > S3BootScriptWidthUint32 - 0x7FE4F018
> > ExecuteBootScript - 7E7A7072
>
> This waits until the DMA command succeeds (reading back the Control
> field continuously until it reads as zero).
>
> > EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE
> > BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000018, 0x00000000
>
> This is another DMA access command for fw_cfg, prepared in the same
> pre-reserved buffer. This time
>
> > S3BootScriptWidthUint8 - 0x7FE4F018 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F019 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01A (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01B (0x10)
>
> we request a write operation,
>
> > S3BootScriptWidthUint8 - 0x7FE4F01C (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01D (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01E (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01F (0x08)
>
> with a length of 8 bytes (big endian), matching the pointer size,
>
> > S3BootScriptWidthUint8 - 0x7FE4F020 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F021 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F022 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F023 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F024 (0x7F)
> > S3BootScriptWidthUint8 - 0x7FE4F025 (0xE4)
> > S3BootScriptWidthUint8 - 0x7FE4F026 (0xF0)
> > S3BootScriptWidthUint8 - 0x7FE4F027 (0x28)
>
> the data to transfer is located at 0x7FE4F028 (just below, tacked to the
> command buffer itself),
>
> > S3BootScriptWidthUint8 - 0x7FE4F028 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F029 (0xC0)
> > S3BootScriptWidthUint8 - 0x7FE4F02A (0xE5)
> > S3BootScriptWidthUint8 - 0x7FE4F02B (0x7F)
> > S3BootScriptWidthUint8 - 0x7FE4F02C (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F02D (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F02E (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F02F (0x00)
>
> and the data to write is the original allocation address of the blob
> (0x7fe5c000).
>
> > ExecuteBootScript - 7E7A709D
> > EFI_BOOT_SCRIPT_IO_WRITE_OPCODE
> > BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002
> > S3BootScriptWidthUint32 - 0x00000514 (0x00000000)
> > S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F)
> > ExecuteBootScript - 7E7A70B8
> > EFI_BOOT_SCRIPT_MEM_POLL_OPCODE
> > BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF,
> > 0x0000000000000000
> > S3BootScriptWidthUint32 - 0x7FE4F018
> > ExecuteBootScript - 7E7A70DF
>
> Same story as above: fire off the transfer and wait until it completes.
>
> > EFI_BOOT_SCRIPT_INFORMATION_OPCODE
> > BootScriptExecuteInformation - 0x7E7A70E6
> > BootScriptInformation: DE AD BE EF
> > ExecuteBootScript - 7E7A70EA
> > S3_BOOT_SCRIPT_LIB_TERMINATE_OPCODE
> > S3BootScriptDone - Success
> > [...]
>
> The DEADBEEF informational (no-op) opcode is something that OVMF appends
> to the very end for hysterical raisins.
>
> (9) Okay, so the guest is now resumed and running, let's interrupt it in
> gdb again, and check the contents of address blob again (we know the
> address of the address blob from step (6)):
>
> > ^C
> > Program received signal SIGINT, Interrupt.
> > 0x00007f2bbf1d1ebf in ppoll () from /lib64/libc.so.6
> > (gdb) print /x *(uint64_t*)0x7f2bd03c37b0
> > $6 = 0x7fe5c000
>
> Et voila.
>
> (10) I detached gdb from QEMU, and issued the following monitor command:
>
> > $ virsh qemu-monitor-command ovmf.rhel7 --hmp 'info vm-generation-id'
> > 00112233-4455-6677-8899-aabbccddeeff
>
> (11) I also booted a Windows Server 2012 R2 guest (Q35, broadcast SMI
> enabled) with a similar vmgenid device/parameter. According to Device
> Manager | System devices, "Microsoft Hyper-V Generation Counter" is
> working properly.
>
> I also tested S3 briefly, it worked okay. (I mentioned the SMI broadcast
> above because for that, OVMF generates an independent S3 Boot Script
> fragment.)
>
>
> I'll let someone else test live migration.
>
> For patches #1, #3, #4 and #5:
>
> Tested-by: Laszlo Ersek <address@hidden>
>
> I'll soon post the OVMF patches.
>
> Thanks!
> Laszlo
How do you feel about Igor's request to change WRITE_POINTER to add
offset in there, so guest can pass in the address of GUID and
not start of table? Would that be a lot of work to add?
--
MST
- [Qemu-devel] [PATCH v6 5/7] qmp/hmp: add query-vm-generation-id and 'info vm-generation-id' commands, (continued)
Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID, Laszlo Ersek, 2017/02/15