qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [RFC PATCH 0/4] nvdimm: enable flush hint address structure


From: Haozhong Zhang
Subject: [Qemu-devel] [RFC PATCH 0/4] nvdimm: enable flush hint address structure
Date: Fri, 31 Mar 2017 16:41:43 +0800

This patch series constructs the flush hint address structures for
nvdimm devices in QEMU.

It's of course not for 2.9. I send it out early in order to get
comments on one point I'm uncertain (see the detailed explanation
below). Thanks for any comments in advance!


Background
---------------
Flush hint address structure is a substructure of NFIT and specifies
one or more addresses, namely Flush Hint Addresses. Software can write
to any one of these flush hint addresses to cause any preceding writes
to the NVDIMM region to be flushed out of the intervening platform
buffers to the targeted NVDIMM. More details can be found in ACPI Spec
6.1, Section 5.2.25.8 "Flush Hint Address Structure".


Why is it RFC?
---------------
RFC is added because I'm not sure whether the way in this patch series
that allocates the guest flush hint addresses is right.

QEMU needs to trap guest accesses (at least for writes) to the flush
hint addresses in order to perform the necessary flush on the host
back store. Therefore, QEMU needs to create IO memory regions that
cover those flush hint addresses. In order to create those IO memory
regions, QEMU needs to know the flush hint addresses or their offsets
to other known memory regions in advance. So far looks good.

Flush hint addresses are in the guest address space. Looking at how
the current NVDIMM ACPI in QEMU allocates the DSM buffer, it's natural
to take the same way for flush hint addresses, i.e. let the guest
firmware allocate from free addresses and patch them in the flush hint
address structure. (*Please correct me If my following understand is wrong*)
However, the current allocation and pointer patching are transparent
to QEMU, so QEMU will be unaware of the flush hint addresses, and
consequently have no way to create corresponding IO memory regions in
order to trap guest accesses.

Alternatively, this patch series moves the allocation of flush hint
addresses to QEMU:

1. (Patch 1) We reserve an address range after the end address of each
   nvdimm device. Its size is specified by the user via a new pc-dimm
   option 'reserved-size'.

   For the following example,
        -object memory-backend-file,id=mem0,size=4G,...
        -device nvdimm,id=dimm0,memdev=mem0,reserved-size=4K,...
        -device pc-dimm,id=dimm1,...
   if dimm0 is allocated to address N ~ N+4G, the address of dimm1
   will start from N+4G+4K or higher. N+4G ~ N+4G+4K is reserved for
   dimm0.

2. (Patch 4) When NVDIMM ACPI code builds the flush hint address
   structure for each nvdimm device, it will allocate them from the
   above reserved area, e.g. the flush hint addresses of above dimm0
   are allocated in N+4G ~ N+4G+4K. The addresses are known to QEMU in
   this way, so QEMU can easily create IO memory regions for them.

   If the reserved area is not present or too small, QEMU will report
   errors.


How to test?
---------------
Add options 'flush-hint' and 'reserved-size' when creating a nvdimm
device, e.g.
    qemu-system-x86_64 -machine pc,nvdimm \
                       -m 4G,slots=4,maxmem=128G \
                       -object 
memory-backend-file,id=mem1,share,mem-path=/dev/pmem0 \
                       -device 
nvdimm,id=nv1,memdev=mem1,reserved-size=4K,flush-hint \
                       ...

The guest OS should be able to find a flush hint address structure in
NFIT. For guest Linux kernel v4.8 or later which supports flush hint,
if QEMU is built with NVDIMM_DEBUG = 1 in include/hw/mem/nvdimm.h, it
will print debug messages like
    nvdimm: Write Flush Hint: offset 0x0, data 0x1
    nvdimm: Write Flush Hint: offset 0x4, data 0x0
when linux performs flush via flush hint address.



Haozhong Zhang (4):
  pc-dimm: add 'reserved-size' to reserve address range after the ending address
  nvdimm: add functions to initialize and perform flush on back store
  nvdimm acpi: record the cache line size in AcpiNVDIMMState
  nvdimm acpi: build flush hint address structure if required

 hw/acpi/nvdimm.c         | 111 ++++++++++++++++++++++++++++++++++++++++++++---
 hw/i386/pc.c             |   5 ++-
 hw/i386/pc_piix.c        |   2 +-
 hw/i386/pc_q35.c         |   2 +-
 hw/mem/nvdimm.c          |  48 ++++++++++++++++++++
 hw/mem/pc-dimm.c         |  48 ++++++++++++++++++--
 include/hw/mem/nvdimm.h  |  20 ++++++++-
 include/hw/mem/pc-dimm.h |   2 +
 8 files changed, 224 insertions(+), 14 deletions(-)

-- 
2.10.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]