[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm
From: |
Stefan Hajnoczi |
Subject: |
Re: [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm |
Date: |
Fri, 30 Apr 2021 16:08:40 +0100 |
On Fri, Apr 30, 2021 at 02:27:18PM +1000, David Gibson wrote:
> On Thu, Apr 29, 2021 at 10:02:23PM +0530, Aneesh Kumar K.V wrote:
> > On 4/29/21 9:25 PM, Stefan Hajnoczi wrote:
> > > On Wed, Apr 28, 2021 at 11:48:21PM -0400, Shivaprasad G Bhat wrote:
> > > > The nvdimm devices are expected to ensure write persistence during power
> > > > failure kind of scenarios.
> > > >
> > > > The libpmem has architecture specific instructions like dcbf on POWER
> > > > to flush the cache data to backend nvdimm device during normal writes
> > > > followed by explicit flushes if the backend devices are not synchronous
> > > > DAX capable.
> > > >
> > > > Qemu - virtual nvdimm devices are memory mapped. The dcbf in the guest
> > > > and the subsequent flush doesn't traslate to actual flush to the backend
> > > > file on the host in case of file backed v-nvdimms. This is addressed by
> > > > virtio-pmem in case of x86_64 by making explicit flushes translating to
> > > > fsync at qemu.
> > > >
> > > > On SPAPR, the issue is addressed by adding a new hcall to
> > > > request for an explicit flush from the guest ndctl driver when the
> > > > backend
> > > > nvdimm cannot ensure write persistence with dcbf alone. So, the approach
> > > > here is to convey when the hcall flush is required in a device tree
> > > > property. The guest makes the hcall when the property is found, instead
> > > > of relying on dcbf.
> > >
> > > Sorry, I'm not very familiar with SPAPR. Why add a hypercall when the
> > > virtio-nvdimm device already exists?
> > >
> >
> > On virtualized ppc64 platforms, guests use papr_scm.ko kernel drive for
> > persistent memory support. This was done such that we can use one kernel
> > driver to support persistent memory with multiple hypervisors. To avoid
> > supporting multiple drivers in the guest, -device nvdimm Qemu command-line
> > results in Qemu using PAPR SCM backend. What this patch series does is to
> > make sure we expose the correct synchronous fault support, when we back such
> > nvdimm device with a file.
> >
> > The existing PAPR SCM backend enables persistent memory support with the
> > help of multiple hypercall.
> >
> > #define H_SCM_READ_METADATA 0x3E4
> > #define H_SCM_WRITE_METADATA 0x3E8
> > #define H_SCM_BIND_MEM 0x3EC
> > #define H_SCM_UNBIND_MEM 0x3F0
> > #define H_SCM_UNBIND_ALL 0x3FC
> >
> > Most of them are already implemented in Qemu. This patch series implements
> > H_SCM_FLUSH hypercall.
>
> The overall point here is that we didn't define the hypercall. It was
> defined in order to support NVDIMM/pmem devices under PowerVM. For
> uniformity between PowerVM and KVM guests, we want to support the same
> hypercall interface on KVM/qemu as well.
Okay, that's fine. Now Linux and QEMU have multiple ways of doing this,
but it's fair enough if it's an existing platform hypercall.
Stefan
signature.asc
Description: PGP signature