qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH V2 00/11] hw/block/nvme: support multi-path for ctrl/ns


From: Klaus Jensen
Subject: Re: [RFC PATCH V2 00/11] hw/block/nvme: support multi-path for ctrl/ns
Date: Tue, 19 Jan 2021 07:04:04 +0100

On Jan 19 12:21, Minwoo Im wrote:
> On 21-01-18 22:14:45, Klaus Jensen wrote:
> > On Jan 17 23:53, Minwoo Im wrote:
> > > Hello,
> > > 
> > > This patch series introduces NVMe subsystem device to support multi-path
> > > I/O in NVMe device model.  Two use-cases are supported along with this
> > > patch: Multi-controller, Namespace Sharing.
> > > 
> > > V1 RFC has been discussed with Klaus and Keith, I really appreciate them
> > > for this patch series to have proper direction [1].
> > > 
> > > This patch series contains few start-up refactoring pathces from the
> > > first to fifth patches to make nvme-ns device not to rely on the nvme
> > > controller always.  Because nvme-ns shall be able to be mapped to the
> > > subsystem level, not a single controller level so that it should provide
> > > generic initialization code: nvme_ns_setup() with NvmeCtrl.  To do that,
> > > the first five patches are to remove the NvmeCtrl * instance argument
> > > from the nvme_ns_setup().  I'd be happy if they are picked!
> > > 
> > > For controller and namespace devices, 'subsys' property has been
> > > introduced to map them to a subsystem.  If multi-controller needed, we
> > > can specify 'subsys' to controllers the same.
> > > 
> > > For namespace deivice, if 'subsys' is not given just like it was, it
> > > will have to be provided with 'bus' parameter to specify a nvme
> > > controller device to attach, it means, they are mutual-exlusive.  To
> > > share a namespace between or among controllers, then nvme-ns should have
> > > 'subsys' property to a single nvme subsystem instance.  To make a
> > > namespace private one, then we need to specify 'bus' property rather
> > > than the 'subsys'.
> > > 
> > > Of course, this series does not require any updates for the run command
> > > for the previos users.
> > > 
> > > Plase refer the following example with nvme-cli output:
> > > 
> > > QEMU Run:
> > >   -device nvme-subsys,id=subsys0 \
> > >   -device nvme,serial=foo,id=nvme0,subsys=subsys0 \
> > >   -device nvme,serial=bar,id=nvme1,subsys=subsys0 \
> > >   -device nvme,serial=baz,id=nvme2,subsys=subsys0 \
> > >   -device nvme-ns,id=ns1,drive=drv10,nsid=1,subsys=subsys0 \
> > >   -device nvme-ns,id=ns2,drive=drv11,nsid=2,bus=nvme2 \
> > >   \
> > >   -device nvme,serial=qux,id=nvme3 \
> > >   -device nvme-ns,id=ns3,drive=drv12,nsid=3,bus=nvme3
> > > 
> > > nvme-cli:
> > >   root@vm:~/work# nvme list -v
> > >   NVM Express Subsystems
> > > 
> > >   Subsystem        Subsystem-NQN                                          
> > >                                           Controllers
> > >   ---------------- 
> > > ------------------------------------------------------------------------------------------------
> > >  ----------------
> > >   nvme-subsys1     nqn.2019-08.org.qemu:subsys0                           
> > >                                           nvme0, nvme1, nvme2
> > >   nvme-subsys3     nqn.2019-08.org.qemu:qux                               
> > >                                           nvme3
> > > 
> > >   NVM Express Controllers
> > > 
> > >   Device   SN                   MN                                       
> > > FR       TxPort Address        Subsystem    Namespaces
> > >   -------- -------------------- ---------------------------------------- 
> > > -------- ------ -------------- ------------ ----------------
> > >   nvme0    foo                  QEMU NVMe Ctrl                           
> > > 1.0      pcie   0000:00:06.0   nvme-subsys1 nvme1n1
> > >   nvme1    bar                  QEMU NVMe Ctrl                           
> > > 1.0      pcie   0000:00:07.0   nvme-subsys1 nvme1n1
> > >   nvme2    baz                  QEMU NVMe Ctrl                           
> > > 1.0      pcie   0000:00:08.0   nvme-subsys1 nvme1n1, nvme1n2
> > >   nvme3    qux                  QEMU NVMe Ctrl                           
> > > 1.0      pcie   0000:00:09.0   nvme-subsys3
> > > 
> > >   NVM Express Namespaces
> > > 
> > >   Device       NSID     Usage                      Format           
> > > Controllers
> > >   ------------ -------- -------------------------- ---------------- 
> > > ----------------
> > >   nvme1n1      1        134.22  MB / 134.22  MB    512   B +  0 B   
> > > nvme0, nvme1, nvme2
> > >   nvme1n2      2        268.44  MB / 268.44  MB    512   B +  0 B   nvme2
> > >   nvme3n1      3        268.44  MB / 268.44  MB    512   B +  0 B   nvme3
> > > 
> > > Summary:
> > >   - Refactored nvme-ns device not to rely on controller during the
> > >     setup.  [1/11 - 5/11]
> > >   - Introduced a nvme-subsys device model. [6/11]
> > >   - Create subsystem NQN based on subsystem. [7/11]
> > >   - Introduced multi-controller model. [8/11 - 9/11]
> > >   - Updated namespace sharing scheme to be based on nvme-subsys
> > >     hierarchy. [10/11 - 11/11]
> > > 
> > > Since RFC V1:
> > >   - Updated namespace sharing scheme to be based on nvme-subsys
> > >     hierarchy.
> > > 
> > 
> > Great stuff Minwoo. Thanks!
> > 
> > I'll pick up [01-05/11] directly since they are pretty trivial.
> 
> Thanks! will prepare the next series based on there.
> 
> > The subsystem model looks pretty much like it should, I don't have a lot
> > of comments.
> > 
> > One thing that I considered, is if we should reverse the "registration"
> > and think about it as namespace attachment. The spec is about
> > controllers attaching to namespaces, not the other way around.
> > Basically, let the namespaces be configured first and register on the
> > subsystem (accumulating in a "namespaces" array), then have the
> > controllers register with the subsystem and attach to all "non-detached"
> > namespaces. This allows detached namespaces to "linger" in the subsystem
> > to be attached later on. If there are any private namespaces (like ns2
> > in your example above), it will be defined after the controller with the
> > bus=ctrlX parameter like normal.
> 
> Revisited spec. again.  5.19 says "The Namespace Attachment command is
> used to attach and detach controllers from a namespace.".  and 5.20 says
> "Host software uses the Namespace Attachment command to attach or detach
> a namespace to or from a controller. The create operation does not attach
> the namespace to a controller."
> 

Yeah ok, that is pretty inconsistent.

>       -device nvme-subsys,id=subsys0
>       -device nvme-ns,id=ns1,drive=<drv>,nsid=1,subsys=subsys0
>       -device nvme,id=nvme0,serial=foo,subsys=subsys0
> 
> In this case, the 'nvme0' controller will have no namespace at the
> initial time of the boot-up.  'nvme0' can be attached to the namespace
> 'ns1' with namespace attach command.  'nvme-ns' device is same as the
> 'create-ns' operation in a NVMe subsystem.  This makes sense as spec
> 5.19 says "from a namespace".
> 
>       -device nvme,id=nvme1,serial=bar,subsys=subsys0b
>       -device nvme-ns,id=ns2,drive=<drv>,nsid=1,bus=nvme1
> 
> This case if for private namespace directly attached to controller.
> This makes sense as spec 5.20 says "to or from a controller".
> 
> All looks fine to me, but one thing I an wondering is that how can we
> attach a controller to shared namespace(s) at the initial time?
> 

Ok, nevermind. I think we can get 'detached' functionality in either
case, so no need to increase complexity by requiring a change of define
order.

Supporting CNS 0x12 and 0x13 (Identify, Controller List), we need the
controllers registered and stored in the subsystem anyway.

So, can we add a 'namespaces' array on the subsystem to keep a list of
namespaces and add a 'detached' parameter on the nvme-ns device? If that
parameter is given, the device is not registered with the controllers.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]