qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC 00/13] hw/nvme: experimental user-creatable objects


From: Klaus Jensen
Subject: Re: [PATCH RFC 00/13] hw/nvme: experimental user-creatable objects
Date: Fri, 17 Sep 2021 10:02:38 +0200

On Sep 17 09:38, Kevin Wolf wrote:
> Am 17.09.2021 um 08:21 hat Klaus Jensen geschrieben:
> > On Sep 16 18:30, Klaus Jensen wrote:
> > > On Sep 16 14:41, Kevin Wolf wrote:
> > > > Am 14.09.2021 um 22:37 hat Klaus Jensen geschrieben:
> > > > > From: Klaus Jensen <k.jensen@samsung.com>
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > This is an attempt at adressing a bunch of issues that have presented
> > > > > themselves since we added subsystem support. It's been brewing for a
> > > > > while now.
> > > > > 
> > > > > Fundamentally, I've come to the conclusion that modeling namespaces 
> > > > > and
> > > > > subsystems as "devices" is wrong. They should have been user-creatable
> > > > > objects. We've run into multiple issues with wrt. hotplugging due to 
> > > > > how
> > > > > namespaces hook up to the controller with a bus. The bus-based design
> > > > > made a lot of sense when we didn't have subsystem support and it 
> > > > > follows
> > > > > the design of hw/scsi. But, the problem here is that the bus-based
> > > > > design dictates a one parent relationship, and with shared namespaces,
> > > > > that is just not true. If the namespaces are considered to have a 
> > > > > single
> > > > > parent, that parent is the subsystem, not any specific controller.
> > > > > 
> > > > > This series adds a set of experimental user-creatable objects:
> > > > > 
> > > > >   -object x-nvme-subsystem
> > > > >   -object x-nvme-ns-nvm
> > > > >   -object x-nvme-ns-zoned
> > > > > 
> > > > > It also adds a new controller device (-device x-nvme-ctrl) that 
> > > > > supports
> > > > > these new objects (and gets rid of a bunch of deprecated and confusing
> > > > > parameters). This new approach has a bunch of benefits (other than 
> > > > > just
> > > > > fixing the hotplugging issues properly) - we also get support for some
> > > > > nice introspection through some new dynamic properties:
> > > > > 
> > > > >   (qemu) qom-get /machine/peripheral/nvme-ctrl-1 attached-namespaces
> > > > >   [
> > > > >       "/objects/nvm-1",
> > > > >       "/objects/zns-1"
> > > > >   ]
> > > > > 
> > > > >   (qemu) qom-list /objects/zns-1
> > > > >   type (string)
> > > > >   subsys (link<x-nvme-subsystem>)
> > > > >   nsid (uint32)
> > > > >   uuid (string)
> > > > >   attached-ctrls (str)
> > > > >   eui64 (string)
> > > > >   blockdev (string)
> > > > >   pi-first (bool)
> > > > >   pi-type (NvmeProtInfoType)
> > > > >   extended-lba (bool)
> > > > >   metadata-size (uint16)
> > > > >   lba-size (size)
> > > > >   zone-descriptor-extension-size (size)
> > > > >   zone-cross-read (bool)
> > > > >   zone-max-open (uint32)
> > > > >   zone-capacity (size)
> > > > >   zone-size (size)
> > > > >   zone-max-active (uint32)
> > > > > 
> > > > >   (qemu) qom-get /objects/zns-1 pi-type
> > > > >   "none"
> > > > > 
> > > > >   (qemu) qom-get /objects/zns-1 eui64
> > > > >   "52:54:00:17:67:a0:40:15"
> > > > > 
> > > > >   (qemu) qom-get /objects/zns-1 zone-capacity
> > > > >   12582912
> > > > > 
> > > > > Currently, there are no shortcuts, so you have to define the full
> > > > > topology to get it up and running. Notice that the topology is 
> > > > > explicit
> > > > > (the 'subsys' and 'attached-ctrls' links). There are no 'nvme-bus'
> > > > > anymore.
> > > > > 
> > > > >   -object x-nvme-subsystem,id=subsys0,subnqn=foo
> > > > >   -device x-nvme-ctrl,id=nvme-ctrl-0,serial=foo,subsys=subsys0
> > > > >   -device x-nvme-ctrl,id=nvme-ctrl-1,serial=bar,subsys=subsys0
> > > > >   -drive  id=nvm-1,file=nvm-1.img,format=raw,if=none,discard=unmap
> > > > >   -object 
> > > > > x-nvme-ns-nvm,id=nvm-1,blockdev=nvm-1,nsid=1,subsys=subsys0,attached-ctrls=nvme-ctrl-1
> > > > >   -drive  id=nvm-2,file=nvm-2.img,format=raw,if=none,discard=unmap
> > > > >   -object 
> > > > > x-nvme-ns-nvm,id=nvm-2,blockdev=nvm-2,nsid=2,subsys=subsys0,attached-ctrls=nvme-ctrl-0
> > > > 
> > > > I may be wrong here, but my first gut feeling when seeing this was that
> > > > referencing the controller device in the namespace object feels
> > > > backwards. Usually, we have objects that are created independently and
> > > > then the devices reference them.
> > > > 
> > > > Your need to use a machine_done notifier is probably related to that,
> > > > too, because it goes against the normal initialisation order, so you
> > > > have to wait. Error handling also isn't really possible in the notifier
> > > > any more, so this series seems to just print something to stderr, but
> > > > ignore the error otherwise.
> > > > 
> > > > Did you consider passing a list of namespaces to the controller device
> > > > instead?
> > > > 
> > > > I guess a problem that you have with both ways is that support for
> > > > list options isn't great in QemuOpts, which is still used both for
> > > > -object and -device in the system emulator...
> > > 
> > > Heh. Exactly. The ability to better support lists with -object through
> > > QAPI is why I did it like this...
> 
> I see. I really need to continue with the QAPIfication work, in the hope
> that devices will have the same level of support for lists in the
> future...
> 
> QOM really has usable support for lists, it's just the outer layer that
> doesn't support them well. I wonder how hard (or how stupid) it would be
> to support JSON syntax without QAPIfying things first.
> 
> > > Having the list of namespaces on the controller is preferable. I'll see
> > > what I can come up with.
> > 
> > There is also the issue that the x-nvme-ns-nvm -object needs a blockdev
> > - and the ordering is also a problem here. That also requires the
> > machine done notifier.
> 
> True, initialisation order can be a mess because QEMU tries to be clever
> and figure it out automatically. But I guess for this part you could
> just have an exception in object_create_early() to switch it to late
> initialisation when blockdevs are already there.

Oh! That's neat! Did not know about that, thanks!

> 
> Even when returning false in object_create_early(), this is still before
> device creation, so the question of whether to move the list to the
> device remains relevant.
> 

Yeah, but I think we can honestly get rid of the full blown flexibility
and just have all namespaces be attached to all controllers initially.
Should QEMU get better support for lists on qdevs, then we can add it.
I'm leaning towards that.


-- 
k



reply via email to

[Prev in Thread] Current Thread [Next in Thread]