[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated dev
From: |
Kirti Wankhede |
Subject: |
Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices |
Date: |
Thu, 13 Oct 2016 14:52:09 +0530 |
On 10/13/2016 3:14 AM, Alex Williamson wrote:
> On Thu, 13 Oct 2016 00:32:48 +0530
> Kirti Wankhede <address@hidden> wrote:
>
>> On 10/12/2016 9:29 PM, Alex Williamson wrote:
>>> On Wed, 12 Oct 2016 20:43:48 +0530
>>> Kirti Wankhede <address@hidden> wrote:
>>>
>>>> On 10/12/2016 7:22 AM, Tian, Kevin wrote:
>>>>>> From: Kirti Wankhede [mailto:address@hidden
>>>>>> Sent: Wednesday, October 12, 2016 4:45 AM
>>>>>>>> +* mdev_supported_types:
>>>>>>>> + List of current supported mediated device types and its details
>>>>>>>> are added
>>>>>>>> +in this directory in following format:
>>>>>>>> +
>>>>>>>> +|- <parent phy device>
>>>>>>>> +|--- Vendor-specific-attributes [optional]
>>>>>>>> +|--- mdev_supported_types
>>>>>>>> +| |--- <type id>
>>>>>>>> +| | |--- create
>>>>>>>> +| | |--- name
>>>>>>>> +| | |--- available_instances
>>>>>>>> +| | |--- description /class
>>>>>>>> +| | |--- [devices]
>>>>>>>> +| |--- <type id>
>>>>>>>> +| | |--- create
>>>>>>>> +| | |--- name
>>>>>>>> +| | |--- available_instances
>>>>>>>> +| | |--- description /class
>>>>>>>> +| | |--- [devices]
>>>>>>>> +| |--- <type id>
>>>>>>>> +| |--- create
>>>>>>>> +| |--- name
>>>>>>>> +| |--- available_instances
>>>>>>>> +| |--- description /class
>>>>>>>> +| |--- [devices]
>>>>>>>> +
>>>>>>>> +[TBD : description or class is yet to be decided. This will change.]
>>>>>>>>
>>>>>>>
>>>>>>> I thought that in previous discussions we had agreed to drop
>>>>>>> the <type id> concept and use the name as the unique identifier.
>>>>>>> When reporting these types in libvirt we won't want to report
>>>>>>> the type id values - we'll want the name strings to be unique.
>>>>>>>
>>>>>>
>>>>>> The 'name' might not be unique but type_id will be. For example that Neo
>>>>>> pointed out in earlier discussion, virtual devices can come from two
>>>>>> different physical devices, end user would be presented with what they
>>>>>> had selected but there will be internal implementation differences. In
>>>>>> that case 'type_id' will be unique.
>>>>>>
>>>>>
>>>>> Hi, Kirti, my understanding is that Neo agreed to use an unique type
>>>>> string (if you still called it <type id>), and then no need of additional
>>>>> 'name' field which can be put inside 'description' field. See below quote:
>>>>>
>>>>
>>>> We had internal discussions about this within NVIDIA and found that
>>>> 'name' might not be unique where as 'type_id' would be unique. I'm
>>>> refering to Neo's mail after that, where Neo do pointed that out.
>>>>
>>>> https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg07714.html
>>>
>>> Everyone not privy to those internal discussions, including me, seems to
>>> think we dropped type_id and that if a vendor does not have a stable
>>> name, they can compose some sort of stable type description based on the
>>> name+id, or even vendor+id, ex. NVIDIA-11. So please share why we
>>> haven't managed to kill off type_id yet. No matter what internal
>>> representation each vendor driver has of "type_id" it seems possible
>>> for it to come up with stable string to define a given configuration.
>>
>>
>> The 'type_id' is unique and the 'name' are not, the name is just a
>> virtual device name/ human readable name. Because at this moment Intel
>> can't define a proper GPU class, we have to add a 'description' field
>> there as well to represent the features of this virtual device, once we
>> have all agreed with the GPU class and its mandatory attributes, the
>> 'description' field can be removed. Here is an example,
>> type_id/type_name = NVIDIA_11,
>> name=M60-M0Q,
>> description=2560x1600, 2 displays, 512MB"
>>
>> Neo's previous comment only applies to the situation where we will have
>> the GPU class or optional attributes defined and recognized by libvirt,
>> since that is not going to happen any time soon, we will have to have
>> the new 'description' field, and we don't want to have it mixed up with
>> 'name' field.
>>
>> We can definitely have something like name+id as Alex recommended to
>> remove the 'name' field, but it will just require libvirt to have more
>> logic to parse that string.
>
> Let's use the mtty example driver provided in patch 5 so we can all
> more clearly see how the interfaces work. I'll start from the
> beginning of my experience and work my way to the type/name thing.
>
Thanks for looking into it and getting feel of it. And I hope this helps
to understand that 'name' and 'type_id' are different.
> (please add a modules_install target to the Makefile)
>
This is an example and I feel it should not be installed in
/lib/modules/../build path. This should be used to understand the
interface and the flow of mdev device management life cycle. User can
use insmod to load driver:
# insmod mtty.ko
> # modprobe mtty
>
> Now what? It seems like I need to have prior knowledge that this
> drivers supports mdev devices and I need to go hunt for them. We need
> to create a class (ex. /sys/class/mdev/) where a user can find all the
> devices that participate in this mediated device infrastructure. That
> would point me to /sys/devices/mtty.
>
You can find devices registered to mdev framework by searching for
'mdev_supported_types' directory at the leaf nodes of devices in
/sys/devices directory. Yes, we can have 'mdev' class and links to
devices which are registered to mdev framework in /sys/class/mdev/.
> # tree /sys/devices/mtty
> /sys/devices/mtty
> |-- mdev_supported_types
> | `-- mtty1
> | |-- available_instances (1)
> | |-- create
> | |-- devices
> | `-- name ("Dual-port-serial")
> |-- mtty_dev
> | `-- sample_mtty_dev ("This is phy device")
> |-- power
> | |-- async
> | |-- autosuspend_delay_ms
> | |-- control
> | |-- runtime_active_kids
> | |-- runtime_active_time
> | |-- runtime_enabled
> | |-- runtime_status
> | |-- runtime_suspended_time
> | `-- runtime_usage
> `-- uevent
>
> Ok, but that was boring, we really need to have at least 2 supported
> types to validate the interface, so without changing the actual device
> backing, I pretended to have a single port vs dual port:
>
> /sys/devices/mtty
> |-- mdev_supported_types
> | |-- mtty1
> | | |-- available_instances (24)
> | | |-- create
> | | |-- devices
> | | `-- name (Single-port-serial)
> | `-- mtty2
> | |-- available_instances (12)
> | |-- create
> | |-- devices
> | `-- name (Dual-port-serial)
> [snip]
>
> I arbitrarily decides I have 24 ports and each single port uses 1 port
> and each dual port uses 2 ports.
>
> Before I start creating devices, what are we going to key the libvirt
> XML on? Can we do anything to prevent vendors from colliding or do we
> have any way to suggest meaningful and unique type_ids?
Libvirt would have parent and type_id in XML. No two vendors can own
same parent device. So I don't think vendors would collide even having
same type_id, since <parent, type_id> pair would always be unique.
Presumably if
> we had a PCI device hosting this, we would be rooted at that parent
> device, ex. /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0. Maybe
> the type_id should automatically be prefixed by the vendor module name,
> ex. mtty-1, i915-foo, nvidia-bar. There's something missing for
> deterministically creating a "XYZ" device and knowing exactly what that
> means and finding a parent device that supports it.
>
We can prefix type_id with module name, i.e using dev->driver->name, but
<parent, type_id> pair is unique so I don't see much benefit in doing
that.
> Let's get to mdev creating...
>
> # uuidgen > mdev_supported_types/mtty2/create
> # tree /sys/devices/mtty
> /sys/devices/mtty
> |-- e68189be-700e-41f7-93a3-b5351e79c470
> | |-- driver -> ../../../bus/mdev/drivers/vfio_mdev
> | |-- iommu_group -> ../../../kernel/iommu_groups/63
> | |-- mtty2 -> ../mdev_supported_types/mtty2
> | |-- power
> | | |-- async
> | | |-- autosuspend_delay_ms
> | | |-- control
> | | |-- runtime_active_kids
> | | |-- runtime_active_time
> | | |-- runtime_enabled
> | | |-- runtime_status
> | | |-- runtime_suspended_time
> | | `-- runtime_usage
> | |-- remove
> | |-- subsystem -> ../../../bus/mdev
> | |-- uevent
> | `-- vendor
> | `-- sample_mdev_dev ("This is MDEV
> e68189be-700e-41f7-93a3-b5351e79c470")
> |-- mdev_supported_types
> | |-- mtty1
> | | |-- available_instances (22)
> | | |-- create
> | | |-- devices
> | | `-- name
> | `-- mtty2
> | |-- available_instances (11)
> | |-- create
> | |-- devices
> | | `-- e68189be-700e-41f7-93a3-b5351e79c470 ->
> ../../../e68189be-700e-41f7-93a3-b5351e79c470
> | `-- name
>
> The mdev device was created directly under the parent, which seems like
> it's going to get messy to me (ie. imagine dropping a bunch of uuids
> into a PCI parent device's sysfs directory, how does a user know what
> they are?).
>
That is the way devices are placed in sysfs. For example below devices:
80:01.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express
Root Port 1a (rev 07)
80:02.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express
Root Port 2a (rev 07)
80:03.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express
Root Port 3a in PCI Express Mode (rev 07)
80:04.0 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
0 (rev 07)
80:04.1 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
1 (rev 07)
80:04.2 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
2 (rev 07)
80:04.3 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
3 (rev 07)
80:04.4 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
4 (rev 07)
80:04.5 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
5 (rev 07)
80:04.6 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
6 (rev 07)
80:04.7 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
7 (rev 07)
80:05.0 System peripheral: Intel Corporation Xeon E5/Core i7 Address
Map, VTd_Misc, System Management (rev 07)
80:05.2 System peripheral: Intel Corporation Xeon E5/Core i7 Control
Status and Global Errors (rev 07)
80:05.4 PIC: Intel Corporation Xeon E5/Core i7 I/O APIC (rev 07)
In sysfs, those are in same parent folder of its parent root port:
# ls /sys/devices/pci0000\:80/ -l
total 0
drwxr-xr-x 8 root root 0 Oct 13 12:08 0000:80:01.0
drwxr-xr-x 7 root root 0 Oct 13 12:08 0000:80:02.0
drwxr-xr-x 6 root root 0 Oct 13 12:08 0000:80:03.0
drwxr-xr-x 6 root root 0 Oct 13 12:08 0000:80:04.0
drwxr-xr-x 6 root root 0 Oct 13 12:08 0000:80:04.1
drwxr-xr-x 6 root root 0 Oct 13 12:08 0000:80:04.2
drwxr-xr-x 6 root root 0 Oct 13 12:08 0000:80:04.3
drwxr-xr-x 6 root root 0 Oct 13 12:08 0000:80:04.4
drwxr-xr-x 6 root root 0 Oct 13 12:08 0000:80:04.5
drwxr-xr-x 6 root root 0 Oct 13 12:08 0000:80:04.6
drwxr-xr-x 6 root root 0 Oct 13 12:08 0000:80:04.7
drwxr-xr-x 3 root root 0 Oct 13 12:08 0000:80:05.0
drwxr-xr-x 3 root root 0 Oct 13 12:08 0000:80:05.2
drwxr-xr-x 3 root root 0 Oct 13 12:08 0000:80:05.4
lrwxrwxrwx 1 root root 0 Oct 13 13:25 firmware_node ->
../LNXSYSTM:00/LNXSYBUS:00/PNP0A08:01
drwxr-xr-x 3 root root 0 Oct 13 12:08 pci_bus
drwxr-xr-x 2 root root 0 Oct 13 12:08 power
-rw-r--r-- 1 root root 4096 Oct 13 13:25 uevent
> Under the device we have "mtty2", shouldn't that be
> "mdev_supported_type", which then links to mtty2? Otherwise a user
> needs to decode from the link what this attribute is.
>
I thought it should show type, so that by looking at 'ls' output user
should be able to find type_id.
> Also here's an example of those vendor sysfs entries per device. So
> long as the vendor never expects a tool like libvirt to manipulate
> attributes there, I can see how that could be pretty powerful.
>
Yes, it is good to have vendor specific entries, libvirt might not
report/use it. That would be more useful for system admin to get extra
information manually that libvirt doesn't report.
> Moving down to the mdev_supported_types, I've updated mtty so that it
> actually adjusts available instance, and we can now see a link under
> the devices for mtty2.
>
> Also worth noting is that a link for the device appears
> in /sys/bus/mdev/devices.
>
> BTW, specifying this device for QEMU vfio-pci is where the sysfsdev
> option comes into play:
>
> -device
> vfio-pci,sysfsdev=/sys/devices/mtty/e68189be-700e-41f7-93a3-b5351e79c470
>
> Which raises another question, we can tell through the vfio interfaces
> that this is exposes as a PCI device, by creating a container
> (open(/dev/vfio/vfio)), setting an iommu (ioctl(VFIO_SET_IOMMU)),
> adding the group to the container (ioctl(VFIO_GROUP_SET_CONTAINER)),
> getting the device (ioctl(VFIO_GROUP_GET_DEVICE_FD)), and finally
> getting the device info (ioctl(VFIO_DEVICE_GET_INFO)) and checking the
> flag bit that says the API is PCI. That's a long path to go and has
> stumbling blocks like the type of iommu that's available for the given
> platform. How do we make that manageable?
Do you want device type to be expressed in sysfs? Then that should be
done from vendor driver. vfio_mdev module is now a shim layer, so mdev
core or vfio_mdev module don't know what device type flag vendor driver
had set.
Thanks,
Kirti
> I don't think we want to
> create some artificial relationship that the type of the parent
> necessarily match the type of the child mdev, we've already broken that
> with a simple mdev tty driver.
>
> One more:
>
> # uuidgen > mdev_supported_types/mtty1/create
> # tree /sys/devices/mtty
> /sys/devices/mtty
> |-- a7ae17d1-2de4-44c2-ae58-20ae0a0befe8
> | |-- driver -> ../../../bus/mdev/drivers/vfio_mdev
> | |-- iommu_group -> ../../../kernel/iommu_groups/64
> | |-- mtty1 -> ../mdev_supported_types/mtty1
> | |-- power
> [snip]
> | |-- remove
> | |-- subsystem -> ../../../bus/mdev
> | |-- uevent
> | `-- vendor
> | `-- sample_mdev_dev ("This is MDEV
> a7ae17d1-2de4-44c2-ae58-20ae0a0befe8")
> |-- e68189be-700e-41f7-93a3-b5351e79c470
> | |-- driver -> ../../../bus/mdev/drivers/vfio_mdev
> | |-- iommu_group -> ../../../kernel/iommu_groups/63
> | |-- mtty2 -> ../mdev_supported_types/mtty2
> | |-- power
> [snip]
> | |-- remove
> | |-- subsystem -> ../../../bus/mdev
> | |-- uevent
> | `-- vendor
> | `-- sample_mdev_dev ("This is MDEV
> e68189be-700e-41f7-93a3-b5351e79c470")
> |-- mdev_supported_types
> | |-- mtty1
> | | |-- available_instances (21)
> | | |-- create
> | | |-- devices
> | | | `-- a7ae17d1-2de4-44c2-ae58-20ae0a0befe8 ->
> ../../../a7ae17d1-2de4-44c2-ae58-20ae0a0befe8
> | | `-- name
> | `-- mtty2
> | |-- available_instances (10)
> | |-- create
> | |-- devices
> | | `-- e68189be-700e-41f7-93a3-b5351e79c470 ->
> ../../../e68189be-700e-41f7-93a3-b5351e79c470
> | `-- name
>
> Hopefully as expected with the caveats for the first example.
>
> # echo 1 > a7ae17d1-2de4-44c2-ae58-20ae0a0befe8/remove
> # echo 1 > e68189be-700e-41f7-93a3-b5351e79c470/remove
>
> These do what they're supposed to, the devices are gone.
>
> Ok, I've identified some issues, let's figure out how to resolve them.
> Thanks,
>
> Alex
>
> (hack multi-port mtty patch attached)
>
- Re: [Qemu-devel] [PATCH v8 3/6] vfio iommu: Add support for mediated devices, (continued)
- [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Kirti Wankhede, 2016/10/10
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Daniel P. Berrange, 2016/10/11
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Kirti Wankhede, 2016/10/11
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Tian, Kevin, 2016/10/11
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Kirti Wankhede, 2016/10/12
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Alex Williamson, 2016/10/12
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Kirti Wankhede, 2016/10/12
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Alex Williamson, 2016/10/12
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices,
Kirti Wankhede <=
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Alex Williamson, 2016/10/13
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Paolo Bonzini, 2016/10/13
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Alex Williamson, 2016/10/13
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Kirti Wankhede, 2016/10/13
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Alex Williamson, 2016/10/14
- Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Tian, Kevin, 2016/10/12
Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices, Jike Song, 2016/10/13
[Qemu-devel] [PATCH v8 5/6] Add simple sample driver for mediated device framework, Kirti Wankhede, 2016/10/10
[Qemu-devel] [PATCH v8 6/6] Add common functions for SET_IRQS and GET_REGION_INFO ioctls, Kirti Wankhede, 2016/10/10