[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device
From: |
Jean-Philippe Brucker |
Subject: |
Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device |
Date: |
Wed, 12 Jul 2017 11:18:06 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 |
On 12/07/17 04:50, Bharat Bhushan wrote:
[...]
>> The size of the virtio_iommu_req_probe structure is variable, and depends
>> what fields the device implements. So the device initially computes the size
>> it
>> needs to fill virtio_iommu_req_probe, describes it in probe_size, and the
>> driver allocates that many bytes for virtio_iommu_req_probe.content[]
>>
>>>> * When device offers VIRTIO_IOMMU_F_PROBE, the driver should send
>> an
>>>> VIRTIO_IOMMU_T_PROBE request for each new endpoint.
>>>> * The driver allocates a device-writeable buffer of probe_size (plus
>>>> framing) and sends it as a VIRTIO_IOMMU_T_PROBE request.
>>>> * The device fills the buffer with various information.
>>>>
>>>> struct virtio_iommu_req_probe {
>>>> /* device-readable */
>>>> struct virtio_iommu_req_head head;
>>>> le32 device;
>>>> le32 flags;
>>>>
>>>> /* maybe also le32 content_size, but it must be equal to
>>>> probe_size */
>>>
>>> Can you please describe why we need to pass size of "probe_size" in probe
>> request?
>>
>> We don't. I don't think we should add this 'content_size' field unless there
>> is
>> a compelling reason to do so.
>>
>>>>
>>>> /* device-writeable */
>>>> u8 content[];
>>>
>>> I assume content_size above is the size of array "content[]" and max value
>> can be equal to probe_size advertised by device?
>>
>> probe_size is exactly the size of array content[]. The driver must allocate a
>> buffer of this size (plus the space needed for head, device, flags and tail).
>>
>> Then the device is free to leave parts of content[] empty. Field 'type' 0
>> will be
>> reserved and mark the end of the array.
>>
>>>> struct virtio_iommu_req_tail tail;
>>>> };
>>>>
>>>> I'm still struggling with the content and layout of the probe
>>>> request, and would appreciate any feedback. To be easily extended, I
>>>> think it should contain a list of fields of variable size:
>>>>
>>>> |0 15|16 31|32 N|
>>>> | type | length | values |
>>>>
>>>> 'length' might be made optional if it can be deduced from type, but
>>>> might make driver-side parsing more robust.
>>>>
>>>> The probe could either be done for each endpoint, or for each address
>>>> space. I much prefer endpoint because it is the smallest granularity.
>>>> The driver can then decide what endpoints to put together in the same
>>>> address space based on their individual capabilities. The
>>>> specification would described how each endpoint property is combined
>>>> when endpoints are put in the same address space. For example, take
>>>> the minimum of all PASID size, the maximum of all page granularities,
>>>> combine doorbell addresses, etc.
>>>>
>>>> If we did the probe on address spaces instead, the driver would have
>>>> to re-send a probe request each time a new endpoint is attached to an
>>>> existing address space, to see if it is still capable of page table
>>>> handover or if the driver just combined a VFIO and an emulated
>>>> endpoint by accident.
>>>>
>>>> ***
>>>>
>>>> Using this framework, the device can declare doorbell regions by
>>>> adding one or more RESV fields into the probe buffer:
>>>>
>>>> /* 'type' */
>>>> #define VIRTIO_IOMMU_PROBE_T_RESV 0x1
>>>>
>>>> /* 'values'. 'length' is sizeof(struct virtio_iommu_probe_resv) */
>>>> struct virtio_iommu_probe_resv {
>>>> le64 gpa;
>>>> le64 size;
>>>>
>>>> #define VIRTIO_IOMMU_PROBE_RESV_MSI 0x1
>>>> u8 type;
>
> To be sure I am understanding it correctly, Is this "type" in struct
> virtio_iommu_req_head?
No, virtio_iommu_req_head::type is the request type
(ATTACH/DETACH/MAP/UNMAP/PROBE).
Then virtio_iommu_probe_property::type is the property type (only RESV for
the moment).
And this is virtio_iommu_probe_resv::type, which is the type of the resv
region (MSI). I renamed it to 'subtype' below, but I think it still is
pretty confusing.
I did a number of changes to structures and naming when trying to
integrate it to the specification:
* Added 64 bytes of padding in virtio_iommu_req_probe, so that future
extensions can add fields in the device-readable part.
* renamed "RESV" to "RESV_MEM".
* The resv_mem property now looks like this:
struct virtio_iommu_probe_resv_mem {
u8 subtype;
u8 padding[3];
le32 flags;
le64 addr;
le64 size;
};
* subtype for MSI doorbells is now VIRTIO_IOMMU_PROBE_RESV_MEM_T_BYPASS
(because transactions to this region bypass the IOMMU). 'flags' contain a
hint VIRTIO_IOMMU_PROBE_RESV_MEM_F_MSI, telling the driver that this
region is used for MSIs.
Here is an example of a probe request returning an MSI doorbell property.
31 7 0
+---------------------------------+
| 0 | type | <- request type = PROBE (5)
+---------------------------------+
| device |
+---------------------------------+
: :
: (64B padding) :
: :
+---------------------------------+
^ | length = 24 | type = 1 | <- property type = RESV_MEM (1)
| +---------------------------------+
| | 0 |subtype | <- RESV_MEM subtype = BYPASS (1)
p| +---------------------------------+
r| | flags = MSI |
o| +---------------------------------+
b| | addr = 0xfee00000 |
e| | |
_| +---------------------------------+
s| | size = 0x00100000 |
i| | |
z| +---------------------------------+
e| | length | type | <- another property may start
| : : here
v : ... :
+---------------------------------+
| 0 | status | <- request tail
+---------------------------------+
I'll try to send the next version of the spec out as soon as possible.
Thanks,
Jean
> Thanks
> -Bharat
>
>>>
>>> type is 16 bit above?
>>
>> Ah, the naming isn't great. This is not the same as above, and could be
>> called
>> 'subtype' to avoid confusion. The above 16-bit type describes the field type,
>> e.g. struct virtio_iommu_probe_resv. I proposed 16-bit because it seems
>> easy to reach more than 255 kinds of endpoint properties, but
>> 65535 should do.
>>
>> This subtype describes which kind of resv region is described in the
>> structure.
>> For the moment there only is VIRTIO_IOMMU_PROBE_RESV_MSI, but we
>> could for example add resv regions that the driver should never use or that
>> it
>> should identity-map (equivalent to IOMMU_RESV_RESERVED/DIRECT in
>> Linux). I think 8 bits should be enough to contain any future types, unless
>> we
>> make this a bitfield. For identity-map, there may be an additional flags
>> field
>> describing the protection.
>>
>>>> };
>>>>
>>>> Such a region would be subject to the following rules:
>>>>
>>>> * Driver should not use any IOVA declared as RESV_MSI in a mapping.
>>>> * Device should leave any transaction matching a RESV_MSI region pass
>>>> through untranslated.
>>>> * If the device does not advertise any RESV region, then the driver
>>>> should assume that MSI doorbells, like any other GPA, must be mapped
>>>> with an arbitrary IOVA in order for the endpoint to access them.
>>>> * Given that the driver *should* perform a probe request if
>>>> available, and it *should* understand the
>> VIRTIO_IOMMU_PROBE_T_RESV
>>>> field, then this field tells the guest how it should handle MSI
>>>> doorbells, and whether it should map the address via MAP requests or
>> not.
>>>>
>>>> Does this make sense and did I overlook something?
>>>
>>> Overall it looks good to me. Do you have plans to implements this in virtio-
>> iommu driver and kvmtool?
>>
>> Yes, if there is no objection I'll try to formalize it and implement it
>> right away.
>>
>> Thanks,
>> Jean
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, (continued)
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Tian, Kevin, 2017/07/14
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Bharat Bhushan, 2017/07/07
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Auger Eric, 2017/07/07
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Bharat Bhushan, 2017/07/07
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Jean-Philippe Brucker, 2017/07/07
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Bharat Bhushan, 2017/07/11
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Jean-Philippe Brucker, 2017/07/11
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Bharat Bhushan, 2017/07/11
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device,
Jean-Philippe Brucker <=
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Bharat Bhushan, 2017/07/12
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Jean-Philippe Brucker, 2017/07/12
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Bharat Bhushan, 2017/07/12
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Auger Eric, 2017/07/06
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Auger Eric, 2017/07/07
- Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Jean-Philippe Brucker, 2017/07/07
Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Tian, Kevin, 2017/07/05
Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device, Tian, Kevin, 2017/07/05