qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH V2 4/4] intel-iommu: PASID support


From: Tian, Kevin
Subject: RE: [PATCH V2 4/4] intel-iommu: PASID support
Date: Wed, 30 Mar 2022 08:00:04 +0000

> From: Jason Wang <jasowang@redhat.com>
> Sent: Tuesday, March 29, 2022 12:47 PM
> 
> On Mon, Mar 28, 2022 at 2:47 PM Tian, Kevin <kevin.tian@intel.com> wrote:
> >
> > > From: Jason Wang <jasowang@redhat.com>
> > > Sent: Monday, March 28, 2022 10:31 AM
> > >
> > > On Thu, Mar 24, 2022 at 4:54 PM Tian, Kevin <kevin.tian@intel.com>
> wrote:
> > > >
> > > > > From: Jason Wang
> > > > > Sent: Monday, March 21, 2022 1:54 PM
> > > > >
> > > > > This patch introduce ECAP_PASID via "x-pasid-mode". Based on the
> > > > > existing support for scalable mode, we need to implement the
> following
> > > > > missing parts:
> > > > >
> > > > > 1) tag VTDAddressSpace with PASID and support IOMMU/DMA
> > > translation
> > > > >    with PASID
> > > > > 2) tag IOTLB with PASID
> > > >
> > > > and invalidate desc to flush PASID iotlb, which seems missing in this
> patch.
> > >
> > > It existed in the previous version, but it looks like it will be used
> > > only for the first level page table which is not supported right now.
> > > So I deleted the codes.
> >
> > You are right. But there is also PASID-based device TLB invalidate 
> > descriptor
> > which is orthogonal to 1st vs. 2nd level thing. If we don't want to break 
> > the
> > spec with this series then there will need a way to prevent the user from
> > setting both "device-iotlb" and "x-pasid-mode" together.
> 
> Right, let me do it in the next version.
> 
> 
> >
> > >
> > > >
> > > > > 3) PASID cache and its flush
> > > > > 4) Fault recording with PASID
> > > > >
> > > > > For simplicity:
> > > > >
> > > > > 1) PASID cache is not implemented so we can simply implement the
> PASID
> > > > > cache flush as a nop.
> > > > > 2) Fault recording with PASID is not supported, NFR is not changed.
> > > > >
> > > > > All of the above is not mandatory and could be implemented in the
> > > > > future.
> > > >
> > > > PASID cache is optional, but fault recording with PASID is required.
> > >
> > > Any pointer in the spec to say something like this? I think sticking
> > > to the NFR would be sufficient.
> >
> > I didn't remember any place in spec saying that fault recording with PASID
> is
> > not required when PASID capability is exposed.
> 
> Ok, but as a spec it needs to clarify what is required for each capability.

It is clarified in 10.4.14 Fault Recording Registers:

  "PV: PASID Value": PASID value used by the faulted request.
  For requests with PASID, this field reports the PASID value that
  came with the request. Hardware implementations not supporting
  PASID (PASID field Clear in Extended Capability register) and not
  supporting RID_PASID (RPS field Clear in Extended Capability
  Register) implement this field as RsvdZ.

Above reflects that when PASID capability is enabled the PV field
should include PASID value for the faulted request.

Similar description can be found in another field "PP: PASID Present"

> 
> > If there is certain fault
> > triggered by a request with PASID, we do want to report this information
> > upward.
> 
> I tend to do it increasingly on top of this series (anyhow at least
> RID2PASID is introduced before this series)

Yes, RID2PASID should have been recorded too but it's not done correctly.

If you do it in separate series, it implies that you will introduce another
"x-pasid-fault' to guard the new logic related to PASID fault recording?

> 
> >
> > btw can you elaborate why NFR matters to PASID? It is just about the
> > number of fault recording register...
> 
> I might be wrong, but I thought without increasing NFR we may lack
> sufficient room for reporting PASID.

I think they are orthogonal things.

> 
> >
> > >
> > > > I'm fine with adding it incrementally but want to clarify the concept 
> > > > first.
> > >
> > > Yes, that's the plan.
> > >
> >
> > I have one open which requires your input.
> >
> > While incrementally enabling things does be a common practice, one worry
> > is whether we want to create too many control knobs in the staging process
> > to cause confusion to the end user.
> 
> It should be fine as long as we use the "x-" prefix which will be
> finally removed.

Good to learn.

> 
> >
> > Earlier when Yi proposed Qemu changes for guest SVA [1] he aimed for a
> > coarse-grained knob design:
> > --
> >   Intel VT-d 3.0 introduces scalable mode, and it has a bunch of 
> > capabilities
> >   related to scalable mode translation, thus there are multiple 
> > combinations.
> >   While this vIOMMU implementation wants simplify it for user by providing
> >   typical combinations. User could config it by "x-scalable-mode" option.
> The
> >   usage is as below:
> >     "-device intel-iommu,x-scalable-mode=["legacy"|"modern"]"
> >
> >     - "legacy": gives support for SL page table
> >     - "modern": gives support for FL page table, pasid, virtual command
> >     -  if not configured, means no scalable mode support, if not proper
> >        configured, will throw error
> > --
> >
> > Which way do you prefer to?
> >
> > [1] https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg02805.html
> 
> My understanding is that, if we want to deploy Qemu in a production
> environment, we can't use the "x-" prefix. We need a full
> implementation of each cap.
> 
> E.g
> -device intel-iommu,first-level=on,scalable-mode=on etc.
> 

You meant each cap will get a separate control option?

But that way requires the management stack or admin to have deep
knowledge about how combinations of different capabilities work, e.g.
if just turning on scalable mode w/o first-level cannot support vSVA
on assigned devices. Is this a common practice when defining Qemu
parameters? 

Thanks
Kevin

reply via email to

[Prev in Thread] Current Thread [Next in Thread]