qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH QEMU v25 13/17] vfio: create mapped iova list when vIOMMU is


From: Alex Williamson
Subject: Re: [PATCH QEMU v25 13/17] vfio: create mapped iova list when vIOMMU is enabled
Date: Thu, 25 Jun 2020 11:40:39 -0600

On Thu, 25 Jun 2020 20:04:08 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:

> On 6/25/2020 12:25 AM, Alex Williamson wrote:
> > On Sun, 21 Jun 2020 01:51:22 +0530
> > Kirti Wankhede <kwankhede@nvidia.com> wrote:
> >   
> >> Create mapped iova list when vIOMMU is enabled. For each mapped iova
> >> save translated address. Add node to list on MAP and remove node from
> >> list on UNMAP.
> >> This list is used to track dirty pages during migration.  
> > 
> > This seems like a lot of overhead to support that the VM might migrate.
> > Is there no way we can build this when we start migration, for example
> > replaying the mappings at that time?  Thanks,
> >   
> 
> In my previous version I tried to go through whole range and find valid 
> iotlb, as below:
> 
> +        if (memory_region_is_iommu(section->mr)) {
> +            iotlb = address_space_get_iotlb_entry(container->space->as, 
> iova,
> +                                                 true, 
> MEMTXATTRS_UNSPECIFIED);
> 
> When mapping doesn't exist, qemu throws error as below:
> 
> qemu-system-x86_64: vtd_iova_to_slpte: detected slpte permission error 
> (iova=0x0, level=0x3, slpte=0x0, write=1)
> qemu-system-x86_64: vtd_iommu_translate: detected translation failure 
> (dev=00:03:00, iova=0x0)
> qemu-system-x86_64: New fault is not recorded due to compression of faults

My assumption would have been that we use the replay mechanism, which
is known to work because we need to use it when we hot-add a device.
We'd make use of iommu_notifier_init() to create a new handler for this
purpose, then we'd walk our container->giommu_list and call
memory_region_iommu_replay() for each.

Peter, does this sound like the right approach to you?

> Secondly, it iterates through whole range with IOMMU page size 
> granularity which is 4K, so it takes long time resulting in large 
> downtime. With this optimization, downtime with vIOMMU reduced 
> significantly.

Right, but we amortize that overhead and the resulting bloat across the
99.9999% of the time that we're not migrating.  I wonder if we could
startup another thread to handle this when we enable dirty logging.  We
don't really need the result until we start processing the dirty
bitmap, right?  Also, if we're dealing with this many separate pages,
shouldn't we be using a tree rather than a list to give us O(logN)
rather than O(N)?
 
> Other option I will try if I can check that if migration is supported 
> then only create this list.

Wouldn't we still have problems if we start with a guest IOMMU domain
with a device that doesn't support migration, hot-add a device that
does support migration, then hot-remove the original device?  Seems
like our list would only be complete since the migration device was
added.  Thanks,

Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]