qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH 03/10] vfio: Check guest IOVA ranges against


From: David Gibson
Subject: Re: [Qemu-devel] [RFC PATCH 03/10] vfio: Check guest IOVA ranges against host IOMMU capabilities
Date: Wed, 23 Sep 2015 21:07:06 +1000
User-agent: Mutt/1.5.23 (2014-03-12)

On Wed, Sep 23, 2015 at 12:10:46PM +0200, Thomas Huth wrote:
> On 17/09/15 15:09, David Gibson wrote:
> > The current vfio core code assumes that the host IOMMU is capable of
> > mapping any IOVA the guest wants to use to where we need.  However, real
> > IOMMUs generally only support translating a certain range of IOVAs (the
> > "DMA window") not a full 64-bit address space.
> > 
> > The common x86 IOMMUs support a wide enough range that guests are very
> > unlikely to go beyond it in practice, however the IOMMU used on IBM Power
> > machines - in the default configuration - supports only a much more limited
> > IOVA range, usually 0..2GiB.
> > 
> > If the guest attempts to set up an IOVA range that the host IOMMU can't
> > map, qemu won't report an error until it actually attempts to map a bad
> > IOVA.  If guest RAM is being mapped directly into the IOMMU (i.e. no guest
> > visible IOMMU) then this will show up very quickly.  If there is a guest
> > visible IOMMU, however, the problem might not show up until much later when
> > the guest actually attempt to DMA with an IOVA the host can't handle.
> > 
> > This patch adds a test so that we will detect earlier if the guest is
> > attempting to use IOVA ranges that the host IOMMU won't be able to deal
> > with.
> > 
> > For now, we assume that "Type1" (x86) IOMMUs can support any IOVA, this is
> > incorrect, but no worse than what we have already.  We can't do better for
> > now because the Type1 kernel interface doesn't tell us what IOVA range the
> > IOMMU actually supports.
> > 
> > For the Power "sPAPR TCE" IOMMU, however, we can retrieve the supported
> > IOVA range and validate guest IOVA ranges against it, and this patch does
> > so.
> > 
> > Signed-off-by: David Gibson <address@hidden>
> > ---
> >  hw/vfio/common.c              | 42 
> > +++++++++++++++++++++++++++++++++++++++---
> >  include/hw/vfio/vfio-common.h |  6 ++++++
> >  2 files changed, 45 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > index 9953b9c..c37f1a1 100644
> > --- a/hw/vfio/common.c
> > +++ b/hw/vfio/common.c
> > @@ -344,14 +344,23 @@ static void vfio_listener_region_add(MemoryListener 
> > *listener,
> >      if (int128_ge(int128_make64(iova), llend)) {
> >          return;
> >      }
> > +    end = int128_get64(llend);
> > +
> > +    if ((iova < container->iommu_data.min_iova)
> > +        || ((end - 1) > container->iommu_data.max_iova)) {
> 
> (Too much paranthesis for my taste ;-))

Yes, well, we've already established our tastes differ on that point.

> > +        error_report("vfio: IOMMU container %p can't map guest IOVA region"
> > +                     " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
> > +                     container, iova, end - 1);
> > +        ret = -EFAULT; /* FIXME: better choice here? */
> 
> Maybe -EINVAL? ... but -EFAULT also sounds ok for me.

I try to avoid EINVAL unless it's clearly the only right choice.  So
many things use it that it tends to be very unhelpful when you get one.

> > +        goto fail;
> > +    }
> ...
> > @@ -712,6 +732,22 @@ static int vfio_connect_container(VFIOGroup *group, 
> > AddressSpace *as)
> >              ret = -errno;
> >              goto free_container_exit;
> >          }
> > +
> > +        /*
> > +         * FIXME: This only considers the host IOMMU' 32-bit window.
> > +         * At some point we need to add support for the optional
> > +         * 64-bit window and dynamic windows
> > +         */
> > +        info.argsz = sizeof(info);
> > +        ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info);
> > +        if (ret) {
> > +            error_report("vfio: VFIO_IOMMU_SPAPR_TCE_GET_INFO failed: %m");
> 
> Isn't that %m a glibc extension only? ... Well, this code likely only
> runs on Linux with a glibc, so it likely doesn't matter, I guess...

Yes, it is, but it's already used extensively within qemu.

> > +            ret = -errno;
> > +            goto free_container_exit;
> > +        }
> > +        container->iommu_data.min_iova = info.dma32_window_start;
> > +        container->iommu_data.max_iova = container->iommu_data.min_iova
> > +            + info.dma32_window_size - 1;
> >      } else {
> >          error_report("vfio: No available IOMMU models");
> >          ret = -EINVAL;
> > diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> > index aff18cd..88ec213 100644
> > --- a/include/hw/vfio/vfio-common.h
> > +++ b/include/hw/vfio/vfio-common.h
> > @@ -71,6 +71,12 @@ typedef struct VFIOContainer {
> >          MemoryListener listener;
> >          int error;
> >          bool initialized;
> > +        /*
> > +         * FIXME: This assumes the host IOMMU can support only a
> > +         * single contiguous IOVA window.  We may need to generalize
> > +         * that in future
> > +         */
> > +        hwaddr min_iova, max_iova;
> 
> Should that maybe be dma_addr_t instead of hwaddr ?

Ah, yes it probably should.

> >      } iommu_data;
> >      QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
> >      QLIST_HEAD(, VFIOGroup) group_list;
> > 
> 
>  Thomas
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: pgpsY_iVFFG45.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]