[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCHv2 1/3] virtio: add missing mb() on notification
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [PATCHv2 1/3] virtio: add missing mb() on notification |
Date: |
Tue, 24 Apr 2012 15:46:25 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120329 Thunderbird/11.0.1 |
Il 23/04/2012 15:19, Michael S. Tsirkin ha scritto:
> During normal operation, virtio first writes a used index
> and then checks whether it should interrupt the guest
> by reading guest avail index/flag values.
>
> Guest does the reverse: writes the index/flag,
> then checks the used ring.
>
> The ordering is important: if host avail flag read bypasses the used
> index write, we could in effect get this timing:
>
> host avail flag read
> guest enable interrupts: avail flag write
> guest check used ring: ring is empty
> host used index write
>
> which results in a lost interrupt: guest will never be notified
> about the used ring update.
>
> This actually can happen when using kvm with an io thread,
> such that the guest vcpu and qemu run on different host cpus,
> and this has actually been observed in the field
> (but only seems to trigger on very specific processor types)
> with userspace virtio: vhost has the necessary smp_mb()
> in place to prevent the regordering, so the same workload stalls
> forever waiting for an interrupt with vhost=off but works
> fine with vhost=on.
>
> Insert an smp_mb barrier operation in userspace virtio to
> ensure the correct ordering.
> Applying this patch fixed the race condition we have observed.
> Tested on x86_64. I checked the code generated by the new macro
> for i386 and ppc but didn't run virtio.
>
> Signed-off-by: Michael S. Tsirkin <address@hidden>
> ---
> hw/virtio.c | 2 ++
> qemu-barrier.h | 23 ++++++++++++++++++++---
> 2 files changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/hw/virtio.c b/hw/virtio.c
> index f805790..6449746 100644
> --- a/hw/virtio.c
> +++ b/hw/virtio.c
> @@ -693,6 +693,8 @@ static bool vring_notify(VirtIODevice *vdev, VirtQueue
> *vq)
> {
> uint16_t old, new;
> bool v;
> + /* We need to expose used array entries before checking used event. */
> + mb();
mb() vs. smp_mb()?
> /* Always notify when queue is empty (when feature acknowledge) */
> if (((vdev->guest_features & (1 << VIRTIO_F_NOTIFY_ON_EMPTY)) &&
> !vq->inuse && vring_avail_idx(vq) == vq->last_avail_idx)) {
> diff --git a/qemu-barrier.h b/qemu-barrier.h
> index c11bb2b..f6722a8 100644
> --- a/qemu-barrier.h
> +++ b/qemu-barrier.h
> @@ -4,7 +4,7 @@
> /* Compiler barrier */
> #define barrier() asm volatile("" ::: "memory")
>
> -#if defined(__i386__) || defined(__x86_64__)
> +#if defined(__i386__)
>
> /*
> * Because of the strongly ordered x86 storage model, wmb() is a nop
> @@ -13,15 +13,31 @@
> * load/stores from C code.
> */
> #define smp_wmb() barrier()
> +/*
> + * We use GCC builtin if it's available, as that can use
> + * mfence on 32 bit as well, e.g. if built with -march=pentium-m.
> + * However, on i386, there seem to be known bugs as recently as 4.3.
> + * */
Do you know what those bugs are? Either add a pointer, or there is no
reason to have cruft that is only backed by hearsay.
> +#if defined(_GNUC__) && __GNUC__ >= 4 && __GNUC_MINOR__ >= 4
> +#define smp_mb() __sync_synchronize()
> +#else
> +#define smp_mb() asm volatile("lock; addl $0,0(%%esp) " ::: "memory")
> +#endif
> +
> +#elif defined(__x86_64__)
> +
> +#define smp_wmb() barrier()
> +#define smp_mb() asm volatile("mfence" ::: "memory")
>
> #elif defined(_ARCH_PPC)
>
> /*
> - * We use an eieio() for a wmb() on powerpc. This assumes we don't
> + * We use an eieio() for wmb() and mb() on powerpc. This assumes we don't
> * need to order cacheable and non-cacheable stores with respect to
> * each other
> */
> #define smp_wmb() asm volatile("eieio" ::: "memory")
> +#define smp_mb() asm volatile("eieio" ::: "memory")
smp_mb() is hwsync under PPC, but I would just trust GCC.
Paolo
>
> #else
>
> @@ -29,9 +45,10 @@
> * For (host) platforms we don't have explicit barrier definitions
> * for, we use the gcc __sync_synchronize() primitive to generate a
> * full barrier. This should be safe on all platforms, though it may
> - * be overkill.
> + * be overkill for wmb().
> */
> #define smp_wmb() __sync_synchronize()
> +#define smp_mb() __sync_synchronize()
>
> #endif
>
[Qemu-devel] [PATCHv2 2/3] virtio: add missing mb() on enable notification, Michael S. Tsirkin, 2012/04/23