qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2
Date: Thu, 03 Jul 2014 12:29:35 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0

Il 03/07/2014 06:54, Ming Lei ha scritto:
On Thu, Jul 3, 2014 at 12:21 AM, Paolo Bonzini <address@hidden> wrote:
Il 02/07/2014 17:45, Ming Lei ha scritto:
The attachment debug patch skips aio_notify() if qemu_bh_schedule
is running from current aio context, but looks there is still 120K
writes triggered. (without the patch, 400K can be observed in
same test)

Nice.  Another observation is that after aio_dispatch we'll always
re-evaluate everything (bottom halves, file descriptors and timeouts),

The idea is very good.

If aio_notify() is called from the 1st aio_dispatch() in aio_poll(),
ctc->notifier might need to be set, but it can be handled easily.

Yes, you can just move the atomic_inc/atomic_dec in aio_poll.

so we can skip the aio_notify if we're inside aio_dispatch.

So what about this untested patch:

diff --git a/aio-posix.c b/aio-posix.c
index f921d4f..a23d85d 100644
--- a/aio-posix.c
+++ b/aio-posix.c

#include "qemu/atomic.h"

@@ -124,6 +124,9 @@ static bool aio_dispatch(AioContext *ctx)
     AioHandler *node;
     bool progress = false;

+    /* No need to set the event notifier during aio_notify.  */
+    ctx->running++;
+
     /*
      * We have to walk very carefully in case qemu_aio_set_fd_handler is
      * called while we're walking.
@@ -169,6 +171,11 @@ static bool aio_dispatch(AioContext *ctx)
     /* Run our timers */
     progress |= timerlistgroup_run_timers(&ctx->tlg);

+    smp_wmb();
+    ctx->iter_count++;
+    smp_wmb();
+    ctx->running--;
+
     return progress;
 }

diff --git a/async.c b/async.c
index 5b6fe6b..1f56afa 100644
--- a/async.c
+++ b/async.c

#include "qemu/atomic.h"

@@ -249,7 +249,19 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)

 void aio_notify(AioContext *ctx)
 {
-    event_notifier_set(&ctx->notifier);
+    uint32_t iter_count;
+    do {
+        iter_count = ctx->iter_count;
+        /* Read ctx->iter_count before ctx->running.  */
+        smb_rmb();

s/smb/smp

+        if (!ctx->running) {
+            event_notifier_set(&ctx->notifier);
+            return;
+        }
+        /* Read ctx->running before ctx->iter_count.  */
+        smb_rmb();

s/smb/smp

+        /* ctx might have gone to sleep.  */
+    } while (iter_count != ctx->iter_count);
 }

Since both 'running' and 'iter_count'  may be read lockless, something
like ACCESS_ONCE() should be used to avoid compiler optimization.

No, smp_rmb() is enough to avoid them.  See also include/qemu/seqlock.h

The first access to ctx->iter_count _could_ be protected by ACCESS_ONCE(), which in QEMU we call atomic_read()/atomic_set(), but it's not necessary. See docs/atomics.txt for a description for QEMU's
atomic access functions.

In my test, it does decrease write() very much, and I hope
a formal version can be applied soon.

Can you take care of that (you can add my Signed-off-by), since you have the best testing environment? v5 of the plug/unplug series will be good to go, I think.

Paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]