On Thu, Jul 3, 2014 at 12:21 AM, Paolo Bonzini <address@hidden> wrote:
Il 02/07/2014 17:45, Ming Lei ha scritto:
The attachment debug patch skips aio_notify() if qemu_bh_schedule
is running from current aio context, but looks there is still 120K
writes triggered. (without the patch, 400K can be observed in
same test)
Nice. Another observation is that after aio_dispatch we'll always
re-evaluate everything (bottom halves, file descriptors and timeouts),
The idea is very good.
If aio_notify() is called from the 1st aio_dispatch() in aio_poll(),
ctc->notifier might need to be set, but it can be handled easily.
so we can skip the aio_notify if we're inside aio_dispatch.
So what about this untested patch:
diff --git a/aio-posix.c b/aio-posix.c
index f921d4f..a23d85d 100644
--- a/aio-posix.c
+++ b/aio-posix.c
#include "qemu/atomic.h"
@@ -124,6 +124,9 @@ static bool aio_dispatch(AioContext *ctx)
AioHandler *node;
bool progress = false;
+ /* No need to set the event notifier during aio_notify. */
+ ctx->running++;
+
/*
* We have to walk very carefully in case qemu_aio_set_fd_handler is
* called while we're walking.
@@ -169,6 +171,11 @@ static bool aio_dispatch(AioContext *ctx)
/* Run our timers */
progress |= timerlistgroup_run_timers(&ctx->tlg);
+ smp_wmb();
+ ctx->iter_count++;
+ smp_wmb();
+ ctx->running--;
+
return progress;
}
diff --git a/async.c b/async.c
index 5b6fe6b..1f56afa 100644
--- a/async.c
+++ b/async.c
#include "qemu/atomic.h"
@@ -249,7 +249,19 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
void aio_notify(AioContext *ctx)
{
- event_notifier_set(&ctx->notifier);
+ uint32_t iter_count;
+ do {
+ iter_count = ctx->iter_count;
+ /* Read ctx->iter_count before ctx->running. */
+ smb_rmb();
s/smb/smp
+ if (!ctx->running) {
+ event_notifier_set(&ctx->notifier);
+ return;
+ }
+ /* Read ctx->running before ctx->iter_count. */
+ smb_rmb();
s/smb/smp
+ /* ctx might have gone to sleep. */
+ } while (iter_count != ctx->iter_count);
}
Since both 'running' and 'iter_count' may be read lockless, something
like ACCESS_ONCE() should be used to avoid compiler optimization.