qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6


From: Ming Lei
Subject: Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2
Date: Thu, 3 Jul 2014 19:50:43 +0800

On Thu, Jul 3, 2014 at 6:29 PM, Paolo Bonzini <address@hidden> wrote:
> Il 03/07/2014 06:54, Ming Lei ha scritto:
>
>> On Thu, Jul 3, 2014 at 12:21 AM, Paolo Bonzini <address@hidden>
>> wrote:
>>>
>>> Il 02/07/2014 17:45, Ming Lei ha scritto:
>>>>
>>>> The attachment debug patch skips aio_notify() if qemu_bh_schedule
>>>> is running from current aio context, but looks there is still 120K
>>>> writes triggered. (without the patch, 400K can be observed in
>>>> same test)
>>>
>>>
>>> Nice.  Another observation is that after aio_dispatch we'll always
>>> re-evaluate everything (bottom halves, file descriptors and timeouts),
>>
>>
>> The idea is very good.
>>
>> If aio_notify() is called from the 1st aio_dispatch() in aio_poll(),
>> ctc->notifier might need to be set, but it can be handled easily.
>
>
> Yes, you can just move the atomic_inc/atomic_dec in aio_poll.

If you mean move inc/dec of 'running' in aio_poll, that won't work.
When aio_notify() sees 'running', it won't set notifier, and may
trap to ppoll().

We can set a flag in aio_notify() if notifier is bypassed, then
check the flag before ppoll() to decide if we should poll().

>
>
>>> so we can skip the aio_notify if we're inside aio_dispatch.
>>>
>>> So what about this untested patch:
>>>
>>> diff --git a/aio-posix.c b/aio-posix.c
>>> index f921d4f..a23d85d 100644
>>> --- a/aio-posix.c
>>> +++ b/aio-posix.c
>>
>>
>> #include "qemu/atomic.h"
>>
>>> @@ -124,6 +124,9 @@ static bool aio_dispatch(AioContext *ctx)
>>>      AioHandler *node;
>>>      bool progress = false;
>>>
>>> +    /* No need to set the event notifier during aio_notify.  */
>>> +    ctx->running++;
>>> +
>>>      /*
>>>       * We have to walk very carefully in case qemu_aio_set_fd_handler is
>>>       * called while we're walking.
>>> @@ -169,6 +171,11 @@ static bool aio_dispatch(AioContext *ctx)
>>>      /* Run our timers */
>>>      progress |= timerlistgroup_run_timers(&ctx->tlg);
>>>
>>> +    smp_wmb();
>>> +    ctx->iter_count++;
>>> +    smp_wmb();
>>> +    ctx->running--;
>>> +
>>>      return progress;
>>>  }
>>>
>>> diff --git a/async.c b/async.c
>>> index 5b6fe6b..1f56afa 100644
>>> --- a/async.c
>>> +++ b/async.c
>>
>>
>> #include "qemu/atomic.h"
>>
>>> @@ -249,7 +249,19 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
>>>
>>>  void aio_notify(AioContext *ctx)
>>>  {
>>> -    event_notifier_set(&ctx->notifier);
>>> +    uint32_t iter_count;
>>> +    do {
>>> +        iter_count = ctx->iter_count;
>>> +        /* Read ctx->iter_count before ctx->running.  */
>>> +        smb_rmb();
>>
>>
>> s/smb/smp
>>
>>> +        if (!ctx->running) {
>>> +            event_notifier_set(&ctx->notifier);
>>> +            return;
>>> +        }
>>> +        /* Read ctx->running before ctx->iter_count.  */
>>> +        smb_rmb();
>>
>>
>> s/smb/smp
>>
>>> +        /* ctx might have gone to sleep.  */
>>> +    } while (iter_count != ctx->iter_count);
>>>  }
>>
>>
>> Since both 'running' and 'iter_count'  may be read lockless, something
>> like ACCESS_ONCE() should be used to avoid compiler optimization.
>
>
> No, smp_rmb() is enough to avoid them.  See also include/qemu/seqlock.h

Yes, there is order between read/write iter_count and running.

> The first access to ctx->iter_count _could_ be protected by ACCESS_ONCE(),
> which in QEMU we call atomic_read()/atomic_set(), but it's not necessary.
> See docs/atomics.txt for a description for QEMU's
> atomic access functions.
>
>
>> In my test, it does decrease write() very much, and I hope
>> a formal version can be applied soon.
>
>
> Can you take care of that (you can add my Signed-off-by), since you have the
> best testing environment?  v5 of the plug/unplug series will be good to go,

OK, I will do it,  and include it in v5, and put your name as
author/signed-off.

BTW, I do not have best testing environment, and all plug/unplug related
tests are done on my laptop, :-)


Thanks,
-- 
Ming Lei



reply via email to

[Prev in Thread] Current Thread [Next in Thread]