[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC PATCH 2/3] raw-posix: Convert Linux AIO submission
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] [RFC PATCH 2/3] raw-posix: Convert Linux AIO submission to coroutines |
Date: |
Fri, 28 Nov 2014 09:59:00 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) |
Ming Lei <address@hidden> writes:
> On 11/28/14, Markus Armbruster <address@hidden> wrote:
>> Ming Lei <address@hidden> writes:
>>
>>> Hi Kevin,
>>>
>>> On Wed, Nov 26, 2014 at 10:46 PM, Kevin Wolf <address@hidden> wrote:
>>>> This improves the performance of requests because an ACB doesn't need to
>>>> be allocated on the heap any more. It also makes the code nicer and
>>>> smaller.
>>>
>>> I am not sure it is good way for linux aio optimization:
>>>
>>> - for raw image with some constraint, coroutine can be avoided since
>>> io_submit() won't sleep most of times
>>>
>>> - handling one time coroutine takes much time than handling malloc,
>>> memset and free on small buffer, following the test data:
>>>
>>> -- 241ns per coroutine
>>
>> What do you mean by "coroutine" here? Create + destroy? Yield?
>
> Please see perf_cost() in tests/test-coroutine.c
static __attribute__((noinline)) void perf_cost_func(void *opaque)
{
qemu_coroutine_yield();
}
static void perf_cost(void)
{
const unsigned long maxcycles = 40000000;
unsigned long i = 0;
double duration;
unsigned long ops;
Coroutine *co;
g_test_timer_start();
while (i++ < maxcycles) {
co = qemu_coroutine_create(perf_cost_func);
qemu_coroutine_enter(co, &i);
qemu_coroutine_enter(co, NULL);
}
duration = g_test_timer_elapsed();
ops = (long)(maxcycles / (duration * 1000));
g_test_message("Run operation %lu iterations %f s, %luK operations/s, "
"%luns per coroutine",
maxcycles,
duration, ops,
(unsigned long)(1000000000 * duration) / maxcycles);
}
This tests create, enter, yield, reenter, terminate, destroy. The cost
of create + destroy may well dominate.
If we create and destroy coroutines for each AIO request, we're doing it
wrong. I doubt Kevin's doing it *that* wrong ;)
Anyway, let's benchmark the real code instead of putting undue trust in
tests/test-coroutine.c micro-benchmarks.