Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virt

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virt

From:	Ming Lei
Subject:	Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support
Date:	Tue, 12 Aug 2014 16:12:03 +0800

On Tue, Aug 12, 2014 at 3:37 AM, Paolo Bonzini <address@hidden> wrote:
> Il 10/08/2014 05:46, Ming Lei ha scritto:
>> Hi Kevin, Paolo, Stefan and all,
>>
>>
>> On Wed, 6 Aug 2014 10:48:55 +0200
>> Kevin Wolf <address@hidden> wrote:
>>
>>> Am 06.08.2014 um 07:33 hat Ming Lei geschrieben:
>>
>>>
>>> Anyhow, the coroutine version of your benchmark is buggy, it leaks all
>>> coroutines instead of exiting them, so it can't make any use of the
>>> coroutine pool. On my laptop, I get this (where fixed coroutine is a
>>> version that simply removes the yield at the end):
>>>
>>>                 | bypass        | fixed coro    | buggy coro
>>> ----------------+---------------+---------------+--------------
>>> time            | 1.09s         | 1.10s         | 1.62s
>>> L1-dcache-loads | 921,836,360   | 932,781,747   | 1,298,067,438
>>> insns per cycle | 2.39          | 2.39          | 1.90
>>>
>>> Begs the question whether you see a similar effect on a real qemu and
>>> the coroutine pool is still not big enough? With correct use of
>>> coroutines, the difference seems to be barely measurable even without
>>> any I/O involved.
>>
>> Now I fixes the coroutine leak bug, and previous crypt bench is a bit high
>> loading, and cause operations per sec very low(~40K/sec), finally I write a 
>> new
>> and simple one which can generate hundreds of kilo operations per sec and
>> the number should match with some fast storage devices, and it does show 
>> there
>> is not small effect from coroutine.
>>
>> Extremely if just getppid() syscall is run in each iteration, with using 
>> coroutine,
>> only 3M operations/sec can be got, and without using coroutine, the number 
>> can
>> reach 16M/sec, and there is more than 4 times difference!!!
>
> I should be on vacation, but I'm following a couple threads in the mailing 
> list
> and I'm a bit tired to hear the same argument again and again...

I am sorry to interrupt your vocation and make you tired, but the discussion
isn't simply again and again, and something new always comes every time
or most of times.

>
> The different characteristics of asynchronous I/O vs. any synchronous workload
> are such that it is hard to be sure that microbenchmarks make sense.

I don't think it is related with asynchronous I/O or synchronous I/O, and there
isn't sleep(or wait for completion) at all, and we can treat it as aio
by thinking
completion as nop in this case(AIO model: submit and complete)

IMO the getppid() bench is a simple simulation on bdrv_aio_readv/writev()
with I/O plug/unplug wrt. coroutine usage.

BTW, do you agree the computation on coroutine cost in my previous mail?
And I don't think the computation is related with I/O type.

>
> The below patch is basically the minimal change to bypass coroutines.  Of 
> course
> the block.c part is not acceptable as is (the change to refresh_total_sectors
> is broken, the others are just ugly), but it is a start.  Please run it with
> your fio workloads, or write an aio-based version of a qemu-img/qemu-io *I/O*
> benchmark.

Could you explain why the new change is introduced?

I will hold it until we can align to the coroutine cost computation,
because it is
very important for the discussion.

Thank you again for taking time in the discussion.

Thanks,

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support, (continued)

Prev by Date: Re: [Qemu-devel] [PATCH 2/2] i386: Add a Virtual Machine Generation ID device.
Next by Date: [Qemu-devel] [PATCH v8 0/2] aarch64: Allow -kernel option to take a gzip-compressed kernel
Previous by thread: Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support
Next by thread: Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support
Index(es):
- Date
- Thread