qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6


From: Ming Lei
Subject: Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2
Date: Thu, 3 Jul 2014 00:13:28 +0800

On Wed, Jul 2, 2014 at 11:45 PM, Ming Lei <address@hidden> wrote:
> On Wed, Jul 2, 2014 at 4:54 PM, Stefan Hajnoczi <address@hidden> wrote:
>> On Tue, Jul 01, 2014 at 06:49:30PM +0200, Paolo Bonzini wrote:
>>> Il 01/07/2014 16:49, Ming Lei ha scritto:
>>> >Let me provide some data when running randread(bs 4k, libaio)
>>> >from VM for 10sec:
>>> >
>>> >1), qemu.git/master
>>> >- write(): 731K
>>> >- rt_sigprocmask(): 417K
>>> >- read(): 21K
>>> >- ppoll(): 10K
>>> >- io_submit(): 5K
>>> >- io_getevents(): 4K
>>> >
>>> >2), qemu 2.0
>>> >- write(): 9K
>>> >- read(): 28K
>>> >- ppoll(): 16K
>>> >- io_submit(): 12K
>>> >- io_getevents(): 10K
>>> >
>>> >>> The sigprocmask can probably be optimized away since the thread's
>>> >>> signal mask remains unchanged most of the time.
>>> >>>
>>> >>> I'm not sure what is causing the write().
>>> >I am investigating it...
>>>
>>> I would guess sigprocmask is getcontext (from qemu_coroutine_new) and write
>>> is aio_notify (from qemu_bh_schedule).
>>
>> Aha!  We shouldn't be executing qemu_coroutine_new() very often since we
>> try to keep a freelist of coroutines.
>>
>> I think a tweak to the freelist could make the rt_sigprocmask() calls go
>> away since we should be reusing coroutines instead of allocating/freeing
>> them all the time.
>>
>>> Both can be eliminated by introducing a fast path in bdrv_aio_{read,write}v,
>>> that bypasses coroutines in the common case of no I/O throttling, no
>>> copy-on-write, etc.
>>
>> I tried that in 2012 and couldn't measure an improvement above the noise
>> threshold, although it was without dataplane.
>>
>> BTW, we cannot eliminate the BH because the block layer guarantees that
>> callbacks are not invoked with reentrancy.  They are always invoked
>> directly from the event loop through a BH.  This simplifies callers
>> since they don't need to worry about callbacks happening while they are
>> still in bdrv_aio_readv(), for example.
>>
>> Removing this guarantee (by making callers safe first) is orthogonal to
>> coroutines.  But it's hard to do since it requires auditing a lot of
>> code.
>>
>> Another idea is to skip aio_notify() when we're sure the event loop
>> isn't blocked in g_poll().  Doing this is a thread-safe and lockless way
>> might be tricky though.
>
> The attachment debug patch skips aio_notify() if qemu_bh_schedule
> is running from current aio context, but looks there is still 120K
> writes triggered. (without the patch, 400K can be observed in
> same test)
>
> So is there still other writes not found in the path?

That must be for generating guest irq, which should have been
processed as batch easily.


Thanks,
-- 
Ming Lei



reply via email to

[Prev in Thread] Current Thread [Next in Thread]