qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC P


From: Avi Kivity
Subject: Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts""
Date: Tue, 09 Oct 2012 13:55:41 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120911 Thunderbird/15.0.1

On 10/09/2012 01:08 PM, Paolo Bonzini wrote:
>> 
>> That's not strictly a coroutine issue.  Switching to ordinary threads
>> may make the problem worse, since there will clearly be contention.
> 
> The point is you don't need either coroutines or userspace threads if
> you use native AIO.  longjmp/setjmp is probably a smaller overhead
> compared to the many syscalls involved in poll+eventfd
> reads+io_submit+io_getevents, but it's also not cheap.  Also, if you
> process AIO in batches you risk overflowing the pool of free coroutines,
> which gets expensive real fast (allocate/free the stack, do the
> expensive getcontext/swapcontext instead of the cheaper longjmp/setjmp,
> etc.).
> 
> It seems better to sidestep the issue completely, it's a small amount of
> work.

Oh, agree 100% raw + native aio wants to bypass coroutines/threads
completely.

>> What is the I/O processing time we have?  If it's say 10 microseconds,
>> then we'll have 100,000 context switches per second assuming a device
>> lock and a saturated iothread (split into multiple threads).
> 
> Hopefully with a saturated dedicated iothread you would not have any
> context switches and a single CPU will be just dedicated to virtio
> processing.

I meant, if you break that saturated thread into multiple threads (in
order to break the 1 core limit), then you start to context switch badly.

> 
>> The coroutine work may have laid the groundwork for fine-grained
>> locking.  I'm doubtful we should use qcow when we want >100K IOPS though.
> 
> Yep.  Going away from coroutines is a solution in search of a problem,
> it will introduce several new variables (kernel scheduling, more
> expensive lock contention, starving the thread pool with locked threads,
> ...), all for a case where performance hardly matters.
> 
>>>>> I'm also curious about virtqueue_pop()/virtqueue_push() outside the QEMU 
>>>>> mutex
>>>>> although that might be blocked by the current work around MMIO/PIO 
>>>>> dispatch
>>>>> outside the global mutex.
>>>>
>>>> It is, yes.
>>>
>>> It should only require unlocked memory map/unmap, not MMIO dispatch.
>>> The MMIO/PIO bits are taken care of by ioeventfd.
>> 
>> The ring, or indirect descriptors, or the data, can all be on mmio.
>> IIRC the virtio spec forbids that, but the APIs have to be general.  We
>> don't have cpu_physical_memory_map_nommio() (or
>> address_space_map_nommio(), as soon as the coding style committee
>> ratifies srtuct literals).
> 
> cpu_physical_memory_map could still take the QEMU lock in the slow
> bounce-buffer case.  

You're right.  In fact this is a good opportunity to introduce lockless
lookups where the only optimized path is RAM -- ioeventfd provides a
lockless lookup of its own.

We could perhaps even avoid refcounting, by shutting down the device
thread as part of hotunplug.

[could we also avoid refcounting by doing the equivalent of
stop_machine() during hotunplug?]

> BTW the block layer has been using struct literals
> for a long time and we're just as happy as you are about them. :)

So does upstream memory.c and the json tests.

-- 
error compiling committee.c: too many arguments to function



reply via email to

[Prev in Thread] Current Thread [Next in Thread]