qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC P


From: Paolo Bonzini
Subject: Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts""
Date: Tue, 09 Oct 2012 16:35:00 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120911 Thunderbird/15.0.1

Il 09/10/2012 16:24, Avi Kivity ha scritto:
> > But we are not Linux, and I think the tradeoffs are different for RCU in
> > Linux vs. QEMU.
> > 
> > For CPUs in the kernel, running user code is just one way to get things
> > done; QEMU threads are much more event driven, and their whole purpose
> > is to either run the guest or sleep, until "something happens" (VCPU
> > exit or readable fd).  In other words, QEMU threads should be able to
> > stay most of the time in KVM_RUN or select() for any workload (to some
> > approximation).
> 
> If you're streaming data (the saturated iothread from that other thread)
> or live migrating or have a block job with fast storage, this isn't
> necessarily true.  You could make sure each thread polls the rcu state
> periodically though.

Yep, that was the approximation part.

>> > Not just that: we do not need to minimize RCU critical sections, because
>> > anyway we want to minimize the time spent in QEMU, period.
>> > 
>> > So I believe that to some approximation, in QEMU we can completely
>> > ignore everything else, and behave as if threads were always under
>> > rcu_read_lock(), except if in KVM_RUN/select.  KVM_RUN and select are
>> > what Paul McKenney calls extended quiescent states, and in fact the
>> > following mapping works:
>> > 
>> >     rcu_extended_quiesce_start()     -> rcu_read_unlock();
>> >     rcu_extended_quiesce_end()       -> rcu_read_lock();
>> >     rcu_read_lock/unlock()           -> nop
>> > 
>> > This in turn means that dispatching inside the RCU critical section is
>> > not really bad.
> I believe you still cannot synchronize_rcu() while in an rcu critical
> section per the rcu documentation, even when lock/unlock map to nops.

Right, what the userspace RCU library does is that synchronize_rcu()
also calls rcu_extended_quiesce_start/end() around the actual
synchronization, so that synchronize_rcu() does not wait for its own
grace period.

Instead of a complete nop, rcu_read_lock/unlock() can just write to a
thread-local variable if you want to assert that synchronize_rcu() is
not called within a critical section.  Probably a good idea.

> Of course we can violate that and it wouldn't know a thing, but I prefer
> to stick to the established pattern.

I wasn't suggesting that, just evaluating the different tradeoffs QEMU
could make.  Reference counting is complicated because it has to apply
to all objects used as opaques, and we're using things other than the
DeviceState as opaques in many cases.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]