qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] exec: Safe work in quiescent state


From: Sergey Fedorov
Subject: Re: [Qemu-devel] exec: Safe work in quiescent state
Date: Wed, 15 Jun 2016 23:05:03 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0

On 15/06/16 18:25, alvise rigo wrote:
> On Wed, Jun 15, 2016 at 4:51 PM, Alex Bennée <address@hidden> wrote:
>> alvise rigo <address@hidden> writes:
>>> On Wed, Jun 15, 2016 at 2:59 PM, Sergey Fedorov <address@hidden> wrote:
>>>> On 10/06/16 00:51, Sergey Fedorov wrote:
>>>>> For certain kinds of tasks we might need a quiescent state to perform an
>>>>> operation safely. Quiescent state means no CPU thread executing, and
>>>>> probably BQL held as well. The tasks could include:
>> <snip>
>>>> Alvise's async_wait_run_on_cpu() [3]:
>>>> - uses the same queue as async_run_on_cpu();
>>>> - the CPU that requested the job is recorded in qemu_work_item;
>>>> - each CPU has a counter of such jobs it has requested;
>>>> - the counter is decremented upon job completion;
>>>> - only the target CPU is forced to exit the execution loop, i.e. the job
>>>> is not run in quiescent state;
>>> async_wait_run_on_cpu() kicks the target VCPU before calling
>>> cpu_exit() on the current VCPU, so all the VCPUs are forced to exit.
>>> Moreover, the current VCPU waits for all the tasks to be completed.
>> The effect of qemu_cpu_kick() for TCG is effectively just doing a
>> cpu_exit() anyway. Once done any TCG code will exit on it's next
>> intra-block transition.

I was just meaning that async_wait_run_on_cpu() does not stop all the
CPUs: it only affects the current CPU and the target CPU. So this
mechanism cannot be used for tb_flush().

>> <snip>
>>>> Distilling the requirements, safe work mechanism should:
>>>> - support both system and user-mode emulation;
>>>> - allow to schedule an asynchronous operation to be performed out of CPU
>>>> execution loop;
>>>> - guarantee that all CPUs are out of execution loop before the operation
>>>> can begin;
>>> This requirement is probably not necessary if we need to query TLB
>>> flushes to other VCPUs, since every VCPU will flush its own TLB.
>>> For this reason we probably need to mechanisms:
>>> - The first allows a VCPU to query a job to all the others and wait
>>> for all of them to be done (like for global TLB flush)
>> Do we need to wait?
> Yes, otherwise the instruction (like MCR which allows to do TLB
> invalidation) is not completely emulated before executing the
> following one.

I think I need to specify this in the requirements: the CPU which
requested an asynchronous safe operation must exit its execution loop at
the end of the current TB and wait for operation completion. Then guest
cross-CPU TLB invalidation instruction can force end of the TB to ensure
no further instructions get executed until the flush is complete.

> During the LL emulation is also required since it avoids possible race
> conditions.

As it was pointed in [1], LL can be implemented using such "safe work in
quiescent state" mechanism.

[1] http://thread.gmane.org/gmane.comp.emulators.qemu/413978/focus=418664


>>> - The second allows a VCPU to perform a task in quiescent state i.e.
>>> the task starts and finishes when all VCPUs are out of the execution
>>> loop (translation buffer flush)
>> If you really want to ensure everything is done then you can exit the
>> block early. To get the sort of dsb() flush semantics mentioned you
>> simply:
>>
>>   - queue your async safe work
>>   - exit block on dsb()
>>
>>   This ensures by the time the TCG thread restarts for the next
>>   instruction all pending work has been flushed.

Indeed, if we kick the CPU which requested the job and just end the TB
at DSB instruction then the CPU will see the exit request and go out of
its execution loop to wait for operation completion.

>>> Does this make sense?
>> I think we want one way of doing things for anything that is Cross CPU
>> and requires a degree of synchronisation. If it ends up being too
>> expensive then we can look at more efficient special casing solutions.
> OK, I agree that we should start with an approach that fits the two use cases.

So refined the requirements, safe work mechanism should:
- support both system and user-mode emulation;
- allow to schedule an asynchronous operation to be performed out of CPU
execution loop;
- force all CPUs to exit execution loop at the end of the currently
executed TB once an operation is scheduled;
- guarantee that all CPUs are out of execution loop before the operation
can begin;
- guarantee that no CPU enters execution loop until all the scheduled
operations are complete.

Kind regards,
Sergey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]