[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v11 02/14] accel: collecting TB execution count
|
From: |
Wu, Fei |
|
Subject: |
Re: [PATCH v11 02/14] accel: collecting TB execution count |
|
Date: |
Mon, 8 May 2023 18:02:20 +0800 |
|
User-agent: |
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 |
On 5/3/2023 4:28 PM, Richard Henderson wrote:
> On 4/21/23 14:24, Fei Wu wrote:
>> From: "Vanderson M. do Rosario" <vandersonmr2@gmail.com>
>>
>> If a TB has a TBS (TBStatistics) with the TB_EXEC_STATS
>> enabled, then we instrument the start code of this TB
>> to atomically count the number of times it is executed.
>> We count both the number of "normal" executions and atomic
>> executions of a TB.
>>
>> The execution count of the TB is stored in its respective
>> TBS.
>>
>> All TBStatistics are created by default with the flags from
>> default_tbstats_flag.
>>
>> Signed-off-by: Vanderson M. do Rosario <vandersonmr2@gmail.com>
>> Message-Id: <20190829173437.5926-3-vandersonmr2@gmail.com>
>> [AJB: Fix author]
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> ---
>> accel/tcg/cpu-exec.c | 6 ++++++
>> accel/tcg/tb-stats.c | 6 ++++++
>> accel/tcg/tcg-runtime.c | 8 ++++++++
>> accel/tcg/tcg-runtime.h | 2 ++
>> accel/tcg/translate-all.c | 7 +++++--
>> accel/tcg/translator.c | 10 ++++++++++
>> include/exec/gen-icount.h | 1 +
>> include/exec/tb-stats.h | 18 ++++++++++++++++++
>> 8 files changed, 56 insertions(+), 2 deletions(-)
>>
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index c815f2dbfd..d89f9fe493 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -25,6 +25,7 @@
>> #include "trace.h"
>> #include "disas/disas.h"
>> #include "exec/exec-all.h"
>> +#include "exec/tb-stats.h"
>> #include "tcg/tcg.h"
>> #include "qemu/atomic.h"
>> #include "qemu/rcu.h"
>> @@ -564,7 +565,12 @@ void cpu_exec_step_atomic(CPUState *cpu)
>> mmap_unlock();
>> }
>> + if (tb_stats_enabled(tb, TB_EXEC_STATS)) {
>> + tb->tb_stats->executions.atomic++;
>> + }
>
> The write is protected by the exclusive lock, but the read might be
> accessible from the monitor, iiuc. Which means you should use
> atomic_set(), for non-tearable write after non-atomic increment.
>
The writes are serialized, 'atomic' is an aligned integer (unsigned
long), the read in parallel with write should not be a problem? It
returns the value either before increment or after, not part of.
>> @@ -148,3 +149,10 @@ void HELPER(exit_atomic)(CPUArchState *env)
>> {
>> cpu_loop_exit_atomic(env_cpu(env), GETPC());
>> }
>> +
>> +void HELPER(inc_exec_freq)(void *ptr)
>> +{
>> + TBStatistics *stats = (TBStatistics *) ptr;
>> + tcg_debug_assert(stats);
>> + qatomic_inc(&stats->executions.normal);
>> +}
>
> Ug. Do we really need an atomic update?
>
> If we have multiple threads executing through the same TB, we'll get
> significant slow-down at the cost of not missing increments. If we
> allow a non-atomic update, we'll get much less slow-down at the cost of
> missing a few increments. But this is statistical only, so how much
> does it really matter?
>
This sounds reasonable to me. Alex, what's your point here?
Richard, could you please review all this series? I just saw your
reviews on patch 01 and 02.
Thanks,
Fei.
>
> r~