On Wed, Jul 12, 2017 at 13:06:23 -1000, Richard Henderson wrote:
You've got a problem here in that you're not including CF_COUNT_MASK in the
hash and you dropped the flush when changing to parallel_cpus = true. That
means you could find an old TB with CF_COUNT > 1.
Not required for this patch set, but what I'd like to see eventually is
(1) cpu_exec_step merged into cpu_exec_step_atomic for clarity.
(2) callers of tb_gen_code add in CF_PARALLEL as needed; do not
pick it up from parallel_cpus within tb_gen_code.
(3) target/*/translate.c uses CF_PARALLEL instead of parallel_cpus.
(4) cpu_exec_step_atomic does the tb lookup and code gen outside
of the start_exclusive/end_exclusive lock.
I have implemented these for v2, which is almost ready to go. However,
just noticed that tcg-op.c also checks parallel_cpus to decide whether
to emit a real atomic or a non-atomic op. Should we export the two
flavours of these ops to targets, since targets are the ones that can
check CF_PARALLEL? Or perhaps set a bit in the now-per-thread *tcg_ctx?