[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's

From: Matheus K. Ferst
Subject: Re: [RFC PATCH] target/ppc: don't print TB in ppc_cpu_dump_state if it's not initialized
Date: Wed, 13 Jul 2022 15:28:45 -0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1

On 12/07/2022 23:21, David Gibson wrote:
On Tue, Jul 12, 2022 at 06:13:44PM -0300, Daniel Henrique Barboza wrote:

On 7/12/22 16:25, Matheus Ferst wrote:
When using "-machine none", env->tb_env is not allocated, causing the
segmentation fault reported in issue #85 (launchpad bug #811683). To
avoid this problem, check if the pointer != NULL before calling the
methods to print TBU/TBL/DECR.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/85
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
This patch fixes the reported problem, but may be an incomplete solution
since many other places dereference env->tb_env without checking for
NULL. AFAICS, "-machine none" is the only way to trigger this problem,
and I'm not familiar with the use-cases for this option.

The "none"  machine type is mainly used by libvirt to do instrospection
of the available options/capabilities of the QEMU binary. It starts a QEMU
process like the following:

./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults \
       -nographic -machine none,accel=kvm:tcg -daemonize

And then it uses QMP to probe the binary.

Aside from this libvirt usage I am not aware of anyone else using -machine
none extensively.

Right.  -machine none basically cannot work as a real machine for
POWER (maybe some other CPUs as well).  At least the more modern POWER
CPUs simply cannot boot without a bunch of supporting board/system
level elements, and there's not really a sane way to encode those into
individual emulated devices at present (maybe ever).

One of those things is that POWER expects the timebases to be
synchronized across all CPUs in the system, which obviously can't be
done locally to a single CPU chip.  It requires system level
operations, which is why it's handled by the machine type

[Example: a typical sequence which might be handled in hardware by
  low-level firmware would be to use machine-specific board-level
  registers to suspend the clock pulse to the CPUs which drives the
  timebase, then write the same value to the TB on each CPU, then
  (atomically) restart the clock pulse using board registers again]

So I guess it's safe to assume that it's impossible to run code with "-machine none", and then there would be no reason to check for NULL in the mtspr/mfspr path, right?

Should we stop assuming env->tb_env != NULL and add checks everywhere?
Or should we find a way to provide Time Base/Decrementer for
"-machine none"?

Are there other cases where env->tb_env can be NULL, aside from the case
reported in the bug?

If there are, I'd say that's a bug in the machine type.  Setting up
(and synchronizing) the timebase is part of the machine's job.

With "-machine none", it seems that the only places where it could happen are:

i) Monitor code: there are some other places where env_tb is used, like monitor_get_tb{u,l} and monitor_get_decr, so commands like "p $tbu" or "p $dect" cause a segfault too. ii) mtspr/mfspr: it shouldn't be a problem if it's not possible to run code without a machine. iii) gdbstub: we're not reading or setting TB{U,L} from gdb, which may be an issue on its own, but not related to #85.

I don't mind the bug fix, but I'm not fond of the idea of adding additional
checks because of this particular issue. I mean, the bug is using  the 'prep'
machine that Thomas removed year ago in b2ce76a0730. If there's no other
foreseeable problem, that we care about, with env->tb_env being NULL, IMO
let's fix the bug and move on.

I'll send a v2 fixing the other segfault in monitor, and then I guess we have a complete solution. Thanks Daniel and David for the feedback.

Matheus K. Ferst
Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/>
Analista de Software
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]