[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH qemu v16] spapr: Implement Open Firmware client interface

From: Alexey Kardashevskiy
Subject: Re: [PATCH qemu v16] spapr: Implement Open Firmware client interface
Date: Thu, 1 Apr 2021 11:17:39 +1100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:87.0) Gecko/20100101 Thunderbird/87.0

On 31/03/2021 12:03, David Gibson wrote:
On Thu, Mar 25, 2021 at 02:25:33PM +1100, Alexey Kardashevskiy wrote:

On 25/03/2021 13:52, David Gibson wrote:
On Tue, Mar 23, 2021 at 01:58:30PM +1100, Alexey Kardashevskiy wrote:
The PAPR platform which describes an OS environment that's presented by
a combination of a hypervisor and firmware. The features it specifies
require collaboration between the firmware and the hypervisor.

Since the beginning, the runtime component of the firmware (RTAS) has
been implemented as a 20 byte shim which simply forwards it to
a hypercall implemented in qemu. The boot time firmware component is
SLOF - but a build that's specific to qemu, and has always needed to be
updated in sync with it. Even though we've managed to limit the amount
of runtime communication we need between qemu and SLOF, there's some,
and it has become increasingly awkward to handle as we've implemented
new features.

This implements a boot time OF client interface (CI) which is
enabled by a new "x-vof" pseries machine option (stands for "Virtual Open
Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
which implements Open Firmware Client Interface (OF CI). This allows
using a smaller stateless firmware which does not have to manage
the device tree.

The new "vof.bin" firmware image is included with source code under
pc-bios/. It also includes RTAS blob.

This implements a handful of CI methods just to get -kernel/-initrd
working. In particular, this implements the device tree fetching and
simple memory allocator - "claim" (an OF CI memory allocator) and updates
"/memory@0/available" to report the client about available memory.

This implements changing some device tree properties which we know how
to deal with, the rest is ignored. To allow changes, this skips
fdt_pack() when x-vof=on as not packing the blob leaves some room for

In absence of SLOF, this assigns phandles to device tree nodes to make
device tree traversing work.

When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.

This adds basic instances support which are managed by a hash map
ihandle -> [phandle].

Before the guest started, the used memory is:
0..e60 - the initial firmware
8000..10000 - stack
400000.. - kernel
3ea0000.. - initramdisk

This OF CI does not implement "interpret".

Unlike SLOF, this does not format uninitialized nvram. Instead, this
includes a disk image with pre-formatted nvram.

With this basic support, this can only boot into kernel directly.
However this is just enough for the petitboot kernel and initradmdisk to
boot from any possible source. Note this requires reasonably recent guest
kernel with:

The immediate benefit is much faster booting time which especially
crucial with fully emulated early CPU bring up environments. Also this
may come handy when/if GRUB-in-the-userspace sees light of the day.

This separates VOF and sPAPR in a hope that VOF bits may be reused by
other POWERPC boards which do not support pSeries.

This is coded in assumption that later on we might be adding support for
booting from QEMU backends (blockdev is the first candidate) without
devices/drivers in between as OF1275 does not require that and
it is quite easy to so.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

I have some comments below, but they're basically all trivial at this
point.  We've missed qemu-6.0 obviously, but I'm hoping I can merge
the next spin to my ppc-for-6.1 tree.


The example command line is:

/home/aik/pbuild/qemu-killslof-localhost-ppc64/qemu-system-ppc64 \
-nodefaults \
-chardev stdio,id=STDIO0,signal=off,mux=on \
-device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \
-mon id=MON0,chardev=STDIO0,mode=readline \
-nographic \
-vga none \
-enable-kvm \
-m 2G \
-kernel pbuild/kernel-le-guest/vmlinux \
-initrd pb/rootfs.cpio.xz \
id=DRIVE0,if=none,file=./p/qemu-killslof/pc-bios/vof-nvram.bin,format=raw \

Removing the need for a prebuild NVRAM image is something I'd like to
see as a followup.

We do not _need_ NVRAM in the VM to begin with, or is this a requirement?

Actually.. I'm not certain.

Have you heard of using it, ever? What do people store in there in practice?

The whole VOF thing is more like a hack and I do not recall myself on doing
anything useful with NVRAM.

If we really need it, then when to format it - in QEMU or VOF.bin? This
alone will trigger a (lengthy) discussion :)

I prefer qemu, but we can worry about that later.


+void spapr_vof_reset(SpaprMachineState *spapr, void *fdt,
+                     target_ulong *stack_ptr, Error **errp)
+    Vof *vof = spapr->vof;
+    vof_cleanup(vof);
+    spapr_vof_client_dt_finalize(spapr, fdt);
+    if (vof_claim(spapr->fdt_blob, vof, 0, spapr->fw_size, 0) == -1) {
+        error_setg(errp, "Memory for firmware is in use");

This could probably be an assert, yes?  IIUC this the very first
claim, so if this fails then we've placed things incorrectly in the
first place, so it's a code error rather than a user error.

Passing &error_fatal as errp is an assert pretty much but more informative

Not quite.  Passing &error_abort is similar to an assert, but
&error_fatal is not.  The rule is that error_abort or assert() should
be used for things that can only occur as a result of a bug in qemu
itself, whereas error_fatal and other errors should be used for things
where the failure may be because of user configuration, or something
wrong on the host or in the guest.

Since the VOF image is being provided by qemu and this is too early
for the guest to have messed with it, this counts as something that is
necessarily a problem in qemu itself.

vof.bin can be passed via "-bios" which is +1 for error_fatal imho.

Sorry I missed this reply when posted v18. Repost with error_abort? I do not care as much about this one.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]