[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] [RFC] QEMU/KVM PowerPC: virtio and guest endianness
From: |
Alexander Graf |
Subject: |
Re: [Qemu-ppc] [RFC] QEMU/KVM PowerPC: virtio and guest endianness |
Date: |
Fri, 4 Oct 2013 13:54:25 +0200 |
On 04.10.2013, at 13:53, Paul Mackerras wrote:
> On Thu, Oct 03, 2013 at 04:29:52PM +0200, Greg Kurz wrote:
>> Hi,
>>
>> There have been some work on the topic lately but no agreement has
>> been reached yet. I want to consolidate the facts in a single thread of
>> mail and re-start the discussion. Please find below a recap of what we
>> have as of today:
>>
>> From a virtio POV, guest endianness is reflected by the endianness of
>> the interrupt vectors (ILE bit in the LPCR register). The guest kernel
>> relies on the H_SET_MODE_RESOURCE_LE hcall to set this bit, early in the
>> boot process.
>>
>> Rusty sent a patchset on qemu-devel@ to provide the necessary bits to
>> perform byteswap in the QEMU:
>>
>> http://patchwork.ozlabs.org/patch/266451/
>> http://patchwork.ozlabs.org/patch/266452/
>> http://patchwork.ozlabs.org/patch/266450/
>> (plus other enablement patches for virtio drivers, not essential for
>> the discussion).
>>
>> In non-KVM mode, QEMU implements the H_SET_MODE_RESOURCE_LE and updates
>> its internal value for LPCR when the guest requests it. Rusty's patchset
>> works out-of-the-box in this mode: I could successfully setup and use a
>> 9p share over virtio transport (broader virtio testing still to be done
>> though).
>>
>> When using KVM, the story is different : QEMU is not on this
>> endianness change flow anymore, providing KVM has the following
>> patch from Anton:
>>
>> http://patchwork.ozlabs.org/patch/277079/
>>
>> There are *at least* two approaches to bring back endianness knowledge
>> to QEMU: polling (1) and propagation (2).
>>
>> (1) QEMU must retrieve LPCR from the kernel using the following API:
>>
>> http://patchwork.ozlabs.org/patch/273029/
>>
>> (2) KVM can resume execution to the host and thus propagating
>> H_SET_MODE_RESOURCE_LE to QEMU. Laurent came up with a patch on
>> linuxppc-dev@ to do this:
>>
>> http://patchwork.ozlabs.org/patch/278590/
>>
>> I would say (1) is a standard and sane way of addressing the issue:
>> since the LPCR register value is held by KVM, it makes sense to
>> introduce an API to get/set it. Then, it is up to QEMU to use this API.
>>
>> We can dumbly do the polling in all the places where byteswapping
>> matters: it is clearly sub-optimized, especially since the LPCR_ILE bit
>> doesn't change so often. Rusty suggested we can retrieve it at virtio
>> device reset time and cache it, since an endianness change after the
>> devices have started to be used is non-sensical.
>>
>> I have searched for an appropriate place to add the polling and I must
>> admit I did not find any... I am no QEMU expert but I suspect we would
>> need some kind of arch specific hook to be called from the virtio code
>> to do this... :-\ I hope I am wrong, please correct me if so.
>>
>> On the other hand, (2) looks a bit hacky: KVM usually returns to the
>> host when it cannot fully handle the h_call. Propagating may look like
>> a useless path to follow from a KVM POV. From a QEMU POV, things are
>> different: propagation will trig the fallback code in QEMU, already
>> working in non-KVM mode. Nothing more to be done.
>
> I don't mind particularly whether H_SET_MODE for the endianness
> setting gets handled in the kernel or in QEMU, but I don't think it
> should be handled in both. If you want QEMU to know about the
> endianness setting immediately, make the kernel version do nothing and
> get QEMU to handle it -- which if KVM is enabled will mean iterating
> over all vcpus and getting them all to send the new LPCR setting to
> the kernel via the SET_ONE_REG ioctl.
>
> However, I want the setting of breakpoint registers (CIABR and DAWR/X)
> via H_SET_MODE to happen in the kernel, preferably in real mode, since
> that can happen on context switch and thus needs to be quick.
I don't want to see a single hypercall be split across the QEMU/KVM barrier. So
if there's a reasonable incentive to handle H_SET_MODE in KVM, we should handle
all of it in KVM.
Alex
- [Qemu-ppc] [RFC] QEMU/KVM PowerPC: virtio and guest endianness, Greg Kurz, 2013/10/03
- Re: [Qemu-ppc] [RFC] QEMU/KVM PowerPC: virtio and guest endianness, Alexander Graf, 2013/10/04
- Re: [Qemu-ppc] [RFC] QEMU/KVM PowerPC: virtio and guest endianness, Paul Mackerras, 2013/10/04
- Re: [Qemu-ppc] [RFC] QEMU/KVM PowerPC: virtio and guest endianness,
Alexander Graf <=
- Re: [Qemu-ppc] [RFC] QEMU/KVM PowerPC: virtio and guest endianness, Greg Kurz, 2013/10/04
- Re: [Qemu-ppc] [RFC] QEMU/KVM PowerPC: virtio and guest endianness, Alexander Graf, 2013/10/04
- [Qemu-ppc] [PATCH 0/2] virtio: guest endianness support, Greg Kurz, 2013/10/07
- [Qemu-ppc] [PATCH 2/2] virtio: refresh registers at reset time, Greg Kurz, 2013/10/07
- Re: [Qemu-ppc] [PATCH 2/2] virtio: refresh registers at reset time, Rusty Russell, 2013/10/15
- [Qemu-ppc] [PATCH 1/2] linux-headers: POWER8 partial update, Greg Kurz, 2013/10/07