qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64


From: Christoffer Dall
Subject: Re: [Qemu-devel] [PATCH V1 1/1] tests: Add migration test for aarch64
Date: Wed, 31 Jan 2018 17:53:23 +0100

On Wed, Jan 31, 2018 at 4:18 PM, Ard Biesheuvel
<address@hidden> wrote:
> On 31 January 2018 at 09:53, Christoffer Dall
> <address@hidden> wrote:
>> On Mon, Jan 29, 2018 at 10:32:12AM +0000, Marc Zyngier wrote:
>>> On 29/01/18 10:04, Peter Maydell wrote:
>>> > On 29 January 2018 at 09:53, Dr. David Alan Gilbert <address@hidden> 
>>> > wrote:
>>> >> * Peter Maydell (address@hidden) wrote:
>>> >>> On 26 January 2018 at 19:46, Dr. David Alan Gilbert <address@hidden> 
>>> >>> wrote:
>>> >>>> * Peter Maydell (address@hidden) wrote:
>>> >>>>> I think the correct fix here is that your test code should turn
>>> >>>>> its MMU on. Trying to treat guest RAM as uncacheable doesn't work
>>> >>>>> for Arm KVM guests (for the same reason that VGA device video memory
>>> >>>>> doesn't work). If it's RAM your guest has to arrange to map it as
>>> >>>>> Normal Cacheable, and then everything should work fine.
>>> >>>>
>>> >>>> Does this cause problems with migrating at just the wrong point during
>>> >>>> a VM boot?
>>> >>>
>>> >>> It wouldn't surprise me if it did, but I don't think I've ever
>>> >>> tried to provoke that problem...
>>> >>
>>> >> If you think it'll get the RAM contents wrong, it might be best to fail
>>> >> the migration if you can detect the cache is disabled in the guest.
>>> >
>>> > I guess QEMU could look at the value of the "MMU disabled/enabled" bit
>>> > in the guest's system registers, and refuse migration if it's off...
>>> >
>>> > (cc'd Marc, Christoffer to check that I don't have the wrong end
>>> > of the stick about how thin the ice is in the period before the
>>> > guest turns on its MMU...)
>>>
>>> Once MMU and caches are on, we should be in a reasonable place for QEMU
>>> to have a consistent view of the memory. The trick is to prevent the
>>> vcpus from changing that. A guest could perfectly turn off its MMU at
>>> any given time if it needs to (and it is actually required on some HW if
>>> you want to mitigate headlining CVEs), and KVM won't know about that.
>>>
>>
>> (Clarification: KVM can detect this is it bother to check the VCPU's
>> system registers, but we don't trap to KVM when the VCPU turns off its
>> caches, right?)
>>
>>> You may have to pause the vcpus before starting the migration, or
>>> introduce a new KVM feature that would automatically pause a vcpu that
>>> is trying to disable its MMU while the migration is on. This would
>>> involve trapping all the virtual memory related system registers, with
>>> an obvious cost. But that cost would be limited to the time it takes to
>>> migrate the memory, so maybe that's acceptable.
>>>
>> Is that even sufficient?
>>
>> What if the following happened. (1) guest turns off MMU, (2) guest
>> writes some data directly to ram (3) qemu stops the vcpu (4) qemu reads
>> guest ram.  QEMU's view of guest ram is now incorrect (stale,
>> incoherent, ...).
>>
>> I'm also not really sure if pausing one VCPU because it turned off its
>> MMU will go very well when trying to migrate a large VM (wouldn't this
>> ask for all the other VCPUs beginning to complain that the stopped VCPU
>> appears to be dead?).  As a short-term 'fix' it's probably better to
>> refuse migration if you detect that a VCPU had begun turning off its
>> MMU.
>>
>> On the larger scale of thins; this appears to me to be another case of
>> us really needing some way to coherently access memory between QEMU and
>> the VM, but in the case of the VCPU turning off the MMU prior to
>> migration, we don't even know where it may have written data, and I'm
>> therefore not really sure what the 'proper' solution would be.
>>
>> (cc'ing Ard who has has thought about this problem before in the context
>> of UEFI and VGA.)
>>
>
> Actually, the VGA case is much simpler because the host is not
> expected to write to the framebuffer, only read from it, and the guest
> is not expected to create a cacheable mapping for it, so any
> incoherency can be trivially solved by cache invalidation on the host
> side. (Note that this has nothing to do with DMA coherency, but only
> with PCI MMIO BARs that are backed by DRAM in the host)

In case of the running guest, the host will also only read from the
cached mapping.  Of course, at restore, the host will also write
through a cached mapping, but shouldn't the latter case be solvable by
having KVM clean the cache lines when faulting in any page?

>
> In the migration case, it is much more complicated, and I think
> capturing the state of the VM in a way that takes incoherency between
> caches and main memory into account is simply infeasible (i.e., the
> act of recording the state of guest RAM via a cached mapping may evict
> clean cachelines that are out of sync, and so it is impossible to
> record both the cached *and* the delta with the uncached state)

This may be an incredibly stupid question (and I may have asked it
before), but why can't we clean+invalidate the guest page before
reading it and thereby obtain a coherent view of a page?

>
> I wonder how difficult it would be to
> a) enable trapping of the MMU system register when a guest CPU is
> found to have its MMU off at migration time
> b) allow the guest CPU to complete whatever it thinks it needs to be
> doing with the MMU off
> c) once it re-enables the MMU, proceed with capturing the memory state
>
I don't think this is impossible, but it would prevent migration of
things like bare-metal guests or weirdo unikernel guests running with
the MMU off, wouldn't it?

Thanks,
-Christoffer



reply via email to

[Prev in Thread] Current Thread [Next in Thread]