qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-stable] [Qemu-devel] [PATCH] fix guest physical bits to match


From: Eduardo Habkost
Subject: Re: [Qemu-stable] [Qemu-devel] [PATCH] fix guest physical bits to match host, to go beyond 1TB guests
Date: Tue, 16 Jul 2013 15:11:36 -0300
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Jul 16, 2013 at 07:46:14PM +0200, Paolo Bonzini wrote:
> Il 16/07/2013 19:38, Eduardo Habkost ha scritto:
> > On Tue, Jul 16, 2013 at 07:22:01PM +0200, Andrea Arcangeli wrote:
> >> Without this patch the guest physical bits are advertised as 40, not
> >> 44 or more depending on the hardware capability of the host.
> >>
> >> That leads to guest kernel crashes with injection of page faults 9
> >> (see oops: 0009) as bits above 40 in the guest pagetables are
> >> considered reserved.
> >>
> >> exregion-0206 [324572448] [17] ex_system_memory_space: System-Memory 
> >> (width 32) R/W 0 Address=00000000FED00000
> >> BUG: unable to handle kernel paging request at ffffc9006030e000
> >> IP: [<ffffffff812fbb6f>] acpi_ex_system_memory_space_handler+0x23e/0x2cb
> >> PGD e01f875067 PUD 1001f075067 PMD e0178d8067 PTE 80000000fed00173
> >> Oops: 0009 [#1] SMP
> >>
> >> (see PUD with bit >=40 set)
> > 
> > I am not sure I understand what caused this: if we are advertising 40
> > physical bits to the guest, why are we ending up with a PUD with
> > bit >= 40 set?
> 
> Because we create a guest that has bigger memory than what we advertise
> in CPUID.

That means we never supported guests that large, doesn't it? Why is it a
qemu-stable candidate, then?

> 
> >>
> >> Signed-off-by: Andrea Arcangeli <address@hidden>
> >> Reported-by: Chegu Vinod <address@hidden>
> >> ---
> >>  target-i386/cpu.c | 6 ++++++
> >>  1 file changed, 6 insertions(+)
> >>
> >> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> >> index e3f75a8..0e65673 100644
> >> --- a/target-i386/cpu.c
> >> +++ b/target-i386/cpu.c
> >> @@ -2108,6 +2108,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t 
> >> index, uint32_t count,
> >>              /* 64 bit processor */
> >>  /* XXX: The physical address space is limited to 42 bits in exec.c. */
> >>              *eax = 0x00003028; /* 48 bits virtual, 40 bits physical */
> >> +            if (kvm_enabled()) {
> >> +                uint32_t _eax;
> >> +                host_cpuid(0x80000000, 0, &_eax, NULL, NULL, NULL);
> >> +                if (_eax >= 0x80000008)
> >> +                    host_cpuid(0x80000008, 0, eax, NULL, NULL, NULL);
> >> +            }
> > 
> > We can't expose a different virtual machine depending on host
> > capabilities. What if we live-migrate between hosts with different
> > physical address bit sizes?
> 
> Same as for vPMU or leaf 0xD: who knows.  In practice, this has an
> effect only for guests with 1T or more memory, otherwise the physical
> memory is smaller than 40 bits.

Leafs 0xA and 0xD are buggy, and we need to disable that "passthrough
mode" behavior by default, because it breaks live-migration. If we want
to expose new features to the guest we need to make those features
configurable, not expose them automatically depending on host
capabilities.

(I was aware of the problem on vPMU/0xA, but I didn't notice 0xD was
broken was well. Thanks for noting.)

For physical bit size, what about extending it in a backwards-compatible
way? Something like this:

    *eax = 0x0003000; /* 48 bits virtual */
    if (ram_size < 1TB) {
        physical_size = 40; /* Keeping backwards compatibility */
    } else if (ram_size < 4TB) {
        physical_size = 42;
    } else {
        abort();
    }
    if (supported_host_physical_size() < physical_size) {
        abort();
    }
    *eax |= physical_size;

(Of course, the abort() calls should be replaced with proper error
reporting)

> 
> Paolo
> 
> >>          } else {
> >>              if (env->features[FEAT_1_EDX] & CPUID_PSE36) {
> >>                  *eax = 0x00000024; /* 36 bits physical */
> >>
> > 
> 
> 

-- 
Eduardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]