[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/5] x86: Allow physical address bits to be set

From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH 4/5] x86: Allow physical address bits to be set
Date: Fri, 17 Jun 2016 15:38:53 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0

On 17/06/2016 15:18, Eduardo Habkost wrote:
> On Fri, Jun 17, 2016 at 09:15:06AM +0100, Dr. David Alan Gilbert wrote:
>> * Eduardo Habkost (address@hidden) wrote:
>>> On Thu, Jun 16, 2016 at 06:12:12PM +0100, Dr. David Alan Gilbert (git) 
>>> wrote:
>>>> From: "Dr. David Alan Gilbert" <address@hidden>
>>>> Currently QEMU sets the x86 number of physical address bits to the
>>>> magic number 40.  This is only correct on some small AMD systems;
>>>> Intel systems tend to have 36, 39, 46 bits, and large AMD systems
>>>> tend to have 48.
>>>> Having the value different from your actual hardware is detectable
>>>> by the guest and in principal can cause problems;
>>> What kind of problems?
>>> Is it a problem to have something smaller from the actual
>>> hardware, or just if it's higher?
>> I'm a bit vague on the failure cases; but my understanding of the two
>> cases are;
>> Larger is a problem if the guest tries to map something to a high
>> address that's not addressable.

        (Note: this is a problem when migrating to hosts with _smaller_

>> Smaller is potentially a problem if the guest plays tricks with
>> what it thinks are spare bits in page tables but which are actually
>> interpreted.   I believe KVM plays a trick like this.

        (Note: this is a problem when migrating to hosts with _larger_

> If both smaller and larger are a problem, we have a much bigger
> problem than we thought. We need to confirm this.
> So, what happens if the guest play tricks in bits 40-45 when QEMU
> sets the limit to 40 but we are running in a 46-bit host? Is it
> really a problem? I assumed it would be safe.

The guest expects a "reserved bit set" page fault, but doesn't get one.

>>    2) While we have maxmem settings to tell us the top of VM RAM, do
>>       we have anything that tells us the top of IO space? What happens
>>       when we hotplug a PCI card?
> (CCing Marcel and Michael, as we were discussing this recently.)
> That's a good question. When calculating how many bits the
> machine requires, machine code could choose to reserve a
> reasonable amount of space for hotplug by default.
> Whatever we choose as the default, in some corner cases (e.g.
> almost-32GB VMs running in a 39-bit host) we will still need to
> let the user choose between having extra space for hotplug and
> being able to safely migrate to 36-bit hosts.

No, this is not possible unfortunately.  If you set phys-bits <
host-phys-bits, the guest may expect some bits to be reserved, when they
actually aren't.  In practice this doesn't happen for the reason I
mentioned in my other message (tl;dr: 1-the trick is rarely used though
KVM uses it, 2-if they use bit 51 they're safe in practice).  But still
making phys-bits smaller than host-phys-bits is a bad idea.

Making the guest's phys-bits larger than host-phys-bits would be okay if
you reserve the area in the e820 and assume the guest doesn't touch it.
But it is not a great idea too, because e820 describes RAM, so you're
telling the guest "look, there's 64 TB of reserved RAM up there".

>>    3) Is it better to stick to sizes that correspond to real hardware
>>       if you can?  For example I don't know of any machines with 37 bits
>>       - in practice I think it's best to stick with sizes that correspond
>>       to some real hardware.
> Yeah, "as small as possible" could be actually "the smallest
> possible value from a set of known-to-exist values". e.g. if we
> find out that we need 37 bits, it's probably better to simply use
> 39 bits.
> Choosing from a smaller set of values also makes corner cases
> (like the example above) less likely to happen.

Not really, because any value that doesn't match the host is
problematic, albeit in different ways.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]