qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFCv2 2/4] i386/pc: relocate 4g start to 1T where applicable


From: Gerd Hoffmann
Subject: Re: [PATCH RFCv2 2/4] i386/pc: relocate 4g start to 1T where applicable
Date: Tue, 15 Feb 2022 10:53:58 +0100

  Hi,

> I don't know what behavior should be if firmware tries to program
> PCI64 hole beyond supported phys-bits.

Well, you are basically f*cked.

Unfortunately there is no reliable way to figure what phys-bits actually
is.  Because of that the firmware (both seabios and edk2) tries to place
the pci64 hole as low as possible.

The long version:

qemu advertises phys-bits=40 to the guest by default.  Probably because
this is what the first amd opteron processors had, assuming that it
would be a safe default.  Then intel came, releasing processors with
phys-bits=36, even recent (desktop-class) hardware has phys-bits=39.
Boom.

End result is that edk2 uses a 32G pci64 window by default, which is
placed at the first 32G border beyond normal ram.  So for virtual
machines with up to ~ 30G ram (including reservations for memory
hotplug) the pci64 hole covers 32G -> 64G in guest physical address
space, which is low enough that it works on hardware with phys-bits=36.

If your VM has more than 32G of memory the pci64 hole will move and
phys-bits=36 isn't enough any more, but given that you probably only do
that on more beefy hosts which can take >= 64G of RAM and have a larger
physical address space this heuristic works good enough in practice.

Changing phys-bits behavior has been discussed on and off since years.
It's tricky to change for live migration compatibility reasons.

We got the host-phys-bits and host-phys-bits-limit properties, which
solve some of the phys-bits problems.

 * host-phys-bits=on makes sure the phys-bits advertised to the guest
   actually works.  It's off by default though for backward
   compatibility reasons (except microvm).  Also because turning it on
   breaks live migration of machines between hosts with different
   phys-bits.

 * host-phys-bits-limit can be used to tweak phys-bits to
   be lower than what the host supports.  Which can be used for
   live migration compatibility, i.e. if you have a pool of machines
   where some have 36 and some 39 you can limit phys-bits to 36 so
   live migration from 39 hosts to 36 hosts works.

What is missing:

 * Some way for the firmware to get a phys-bits value it can actually
   use.  One possible way would be to have a paravirtual bit somewhere
   telling whenever host-phys-bits is enabled or not.

If edk2 could figure what the usable (guest) physical address space
actually is it could:

  (a) make sure it never crosses that limit, and
  (b) pick better defaults, for example make the pci64 hole larger
      than 32G in case the available address space allows that.

take care,
  Gerd




reply via email to

[Prev in Thread] Current Thread [Next in Thread]