qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 4/6] i386/pc: relocate 4g start to 1T where applicable


From: Joao Martins
Subject: Re: [PATCH v3 4/6] i386/pc: relocate 4g start to 1T where applicable
Date: Fri, 25 Feb 2022 17:40:26 +0000

On 2/25/22 12:49, Michael S. Tsirkin wrote:
> On Fri, Feb 25, 2022 at 12:36:24PM +0000, Joao Martins wrote:
>> I am trying to approach this iteratively and starting by fixing AMD 1T+ 
>> guests
>> with something that hopefully is less painful to bear and unbreaks users 
>> doing
>> multi-TB guests on kernels >= 5.4. While for < 5.4 it would not wrongly be
>> DMA mapping bad IOVAs that may lead guests own spurious failures.
>> For the longterm, qemu would need some sort of handling of configurable 
>> sparse
>> map of all guest RAM which currently does not exist (and it's stuffed inside 
>> on a
>> per-machine basis as you're aware). What I am unsure is the churn associated
>> with it (compat, migration, mem-hotplug, nvdimms, memory-backends) versus 
>> benefit
>> if it's "just" one class of x86 platforms (Intel not affected) -- which is 
>> what I find
>> attractive with the past 2 revisions via smaller change.
> 
> Right. I pondered this for a while and I wonder whether you considered
> making this depend on the guest cpu vendor and max phys bits. 

Hmmm, I am considering phys-bits already (or +host-phys-bits) but not
max_host_phys_bits. But I am not sure the latter is relevant for this case.
phys-bits is what we need to gate as that's what's ultimately exposed to
the guest based on the various -cpu options. I can bring back to like v2
and prior to consider relocating if phys-bits aren't enough and bail out.

> Things
> are easier to debug if the memory map is the same whatever the host. The
> guest vendor typically matches the host cpu vendor after all, and there
> just could be guests avoiding the reserved memory ranges on principle.
> 
Regarding guest cpu vendor, if we gate to guest CPU vendor alone this actually
increases the span of guests it might affect to, compared to just checking
host AMD IOMMU existence. The checking of AMD IOMMU would exclude no-host-IOMMU
1T AMD guests which do not need to consider this HT reserved range.

I can restrict this to guest CPU being AMD solely (assuming
-cpu host is covered too), if folks have mixed feelings towards checking
host amd IOMMU.

To be clear checking guest CPU vendor alone, would not capture the case of
using a -cpu {Skylake,...} on a AMD host, so the failure would occur just
the same. I assume you're OK with that.

> We'll need a bunch of code comments explaining all this hackery, as well
> as machine type compat things, but that is par for the course.
> 
> Additionally, we could have a host check and then fail to init vdpa and
> vfio devices if the memory map will make some memory inaccessible.
> 
> Does this sound reasonable to others? Alex? Joao?
> 
Sounds reasonable the earlier part.

Regarding device init failure logic I think the only one that might need 
checking
is vDPA as vfio already validates that sort of thing (on >= 5.4). Albeit given
how agnostic this is to the -devices this passes the memory map gets adjusted
to make it work for vfio/vdpa (should this be solely on guest AMD vendor or amd
host iommu existence) ... then I am not sure this is needed.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]