[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [qemu-s390x] [Qemu-devel] [PATCH for-2.13] Clear mem_path

From: Christian Borntraeger
Subject: Re: [Qemu-ppc] [qemu-s390x] [Qemu-devel] [PATCH for-2.13] Clear mem_path if we fall back to anonymous RAM allocation
Date: Fri, 20 Apr 2018 09:13:52 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0

On 04/19/2018 06:08 PM, Greg Kurz wrote:
> On Thu, 19 Apr 2018 16:11:37 +0200
> David Hildenbrand <address@hidden> wrote:
>> On 19.04.2018 15:34, Christian Borntraeger wrote:
>>> On 04/19/2018 02:58 PM, Cornelia Huck wrote:  
>>>> On Thu, 19 Apr 2018 14:33:18 +0200
>>>> Igor Mammedov <address@hidden> wrote:
>>>>> On Thu, 19 Apr 2018 17:21:23 +1000
>>>>> David Gibson <address@hidden> wrote:
>>>>>> If the -mem-path option is set, we attempt to map the guest's RAM from a
>>>>>> file in the given path; it's usually used to back guest RAM with 
>>>>>> hugepages.
>>>>>> If we're unable to (e.g. not enough free hugepages) then we fall back to
>>>>>> allocating normal anonymous pages.  This behaviour can be surprising, 
>>>>>> but a
>>>>>> comment in allocate_system_memory_nonnuma() suggests it's legacy 
>>>>>> behaviour
>>>>>> we can't change.
>>>>>> What really isn't ok, though, is that in this case we leave mem_path set.
>>>>>> That means functions which attempt to determine the pagesize of main RAM
>>>>>> can erroneously think it is hugepage based on the requested path, even
>>>>>> though it's not.
>>>>>> This is particular bad for the pseries machine type.  KVM HV limitations
>>>>>> mean the guest can't use pagesizes larger than the host page size used to
>>>>>> back RAM.  That means that such a fallback, rather than merely giving
>>>>>> poorer performance that expected will cause the guest to freeze up early 
>>>>>> in
>>>>>> boot as it attempts to use large page mappings that can't work.
>>>>>> This patch addresses the problem by clearing the mem_path variable when 
>>>>>> we
>>>>>> fall back to anonymous pages, meaning that subsequent attempts to
>>>>>> determine the RAM page size will get an accurate result.
>>>>>> Signed-off-by: David Gibson <address@hidden>
>>>>>> ---
>>>>>>  numa.c | 1 +
>>>>>>  1 file changed, 1 insertion(+)
>>>>>> Paolo et al, as with my earlier patches adding some extensions to the
>>>>>> helpers for determining backing page sizes, if there are no objections
>>>>>> can I get an ack to merge this via my ppc tree?
>>>>>> diff --git a/numa.c b/numa.c
>>>>>> index 1116c90af9..78a869e598 100644
>>>>>> --- a/numa.c
>>>>>> +++ b/numa.c
>>>>>> @@ -469,6 +469,7 @@ static void 
>>>>>> allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
>>>>>>              /* Legacy behavior: if allocation failed, fall back to
>>>>>>               * regular RAM allocation.
>>>>>>               */
>>>>>> +            mem_path = NULL;
>>>>>>              memory_region_init_ram_nomigrate(mr, owner, name, ram_size, 
>>>>>> &error_fatal);
>>>>>>          }
>>>>>>  #else    
>>>>> mem_path is also used by kvm_s390_apply_cpu_model(),
>>>>> and in ccw_init() memory is initialized before CPUs are
> Something similar happens with spapr: kvm_fixup_page_sizes() calls
> qemu_getrampagesize() during CPU start, which happens before the machine
> init calls allocate_system_memory_nonnuma(). Shouldn't we allocate memory
> before calling spapr_init_cpus() in spapr_machine_init() then ?
>>>>> so if QEM was started with -mem-path, then before patch
>>>>> created CPU won't have CMM enabled and print warning:
>>>>>  "CMM will not be enabled because it is not compatible with hugetlbfs."
>>>>> and after patch it might enable CMM if we clear mem_path.
>>>>> So question is do we care about this?  
>>>> I don't quite remember the cmm semantics here -- Christian?  
>>> The CMMA interface does not work on large pages. I think the kernel will 
>>> react
>>> with EFAULT in some cases (cmma migration and others) so qemu will probably 
>>> fail
>>> unexpectedly. 
>>> But this patch seems to only clear mem-path if we do not allocate at all 
>>> from
>>> hugetlbfs. So things should be ok, no?
>> This even looks like the right thing to me, as hugetlbfs was never
>> supported.
> Unrelated to this patch, -mem-path can be passed something that doesn't sit
> in a hugetlbfs, in which case we use getpagesize()... is there a reason for
> kvm_s390_enable_cmma() to filter out this case as well ? Or should we rather
> check mem_path isn't NULL and points to a hugetlbfs ?

cmm is somewhat special, so I prefer to have it only for non-mem-path memory
since I know that it works for anonymous pages. I would rather white list other
mechanisms if necessary in the future.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]