qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support


From: Andrea Arcangeli
Subject: Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support
Date: Thu, 11 Mar 2010 17:46:42 +0100

On Thu, Mar 11, 2010 at 04:28:04PM +0000, Paul Brook wrote:
> > > +         /*
> > > +          * Align on HPAGE_SIZE so "(gfn ^ pfn)&
> > > +          * (HPAGE_SIZE-1) == 0" to allow KVM to take advantage
> > > +          * of hugepages with NPT/EPT.
> > > +          */
> > > +         new_block->host = qemu_memalign(1<<  TARGET_HPAGE_BITS, size);
> 
> This should not be target dependent. i.e. it should be the host page size.

Yep I noticed. I'm not aware of an official way to get that
information out of the kernel (hugepagesize in /proc/meminfo is
dependent on hugetlbfs which in turn is not a dependency for
transparent hugepage support) but hey I can add it myself to
/sys/kernel/mm/transparent_hugepage/hugepage_size !

> > That is a little wasteful.  How about a hint to mmap() requesting proper
> > alignment (MAP_HPAGE_ALIGN)?
> 
> I'd kinda hope that we wouldn't need to. i.e. the host kernel is smart enough 
> to automatically align large allocations anyway.

Kernel won't do that, and the main reason is to avoid creating more
vmas, it's more efficient to waste virtual space and have userland
allocate more than needed, than ask the kernel alignment and force it
to create more vmas because of holes generated out of it. virtual
memory costs nothing.

Also khugepaged can later zero out the pte_none regions to create a
full segment all backed by hugepages, however if we do that khugepaged
will eat into the free memory space. At the moment I kept khugepaged a
zero-memory-footprint thing. But I'm currently adding an option called
collapse_unmapped to allow khugepaged to collapse unmapped pages too
so if there are only 2/3 pages in the region before the memalign, they
also can be mapped by a large tlb to allow qemu run faster.

> This is probably a useful optimization regardless of KVM.

HPAGE alignment is only useful with KVM because it can only payoff
with EPT/NPT, transparent hugepage already works fine without that
(but ok it'd be a microoptimization for the first and last few pages
in the whole vma). This is why I made it conditional to
kvm_enabled(). I can remove the kvm_enabled() check if you worry about
the first and last pages in the huge anon vma.

OTOH the madvise(MADV_HUGEPAGE) is surely good idea for qemu too. KVM
normally runs on 64bit hosts, so it's no big deal if we waste 1M of
virtual memory here and there but I thought on qemu you preferred not
to have alignment and have the first few and last few pages in a vma
not backed by large tlb. Ideally we should also align on hpage size if
sizeof(long) = 8. Not sure what's the recommended way to code that
though and it'll make it a bit more complex for little good.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]