qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 9 TiB vm memory creation


From: Ani Sinha
Subject: Re: 9 TiB vm memory creation
Date: Tue, 15 Feb 2022 15:18:45 +0530

On Tue, Feb 15, 2022 at 3:14 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 15.02.22 10:40, Ani Sinha wrote:
> > On Tue, Feb 15, 2022 at 2:08 PM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 15.02.22 09:12, Ani Sinha wrote:
> >>> On Tue, Feb 15, 2022 at 1:25 PM David Hildenbrand <david@redhat.com> 
> >>> wrote:
> >>>>
> >>>> On 15.02.22 08:00, Ani Sinha wrote:
> >>>>>
> >>>>>
> >>>>> On Mon, 14 Feb 2022, David Hildenbrand wrote:
> >>>>>
> >>>>>> On 14.02.22 13:36, Igor Mammedov wrote:
> >>>>>>> On Mon, 14 Feb 2022 10:54:22 +0530 (IST)
> >>>>>>> Ani Sinha <ani@anisinha.ca> wrote:
> >>>>>>>
> >>>>>>>> Hi Igor:
> >>>>>>>>
> >>>>>>>> I failed to spawn a 9 Tib VM. The max I could do was a 2 TiB vm on my
> >>>>>>>> system with the following commandline before either the system
> >>>>>>>> destabilized or the OOM killed killed qemu
> >>>>>>>>
> >>>>>>>> -m 2T,maxmem=9T,slots=1 \
> >>>>>>>> -object 
> >>>>>>>> memory-backend-file,id=mem0,size=2T,mem-path=/data/temp/memfile,prealloc=off
> >>>>>>>>  \
> >>>>>>>> -machine memory-backend=mem0 \
> >>>>>>>> -chardev file,path=/tmp/debugcon2.txt,id=debugcon \
> >>>>>>>> -device isa-debugcon,iobase=0x402,chardev=debugcon \
> >>>>>>>>
> >>>>>>>> I have attached the debugcon output from 2 TiB vm.
> >>>>>>>> Is there any other commandline parameters or options I should try?
> >>>>>>>>
> >>>>>>>> thanks
> >>>>>>>> ani
> >>>>>>>
> >>>>>>> $ truncate -s 9T 9tb_sparse_disk.img
> >>>>>>> $ qemu-system-x86_64 -m 9T \
> >>>>>>>   -object 
> >>>>>>> memory-backend-file,id=mem0,size=9T,mem-path=9tb_sparse_disk.img,prealloc=off,share=on
> >>>>>>>  \
> >>>>>>>   -machine memory-backend=mem0
> >>>>>>>
> >>>>>>> works for me till GRUB menu, with sufficient guest kernel
> >>>>>>> persuasion (i.e. CLI limit ram size to something reasonable) you can 
> >>>>>>> boot linux
> >>>>>>> guest on it and inspect SMBIOS tables comfortably.
> >>>>>>>
> >>>>>>>
> >>>>>>> With KVM enabled it bails out with:
> >>>>>>>    qemu-system-x86_64: kvm_set_user_memory_region: 
> >>>>>>> KVM_SET_USER_MEMORY_REGION failed, slot=1, start=0x100000000, 
> >>>>>>> size=0x8ff40000000: Invalid argument
> >>>>>>>
> >>>>>
> >>>>> I have seen this in my system but not always. Maybe I should have dug
> >>>>> deeper as to why i do see this all the time.
> >>>>>
> >>>>>>> all of that on a host with 32G of RAM/no swap.
> >>>>>>>
> >>>>>
> >>>>> My system in 16 Gib of main memory, no swap.
> >>>>>
> >>>>>>
> >>>>>> #define KVM_MEM_MAX_NR_PAGES ((1UL << 31) - 1)
> >>>>>>
> >>>>>> ~8 TiB (7,999999)
> >>>>>
> >>>>> That's not 8 Tib, thats 2 GiB. But yes, 0x8ff40000000 is certainly 
> >>>>> greater
> >>>>> than 2 Gib * 4K (assuming 4K size pages).
> >>>>
> >>>> "pages" don't carry the unit "GiB/TiB", so I was talking about the
> >>>> actual size with 4k pages (your setup, I assume)
> >>>
> >>> yes I got that after reading your email again.
> >>> The interesting question now is how is redhat QE running 9 TiB vm with 
> >>> kvm?
> >>
> >> As already indicated by me regarding s390x only having single large NUMA
> >> nodes, x86 is usually using multiple NUMA nodes with such large memory.
> >> And QE seems to be using virtual numa nodes:
> >>
> >> Each of the 32 virtual numa nodes receive a:
> >>
> >>   -object memory-backend-ram,id=ram-node20,size=309237645312,host-
> >>    nodes=0-31,policy=bind
> >>
> >> which results in a dedicated KVM memslot (just like each DIMM would)
> >>
> >>
> >> 32 * 309237645312 == 9 TiB :)
> >
> > ah, I should have looked closely at the other commandlines before
> > shooting off the email. Yes the limitation is per mem-slot and they
> > have 32 slots one per node.
> > ok so should we do
> > kvm_set_max_memslot_size(KVM_SLOT_MAX_BYTES);
> > from i386 kvm_arch_init()?
>
>
> As I said, I'm not a friend of these workarounds in user space.

Oh ok, did not realize you were against s390x like workarounds.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]