qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Question about wrong ram-node0 reference


From: liujunjie (A)
Subject: Re: [Qemu-devel] Question about wrong ram-node0 reference
Date: Mon, 27 May 2019 12:51:00 +0000

We find only one VM aborted among at least 20 VMs with the same configuration. 
And this problem does not reproduce yet... (Be aware of reproduce is importance 
to figure out the problem, we already tried to add more VMs to reproduce, but 
no results yet.)
The qemu cmdline is as follows:
/usr/bin/qemu-kvm -name guest=instance-00025bf8,debug-threads=on -S -object 
secret,id=masterKey0,format=raw,file=/var/run/libvirt/qemu/domain-118-instance-00025bf8/master-key.aes
 -machine 
pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off,max-ram-below-4g=2G -cpu 
host,host-cache-info=on -m 131072 -realtime min_guarantee=131072,mlock=off -smp 
16,sockets=2,cores=4,threads=2 -object iothread,id=iothread1 -object 
iothread,id=iothread2 -object iothread,id=iothread3 -object 
iothread,id=iothread4 -object iothread,id=iothread5 -object 
iothread,id=iothread6 -object iothread,id=iothread7 -object 
iothread,id=iothread8 -object iothread,id=iothread9 -object 
iothread,id=iothread10 -object iothread,id=iothread11 -object 
iothread,id=iothread12 -object iothread,id=iothread13 -object 
iothread,id=iothread14 -object iothread,id=iothread15 -object 
iothread,id=iothread16 -object iothread,id=iothread17 -object 
iothread,id=iothread18 -object iothread,id=iothread19 -object 
iothread,id=iothread20 -object iothread,id=iothread21 -object 
iothread,id=iothread22 -object iothread,id=iothread23 -object 
iothread,id=iothread24 -object iothread,id=iothread25 -object 
iothread,id=iothread26 -object iothread,id=iothread27 -object 
iothread,id=iothread28 -object iothread,id=iothread29 -object 
memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/118-instance-00025bf8,share=yes,size=68719476736,host-nodes=0,policy=bind
 -numa node,nodeid=0,cpus=0-7,memdev=ram-node0 -object 
memory-backend-file,id=ram-node1,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/118-instance-00025bf8,share=yes,size=68719476736,host-nodes=1,policy=bind
 -numa node,nodeid=1,cpus=8-15,memdev=ram-node1 -uuid 
6952c043-4e0c-4267-80c1-fac2e302443f -smbios type=1,manufacturer=OpenStack 
Foundation,product=OpenStack 
Nova,version=13.2.1-20181119144459,serial=c5cc21e6-1d3b-4587-8c1e-208a1d19a47e,uuid=6952c043-4e0c-4267-80c1-fac2e302443f,family=Virtual
 Machine -no-user-config -nodefaults -chardev 
socket,id=charmonitor,path=/var/run/libvirt/qemu/domain-118-instance-00025bf8/monitor.sock,server,nowait
 -mon chardev=charmonitor,id=monitor,mode=control -rtc 
base=2019-01-21T06:59:37,clock=vm,driftfix=slew -global 
kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -boot strict=on -device 
pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x3 -device 
pci-bridge,chassis_nr=2,id=pci.2,bus=pci.0,addr=0x4 -device 
pci-bridge,chassis_nr=3,id=pci.3,bus=pci.0,addr=0x5 -device 
pci-bridge,chassis_nr=4,id=pci.4,bus=pci.0,addr=0x6 -device 
pci-bridge,chassis_nr=5,id=pci.5,bus=pci.0,addr=0x7 -device 
pci-bridge,chassis_nr=6,id=pci.6,bus=pci.0,addr=0x8 -device 
pci-bridge,chassis_nr=7,id=pci.7,bus=pci.0,addr=0x9 -device 
pci-bridge,chassis_nr=8,id=pci.8,bus=pci.0,addr=0xa -device 
piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device 
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0xb -drive 
file=/dev/mapper/648d06e72e68404a9401854e21409f3d-dm,format=raw,if=none,id=drive-virtio-disk0,serial=648d06e7-2e68-404a-9401-854e21409f3d,cache=none,aio=native
 -device 
virtio-blk-pci,scsi=off,bus=pci.2,addr=0x1,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
 -chardev socket,id=charnet0,path=/var/run/vhost-user/tap4ba9f4eb-19 -netdev 
vhost-user,chardev=charnet0,queues=4,id=hostnet0 -device 
virtio-net-pci,mq=on,vectors=10,netdev=hostnet0,id=net0,mac=fa:16:3e:0f:ed:94,bus=pci.4,addr=0x3,bootindex=2
 -add-fd set=0,fd=45 -chardev file,id=charserial0,path=/dev/fdset/0,append=on 
-device isa-serial,chardev=charserial0,id=serial0 -chardev 
socket,id=charchannel0,path=/var/run/libvirt/qemu/instance-00025bf8.extend,server,nowait
 -device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1
 -chardev 
socket,id=charchannel1,path=/var/run/libvirt/qemu/instance-00025bf8.agent,server,nowait
 -device 
virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0
 -chardev 
socket,id=charchannel2,path=/var/run/libvirt/qemu/instance-00025bf8.hostd,server,nowait
 -device 
virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=org.qemu.guest_agent.2
 -chardev 
socket,id=charchannel3,path=/var/run/libvirt/qemu/instance-00025bf8.upgraded,server,nowait
 -device 
virtserialport,bus=virtio-serial0.0,nr=4,chardev=charchannel3,id=channel3,name=org.qemu.guest_agent.3
 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 172.28.5.246:3,password -k 
en-us -device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2 -device 
vfio-pci,host=95:00.0,id=hostdev0,bus=pci.5,addr=0x1 -device 
vfio-pci,host=99:00.0,id=hostdev1,bus=pci.5,addr=0x2 -device 
vfio-pci,host=35:00.0,id=hostdev2,peer-clique-id=0,iomem=0x98000000-0x98ffffff:0x3e800000000-0x3ebffffffff:0x3ec00000000-0x3ec01ffffff,bus=pci.0,addr=0xc
 -device 
vfio-pci,host=39:00.0,id=hostdev3,peer-clique-id=0,iomem=0x92000000-0x92ffffff:0x3e000000000-0x3e3ffffffff:0x3e400000000-0x3e401ffffff,bus=pci.0,addr=0xd
 -global p2p.downstream_ports=28:10.0 28:14.0 -device 
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xe -NetInterruptAutobind 
-chardev 
file,id=seabios,path=/var/log/libvirt/qemu/instance-00025bf8.seabios,mux=off,append=on
 -device isa-debugcon,iobase=0x402,chardev=seabios -msg timestamp=on

> -----Original Message-----
> From: Igor Mammedov [mailto:address@hidden
> Sent: Monday, May 27, 2019 3:57 PM
> To: liujunjie (A) <address@hidden>
> Cc: address@hidden; address@hidden; address@hidden;
> address@hidden; Zhoujian (jay) <address@hidden>; fangying
> <address@hidden>; wangxin (U) <address@hidden>;
> Huangweidong (C) <address@hidden>
> Subject: Re: Question about wrong ram-node0 reference
> 
> On Sat, 25 May 2019 03:35:20 +0000
> "liujunjie (A)" <address@hidden> wrote:
> 
> > Hi, I have met a problem:
> >
> > The QEMU version is 2.8.1, the virtual machine is configured with 1G huge
> pages, two NUMA nodes and four pass-through NVME SSDs.
> >
> > After we started the VM, in addition to some QMP queries nothing more has
> been done, the QEMU aborted after some months later.
> > After that, the VM is restarted, and the problem does not reproduce yet.
> > And The backtrace of the RCU thread is as follows:
> > (gdb) bt
> > #0  0x00007fd2695f0197 in raise () from /usr/lib64/libc.so.6
> > #1  0x00007fd2695f1888 in abort () from /usr/lib64/libc.so.6
> > #2  0x00007fd2695e9206 in __assert_fail_base () from /usr/lib64/libc.so.6
> > #3  0x00007fd2695e92b2 in __assert_fail () from /usr/lib64/libc.so.6
> > #4  0x0000000000476a84 in memory_region_finalize (obj=<optimized out>)
> >     at /home/abuild/rpmbuild/BUILD/qemu-kvm-2.8.1/memory.c:1512
> > #5  0x0000000000763105 in object_deinit (address@hidden,
> >     address@hidden) at qom/object.c:448
> > #6  0x0000000000763153 in object_finalize (data=0x1dc1fd0) at
> qom/object.c:462
> > #7  0x00000000007627cc in object_property_del_all
> (address@hidden)
> >     at qom/object.c:399
> > #8  0x0000000000763148 in object_finalize (data=0x1dc1f70) at
> qom/object.c:461
> > #9  0x0000000000764426 in object_unref (obj=<optimized out>) at
> qom/object.c:897
> > #10 0x0000000000473b6b in memory_region_unref (mr=<optimized out>)
> >     at /home/abuild/rpmbuild/BUILD/qemu-kvm-2.8.1/memory.c:1560
> > #11 0x0000000000473bc7 in flatview_destroy (view=0x7fc188b9cb90)
> >     at /home/abuild/rpmbuild/BUILD/qemu-kvm-2.8.1/memory.c:289
> > #12 0x0000000000843be0 in call_rcu_thread (opaque=<optimized out>)
> >     at util/rcu.c:279
> > #13 0x00000000008325c2 in qemu_thread_start
> (address@hidden)
> >     at util/qemu_thread_posix.c:496
> > #14 0x00007fd269983dc5 in start_thread () from /usr/lib64/libpthread.so.0
> > #15 0x00007fd2696b27bd in clone () from /usr/lib64/libc.so.6
> >
> > In this core, I found that the reference of "/objects/ram-node0"( the type 
> > of
> ram-node0 is struct "HostMemoryBackendFile") equals to 0 , while the
> reference of "/objects/ram-node1" equals to 129, more details can be seen at
> the end of this email.
> >
> > I searched through the community, and found a case that had the same error
> report:
> https://mail.coreboot.org/pipermail/seabios/2017-September/011799.html
> > However, I did not configure pcie_pci_bridge. Besides, qemu aborted in
> device initialization phase in this case.
> That case doesn't seem relevant.
> 
> >
> > Also, I try to find out which can reference "/objects/ram-node0" so as to 
> > look
> for the one that may un reference improperly, most of them lie in the function
> of "render_memory_region" or "phys_section_add" when memory topology
> changes.
> > Later, the temporary flatviews are destroyed by RCU thread, so un reference
> happened and the backtrace is similar to the one shown above.
> > But I am not familiar with the detail of these process, it is hard to keep 
> > trace
> of these memory topology changes.
> >
> > My question is:
> > How can ram-node0's reference comes down to 0 when the virtual machine is
> still running?
> >
> > Maybe someone who is familiar with memory_region_ref or
> memory-backend-file can help me figure out.
> > Any idea is appreciated.
> 
> Could you provide steps to reproduce (incl. command line)?
> 
> [...]
> > Thanks,
> > Junjie Liu
> >




reply via email to

[Prev in Thread] Current Thread [Next in Thread]