qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question on memory commit during MR finalize()


From: Peter Xu
Subject: Re: Question on memory commit during MR finalize()
Date: Mon, 19 Jul 2021 15:59:15 -0400

On Mon, Jul 19, 2021 at 07:05:29PM +0000, Thanos Makatos wrote:
> > -----Original Message-----
> > From: Qemu-devel <qemu-devel-
> > bounces+thanos.makatos=nutanix.com@nongnu.org> On Behalf Of Thanos
> > Makatos
> > Sent: 19 July 2021 19:02
> > To: Peter Xu <peterx@redhat.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>; John Levon
> > <john.levon@nutanix.com>; John G Johnson <john.g.johnson@oracle.com>;
> > Markus Armbruster <armbru@redhat.com>; QEMU Devel Mailing List
> > <qemu-devel@nongnu.org>
> > Subject: Re: Question on memory commit during MR finalize()
> > 
> > Omg I don't know how I missed that, of course I'll ignore SIGUSR1 and 
> > retest!
> > 
> > ________________________________________
> > From: Peter Xu <mailto:peterx@redhat.com>
> > Sent: Monday, 19 July 2021, 16:58
> > To: Thanos Makatos
> > Cc: Paolo Bonzini; Markus Armbruster; QEMU Devel Mailing List; John Levon;
> > John G Johnson
> > Subject: Re: Question on memory commit during MR finalize()
> > 
> > 
> > Hi, Thanos,
> > 
> > On Mon, Jul 19, 2021 at 02:38:52PM +0000, Thanos Makatos wrote:
> > > I can trivially trigger an assertion with a build where I merged the 
> > > recent
> > vfio-user patches (https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__patchew.org_QEMU_cover.1626675354.git.elena.ufimtseva-
> > 40oracle.com_&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJv
> > tw6ogtti46atk736SI4vgsJiUKIyDE&m=LvALaULnrxZWlgXFcaxGAl95UIwq3a6LI8
> > OnG_5r4XY&s=moFPVchYp27xozQcvvxG4nb4nC2QmMnqQ1Wmt4Z3dNE&e
> > = ) to master and then merging the result into your xzpeter/memory-sanity
> > branch, I've pushed the branch here:
> > https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__github.com_tmakatos_qemu_tree_memory-
> > 2Dsanity&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6og
> > tti46atk736SI4vgsJiUKIyDE&m=LvALaULnrxZWlgXFcaxGAl95UIwq3a6LI8OnG_
> > 5r4XY&s=veyjdkkFkGSYNDZOuksB-kbHmdQaw9RYxyZp8Qo7nW4&e= . I
> > explain the repro steps below in case you want to take a look:
> > >
> > > Build as follows:
> > >
> > > ./configure --prefix=/opt/qemu-xzpeter --target-list=x86_64-softmmu --
> > enable-kvm  --enable-debug --enable-multiprocess && make -j `nproc` &&
> > make install
> > >
> > > Then build and run the GPIO sample from libvfio-user
> > (https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__github.com_nutanix_libvfio-
> > 2Duser&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6ogt
> > ti46atk736SI4vgsJiUKIyDE&m=LvALaULnrxZWlgXFcaxGAl95UIwq3a6LI8OnG_5
> > r4XY&s=HYP5NmDMGuS13pdyV83x3HzyhGbE-oP1T8NLtu0d1U8&e= ):
> > >
> > > libvfio-user/build/dbg/samples/gpio-pci-idio-16 -v /var/run/vfio-user.sock
> > >
> > > And then run QEMU as follows:
> > >
> > > gdb --args /opt/qemu-xzpeter/bin/qemu-system-x86_64 -cpu host -
> > enable-kvm -smp 4 -m 2G -object memory-backend-
> > file,id=mem0,size=2G,mem-path=/dev/hugepages,share=on,prealloc=yes -
> > numa node,memdev=mem0 -kernel bionic-server-cloudimg-amd64-vmlinuz-
> > generic -initrd bionic-server-cloudimg-amd64-initrd-generic -append
> > 'console=ttyS0 root=/dev/sda1 single' -hda bionic-server-cloudimg-amd64-
> > 0.raw -nic user,model=virtio-net-pci -machine pc-q35-3.1 -device vfio-user-
> > pci,socket=/var/run/vfio-user.sock -nographic
> > >
> > > I immediately get the following stack trace:
> > >
> > > Thread 5 "qemu-system-x86" received signal SIGUSR1, User defined signal
> > 1.
> > 
> > This is SIGUSR1.  QEMU uses it for general vcpu ipis.
> > 
> > > [Switching to Thread 0x7fffe6e82700 (LWP 151973)]
> > > __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
> > > 103     ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or
> > directory.
> > > (gdb) bt
> > > #0  0x00007ffff655d29c in __lll_lock_wait () at
> > ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
> > > #1  0x00007ffff6558642 in __pthread_mutex_cond_lock
> > (mutex=mutex@entry=0x5555568bb280 <qemu_global_mutex>) at
> > ../nptl/pthread_mutex_lock.c:80
> > > #2  0x00007ffff6559ef8 in __pthread_cond_wait_common (abstime=0x0,
> > mutex=0x5555568bb280 <qemu_global_mutex>, cond=0x555556cecc30) at
> > pthread_cond_wait.c:645
> > > #3  0x00007ffff6559ef8 in __pthread_cond_wait (cond=0x555556cecc30,
> > mutex=0x5555568bb280 <qemu_global_mutex>) at
> > pthread_cond_wait.c:655
> > > #4  0x000055555604f717 in qemu_cond_wait_impl (cond=0x555556cecc30,
> > mutex=0x5555568bb280 <qemu_global_mutex>, file=0x5555561ca869
> > "../softmmu/cpus.c", line=514) at ../util/qemu-thread-posix.c:194
> > > #5  0x0000555555d28a4a in qemu_cond_wait_iothread
> > (cond=0x555556cecc30) at ../softmmu/cpus.c:514
> > > #6  0x0000555555d28781 in qemu_wait_io_event (cpu=0x555556ce02c0) at
> > ../softmmu/cpus.c:425
> > > #7  0x0000555555e5da75 in kvm_vcpu_thread_fn (arg=0x555556ce02c0) at
> > ../accel/kvm/kvm-accel-ops.c:54
> > > #8  0x000055555604feed in qemu_thread_start (args=0x555556cecc70) at
> > ../util/qemu-thread-posix.c:541
> > > #9  0x00007ffff6553fa3 in start_thread (arg=<optimized out>) at
> > pthread_create.c:486
> > > #10 0x00007ffff64824cf in clone () at
> > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> > 
> > Would you please add below to your ~/.gdbinit script?
> > 
> >   handle SIGUSR1 nostop noprint
> > 
> > Or just run without gdb and wait it to crash with SIGABRT.
> > 
> > Thanks,
> > 
> > --
> > Peter Xu
> 
> Sorry for the bad brain day, here's your stack trace:
> 
> qemu-system-x86_64: ../softmmu/cpus.c:72: qemu_mutex_unlock_iothread_prepare: 
> Assertion `!memory_region_has_pending_transaction()' failed.
> 
> Thread 1 "qemu-system-x86" received signal SIGABRT, Aborted.
> __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> 50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> (gdb) bt
> #0  0x00007ffff63c07bb in __GI_raise (sig=sig@entry=6) at 
> ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007ffff63ab535 in __GI_abort () at abort.c:79
> #2  0x00007ffff63ab40f in __assert_fail_base
>     (fmt=0x7ffff650dee0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
> assertion=0x5555561ca880 "!memory_region_has_pending_transaction()", 
> file=0x5555561ca869 "../softmmu/cpus.c", line=72, function=<optimized out>) 
> at assert.c:92
> #3  0x00007ffff63b9102 in __GI___assert_fail
>     (assertion=0x5555561ca880 "!memory_region_has_pending_transaction()", 
> file=0x5555561ca869 "../softmmu/cpus.c", line=72, function=0x5555561caa60 
> <__PRETTY_FUNCTION__.37393> "qemu_mutex_unlock_iothread_prepare") at 
> assert.c:101
> #4  0x0000555555d27c20 in qemu_mutex_unlock_iothread_prepare () at 
> ../softmmu/cpus.c:72
> #5  0x0000555555d289f6 in qemu_mutex_unlock_iothread () at 
> ../softmmu/cpus.c:507
> #6  0x0000555555dcb3d6 in vfio_user_send_recv (proxy=0x555557ac5560, 
> msg=0x555557933d50, fds=0x0, rsize=40) at ../hw/vfio/user.c:88
> #7  0x0000555555dcd30a in vfio_user_dma_unmap (proxy=0x555557ac5560, 
> unmap=0x7fffffffd8d0, bitmap=0x0) at ../hw/vfio/user.c:796
> #8  0x0000555555dabf5f in vfio_dma_unmap (container=0x555557a06fb0, 
> iova=786432, size=2146697216, iotlb=0x0) at ../hw/vfio/common.c:501
> #9  0x0000555555dae12c in vfio_listener_region_del (listener=0x555557a06fc0, 
> section=0x7fffffffd9f0) at ../hw/vfio/common.c:1249
> #10 0x0000555555d3d06d in address_space_update_topology_pass 
> (as=0x5555568bbc80 <address_space_memory>, old_view=0x555556d6cfb0, 
> new_view=0x555556d6c8b0, adding=false) at ../softmmu/memory.c:960
> #11 0x0000555555d3d62c in address_space_set_flatview (as=0x5555568bbc80 
> <address_space_memory>) at ../softmmu/memory.c:1062
> #12 0x0000555555d3d800 in memory_region_transaction_commit () at 
> ../softmmu/memory.c:1124
> #13 0x0000555555b75a3e in mch_update_pam (mch=0x555556e80a40) at 
> ../hw/pci-host/q35.c:344
> #14 0x0000555555b76068 in mch_update (mch=0x555556e80a40) at 
> ../hw/pci-host/q35.c:504
> #15 0x0000555555b761d7 in mch_reset (qdev=0x555556e80a40) at 
> ../hw/pci-host/q35.c:561
> #16 0x0000555555e93a95 in device_transitional_reset (obj=0x555556e80a40) at 
> ../hw/core/qdev.c:1028
> #17 0x0000555555e956f8 in resettable_phase_hold (obj=0x555556e80a40, 
> opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:182
> #18 0x0000555555e8e07c in bus_reset_child_foreach (obj=0x555556ebce80, 
> cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) 
> at ../hw/core/bus.c:97
> #19 0x0000555555e953ff in resettable_child_foreach (rc=0x555556a07ab0, 
> obj=0x555556ebce80, cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, 
> type=RESET_TYPE_COLD) at ../hw/core/resettable.c:96
> #20 0x0000555555e9567e in resettable_phase_hold (obj=0x555556ebce80, 
> opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:173
> #21 0x0000555555e920e0 in device_reset_child_foreach (obj=0x555556e802d0, 
> cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) 
> at ../hw/core/qdev.c:366
> #22 0x0000555555e953ff in resettable_child_foreach (rc=0x555556ad2830, 
> obj=0x555556e802d0, cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, 
> type=RESET_TYPE_COLD) at ../hw/core/resettable.c:96
> #23 0x0000555555e9567e in resettable_phase_hold (obj=0x555556e802d0, 
> opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:173
> #24 0x0000555555e8e07c in bus_reset_child_foreach (obj=0x555556beaac0, 
> cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) 
> at ../hw/core/bus.c:97
> #25 0x0000555555e953ff in resettable_child_foreach (rc=0x555556b1ca70, 
> obj=0x555556beaac0, cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, 
> type=RESET_TYPE_COLD) at ../hw/core/resettable.c:96
> #26 0x0000555555e9567e in resettable_phase_hold (obj=0x555556beaac0, 
> opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:173
> #27 0x0000555555e952b4 in resettable_assert_reset (obj=0x555556beaac0, 
> type=RESET_TYPE_COLD) at ../hw/core/resettable.c:60
> #28 0x0000555555e951f8 in resettable_reset (obj=0x555556beaac0, 
> type=RESET_TYPE_COLD) at ../hw/core/resettable.c:45
> #29 0x0000555555e95a37 in resettable_cold_reset_fn (opaque=0x555556beaac0) at 
> ../hw/core/resettable.c:269
> #30 0x0000555555e93f40 in qemu_devices_reset () at ../hw/core/reset.c:69
> #31 0x0000555555c9eb04 in pc_machine_reset (machine=0x555556a4d9e0) at 
> ../hw/i386/pc.c:1654
> #32 0x0000555555d381fb in qemu_system_reset (reason=SHUTDOWN_CAUSE_NONE) at 
> ../softmmu/runstate.c:443
> #33 0x0000555555a787f2 in qdev_machine_creation_done () at 
> ../hw/core/machine.c:1330
> #34 0x0000555555d4e09c in qemu_machine_creation_done () at 
> ../softmmu/vl.c:2650
> #35 0x0000555555d4e16b in qmp_x_exit_preconfig (errp=0x5555568db1a0 
> <error_fatal>) at ../softmmu/vl.c:2673
> #36 0x0000555555d506be in qemu_init (argc=31, argv=0x7fffffffe268, 
> envp=0x7fffffffe368) at ../softmmu/vl.c:3692
> #37 0x0000555555945cad in main (argc=31, argv=0x7fffffffe268, 
> envp=0x7fffffffe368) at ../softmmu/main.c:49
> 
> This is where the vfio-user client in QEMU tells the vfio-user server (the 
> GPIO device) that this particular memory region is not available for DMA. 
> There are 3 vfio_dma_map() operations before this happens and this seems to 
> be the very first vfio_dma_unmap() operation.
> 

Yes my branch is just for catching things like the stack above.

Here vfio_user_send_recv() looks tricky to me - it releases the bql within a
memory update procedure, and IMHO it needs some serious justification on why it
can do so. For example, what if memory layout changed when waiting for the
reply?  As it can happen in parallel if without bql, afaict.

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]