[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bdrv_drained_begin deadlock with io-threads
From: |
Kevin Wolf |
Subject: |
Re: bdrv_drained_begin deadlock with io-threads |
Date: |
Wed, 1 Apr 2020 20:12:56 +0200 |
User-agent: |
Mutt/1.12.1 (2019-06-15) |
Am 01.04.2020 um 17:37 hat Dietmar Maurer geschrieben:
> > > I really nobody else able to reproduce this (somebody already tried to
> > > reproduce)?
> >
> > I can get hangs, but that's for job_completed(), not for starting the
> > job. Also, my hangs have a non-empty bs->tracked_requests, so it looks
> > like a different case to me.
>
> Please can you post the command line args of your VM? I use something like
>
> ./x86_64-softmmu/qemu-system-x86_64 -chardev
> 'socket,id=qmp,path=/var/run/qemu-server/101.qmp,server,nowait' -mon
> 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/101.pid -m
> 1024 -object 'iothread,id=iothread-virtioscsi0' -device
> 'virtio-scsi-pci,id=virtioscsi0,iothread=iothread-virtioscsi0' -drive
> 'file=/backup/disk3/debian-buster.raw,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on'
> -device
> 'scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0'
> -machine "type=pc,accel=kvm"
>
> Do you also run "stress-ng -d 5" indied the VM?
I'm not using the exact same test case, but something that I thought
would be similar enough. Specifically, I run the script below, which
boots from a RHEL 8 CD and in the rescue shell, I'll do 'dd if=/dev/zero
of=/dev/sda' while the script keeps starting and cancelling backup jobs
in the background.
Anyway, I finally managed to bisect my problem now (did it wrong the
first time) and got this result:
00e30f05de1d19586345ec373970ef4c192c6270 is the first bad commit
commit 00e30f05de1d19586345ec373970ef4c192c6270
Author: Vladimir Sementsov-Ogievskiy <address@hidden>
Date: Tue Oct 1 16:14:09 2019 +0300
block/backup: use backup-top instead of write notifiers
Drop write notifiers and use filter node instead.
= Changes =
1. Add filter-node-name argument for backup qmp api. We have to do it
in this commit, as 257 needs to be fixed.
2. There are no more write notifiers here, so is_write_notifier
parameter is dropped from block-copy paths.
3. To sync with in-flight requests at job finish we now have drained
removing of the filter, we don't need rw-lock.
4. Block-copy is now using BdrvChildren instead of BlockBackends
5. As backup-top owns these children, we also move block-copy state
into backup-top's ownership.
[...]
That's a pretty big change, and I'm not sure how it's related to
completed requests hanging in the thread pool instead of reentering the
file-posix coroutine. But I also tested it enough that I'm confident
it's really the first bad commit.
Maybe you want to try if your problem starts at the same commit?
Kevin
#!/bin/bash
qmp() {
cat <<EOF
{'execute':'qmp_capabilities'}
EOF
while true; do
cat <<EOF
{ "execute": "drive-backup", "arguments": {
"job-id":"drive_image1","device": "drive_image1", "sync": "full", "target":
"/tmp/backup.raw" } }
EOF
sleep 1
cat <<EOF
{ "execute": "block-job-cancel", "arguments": { "device": "drive_image1"} }
EOF
sleep 2
done
}
./qemu-img create -f qcow2 /tmp/test.qcow2 4G
for i in $(seq 0 1); do echo "write ${i}G 1G"; done | ./qemu-io /tmp/test.qcow2
qmp | x86_64-softmmu/qemu-system-x86_64 \
-enable-kvm \
-machine pc \
-m 1G \
-object 'iothread,id=iothread-virtioscsi0' \
-device 'virtio-scsi-pci,id=virtioscsi0,iothread=iothread-virtioscsi0' \
-blockdev node-name=my_drive,driver=file,filename=/tmp/test.qcow2 \
-blockdev driver=qcow2,node-name=drive_image1,file=my_drive \
-device scsi-hd,drive=drive_image1,id=image1 \
-cdrom ~/images/iso/RHEL-8.0-20190116.1-x86_64-dvd1.iso \
-boot d \
-qmp stdio -monitor vc
- Re: bdrv_drained_begin deadlock with io-threads, Kevin Wolf, 2020/04/01
- Re: bdrv_drained_begin deadlock with io-threads, Dietmar Maurer, 2020/04/01
- Re: bdrv_drained_begin deadlock with io-threads, Dietmar Maurer, 2020/04/01
- Re: bdrv_drained_begin deadlock with io-threads,
Kevin Wolf <=
- Re: bdrv_drained_begin deadlock with io-threads, Dietmar Maurer, 2020/04/01
- Re: bdrv_drained_begin deadlock with io-threads, Kevin Wolf, 2020/04/01
- Re: bdrv_drained_begin deadlock with io-threads, Dietmar Maurer, 2020/04/02
- Re: bdrv_drained_begin deadlock with io-threads, Dietmar Maurer, 2020/04/02
- Re: bdrv_drained_begin deadlock with io-threads, Kevin Wolf, 2020/04/02
- Re: bdrv_drained_begin deadlock with io-threads, Kevin Wolf, 2020/04/02
- Re: bdrv_drained_begin deadlock with io-threads, Dietmar Maurer, 2020/04/02
- Re: bdrv_drained_begin deadlock with io-threads, Kevin Wolf, 2020/04/02
- Re: bdrv_drained_begin deadlock with io-threads, Kevin Wolf, 2020/04/02
- Re: bdrv_drained_begin deadlock with io-threads, Thomas Lamprecht, 2020/04/03