[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH v5 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi ioth
[PATCH v5 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread
Sun, 13 Sep 2020 19:02:50 +0300
This is a patch series that is a result of my discussion with Paulo on
how to correctly fix the root cause of the BZ #1812399.
The root cause of this bug is the fact that IO thread is running mostly
unlocked versus main thread on which device hotplug is done.
qdev_device_add first creates the device object, then places it on the bus,
and only then realizes it.
However some drivers and currently only virtio-scsi enumerate its child bus
devices on each request that is received from the guest and that can happen on
Thus we have a window when new device is on the bus but not realized and can be
by the virtio-scsi driver in that state.
Fix that by doing two things:
1. Add partial RCU protection to the list of a bus's child devices.
This allows the scsi IO thread to safely enumerate the child devices
while it races with the hotplug placing the device on the bus.
2. Let scsi_device_find not return devices that are on the bus but not realized
Note that in the particular bug report the issue wasn't a race but rather due
to combination of things, the .realize code in the middle managed to trigger IO
on the virtqueue
which caused the virtio-scsi driver to access the half realized device. However
since this can happen as well with real IO thread, this patch series was done,
which fixes this as well.
Changes from V4:
* Addressed review feedback
Changes from V3:
* Rebased to latest qemu
* Added a new patch to fix related race in scsi_target_emulate_report_luns
* Moved the non-realized device check to scsi core, since there is no
way a device driver will want to see non realized devices on a scsi bus.
(scsi-bus still needs this and can using an internal function)
* Splitted patch that added drain_rcu and used it, to patch that only adds it,
that uses it (no other changes so I kept Reviewed-by)
*Some tweaks to commits
This series was tested by adding a virtio-scsi drive with iothread,
then running fio stress job in the guest in a loop, and then adding/removing
the scsi drive on the host in the loop.
This test was failing usually on 1st iteration withouth this patch series,
and now it seems to work smoothly.
Maxim Levitsky (9):
scsi/scsi_bus: switch search direction in scsi_device_find
rcu: Implement drain_call_rcu
device_core: use drain_call_rcu in in hmp_device_del/qmp_device_add
device-core: use RCU for list of childs of a bus
device-core: use atomic_set on .realized property
scsi/scsi-bus: scsi_device_find: don't return unrealized devices
scsi/scsi_bus: Add scsi_device_get
virtio-scsi: use scsi_device_get
scsi/scsi_bus: fix races in REPORT LUNS
hw/core/bus.c | 28 +++++---
hw/core/qdev.c | 56 +++++++++++----
hw/scsi/scsi-bus.c | 153 +++++++++++++++++++++++++++--------------
hw/scsi/virtio-scsi.c | 27 +++++---
include/hw/qdev-core.h | 11 +++
include/hw/scsi/scsi.h | 1 +
include/qemu/rcu.h | 1 +
qdev-monitor.c | 22 ++++++
util/rcu.c | 55 +++++++++++++++
9 files changed, 267 insertions(+), 87 deletions(-)
- [PATCH v5 0/9] Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread,
Maxim Levitsky <=
- [PATCH v5 1/9] scsi/scsi_bus: switch search direction in scsi_device_find, Maxim Levitsky, 2020/09/13
- [PATCH v5 2/9] rcu: Implement drain_call_rcu, Maxim Levitsky, 2020/09/13
- [PATCH v5 3/9] device_core: use drain_call_rcu in in hmp_device_del/qmp_device_add, Maxim Levitsky, 2020/09/13
- [PATCH v5 4/9] device-core: use RCU for list of childs of a bus, Maxim Levitsky, 2020/09/13
- [PATCH v5 5/9] device-core: use atomic_set on .realized property, Maxim Levitsky, 2020/09/13
- [PATCH v5 6/9] scsi/scsi-bus: scsi_device_find: don't return unrealized devices, Maxim Levitsky, 2020/09/13
- [PATCH v5 7/9] scsi/scsi_bus: Add scsi_device_get, Maxim Levitsky, 2020/09/13
- [PATCH v5 8/9] virtio-scsi: use scsi_device_get, Maxim Levitsky, 2020/09/13
- [PATCH v5 9/9] scsi/scsi_bus: fix races in REPORT LUNS, Maxim Levitsky, 2020/09/13