[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 1/2] Fix Block Hotplug race with drive_del()
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] [PATCH 1/2] Fix Block Hotplug race with drive_del() |
Date: |
Wed, 10 Nov 2010 13:48:03 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) |
One real question, and a couple of nits.
Ryan Harper <address@hidden> writes:
> Block hot unplug is racy since the guest is required to acknowlege the ACPI
> unplug event; this may not happen synchronously with the device removal
> command
Well, I wouldn't call unplug "racy". It just takes an unpredictable
length of time, possibly forever. To make a race, you need to throw in
a client assuming (incorrectly) that unplug is instantaneous, as
described in your next paragraph.
Moreover, all PCI unplug is that way, not just block.
> This series aims to close a gap where by mgmt applications that assume the
> block resource has been removed without confirming that the guest has
> acknowledged the removal may re-assign the underlying device to a second guest
> leading to data leakage.
Yes, the incorrect assumption is a problem. But with that fixed (in the
management application), we run right into the next problem: there is no
way for the management application to reliably disconnect the guest from
a block device. And that's the problem you're fixing.
> This series introduces a new montor command to decouple asynchornous device
Typos "montor" and "asynchornous". You might want to use a spell
checker :)
Lines are a bit long. Recommend wrap at column 70.
> removal from restricting guest access to a block device. We do this by
> creating
> a new monitor command drive_del which maps to a bdrv_unplug() command which
> does a qemu_aio_flush; bdrv_flush() and bdrv_close(). Once complete,
> subsequent
> IO is rejected from the device and the guest will get IO errors but continue
> to
> function. In addition to preventing further IO, we clean up state pointers
> between host (BlockDriverState) and guest (DeviceInfo).
>
> A subsequent device removal command can be issued to remove the device, to
> which
> the guest may or maynot respond, but as long as the unplugged bit is set, no
> IO
"maynot" is not a word.
> will be sumbitted.
This suggests to drive_del before device_del, which makes the device
goes through a "broken device" state on its way to unplug. If the guest
accesses the device in that state, it gets I/O errors. Not nice.
Instead, I'd recommend device_del, wait for the device to go away,
drive_del on time out. If the guest reacts to the ACPI unplug promptly,
it's never exposed to the "broken device" state. Note: if the drive_del
fails because the device doesn't exist, we lost the race with the
automatic destruction, which is harmless. Ignore that error.
> Signed-off-by: Ryan Harper <address@hidden>
> ---
> block.c | 7 +++++++
> block.h | 1 +
> blockdev.c | 36 ++++++++++++++++++++++++++++++++++++
> blockdev.h | 1 +
> hmp-commands.hx | 18 ++++++++++++++++++
> 5 files changed, 63 insertions(+), 0 deletions(-)
>
> diff --git a/block.c b/block.c
> index 6b505fb..c76a796 100644
> --- a/block.c
> +++ b/block.c
> @@ -1328,6 +1328,13 @@ void bdrv_set_removable(BlockDriverState *bs, int
> removable)
> }
> }
>
> +void bdrv_unplug(BlockDriverState *bs)
> +{
> + qemu_aio_flush();
> + bdrv_flush(bs);
> + bdrv_close(bs);
> +}
> +
Unless we expect more users, I'd inline this into its only caller.
Matter of taste.
> int bdrv_is_removable(BlockDriverState *bs)
> {
> return bs->removable;
> diff --git a/block.h b/block.h
> index 78ecfac..581414c 100644
> --- a/block.h
> +++ b/block.h
> @@ -171,6 +171,7 @@ void bdrv_set_on_error(BlockDriverState *bs,
> BlockErrorAction on_read_error,
> BlockErrorAction on_write_error);
> BlockErrorAction bdrv_get_on_error(BlockDriverState *bs, int is_read);
> void bdrv_set_removable(BlockDriverState *bs, int removable);
> +void bdrv_unplug(BlockDriverState *bs);
> int bdrv_is_removable(BlockDriverState *bs);
> int bdrv_is_read_only(BlockDriverState *bs);
> int bdrv_is_sg(BlockDriverState *bs);
> diff --git a/blockdev.c b/blockdev.c
> index 6cb179a..ee8c2ec 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -14,6 +14,8 @@
> #include "qemu-option.h"
> #include "qemu-config.h"
> #include "sysemu.h"
> +#include "hw/qdev.h"
> +#include "block_int.h"
>
> static QTAILQ_HEAD(drivelist, DriveInfo) drives =
> QTAILQ_HEAD_INITIALIZER(drives);
>
> @@ -597,3 +599,37 @@ int do_change_block(Monitor *mon, const char *device,
> }
> return monitor_read_bdrv_key_start(mon, bs, NULL, NULL);
> }
> +
> +int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
> +{
> + const char *id = qdict_get_str(qdict, "id");
> + BlockDriverState *bs;
> + Property *prop;
> +
> + bs = bdrv_find(id);
> + if (!bs) {
> + qerror_report(QERR_DEVICE_NOT_FOUND, id);
> + return -1;
> + }
> +
> + /* quiesce block driver; prevent further io */
> + bdrv_unplug(bs);
> +
> + /* clean up guest state from pointing to host resource by
> + * finding and removing DeviceState "drive" property */
> + for (prop = bs->peer->info->props; prop && prop->name; prop++) {
> + if ((prop->info->type == PROP_TYPE_DRIVE) &&
> + (*(BlockDriverState **)qdev_get_prop_ptr(bs->peer, prop) == bs))
> {
> + if (prop->info->free) {
> + prop->info->free(bs->peer, prop);
> + }
Does this null the drive property? I doubt it. Quick check in the
debugger?
The free callbacks generally don't zap the properties, because they run
from qdev_free().
> + }
> + }
> +
> + /* clean up host state pointing to guest resource by removing
> + * pointers to guest device in the BlockDriverState */
> + bdrv_delete(bs);
> +
> + return 0;
> +}
> +
> diff --git a/blockdev.h b/blockdev.h
> index 653affc..2a0559e 100644
> --- a/blockdev.h
> +++ b/blockdev.h
> @@ -51,5 +51,6 @@ int do_eject(Monitor *mon, const QDict *qdict, QObject
> **ret_data);
> int do_block_set_passwd(Monitor *mon, const QDict *qdict, QObject
> **ret_data);
> int do_change_block(Monitor *mon, const char *device,
> const char *filename, const char *fmt);
> +int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data);
>
> #endif
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index e5585ba..d6dc18c 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -68,6 +68,24 @@ Eject a removable medium (use -f to force it).
> ETEXI
>
> {
> + .name = "drive_del",
> + .args_type = "id:s",
> + .params = "device",
> + .help = "remove host block device",
> + .user_print = monitor_user_noop,
> + .mhandler.cmd_new = do_drive_del,
> + },
> +
> +STEXI
> address@hidden delete @var{device}
> address@hidden delete
> +Remove host block device. The result is that guest generated IO is no longer
> +submitted against the host device underlying the disk. Once a drive has
> +been deleted, the QEMU Block layer returns -EIO which results in IO
> +errors in the guest for applications that are reading/writing to the device.
> +ETEXI
> +
> + {
> .name = "change",
> .args_type = "device:B,target:F,arg:s?",
> .params = "device filename [format]",
- [Qemu-devel] [PATCH 0/2] v6 Decouple block device removal from device removal, Ryan Harper, 2010/11/08
- [Qemu-devel] [PATCH 2/2] Add qmp version of drive_del, Ryan Harper, 2010/11/08
- [Qemu-devel] [PATCH 1/2] Fix Block Hotplug race with drive_del(), Ryan Harper, 2010/11/08
- Re: [Qemu-devel] [PATCH 1/2] Fix Block Hotplug race with drive_del(),
Markus Armbruster <=
- Re: [Qemu-devel] [PATCH 1/2] Fix Block Hotplug race with drive_del(), Ryan Harper, 2010/11/10
- Re: [Qemu-devel] [PATCH 1/2] Fix Block Hotplug race with drive_del(), Markus Armbruster, 2010/11/10
- Re: [Qemu-devel] [PATCH 1/2] Fix Block Hotplug race with drive_del(), Ryan Harper, 2010/11/10
- Re: [Qemu-devel] [PATCH 1/2] Fix Block Hotplug race with drive_del(), Ryan Harper, 2010/11/10
- Re: [Qemu-devel] [PATCH 1/2] Fix Block Hotplug race with drive_del(), Markus Armbruster, 2010/11/11
- Re: [Qemu-devel] [PATCH 1/2] Fix Block Hotplug race with drive_del(), Ryan Harper, 2010/11/11
- Re: [Qemu-devel] [PATCH 1/2] Fix Block Hotplug race with drive_del(), Markus Armbruster, 2010/11/11
[Qemu-devel] Re: [PATCH 0/2] v6 Decouple block device removal from device removal, Michael S. Tsirkin, 2010/11/09