[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2] Do not delete BlockDriverState when deleting
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] [PATCH v2] Do not delete BlockDriverState when deleting the drive |
Date: |
Thu, 24 Mar 2011 13:17:46 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) |
Whoops, almost missed this. Best to cc: me to avoid that.
Ryan Harper <address@hidden> writes:
> * Markus Armbruster <address@hidden> [2011-03-15 04:48]:
>> Sorry for the long delay, I was out of action for a week.
>>
>> Ryan Harper <address@hidden> writes:
>>
>> > When removing a drive from the host-side via drive_del we currently have
>> > the
>> > following path:
>> >
>> > drive_del
>> > qemu_aio_flush()
>> > bdrv_close()
>> > drive_uninit()
>> > bdrv_delete()
>> >
>> > When we bdrv_delete() we end up qemu_free() the BlockDriverState pointer
>> > however, the block devices retain a copy of this pointer, see
>> > hw/virtio-blk.c:virtio_blk_init() where we s->bs = conf->bs.
>> >
>> > We now have a use-after-free situation. If the guest continues to issue IO
>> > against the device, and we've reallocated the memory that the
>> > BlockDriverState
>> > pointed at, then we will fail the bs->drv checks in the various bdrv_
>> > methods.
>>
>> "we will fail the bs->drv checks" is misleading, in my opinion. Here's
>> what happens:
>>
>> 1. bdrv_close(bs) zaps bs->drv, which makes any subsequent I/O get
>> dropped. Works as designed.
>>
>> 2. drive_uninit() frees the bs. Since the device is still connected to
>> bs, any subsequent I/O is a use-after-free.
>>
>> The value of bs->drv becomes unpredictable on free. As long as it
>> remains null, I/O still gets dropped. I/O crashes or worse once that
>> changed. Could be right on free, could be much later.
>>
>> If you respin anyway, please clarify your description.
>
> Sure. I wasn't planning a new version, but I'll update and send anyhow
> as I didn't see it get included in pull from the block branch.
>>
>> > To resolve this issue as simply as possible, we can chose to not actually
>> > delete the BlockDriverState pointer. Since bdrv_close() handles setting
>> > the drv
>> > pointer to NULL, we just need to remove the BlockDriverState from the QLIST
>> > that is used to enumerate the block devices. This is currently handled
>> > within
>> > bdrv_delete, so move this into it's own function, bdrv_remove().
>>
>> Why do we remove the BlockDriverState from bdrv_states? Because we want
>> drive_del make its *name* go away.
>>
>> Begs the question: is the code prepared for a BlockDriverState object
>> that isn't on bdrv_states? Turns out we're in luck: bdrv_new() already
>> creates such objects when the device_name is empty. This is used for
>> internal BlockDriverStates such as COW backing files. Your code makes
>> device_name empty when taking the object off bdrv_states, so we're good.
>>
>> Begs yet another question: how does the behavior of a BlockDriverState
>> change when it's taken off bdrv_states, and is that the behavior we
>> want? Changes:
>>
>> * bdrv_delete() no longer takes it off bdrv_states. Good.
>>
>> * bdrv_close_all(), bdrv_commit_all() and bdrv_flush_all() no longer
>> cover it. Okay, because bdrv_close(), bdrv_commit() and bdrv_flush()
>> do nothing anyway for closed BlockDriverStates.
>>
>> * "info block" and "info blockstats" no longer show it, because
>> bdrv_info() and bdrv_info_stats() no longer see it. Okay.
>>
>> * bdrv_find(), bdrv_next(), bdrv_iterate() no longer see it. Impact?
>> Please check their uses and report.
>
> 1 664 block-migration.c <<block_load>>
> bs = bdrv_find(device_name);
>
> - no longer see it. This is fine since we can't migrate a block
> device that has been removed
>
> 2 562 blockdev.c <<do_commit>>
> bs = bdrv_find(device);
>
> - do_commit won't see it in either when calling bdrv_commit_all()
> Fine as you mention above. If user specifies the device name
> we won't find it, that's OK because we can't commit data against
> a closed BlockDriverState.
>
> 3 587 blockdev.c <<do_snapshot_blkdev>>
> bs = bdrv_find(device);
>
> - OK, cannot take a snapshot against a deleted BlockDriverState
>
> 4 662 blockdev.c <<do_eject>>
> bs = bdrv_find(filename);
>
> - OK, cannot eject a deleted BlockDriverState;
>
> 5 676 blockdev.c <<do_block_set_passwd>>
> bs = bdrv_find(qdict_get_str(qdict, "device"));
>
> - OK, cannot set password a deleted BlockDriverState;
>
> 6 701 blockdev.c <<do_change_block>>
> bs = bdrv_find(device);
>
> - OK, cannot change the file/device of a deleted BlockDriverState;
>
> 7 732 blockdev.c <<do_drive_del>>
> bs = bdrv_find(id);
>
> - OK, cannot delete an already deleted Drive
>
> 8 783 blockdev.c <<do_block_resize>>
> bs = bdrv_find(device);
>
> - OK, cannot resize a deleted Drive
>
> 9 312 hw/qdev-properties.c <<parse_drive>>
> bs = bdrv_find(str);
>
> - Used when invoking qdev_prop_drive .parse method; parse method is
> invoked via
> qdev_device_add() which calls set_property() which invokes parse.
> AFAICT, this is OK
> since we won't be going down the device add path worrying about a
> deleted block device.
Thanks for checking!
>> > The result is that we can now invoke drive_del, this closes the file
>> > descriptors
>> > and sets BlockDriverState->drv to NULL which prevents futher IO to the
>> > device,
>> > and since we do not free BlockDriverState, we don't have to worry about
>> > the copy
>> > retained in the block devices.
>>
>> Yep. But there's one more question: is the BlockDriverState freed when
>> the device using it gets destroyed?
>>
>> qdev_free() runs prop->info->free() for all properties. The drive
>> property's free() is free_drive():
>>
>> static void free_drive(DeviceState *dev, Property *prop)
>> {
>> BlockDriverState **ptr = qdev_get_prop_ptr(dev, prop);
>>
>> if (*ptr) {
>> bdrv_detach(*ptr, dev);
>> blockdev_auto_del(*ptr);
>> }
>> }
>>
>> This should indeed delete the drive. But only if the property still
>> points to it. See below.
>>
>> > Reported-by: Marcus Armbruster <address@hidden>
>> > Signed-off-by: Ryan Harper <address@hidden>
>> > ---
>> > v1->v2
>> > - NULL bs->device_name after removing from list to prevent
>> > second removal.
>> >
>> > block.c | 12 +++++++++---
>> > block.h | 1 +
>> > blockdev.c | 2 +-
>> > 3 files changed, 11 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/block.c b/block.c
>> > index 1544d81..0df9942 100644
>> > --- a/block.c
>> > +++ b/block.c
>> > @@ -697,14 +697,20 @@ void bdrv_close_all(void)
>> > }
>> > }
>> >
>> > +void bdrv_remove(BlockDriverState *bs)
>> > +{
>> > + if (bs->device_name[0] != '\0') {
>> > + QTAILQ_REMOVE(&bdrv_states, bs, list);
>> > + }
>> > + bs->device_name[0] = '\0';
>> > +}
>> > +
>>
>> I don't like this name. What's the difference between "delete" and
>> "remove"?
>>
>> The function zaps the device name. bdrv_make_anon()?
>
> bdrv_delist? bdrv_hide? I'm also fine with make_anon.
Any of the three works for me. "delist" seems the least obvious, but
that's a matter of taste.
>> > void bdrv_delete(BlockDriverState *bs)
>> > {
>> > assert(!bs->peer);
>> >
>> > /* remove from list, if necessary */
>> > - if (bs->device_name[0] != '\0') {
>> > - QTAILQ_REMOVE(&bdrv_states, bs, list);
>> > - }
>> > + bdrv_remove(bs);
>> >
>> > bdrv_close(bs);
>> > if (bs->file != NULL) {
>> > diff --git a/block.h b/block.h
>> > index 5d78fc0..8447397 100644
>> > --- a/block.h
>> > +++ b/block.h
>> > @@ -66,6 +66,7 @@ int bdrv_create(BlockDriver *drv, const char* filename,
>> > QEMUOptionParameter *options);
>> > int bdrv_create_file(const char* filename, QEMUOptionParameter *options);
>> > BlockDriverState *bdrv_new(const char *device_name);
>> > +void bdrv_remove(BlockDriverState *bs);
>> > void bdrv_delete(BlockDriverState *bs);
>> > int bdrv_file_open(BlockDriverState **pbs, const char *filename, int
>> > flags);
>> > int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
>> > diff --git a/blockdev.c b/blockdev.c
>> > index 0690cc8..1f57b50 100644
>> > --- a/blockdev.c
>> > +++ b/blockdev.c
>> > @@ -760,7 +760,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict,
>> > QObject **ret_data)
>>
>> Let me add a bit more context:
>>
>> bdrv_flush(bs);
>> bdrv_close(bs);
>>
>> /* clean up guest state from pointing to host resource by
>> * finding and removing DeviceState "drive" property */
>> if (bs->peer) {
>> for (prop = bs->peer->info->props; prop && prop->name; prop++) {
>> if (prop->info->type == PROP_TYPE_DRIVE) {
>> ptr = qdev_get_prop_ptr(bs->peer, prop);
>> if (*ptr == bs) {
>> bdrv_detach(bs, bs->peer);
>> *ptr = NULL;
>> break;
>> }
>> }
>> }
>> > }
>>
>> This zaps the drive property. A subsequent free_drive() will do
>> nothing. We leak the BlockDriverState on device unplug, I'm afraid.
>>
>> Any reason why we still want to zap the drive property?
>
> IIRC, it was to prevent qdev from keeping a ptr around to the bs; but if
> we're keeping the bs around anyhow, then I don't think we need to remove
> the property.
Agree.
> One last check would be to see what if the device still
> shows up in qtree if we don't remove the property.
I expect it to show up as "drive = " instead of "drive = <null>",
because:
static int print_drive(DeviceState *dev, Property *prop, char *dest, size_t
len)
{
BlockDriverState **ptr = qdev_get_prop_ptr(dev, prop);
return snprintf(dest, len, "%s",
*ptr ? bdrv_get_device_name(*ptr) : "<null>");
}
and
const char *bdrv_get_device_name(BlockDriverState *bs)
{
return bs->device_name;
}
If the empty right hand side bothers you, you can change print_drive()
to print something else then, say "<anon>", or "<gone>".
Looking forward to v3 :)