I agree that passing in QEMUIOVector to blk_aio_ioctl() as a holder of the void* buffer used in blk_aio_ioctl_entry() is unnecessary. But, as Kevin noted, read and write were using the QEMUIOVector in BlkRwCo.
To avoid changes to the callers of blk_aio_ioctl(), I’ll change blk_aio_prwv() to take a void pointer instead of QEMUIOVector* and use a union to hold the buffer in BlkRwCo.
On 22/11/2017 19:06, Kevin Wolf wrote:Am 22.11.2017 um 17:34 hat Paolo Bonzini geschrieben:
On 22/11/2017 16:33, Deepa Srinivasan wrote:
Starting qemu with the following arguments causes qemu to segfault: ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name= iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
This patch fixes blk_aio_ioctl() so it does not pass stack addresses to blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More details about the bug follow.
blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
When blk_aio_ioctl() is executed from within a coroutine context (e.g. iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer: .... BlkRwCo *rwco = &acb->rwco;
rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, rwco->qiov->iov[0].iov_base); <--- qiov is invalid here ...
In the case when blk_aio_ioctl() is called from a non-coroutine context, blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine execution is complete, control returns to blk_aio_ioctl_entry() after the call to blk_co_ioctl(). There is no invalid reference after this point, but the function is still holding on to invalid pointers.
The fix is to allocate memory for the QEMUIOVector and struct iovec as part of the request struct which the IO buffer is part of. The memory for this struct is guaranteed to be valid till the AIO is completed.
Signed-off-by: Deepa Srinivasan <address@hidden> Signed-off-by: Konrad Rzeszutek Wilk <address@hidden> Reviewed-by: Mark Kanda <address@hidden> --- block/block-backend.c | 13 ++----------- hw/block/virtio-blk.c | 9 ++++++++- hw/scsi/scsi-disk.c | 10 +++++++++- hw/scsi/scsi-generic.c | 9 ++++++++- include/sysemu/block-backend.h | 2 +- 5 files changed, 28 insertions(+), 15 deletions(-)
diff --git a/block/block-backend.c b/block/block-backend.c index baef8e7..c275827 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque) blk_aio_complete(acb); }
-BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf, +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, QEMUIOVector *qiov, BlockCompletionFunc *cb, void *opaque)
I think this is not the best way to fix the bug, because it adds extra unnecessary code in the callers.
Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the same for blk_aio_prwv's "qiov" argument?
Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just use rwco->buf.
But the same struct is used for read and write requests that do use an actual QEMUIOVector and not just a linear buffer.
Then let's call it "void *opaque", or make it a union (but I thinkthat's overkill).The QEMUIOVector pointer is opaque as far as blk_aio_prwv is concerned,and it is only created by blk_aio_ioctl for blk_aio_ioctl_entry toextract buf: rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, rwco->qiov->iov[0].iov_base);Exposing the fake QEMUIOVector to the callers of blk_aio_ioctl is muchuglier than using a void* for what is effectively a multi-type pointer.Paolo
|