[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 4/5] vhost: check vring address before calling unmap
From: |
Dima Stepanov |
Subject: |
Re: [PATCH v2 4/5] vhost: check vring address before calling unmap |
Date: |
Tue, 12 May 2020 12:08:03 +0300 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
On Tue, May 12, 2020 at 11:26:11AM +0800, Jason Wang wrote:
>
> On 2020/5/11 下午5:11, Dima Stepanov wrote:
> >On Mon, May 11, 2020 at 11:05:58AM +0800, Jason Wang wrote:
> >>On 2020/4/30 下午9:36, Dima Stepanov wrote:
> >>>Since disconnect can happen at any time during initialization not all
> >>>vring buffers (for instance used vring) can be intialized successfully.
> >>>If the buffer was not initialized then vhost_memory_unmap call will lead
> >>>to SIGSEGV. Add checks for the vring address value before calling unmap.
> >>>Also add assert() in the vhost_memory_unmap() routine.
> >>>
> >>>Signed-off-by: Dima Stepanov <address@hidden>
> >>>---
> >>> hw/virtio/vhost.c | 27 +++++++++++++++++++++------
> >>> 1 file changed, 21 insertions(+), 6 deletions(-)
> >>>
> >>>diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >>>index ddbdc53..3ee50c4 100644
> >>>--- a/hw/virtio/vhost.c
> >>>+++ b/hw/virtio/vhost.c
> >>>@@ -314,6 +314,8 @@ static void vhost_memory_unmap(struct vhost_dev *dev,
> >>>void *buffer,
> >>> hwaddr len, int is_write,
> >>> hwaddr access_len)
> >>> {
> >>>+ assert(buffer);
> >>>+
> >>> if (!vhost_dev_has_iommu(dev)) {
> >>> cpu_physical_memory_unmap(buffer, len, is_write, access_len);
> >>> }
> >>>@@ -1132,12 +1134,25 @@ static void vhost_virtqueue_stop(struct vhost_dev
> >>>*dev,
> >>> vhost_vq_index);
> >>> }
> >>>- vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev,
> >>>idx),
> >>>- 1, virtio_queue_get_used_size(vdev, idx));
> >>>- vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev,
> >>>idx),
> >>>- 0, virtio_queue_get_avail_size(vdev, idx));
> >>>- vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev,
> >>>idx),
> >>>- 0, virtio_queue_get_desc_size(vdev, idx));
> >>>+ /*
> >>>+ * Since the vhost-user disconnect can happen during initialization
> >>>+ * check if vring was initialized, before making unmap.
> >>>+ */
> >>>+ if (vq->used) {
> >>>+ vhost_memory_unmap(dev, vq->used,
> >>>+ virtio_queue_get_used_size(vdev, idx),
> >>>+ 1, virtio_queue_get_used_size(vdev, idx));
> >>>+ }
> >>>+ if (vq->avail) {
> >>>+ vhost_memory_unmap(dev, vq->avail,
> >>>+ virtio_queue_get_avail_size(vdev, idx),
> >>>+ 0, virtio_queue_get_avail_size(vdev, idx));
> >>>+ }
> >>>+ if (vq->desc) {
> >>>+ vhost_memory_unmap(dev, vq->desc,
> >>>+ virtio_queue_get_desc_size(vdev, idx),
> >>>+ 0, virtio_queue_get_desc_size(vdev, idx));
> >>>+ }
> >>
> >>Any reason not checking hdev->started instead? vhost_dev_start() will set it
> >>to true if virtqueues were correctly mapped.
> >>
> >>Thanks
> >Well i see it a little bit different:
> > - vhost_dev_start() sets hdev->started to true before starting
> > virtqueues
> > - vhost_virtqueue_start() maps all the memory
> >If we hit the vhost disconnect at the start of the
> >vhost_virtqueue_start(), for instance for this call:
> > r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
> >Then we will call vhost_user_blk_disconnect:
> > vhost_user_blk_disconnect()->
> > vhost_user_blk_stop()->
> > vhost_dev_stop()->
> > vhost_virtqueue_stop()
> >As a result we will come in this routine with the hdev->started still
> >set to true, but if used/avail/desc fields still uninitialized and set
> >to 0.
>
>
> I may miss something, but consider both vhost_dev_start() and
> vhost_user_blk_disconnect() were serialized in main loop. Can this really
> happen?
Yes, consider the case when we start the vhost-user-blk device:
vhost_dev_start->
vhost_virtqueue_start
And we got a disconnect in the middle of vhost_virtqueue_start()
routine, for instance:
1000 vq->num = state.num = virtio_queue_get_num(vdev, idx);
1001 r = dev->vhost_ops->vhost_set_vring_num(dev, &state);
1002 if (r) {
1003 VHOST_OPS_DEBUG("vhost_set_vring_num failed");
1004 return -errno;
1005 }
--> Here we got a disconnect <--
1006
1007 state.num = virtio_queue_get_last_avail_idx(vdev, idx);
1008 r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
1009 if (r) {
1010 VHOST_OPS_DEBUG("vhost_set_vring_base failed");
1011 return -errno;
1012 }
As a result call to vhost_set_vring_base will call the disconnect
routine. The backtrace log for SIGSEGV is as follows:
Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff2ea9700 (LWP 183150)]
0x00007ffff4d60840 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ffff4d60840 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x000055555590fd90 in flatview_write_continue (fv=0x7fffec4a2600,
addr=0, attrs=..., ptr=0x0, len=1028, addr1=0,
l=1028, mr=0x555556b1b310) at ./exec.c:3142
#2 0x000055555590fe98 in flatview_write (fv=0x7fffec4a2600, addr=0,
attrs=..., buf=0x0, len=1028) at ./exec.c:3177
#3 0x00005555559101ed in address_space_write (as=0x555556893940
<address_space_memory>, addr=0, attrs=..., buf=0x0,
len=1028) at ./exec.c:3268
#4 0x0000555555910caf in address_space_unmap (as=0x555556893940
<address_space_memory>, buffer=0x0, len=1028,
is_write=true, access_len=1028) at ./exec.c:3592
#5 0x0000555555910d82 in cpu_physical_memory_unmap (buffer=0x0,
len=1028, is_write=true, access_len=1028) at ./exec.c:3613
#6 0x0000555555a16fa1 in vhost_memory_unmap (dev=0x7ffff22723e8,
buffer=0x0, len=1028, is_write=1, access_len=1028)
at ./hw/virtio/vhost.c:318
#7 0x0000555555a192a2 in vhost_virtqueue_stop (dev=0x7ffff22723e8,
vdev=0x7ffff22721a0, vq=0x55555770abf0, idx=0) at ./hw/virtio/vhost.c:1136
#8 0x0000555555a1acc0 in vhost_dev_stop (hdev=0x7ffff22723e8,
vdev=0x7ffff22721a0) at ./hw/virtio/vhost.c:1702
#9 0x00005555559b6532 in vhost_user_blk_stop (vdev=0x7ffff22721a0)
at ./hw/block/vhost-user-blk.c:196
#10 0x00005555559b6b73 in vhost_user_blk_disconnect (dev=0x7ffff22721a0)
at ./hw/block/vhost-user-blk.c:365
#11 0x00005555559b6c4f in vhost_user_blk_event (opaque=0x7ffff22721a0,
event=CHR_EVENT_CLOSED) at ./hw/block/vhost-user-blk.c:384
#12 0x0000555555e65f7e in chr_be_event (s=0x555556b182e0,
event=CHR_EVENT_CLOSED)
at chardev/char.c:60
#13 0x0000555555e6601a in qemu_chr_be_event (s=0x555556b182e0,
event=CHR_EVENT_CLOSED) at chardev/char.c:80
#14 0x0000555555e6eef3 in tcp_chr_disconnect_locked (chr=0x555556b182e0)
at chardev/char-socket.c:488
#15 0x0000555555e6e23f in tcp_chr_write (chr=0x555556b182e0,
buf=0x7ffff2ea8220 "\n", len=20) at chardev/char-socket.c:178
#16 0x0000555555e6616c in qemu_chr_write_buffer (s=0x555556b182e0,
buf=0x7ffff2ea8220 "\n", len=20, offset=0x7ffff2ea8150, write_all=true)
at chardev/char.c:120
#17 0x0000555555e662d9 in qemu_chr_write (s=0x555556b182e0,
buf=0x7ffff2ea8220 "\n",
len=20, write_all=true) at chardev/char.c:155
#18 0x0000555555e693cc in qemu_chr_fe_write_all (be=0x7ffff2272360,
buf=0x7ffff2ea8220 "\n", len=20) at chardev/char-fe.c:53
#19 0x0000555555a1c489 in vhost_user_write (dev=0x7ffff22723e8,
msg=0x7ffff2ea8220, fds=0x0, fd_num=0) at ./hw/virtio/vhost-user.c:350
#20 0x0000555555a1d325 in vhost_set_vring (dev=0x7ffff22723e8,
request=10, ring=0x7ffff2ea8520) at ./hw/virtio/vhost-user.c:660
#21 0x0000555555a1d4c6 in vhost_user_set_vring_base (dev=0x7ffff22723e8,
ring=0x7ffff2ea8520) at ./hw/virtio/vhost-user.c:704
#22 0x0000555555a18c1b in vhost_virtqueue_start (dev=0x7ffff22723e8,
vdev=0x7ffff22721a0, vq=0x55555770abf0, idx=0) at ./hw/virtio/vhost.c:1009
#23 0x0000555555a1a9f5 in vhost_dev_start (hdev=0x7ffff22723e8,
vdev=0x7ffff22721a0)
at ./hw/virtio/vhost.c:1639
#24 0x00005555559b6367 in vhost_user_blk_start (vdev=0x7ffff22721a0)
at ./hw/block/vhost-user-blk.c:150
#25 0x00005555559b6653 in vhost_user_blk_set_status (vdev=0x7ffff22721a0,
status=15 '\017')
at ./hw/block/vhost-user-blk.c:233
#26 0x0000555555a1072d in virtio_set_status (vdev=0x7ffff22721a0, val=15
'\017')
at ./hw/virtio/virtio.c:1956
---Type <return> to continue, or q <return> to quit---
So while we inside vhost_user_blk_start() (frame#24) we are calling
vhost_user_blk_disconnect() (frame#10). And for this call the
dev->started field will be set to true.
(gdb) frame 8
#8 0x0000555555a1acc0 in vhost_dev_stop (hdev=0x7ffff22723e8,
vdev=0x7ffff22721a0)
at ./hw/virtio/vhost.c:1702
1702 vhost_virtqueue_stop(hdev,
(gdb) p hdev->started
$1 = true
It isn't an easy race to reproduce: to hit a disconnect at this window.
We were able to hit it during long testing on one of the reconnect
iteration. After it we were able to reproduce it with 100% by adding a
sleep() call inside qemu to make the race window bigger.
>
> Thanks
>
>
> >
> >>
> >>> }
> >>> static void vhost_eventfd_add(MemoryListener *listener,
>