qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 4/5] vhost: check vring address before calling unmap


From: Dima Stepanov
Subject: Re: [PATCH v2 4/5] vhost: check vring address before calling unmap
Date: Tue, 12 May 2020 12:08:03 +0300
User-agent: Mutt/1.5.24 (2015-08-30)

On Tue, May 12, 2020 at 11:26:11AM +0800, Jason Wang wrote:
> 
> On 2020/5/11 下午5:11, Dima Stepanov wrote:
> >On Mon, May 11, 2020 at 11:05:58AM +0800, Jason Wang wrote:
> >>On 2020/4/30 下午9:36, Dima Stepanov wrote:
> >>>Since disconnect can happen at any time during initialization not all
> >>>vring buffers (for instance used vring) can be intialized successfully.
> >>>If the buffer was not initialized then vhost_memory_unmap call will lead
> >>>to SIGSEGV. Add checks for the vring address value before calling unmap.
> >>>Also add assert() in the vhost_memory_unmap() routine.
> >>>
> >>>Signed-off-by: Dima Stepanov <address@hidden>
> >>>---
> >>>  hw/virtio/vhost.c | 27 +++++++++++++++++++++------
> >>>  1 file changed, 21 insertions(+), 6 deletions(-)
> >>>
> >>>diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >>>index ddbdc53..3ee50c4 100644
> >>>--- a/hw/virtio/vhost.c
> >>>+++ b/hw/virtio/vhost.c
> >>>@@ -314,6 +314,8 @@ static void vhost_memory_unmap(struct vhost_dev *dev, 
> >>>void *buffer,
> >>>                                 hwaddr len, int is_write,
> >>>                                 hwaddr access_len)
> >>>  {
> >>>+    assert(buffer);
> >>>+
> >>>      if (!vhost_dev_has_iommu(dev)) {
> >>>          cpu_physical_memory_unmap(buffer, len, is_write, access_len);
> >>>      }
> >>>@@ -1132,12 +1134,25 @@ static void vhost_virtqueue_stop(struct vhost_dev 
> >>>*dev,
> >>>                                                  vhost_vq_index);
> >>>      }
> >>>-    vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, 
> >>>idx),
> >>>-                       1, virtio_queue_get_used_size(vdev, idx));
> >>>-    vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, 
> >>>idx),
> >>>-                       0, virtio_queue_get_avail_size(vdev, idx));
> >>>-    vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, 
> >>>idx),
> >>>-                       0, virtio_queue_get_desc_size(vdev, idx));
> >>>+    /*
> >>>+     * Since the vhost-user disconnect can happen during initialization
> >>>+     * check if vring was initialized, before making unmap.
> >>>+     */
> >>>+    if (vq->used) {
> >>>+        vhost_memory_unmap(dev, vq->used,
> >>>+                           virtio_queue_get_used_size(vdev, idx),
> >>>+                           1, virtio_queue_get_used_size(vdev, idx));
> >>>+    }
> >>>+    if (vq->avail) {
> >>>+        vhost_memory_unmap(dev, vq->avail,
> >>>+                           virtio_queue_get_avail_size(vdev, idx),
> >>>+                           0, virtio_queue_get_avail_size(vdev, idx));
> >>>+    }
> >>>+    if (vq->desc) {
> >>>+        vhost_memory_unmap(dev, vq->desc,
> >>>+                           virtio_queue_get_desc_size(vdev, idx),
> >>>+                           0, virtio_queue_get_desc_size(vdev, idx));
> >>>+    }
> >>
> >>Any reason not checking hdev->started instead? vhost_dev_start() will set it
> >>to true if virtqueues were correctly mapped.
> >>
> >>Thanks
> >Well i see it a little bit different:
> >  - vhost_dev_start() sets hdev->started to true before starting
> >    virtqueues
> >  - vhost_virtqueue_start() maps all the memory
> >If we hit the vhost disconnect at the start of the
> >vhost_virtqueue_start(), for instance for this call:
> >   r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
> >Then we will call vhost_user_blk_disconnect:
> >   vhost_user_blk_disconnect()->
> >     vhost_user_blk_stop()->
> >       vhost_dev_stop()->
> >         vhost_virtqueue_stop()
> >As a result we will come in this routine with the hdev->started still
> >set to true, but if used/avail/desc fields still uninitialized and set
> >to 0.
> 
> 
> I may miss something, but consider both vhost_dev_start() and
> vhost_user_blk_disconnect() were serialized in main loop. Can this really
> happen?
Yes, consider the case when we start the vhost-user-blk device:
  vhost_dev_start->
    vhost_virtqueue_start
And we got a disconnect in the middle of vhost_virtqueue_start()
routine, for instance:
  1000     vq->num = state.num = virtio_queue_get_num(vdev, idx);
  1001     r = dev->vhost_ops->vhost_set_vring_num(dev, &state);
  1002     if (r) {
  1003         VHOST_OPS_DEBUG("vhost_set_vring_num failed");
  1004         return -errno;
  1005     }
  --> Here we got a disconnect <--
  1006 
  1007     state.num = virtio_queue_get_last_avail_idx(vdev, idx);
  1008     r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
  1009     if (r) {
  1010         VHOST_OPS_DEBUG("vhost_set_vring_base failed");
  1011         return -errno;
  1012     }
As a result call to vhost_set_vring_base will call the disconnect
routine. The backtrace log for SIGSEGV is as follows:
  Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
  [Switching to Thread 0x7ffff2ea9700 (LWP 183150)]
  0x00007ffff4d60840 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
  (gdb) bt
  #0  0x00007ffff4d60840 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
  #1  0x000055555590fd90 in flatview_write_continue (fv=0x7fffec4a2600,
      addr=0, attrs=..., ptr=0x0, len=1028, addr1=0, 
      l=1028, mr=0x555556b1b310) at ./exec.c:3142
  #2  0x000055555590fe98 in flatview_write (fv=0x7fffec4a2600, addr=0,
      attrs=..., buf=0x0, len=1028) at ./exec.c:3177
  #3  0x00005555559101ed in address_space_write (as=0x555556893940
      <address_space_memory>, addr=0, attrs=..., buf=0x0, 
      len=1028) at ./exec.c:3268
  #4  0x0000555555910caf in address_space_unmap (as=0x555556893940
      <address_space_memory>, buffer=0x0, len=1028, 
      is_write=true, access_len=1028) at ./exec.c:3592
  #5  0x0000555555910d82 in cpu_physical_memory_unmap (buffer=0x0,
      len=1028, is_write=true, access_len=1028) at ./exec.c:3613
  #6  0x0000555555a16fa1 in vhost_memory_unmap (dev=0x7ffff22723e8,
      buffer=0x0, len=1028, is_write=1, access_len=1028)
      at ./hw/virtio/vhost.c:318
  #7  0x0000555555a192a2 in vhost_virtqueue_stop (dev=0x7ffff22723e8,
      vdev=0x7ffff22721a0, vq=0x55555770abf0, idx=0) at ./hw/virtio/vhost.c:1136
  #8  0x0000555555a1acc0 in vhost_dev_stop (hdev=0x7ffff22723e8,
      vdev=0x7ffff22721a0) at ./hw/virtio/vhost.c:1702
  #9  0x00005555559b6532 in vhost_user_blk_stop (vdev=0x7ffff22721a0)
      at ./hw/block/vhost-user-blk.c:196
  #10 0x00005555559b6b73 in vhost_user_blk_disconnect (dev=0x7ffff22721a0)
      at ./hw/block/vhost-user-blk.c:365
  #11 0x00005555559b6c4f in vhost_user_blk_event (opaque=0x7ffff22721a0,
      event=CHR_EVENT_CLOSED) at ./hw/block/vhost-user-blk.c:384
  #12 0x0000555555e65f7e in chr_be_event (s=0x555556b182e0, 
event=CHR_EVENT_CLOSED)
      at chardev/char.c:60
  #13 0x0000555555e6601a in qemu_chr_be_event (s=0x555556b182e0,
      event=CHR_EVENT_CLOSED) at chardev/char.c:80
  #14 0x0000555555e6eef3 in tcp_chr_disconnect_locked (chr=0x555556b182e0)
      at chardev/char-socket.c:488
  #15 0x0000555555e6e23f in tcp_chr_write (chr=0x555556b182e0,
      buf=0x7ffff2ea8220 "\n", len=20) at chardev/char-socket.c:178
  #16 0x0000555555e6616c in qemu_chr_write_buffer (s=0x555556b182e0,
      buf=0x7ffff2ea8220 "\n", len=20, offset=0x7ffff2ea8150, write_all=true)
      at chardev/char.c:120
  #17 0x0000555555e662d9 in qemu_chr_write (s=0x555556b182e0, 
buf=0x7ffff2ea8220 "\n",
      len=20, write_all=true) at chardev/char.c:155
  #18 0x0000555555e693cc in qemu_chr_fe_write_all (be=0x7ffff2272360,
      buf=0x7ffff2ea8220 "\n", len=20) at chardev/char-fe.c:53
  #19 0x0000555555a1c489 in vhost_user_write (dev=0x7ffff22723e8,
      msg=0x7ffff2ea8220, fds=0x0, fd_num=0) at ./hw/virtio/vhost-user.c:350
  #20 0x0000555555a1d325 in vhost_set_vring (dev=0x7ffff22723e8,
      request=10, ring=0x7ffff2ea8520) at ./hw/virtio/vhost-user.c:660
  #21 0x0000555555a1d4c6 in vhost_user_set_vring_base (dev=0x7ffff22723e8,
      ring=0x7ffff2ea8520) at ./hw/virtio/vhost-user.c:704
  #22 0x0000555555a18c1b in vhost_virtqueue_start (dev=0x7ffff22723e8,
      vdev=0x7ffff22721a0, vq=0x55555770abf0, idx=0) at ./hw/virtio/vhost.c:1009
  #23 0x0000555555a1a9f5 in vhost_dev_start (hdev=0x7ffff22723e8, 
vdev=0x7ffff22721a0)
      at ./hw/virtio/vhost.c:1639
  #24 0x00005555559b6367 in vhost_user_blk_start (vdev=0x7ffff22721a0)
      at ./hw/block/vhost-user-blk.c:150
  #25 0x00005555559b6653 in vhost_user_blk_set_status (vdev=0x7ffff22721a0, 
status=15 '\017')
      at ./hw/block/vhost-user-blk.c:233
  #26 0x0000555555a1072d in virtio_set_status (vdev=0x7ffff22721a0, val=15 
'\017')
      at ./hw/virtio/virtio.c:1956
---Type <return> to continue, or q <return> to quit---

So while we inside vhost_user_blk_start() (frame#24) we are calling
vhost_user_blk_disconnect() (frame#10). And for this call the
dev->started field will be set to true.
  (gdb) frame 8
  #8  0x0000555555a1acc0 in vhost_dev_stop (hdev=0x7ffff22723e8, 
vdev=0x7ffff22721a0)
      at ./hw/virtio/vhost.c:1702
  1702            vhost_virtqueue_stop(hdev,
  (gdb) p hdev->started
  $1 = true

It isn't an easy race to reproduce: to hit a disconnect at this window.
We were able to hit it during long testing on one of the reconnect
iteration. After it we were able to reproduce it with 100% by adding a
sleep() call inside qemu to make the race window bigger. 

> 
> Thanks
> 
> 
> >
> >>
> >>>  }
> >>>  static void vhost_eventfd_add(MemoryListener *listener,
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]