[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 0/3] vhost-user reconnect
From: |
Marc-André Lureau |
Subject: |
Re: [Qemu-devel] [PATCH 0/3] vhost-user reconnect |
Date: |
Mon, 20 Aug 2018 15:11:19 +0200 |
Hi
On Mon, Aug 20, 2018 at 2:51 PM, Yury Kotov <address@hidden> wrote:
> 16.08.2018, 18:36, "Marc-André Lureau" <address@hidden>:
>> On Thu, Aug 16, 2018 at 5:32 PM, Yury Kotov <address@hidden> wrote:
>>> We are using QEMU (2.12.0) with SPDK (18.04.1) over vhost-user to emulate
>>> block
>>> devices. One of our cases it to restart SPDK without restarting VM (in case
>>> of some updates or smth like it). We tried to use the 'reconnect' option
>>> for
>>> the '-chardev' device:
>>> -object
>>> memory-backend-file,id=mem0,size=1G,mem-path=/dev/hugepages,share=on \
>>> -numa node,memdev=mem0 \
>>> -chardev socket,id=spdk_vhost_blk1,path=/var/tmp/vhost.1,reconnect=10 \
>>> -device vhost-user-blk-pci,chardev=spdk_vhost_blk1,num-queues=4
>>>
>>> After this, vhost-user-blk initialization fails with an error below:
>>> qemu-system-x86_64: -device ...: Failed to set msg fds.
>>> qemu-system-x86_64: -device ...: vhost-user-blk: vhost initialization
>>> failed:
>>> Operation not permitted
>>>
>>> We got the same error with the latest QEMU (c542a9f9794ec8e0bc3f).
>>>
>>> We made some investigations and found out that there are several issues:
>>>
>>> 1. Reconnect option postpones the first connection till machine init done
>>> event.
>>> But we need this connection during vhost blk device initialization which
>>> happens before the machine init done handling.
>>>
>>> 2. If the connection is forced, then the reconnection will be successful
>>> after SPDK restart. The problem is that virtual queue will not start.
>>> The reason for it is that virtual queue initialization commands
>>> should be resent:
>>> * VHOST_USER_SET_FEATURES
>>> * VHOST_USER_SET_MEM_TABLE
>>> * VHOST_USER_SET_VRING_NUM
>>> * VHOST_USER_SET_VRING_BASE
>>> * VHOST_USER_SET_VRING_ADDR
>>> * VHOST_USER_SET_VRING_KICK
>>> * VHOST_USER_SET_VRING_CALL
>>>
>>> The patch set resolves both of these issues.
>>>
>>> Test case:
>>>
>>> 1. Start fio process (inside VM):
>>> fio --name test --ioengine=libaio --iodepth=64 --bs=4096 \
>>> --rw=randrw --direct=1 --sync=1 --verify=md5 \
>>> --size=64M --filename=/dev/vda --loops=100
>>>
>>> 2. Restart SPDK many times.
>>> We are expecting that during SPDK restart fio will pause and fio should
>>> continue to work after restart completion.
>>>
>>> 3. fio process completed successfully without any error.
>>
>> Can you write a test case in vhost-user-test.c ? (perhaps under
>> QTEST_VHOST_USER_FIXME scope...)
>>
>
> This is a great idea and we were definitely going to do that during coming
> couple of weeks. We thought that we could make a follow up commit with
> necessary tests added a bit later though, since currently we need to figure
> out the state of vhost-user tests in general, before we can try to add any
> new stuff, and that will take some time. So far we have stress-tested these
> fixes manually.
Yes, some vhost-user tests are disabled by default (sadly for travis
CI reason - not a really bug), and it's easy to introduce regressions.
I sent a related series "[PATCH 0/4] Fix socket chardev regression" to
make it work again.
> Do you suggest we wait with this series as well until we have all tests
> ready? Or do we proceed now and make a follow up series with vhost user tests
> later like we suggested?
I would rather have the tests with the series.
>
>>> Yury Kotov (3):
>>> chardev: prevent extra connection attempt in tcp_chr_machine_done_hook
>>> vhost: refactor vhost_dev_start and vhost_virtqueue_start
>>> vhost-user: add reconnect support for vhost-user
>>>
>>> chardev/char-socket.c | 5 +-
>>> hw/virtio/vhost-user.c | 65 ++++++++++++--
>>> hw/virtio/vhost.c | 223 +++++++++++++++++++++++++++++++---------------
>>> include/hw/virtio/vhost.h | 2 +
>>> 4 files changed, 215 insertions(+), 80 deletions(-)
>>>
>>> --
>>> 2.7.4