On 2018/12/6 下午9:59, Michael S. Tsirkin wrote:
On Thu, Dec 06, 2018 at 09:57:22PM +0800, Jason Wang wrote:
On 2018/12/6 下午2:35,address@hidden wrote:
From: Xie Yongji<address@hidden>
This patchset is aimed at supporting qemu to reconnect
vhost-user-blk backend after vhost-user-blk backend crash or
restart.
The patch 1 tries to implenment the sync connection for
"reconnect socket".
The patch 2 introduces a new message VHOST_USER_SET_VRING_INFLIGHT
to support offering shared memory to backend to record
its inflight I/O.
The patch 3,4 are the corresponding libvhost-user patches of
patch 2. Make libvhost-user support VHOST_USER_SET_VRING_INFLIGHT.
The patch 5 supports vhost-user-blk to reconnect backend when
connection closed.
The patch 6 tells qemu that we support reconnecting now.
To use it, we could start qemu with:
qemu-system-x86_64 \
-chardev socket,id=char0,path=/path/vhost.socket,reconnect=1,wait \
-device vhost-user-blk-pci,chardev=char0 \
and start vhost-user-blk backend with:
vhost-user-blk -b /path/file -s /path/vhost.socket
Then we can restart vhost-user-blk at any time during VM running.
I wonder whether or not it's better to handle this at the level of virtio
protocol itself instead of vhost-user level. E.g expose last_avail_idx to
driver might be sufficient?
Another possible issue is, looks like you need to deal with different kinds
of ring layouts e.g packed virtqueues.
Thanks
I'm not sure I understand your comments here.
All these would be guest-visible extensions.
Looks not, it only introduces a shared memory between qemu and
vhost-user backend?
Possible for sure but how is this related to
a patch supporting transparent reconnects?
I might miss something. My understanding is that we support transparent
reconnects, but we can't deduce an accurate last_avail_idx and this is
what capability this series try to add. To me, this series is functional
equivalent to expose last_avail_idx (or avail_idx_cons) in available
ring. So the information is inside guest memory, vhost-user backend can
access it and update it directly. I believe this is some modern NIC did
as well (but index is in MMIO area of course).