[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC 0/2] virtio-vhost-user: add virtio-vhost-user devi
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [RFC 0/2] virtio-vhost-user: add virtio-vhost-user device |
Date: |
Tue, 6 Feb 2018 14:13:34 +0000 |
User-agent: |
Mutt/1.9.1 (2017-09-22) |
On Tue, Feb 06, 2018 at 12:42:36PM +0000, Wang, Wei W wrote:
> On Tuesday, February 6, 2018 5:32 PM, Stefan Hajnoczi wrote:
> > On Tue, Feb 06, 2018 at 01:28:25AM +0000, Wang, Wei W wrote:
> > > On Tuesday, February 6, 2018 12:26 AM, Stefan Hajnoczi wrote:
> > > > On Fri, Feb 02, 2018 at 09:08:44PM +0800, Wei Wang wrote:
> > > > > On 02/02/2018 01:08 AM, Michael S. Tsirkin wrote:
> > > > > > On Tue, Jan 30, 2018 at 08:09:19PM +0800, Wei Wang wrote:
> > > > > > > Issues:
> > > > > > > Suppose we have both the vhost and virtio-net set up, and
> > > > > > > vhost pmd <-> virtio-net pmd communication works well. Now,
> > > > > > > vhost pmd exits (virtio-net pmd is still there). Some time
> > > > > > > later, we re-run vhost pmd, the vhost pmd doesn't know the
> > > > > > > virtqueue addresses of the virtio-net pmd, unless the
> > > > > > > virtio-net pmd reloads to start the 2nd phase of the
> > > > > > > vhost-user protocol. So the second run of the vhost
> > > > pmd won't work.
> > > > > > >
> > > > > > > Any thoughts?
> > > > > > >
> > > > > > > Best,
> > > > > > > Wei
> > > > > > So vhost in qemu must resend all configuration on reconnect.
> > > > > > Does this address the issues?
> > > > > >
> > > > >
> > > > > Yes, but the issues are
> > > > > 1) there is no reconnecting when a pmd exits (the socket
> > > > > connection seems still on at the device layer);
> > > >
> > > > This is how real hardware works too. If the driver suddenly stops
> > > > running then the device remains operational. When the driver is
> > > > started again it resets the device and initializes it.
> > > >
> > > > > 2) If we find a way to break the QEMU layer socket connection
> > > > > when pmd exits and get it reconnect, virtio-net device still
> > > > > won't send all the configure when reconnecting, because socket
> > > > > connecting only triggers phase 1 of vhost-user negotiation (i.e.
> > > > > vhost_user_init). Phase 2 is triggered after the driver loads
> > > > > (i.e. vhost_net_start). If the virtio-net pmd doesn't reload,
> > > > > there are no phase 2 messages (like virtqueue addresses which
> > > > > are allocated by the pmd). I think we need to think more about
> > > > > this before
> > moving forward.
> > > >
> > > > Marc-André: How does vhost-user reconnect work when the master
> > > > goes away and a new master comes online? Wei found that the QEMU
> > > > slave implementation only does partial vhost-user initialization
> > > > upon reconnect, so the new master doesn't get the virtqueue
> > > > address and
> > related information.
> > > > Is this a QEMU bug?
> > >
> > > Actually we are discussing the slave (vhost is the slave, right?) going
> > > away.
> > When a slave exits and some moment later a new slave runs, the master
> > (virtio-net) won't send the virtqueue addresses to the new vhost slave.
> >
> > Yes, apologies for the typo. s/QEMU slave/QEMU master/
> >
> > Yesterday I asked Marc-André for help on IRC and we found the code
> > path where the QEMU master performs phase 2 negotiation upon
> > reconnect. It's not obvious but the qmp_set_link() calls in
> > net_vhost_user_event() will do it.
> >
> > I'm going to try to reproduce the issue you're seeing now. Will let
> > you know what I find.
> >
>
> OK. Thanks. I observed no messages after re-run virtio-vhost-user pmd, and
> found there is no re-connection event happening in the device side.
>
> I also tried to switch the role of client/server - virtio-net to run a server
> socket, and virtio-vhost-user to run the client, and it seems the current
> code fails to run that way. The reason is the virtio-net side
> vhost_user_get_features() doesn't return. On the vhost side, I don't see
> virtio_vhost_user_deliver_m2s being invoked to deliver the GET_FEATURES
> message. I'll come back to continue later.
This morning I reached the conclusion that reconnection is currently
broken in the QEMU vhost-user master. It's a bug in the QEMU vhost-user
master implementation, not a design or protocol problem.
On my machine the following QEMU command-line does not launch because
vhost-user.c gets stuck while trying to connect/negotiate:
qemu -M accel=kvm -cpu host -m 1G \
-object
memory-backend-file,id=mem0,mem-path=/var/tmp/foo,size=1G,share=on \
-numa node,memdev=mem0 \
-drive if=virtio,file=test.img,format=raw \
-chardev socket,id=chardev0,path=vhost-user.sock,reconnect=1 \
-netdev vhost-user,chardev=chardev0,id=netdev0 \
-device virtio-net-pci,netdev=netdev0
Commit c89804d674e4e3804bd3ac1fe79650896044b4e8 ("vhost-user: wait until
backend init is completed") broke reconnect by introducing a call to
qemu_chr_fe_wait_connected().
qemu_chr_fe_wait_connected() doesn't work together with -chardev
...,reconnect=1. This is because reconnect=1 connects asynchronously
and then qemu_chr_fe_wait_connect() connects synchronously (if the async
connect hasn't completed yet). This means there will be 2 sockets
connecting to the vhost-user slave!
The virtio-vhost-user slave accepts the first connection but never
receives any data because the QEMU master is trying to use the 2nd
socket instead.
Reconnection probably worked when Marc-André implemented it since QEMU
wasn't using qemu_chr_fe_wait_connected().
Marc-André: How do you think this should be fixed?
Stefan
signature.asc
Description: PGP signature