[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 05/24] vhost: change some assert() for error_rep
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] [PATCH 05/24] vhost: change some assert() for error_report() or silent fail |
Date: |
Sat, 25 Jun 2016 01:38:01 +0300 |
On Fri, Jun 24, 2016 at 03:25:30PM +0200, Marc-André Lureau wrote:
> On Thu, Jun 23, 2016 at 7:13 PM, Michael S. Tsirkin <address@hidden> wrote:
> >> > If it's ok and we can recover, then why should we print errors?
> >>
> >> To me, the current disconnect handling is not handled cleanly. There
> >> is not much/nothing in the protocol that tells when and how you can
> >> disconnect. If qemu vhost-user reconnection works today, it is mostly
> >> by luck. Imho, we should print errors if any call to vhost user fails
> >> to help further analysis of broken behaviour.
> >
> > But we decided disconnect is OK, did we not? So now a failure like this
> > is just expected. It's not broken behaviour anymore ...
>
> As you know, there are many ways qemu or the vm can break when the
> backend is disconnected. For now, it would help a lot if we had a bit
> of error messages in this case. But do we really want to support
> spurious disconnect/reconnect at any time? It's going to be
> challenging, to maintain, to test... Is it really worthwhile? I doubt
> it. I think it would be easier and more future-proof to have a
> dedicated vhost-user request for that and only do a best effort for
> the ungraceful disconnect.
ungraceful might take a while. But I don't see a need for
message exchanges for shutdown. Do you?
> >> And we shoud have a
> >> clean mechanism to handle shutdown/disconnect/reconnect.
> >
> >
> > At some level this is something that's missing here.
> >
> > How about we patch docs/specs/vhost-user.txt
> > and describe how is reconnection supposed to work?
> >
> > All we have is:
> > If Master is unable to send the full message or receives a wrong
> > reply
> > it will close the connection. An optional reconnection mechanism
> > can be
> > implemented.
>
> This text is there from the original commit but it doesn't say how to
> reconnect and or how to recover from the ring. I don't have enough
> experience with virtio in general to say when it is acceptable for
> backend to be gone (not processing the ring), I can imagine the
> implementation vary widely, and the requirements depend on the kind of
> device.
>
> If we limit the spec to vhost-user protocol only and leave it open on
> how each virtio device should support ungraceful disconnect/reconnect,
> then it sounds reasonable to document that after such disconnect, on
> reconnect:
> - the server assumes the backend has no knowledge of previous connection
> - the state can be changed between reconnections (new ring state, mem table
> etc)
> - the server will check feature compatibility with previous backend
> and reject incompatible backends
> - backend must restore ring processing from used->idx and ignore
> VHOST_USER_SET_VRING_BASE (or should we change and document that in
> this case the ring base is updated from used->idx?)
I think it's cleaner to change VHOST_USER_SET_VRING_BASE to get stuff
from used->idx.
> (it's probably not a complete list, I am not good at imagining all
> possible cases)
used ring must be flushed before disconnect (so all entries
before used->idx have been processed, and no entries
after used->idx have been processed).
If enabled, dirty log must be flushed before disconnect.
>
> --
> Marc-André Lureau
- Re: [Qemu-devel] [PATCH 08/24] vhost-user: return a read error, (continued)
[Qemu-devel] [PATCH 09/24] vhost: make vhost_log_put() idempotent, marcandre . lureau, 2016/06/21
[Qemu-devel] [PATCH 05/24] vhost: change some assert() for error_report() or silent fail, marcandre . lureau, 2016/06/21
[Qemu-devel] [PATCH 10/24] vhost: call vhost_log_put() on cleanup, marcandre . lureau, 2016/06/21
[Qemu-devel] [PATCH 12/24] vhost: make vhost_dev_cleanup() idempotent, marcandre . lureau, 2016/06/21
[Qemu-devel] [PATCH 13/24] vhost-net: always call vhost_dev_cleanup() on failure, marcandre . lureau, 2016/06/21
[Qemu-devel] [PATCH 14/24] vhost: don't assume opaque is a fd, use backend cleanup, marcandre . lureau, 2016/06/21
[Qemu-devel] [PATCH 11/24] vhost: add vhost device only after all success, marcandre . lureau, 2016/06/21
[Qemu-devel] [PATCH 06/24] vhost-user: check vhost_user_write() return value, marcandre . lureau, 2016/06/21