qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support
Date: Mon, 2 May 2016 13:45:31 +0300

On Sun, May 01, 2016 at 02:04:42PM -0700, Yuanhan Liu wrote:
> On Sun, May 01, 2016 at 02:37:19PM +0300, Michael S. Tsirkin wrote:
> > On Fri, Apr 29, 2016 at 10:48:35AM -0700, Yuanhan Liu wrote:
> > > > But, I
> > > > would worry first about a backend that crashes that it may corrupt the
> > > > VM memory too...
> > > 
> > > Not quite sure I follow this. But, backend just touches the virtio
> > > related memory, so it will do no harm to the VM?
> > 
> > It crashed so apparently it's not behaving as designed -
> > how can we be sure it does not harm the VM?
> 
> Hi Michael,
> 
> What I know so far, a crash might could corrupt the virtio memory in two
> ways:
> 
> - vring_used_elem might be in stale state when we are in the middle of
>   updating used vring. Say, we might have updated the "id" field, but
>   leaving "len" untouched.
> 
> - vring desc buff might also be in stale state, when we are copying
>   data into it.


- a random write into VM memory due to backend bug corrupts it.

> However, the two issues will not be real issue, as used->idx is not
> updated in both case. Thefore, those corrupted memory will not be
> processed by guest. So, no harm I'd say. Or, am I missing anything?

Why is backend crashing? It shouldn't so maybe this means it's
memory is corrupt. That is the claim.

> BTW, Michael, what's your thoughts on the whole reconnect stuff?
> 
>       --yliu

My thoughts are that we need to split these patchsets up.

There are several issues here:


1. Graceful disconnect
- One should be able to do vmstop, disconnect, connect then vm start
  This looks like a nice intermediate step.
- Why would we always assume it's always remote initiating the disconnect?
  Monitor commands for disconnecting seem called for.
  

2. Unexpected disconnect
- If buffers are processed in order or flushed before socket close,
  then on disconnect status can be recovered
  from ring alone. If that is of interest, we might want to add
  a protocol flag for that. F_DISCONNECT_FLUSH ? Without this,
  only graceful disconnect or reset with guest's help can be supported.
- As Marc-André said, without graceful shutdown it is not enough to
  handle ring state though.  We must handle socket getting disconnected
  in the middle of send/receive.  While it's more work,
  it does seem like a nice interface to support.
- I understand the concern that existing guests do not handle NEED_RESET
  status. One way to fix this would be a new feature bit. If NEED_RESET not
  handled, request hot-unplug instead.

3. Running while disconnected
- At the moment, we wait on vm start for remote to connect,
  if we support disconnecting backend without stopping
  we probably should also support starting it without waiting
  for connection
- Guests expect tx buffers to be used in a timely manner, thus:
- If vm is running we must use them in qemu, maybe discarding packets
  in the process.
  There already are commands for link down, I'm not sure there's value
  in doing this automatically in qemu.
- Alternatively, we should also have an option to stop vm automatically (like 
we do on
  disk errors) to reduce number of dropped packets.

4. Reconnecting
- When acting as a server, we might want an option to go back to
  listening state, but it should not be the only option,
  there should be a monitor command for moving it back to
  listening state.
- When acting as a client, auto-reconnect might be handy sometimes, but should 
not be the only
  option, there should be a way to manually request connect, possibly to
  a different target, so a monitor command for re-connecting seems called for.
- We'll also need monitor events for disconnects so management knows it
  must re-connect/restart listening.
- If we stopped VM, there could be an option to restart on reconnect.


-- 
MST



reply via email to

[Prev in Thread] Current Thread [Next in Thread]