[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC v4 PATCH 30/49] multi-process: send heartbeat messages to remot
From: |
Stefan Hajnoczi |
Subject: |
Re: [RFC v4 PATCH 30/49] multi-process: send heartbeat messages to remote |
Date: |
Thu, 21 Nov 2019 12:19:14 +0000 |
User-agent: |
Mutt/1.12.1 (2019-06-15) |
On Wed, Nov 13, 2019 at 11:01:07AM -0500, Jag Raman wrote:
>
>
> On 11/11/2019 11:27 AM, Stefan Hajnoczi wrote:
> > On Thu, Oct 24, 2019 at 05:09:11AM -0400, Jagannathan Raman wrote:
> > > +static void broadcast_msg(MPQemuMsg *msg, bool need_reply)
> > > +{
> > > + PCIProxyDev *entry;
> > > + unsigned int pid;
> > > + int wait;
> > > +
> > > + QLIST_FOREACH(entry, &proxy_dev_list.devices, next) {
> > > + if (need_reply) {
> > > + wait = eventfd(0, EFD_NONBLOCK);
> > > + msg->num_fds = 1;
> > > + msg->fds[0] = wait;
> > > + }
> > > +
> > > + mpqemu_msg_send(entry->mpqemu_link, msg,
> > > entry->mpqemu_link->com);
> > > + if (need_reply) {
> > > + pid = (uint32_t)wait_for_remote(wait);
> >
> > Sometimes QEMU really needs to wait for the remote process before it can
> > make progress. I think this is not one of those cases though.
> >
> > Since QEMU is event-driven it's problematic to invoke blocking system
> > calls. The remote process might not respond for a significant amount of
> > time. Other QEMU threads will be held up waiting for the QEMU global
> > mutex in the meantime (because we hold it!).
>
> There are places where we wait synchronously for the remote process.
> However, these synchronous waits carry a timeout to prevent the hang
> situation you described above.
>
> We will add an error recovery in the future. That is, we will respawn
> the remote process if the QEMU times out waiting for it.
Even with a timeout, in the meantime the event loop is blocked. That
means timers will be delayed by a large amount, the monitor will be
unresponsive, etc.
> >
> > Please implement heartbeat/ping asynchronously. The wait eventfd should
> > be read by an event loop fd handler instead. That way QEMU can continue
> > with running the VM while waiting for the remote process.
>
> In the current implementation, the heartbeat/ping is asynchronous.
> start_heartbeat_timer() sets up a timer to perform the ping.
The heartbeat/ping is synchronous because broadcast_msg() blocks.
Stefan
signature.asc
Description: PGP signature