qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 0/2]: qemu-ga: drop automatic reaper


From: Michael Roth
Subject: Re: [Qemu-devel] [RFC 0/2]: qemu-ga: drop automatic reaper
Date: Fri, 20 Apr 2012 12:26:10 -0500
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, Apr 20, 2012 at 10:36:38AM -0300, Luiz Capitulino wrote:
> On Fri, 20 Apr 2012 14:07:16 +0200
> Michal Privoznik <address@hidden> wrote:
> 
> > >> But, I think if we tell users we'll *only* send response on error,
> > >> we should do our part to *not* send the responses, rather than relying
> > >> on them having implemented the reset mechanism to throw them away after
> > >> guest wake-up. What we could do is allow a command to set a flag, after
> > >> it reaps it's child (in the case of suspend this would be after
> > >> wake-up, for shutdown it'd basically be a no-op, but worth adding
> > >> for readability sake), to have qemu-ga not send a response. We'd
> > >> implement it similarly to how we did ga_set_response_delimited().
> > > 
> > > Fine with me. Stating that "do not wait for an OK response, because none
> > > will be sent" sounds clearer than "an OK response may, or may not be
> > > emitted. Or it may be emitted when the VM resumes".
> > 
> > 
> > Just to make this clear: this "report-only-error" behavior concerns only
> > guest-suspend-* and guest-shutdown commands, right? Because otherwise,
> > if we enable such behavior for all commands (e.g. fsfreeze) I think we
> > are entering the world of pain.
> 
> Exactly, this would only concern the suspend and shutdown commands.
> 
> > From user POV there is a huge difference between those 2 sets of
> > commands (suspend/shutdown on one side, the others on the other side):
> > - the first emits an event on qemu monitor, so libvirt can catch that
> > and confirm suspend/shutdown has succeeded
> 
> Oops, this is a different subject but there's a problem here. Events are just
> hints, they shouldn't be your definitive source of information.
> 
> For shutdown and suspend-disk, I think that the best indication that
> the command has succeeded is that the VM will successfully exit. We could
> also have a special exit status code for suspend-to-disk, because the
> command could run in parallel with the user powering off the VM and libvirt
> wouldn't know that (and would think the VM is suspended, while it's really
> powered-off).

Do we have a way of differentiating a guest shutting down as a result of
shutdown vs. suspend-to-disk? AFAIK they both have the same ending state. If
so I'm not sure we can say anything other than "the machine may or may not
resume when you start it up, depending on whether or not the suspend-to-disk
completed before the machine shut down".

So for both shutdown and suspend-to-disk, i think the best we can do it
treat qemu exit as an indicator of success (from the client perspective,
at least).

And if we can do that, shutdown and suspend-to-disk are pretty
straighforward: we either get an error response, or a timeout. Errors
are easy to report, timeouts: you just keep issuing the reset sequence
and attempting to re-establish a connection. If the machine is hung this
can continue indefinitely, which is fine if your reset sequence also
has a timeout and allows periodic execution of libvirt to it can do
whatever it normally does if a guest is hung.

> 
> For suspend-ram and suspend-hybrid, we're missing a 'suspended' RunState.
> The event serve as a good hint and you can use it, but if it's lost for some
> reason (eg. libvirt crashes before it's received) then libvirt can learn the
> VM state by issuing query-status.

Is it pretty trivial to plumb QEVENT_SUSPEND/QEVENT_WAKEUP to a
RunState? If so, checking query-status should be similar to checking for
qemu process exit. If you don't see a transition after a certain
threshold, you transition back into qemu-ga reset sequence loop, and
treat the machine the same way libvirt would treat any hung guest.

The one downside to query-status vs. an event is that there's a race
window where the suspend succeeds but gets woken up before we see it in
a suspended state, and event would work around this, but there may as
you said be instances where we miss it.

> 
> Now, going back to the original subject. I have to admit that I'm not sure
> what's the best way to go here.
> 
> I'll try to recapitulate (for myself and for those that may be confused) I'll 
> be
> verbose a bit.
> 
> We have two qemu-ga commands that are special: guest-shutdown and the 
> guest-suspend
> family. They are special because they shut down the VM or suspend its 
> execution
> (meaning that the world of qemu-ga is gone or gets completely frozen).
> 
> Today, shutdown is an asynchronous operation: qemu-ga gets the shutdown 
> process
> started and returns to accept new commands. For qemu-ga clients, this implies:
> 
>  1. errors in the shutdown operation are not reported back
>  2. qemu-ga doesn't block
> 
> For qemu-ga this implies having a way to automatically reap terminated 
> children
> processes.
> 
> The guest-suspend commands do the same when executing the suspend operation,
> but before they do that they need to query the VM for suspend support and
> this is done by executing pm-is-supported (if available). This fact shouldn't
> be visible to qemu-ga clients, but it has two internal implications:
> 
>  1. The operation is half synchronous and half asynchronous
>  2. In order to bypass the automatic process reaper in qemu-ga when executing
>     pm-is-supported, we have to play tricks that makes the suspend code
>     more complex than it should be
> 
> We have two options:
> 
>  1. Keep the current behavior (explained above, shutdown is async, suspend
>     is half sync half async). For libvirt this means nothing changes, for
>     qemu-ga this means more complex code
> 
>  2. Change everything to be synchronous (this series). This essentially means:
> 
>      A. errors are going to reported back
>      B. qemu-ga will block
>      C. we avoid all the dirty tricks, and qemu-ga code becomes simpler
>      D. In theory, this should be a compatible change due to the end of world
>         nature of the commands involved
> 
> A third possible option would be to have asynchronous support. But I'm not
> sure whether this would fit well and how complex this would be (specially
> because of fork()).
> 
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]