qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] qemu-ga: conclusions on shutdown & suspend behavior


From: Luiz Capitulino
Subject: [Qemu-devel] qemu-ga: conclusions on shutdown & suspend behavior
Date: Tue, 24 Apr 2012 15:19:41 -0300

Hi there,

With the risk of becoming repetitive, I'm going to summarize the problems and
solutions we've discussed in the last few days for the problems found in 
qemu-ga's
shutdown and suspend commands.

Gleb and Igor, you may be interested in items 2 and 4.

Basically, we have four issues:

 1. The guest-shutdown and guest-suspend-* commands are unable to detect errors
    while performing their operation. That is, qemu-ga will report success to
    clients even if an error happens while shutting down or suspending.

    This happens because the operation is executed in a child process and
    qemu-ga doesn't wait() for children processes to avoid blocking.

    Possible solutions:

        A. Don't fix this and preserve qemu-ga's non-blocking behavior
        B. Change qemu-ga to wait() for its children and report errors. Has
           the implication of being a blocking call

 2. The guest-shutdown and guest-suspend-* commands may not emit a success
    response. Actually, the guest-suspend-* commands may emit a response
    after the guest resumes.

    This happens because the guest may shutdown/suspend before qemu-ga is
    able to emit a success response.

    Solution: Change qemu-ga to never emit a success response. Clients should
    do the following to check for success:

       o guest-suspend-disk: if the guest suspends through ACPI, an exit
         status of 3 (chose a random number). Otherwise an exit status of 0
       o guest-suspend-ram or hybrid: wait for the SUSPEND event and/or
         pull for a RunState change to suspended (the RunState change doesn't
         exist upstream yet, will submit a patch)
       o guest-shutdown: an exit status of 0

 3. There's a possible race in suspend code while trying to detect suspend
    support in the guest.

    This happens because the suspend code got complex while trying to
    preserve qemu-ga's non-blocking behavior described in item 1.

    Possible solutions:

          A. Just fix the race (which makes the code more complex)
          B. Do solution 1.B. (which also simplifies the code considerably)

  4. Libvirt is facing a problem when hot plugging a device and then user-space
     suspends to disk: if libvirt is not told to make the new device persistent,
     then it will be unable to correctly resume the VM later, since its
     command-line won't have the newly added device.

     This happens because libvirt doesn't know the VM suspended to disk.

     Solution: Implement solution for item 2 above (ie. exit with a
               different exit status, eg. 3). There isn't much to be done
               if the guest doesn't suspend through ACPI.

     PS: This problem is out of qemu-ga's realm, but it's interesting to
         find a "unified" solution.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]