[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] qemu-ga: conclusions on shutdown & suspend behavior
From: |
Luiz Capitulino |
Subject: |
[Qemu-devel] qemu-ga: conclusions on shutdown & suspend behavior |
Date: |
Tue, 24 Apr 2012 15:19:41 -0300 |
Hi there,
With the risk of becoming repetitive, I'm going to summarize the problems and
solutions we've discussed in the last few days for the problems found in
qemu-ga's
shutdown and suspend commands.
Gleb and Igor, you may be interested in items 2 and 4.
Basically, we have four issues:
1. The guest-shutdown and guest-suspend-* commands are unable to detect errors
while performing their operation. That is, qemu-ga will report success to
clients even if an error happens while shutting down or suspending.
This happens because the operation is executed in a child process and
qemu-ga doesn't wait() for children processes to avoid blocking.
Possible solutions:
A. Don't fix this and preserve qemu-ga's non-blocking behavior
B. Change qemu-ga to wait() for its children and report errors. Has
the implication of being a blocking call
2. The guest-shutdown and guest-suspend-* commands may not emit a success
response. Actually, the guest-suspend-* commands may emit a response
after the guest resumes.
This happens because the guest may shutdown/suspend before qemu-ga is
able to emit a success response.
Solution: Change qemu-ga to never emit a success response. Clients should
do the following to check for success:
o guest-suspend-disk: if the guest suspends through ACPI, an exit
status of 3 (chose a random number). Otherwise an exit status of 0
o guest-suspend-ram or hybrid: wait for the SUSPEND event and/or
pull for a RunState change to suspended (the RunState change doesn't
exist upstream yet, will submit a patch)
o guest-shutdown: an exit status of 0
3. There's a possible race in suspend code while trying to detect suspend
support in the guest.
This happens because the suspend code got complex while trying to
preserve qemu-ga's non-blocking behavior described in item 1.
Possible solutions:
A. Just fix the race (which makes the code more complex)
B. Do solution 1.B. (which also simplifies the code considerably)
4. Libvirt is facing a problem when hot plugging a device and then user-space
suspends to disk: if libvirt is not told to make the new device persistent,
then it will be unable to correctly resume the VM later, since its
command-line won't have the newly added device.
This happens because libvirt doesn't know the VM suspended to disk.
Solution: Implement solution for item 2 above (ie. exit with a
different exit status, eg. 3). There isn't much to be done
if the guest doesn't suspend through ACPI.
PS: This problem is out of qemu-ga's realm, but it's interesting to
find a "unified" solution.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Qemu-devel] qemu-ga: conclusions on shutdown & suspend behavior,
Luiz Capitulino <=