qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH 04/22] savevm: do_loadvm(): Always resume the VM


From: Kevin Wolf
Subject: [Qemu-devel] Re: [PATCH 04/22] savevm: do_loadvm(): Always resume the VM
Date: Wed, 21 Apr 2010 17:27:04 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100330 Fedora/3.0.4-1.fc12 Thunderbird/3.0.4

Am 21.04.2010 17:08, schrieb Luiz Capitulino:
> On Wed, 21 Apr 2010 15:28:16 +0200
> Kevin Wolf <address@hidden> wrote:
> 
>> Am 20.04.2010 23:09, schrieb Luiz Capitulino:
>>> do_loadvm(), which implements the 'loadvm' Monitor command, pauses
>>> the emulation to load the saved VM, however it will only resume
>>> it if the loading succeeds.
>>>
>>> In other words, if the user issues 'loadvm' and it fails, the
>>> end result will be an error message and a paused VM.
>>>
>>> This seems an undesirable side effect to me because, most of the
>>> time, if a Monitor command fails the best thing we can do is to
>>> leave the VM as it were before the command was executed.
>>
>> I completely agree with Juan here, this is broken.
> 
>  Yeah, it's an RFC. I decided to send it as is because I needed feedback as
> this series is a snowball.

Which is the right thing to do, so we can find such problems. :-)

>> If you could leave the VM as it was before, just like you describe
>> above, everything would be okay. But in fact you can't tell where the
>> error occurred. You may still have the old state; or you may have loaded
>> the snapshot on one disk, but not on the other one; or you may have
>> loaded snapshots for all disks, but only half of the devices.
>>
>> We must not run a machine in such an unknown state. I'd even go further
>> and say that after a failed loadvm we must even prevent that a user uses
>> the cont command to resume manually.
> 
>  Maybe 'info status' should have a specific status for this too.
> 
>  (Assuming we don't want to radically call exit(1)).

A different status that disallows cont sounds good. But exit() would
work for me as well and is probably easier. However, I'm not sure if it
would work well for management.

>>> FIXME: This will try to run a potentially corrupted image, the
>>>        solution is to split load_vmstate() in two and only keep
>>>        the VM paused if qemu_loadvm_state() fails.
>>
>> Your suggestion of having a prepare function that doesn't change any
>> state looks fine to me. It could just check if the snapshot is there and
>> contains VM state. This should cover all of the trivial cases where
>> recovery is really as easy as resuming the old state.
> 
>  That's exactly what I want to do.
> 
>> Another problem that I see is that it's really hard to define what an
>> error is. Current code print "Warning: ..." for all non-primary images
>> and continues with loading the snapshot. I'm really unsure what the
>> right behaviour would be if a snapshot doesn't exist on a secondary
>> image (note that this is not the CD-ROM case, these drives are already
>> excluded by bdrv_has_snapshot as they are read-only).
> 
>  Defining the right behavior and deciding what to expose have been
> proven very difficult tasks in the QMP world.

There's nothing QMP specific about it. The problem has always been
there, QMP just means that you involuntarily get to review all of it.

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]