qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] Consistent Snapshots Idea


From: shu ming
Subject: Re: [Qemu-devel] [RFC] Consistent Snapshots Idea
Date: Mon, 21 Nov 2011 22:27:46 +0800
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20111105 Thunderbird/8.0

On 2011-11-21 20:31, Avi Kivity wrote:
On 11/21/2011 02:01 PM, Richard Laager wrote:
I'm not an expert on the architecture of KVM, so perhaps this is a QEMU
question. If so, please let me know and I'll ask on a different list.
It is a qemu question, yes (though fork()ing a guest also relates to kvm).

Background:

Assuming the block layer can make instantaneous snapshots of a guest's
disk (e.g. lvcreate -s), one can get "crash consistent" (i.e. as if the
guest crashed) snapshots. To get a "fully consistent" snapshot, you need
to shutdown the guest. For production VMs, this is obviously not ideal.

Idea:

What if KVM/QEMU was to fork() the guest and shutdown one copy?

KVM/QEMU would momentarily halt the execution of the guest and take a
writable, instantaneous snapshot of each block device. Then it would
fork(). The parent would resume execution as normal. The child would
redirect disk writes to the snapshot(s). The RAM should have
copy-on-write behavior as with any other fork()ed process. Other
resources like the network, display, sound, serial, etc. would simply be
disconnected/bit-bucketed. Finally, the child would resume guest
execution and send the guest an ACPI power button press event. This
would cause the guest OS to perform an orderly shutdown.

I believe this would provide consistent snapshots in the vast majority
of real-world scenarios in a guest OS and application-independent way.
Interesting idea.  Will the guest actually shut down nicely without a
network?  Things like NFS mounts will break.

Does the child and parent process run in parallel? What will happen if the parent process try to access the block device? It looks like that the child process will write to a snapshot file, but where will the parent process write to?


Implementation Nits:

       * A timeout on the child process would likely be a good idea.
       * It'd probably be best to disconnect the network (i.e. tell the
         guest the cable is unplugged) to avoid long timeouts. Likewise
         for the hardware flow-control lines on the serial port.
This is actually critical, otherwise the guest will shutdown(2) all
sockets and confuse the clients.

       * For correctness, fdatasync()ing or similar might be necessary
         after halting execution and before creating the snapshots.
Microsoft guests have an API to quiesce storage prior to a snapshot, and
I think there is work to bring this to Linux guests.  So it should be
possible to get consistent snapshots even without this, but it takes
more integration.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]