|
From: | Avi Kivity |
Subject: | Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy |
Date: | Wed, 02 Mar 2011 15:00:07 +0200 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Lightning/1.0b3pre Thunderbird/3.1.7 |
On 03/02/2011 02:39 PM, Anthony Liguori wrote:
Here is where your race is: 2. Management sends a switch command 3. QEMU receives switch command 4. QEMU stops doubling IO and switches to the destination 5. QEMU sends acknowledgement of switch command 6. Management receives acknowledge of switch command7. Management changes internal state definition to reflect the new destinationIf QEMU or the management tool crashes after step 4 and before step 6, when the management tool restarts QEMU with the source image, data loss will have occurred (and potentially corruption if a flush had happened).
No. After step 2, any qemu restart will be with the destination image. If the management tool restarts, it can query the state (or just re-issue the switch command, which is idempotent).
This all boils down to the Two Generals Problem[1]. It's simply not fixable without making one end reliable and that means that someone needs to fsync() something *after* the switchover happens but before the first write happens. That can be QEMU (Avi's RAID proposal and my state file proposal) or it can be the management tool (if we introduce synchronous events).
The two problems are not equivalent. Once the management tool receives acknowledgement that the switch occurred, the protocol terminates.
-- error compiling committee.c: too many arguments to function
[Prev in Thread] | Current Thread | [Next in Thread] |