Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy

From:	Avi Kivity
Subject:	Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
Date:	Wed, 02 Mar 2011 18:30:36 +0200
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Lightning/1.0b3pre Thunderbird/3.1.7

On 03/01/2011 05:51 PM, Anthony Liguori wrote:

Do a hot unplug of a network device with upstream libvirt withacpiphp unloaded, consult libvirt and then consult the monitor tosee who has the right view of the guests config.
libvirt is right and the monitor is wrong.
On real hardware, calling _EJ0 doesn't affect the configuration onelittle bit (if I understand it correctly). It just turns off powerto the slot. If you power-cycle, the card will be there.
It's up to the hardware vendor. Since it's ACPI, it can result in anynumber of operations. Usually, there's some logic to flip on an LEDor something.
There's nothing that prevents a vendor from ejecting the card. Mypoint is that there aren't cleanly separated lines in the real world.

We can implement out virtual hardware like real hardware, or we can dosome new stuff, and break our management model in the process.

Unless I'm hallucinating, you're suggesting quite a bit more. Arevolution in how qemu is to be managed.
Let me take another route to see if I can't persuade you.
First, let's clarify your proposal. You want to introduce a new blockformatthat references to block devices. It may also store a dirty bitmap tokeep
track of which blocks are out of sync.  Hopefully, it goes without saying
that the dirty bitmap is strictly optional (it's a performanceoptimization) so
let's ignore it.


(as was related elsewhere, the state is also optional)


Your format, as a text file, looks like:

[raid1]
primary=diska.img
secondary=diskb.img
active=primary

To use it, here's the sequence:

0) qemu uses disk A for a block device

1) create a raid1 block device pointing to disk A and disk B.

2) management tool asks qemu to us the new raid1 block device.

3) qemu acks (2)

4) at some point, the mirror completes, writes are going to both disks

5) qemu sends out an event indicating that the disks are in sync

6) management tool then sends a command to fail over to disk B

7) qemu acks (6)

We're making the management tool the "authoritative" source of how tolaunchQEMU. That means that the management tool ultimately determines whichcommand

line to relaunch QEMU with.

Here are the races:

A) If QEMU crashes between (2) and (3), it may have issues a write tothe newraid1 block device before the management tool sees (3). If thishappens,

   when the management tool restarts QEMU with disk A, we're left with a
   dangling raid1 block device.  Not a critical failure, but not ideal.


You can restart qemu with the RAID1 blockdev.

B) If QEMU crashes between (6) and (7), QEMU may have started writingto diskB before the management tool sees (7). This means that themanagement toolwill create the guest with the raid1 block device which no longeris thecorrect disk. This could fail in subtly bad ways. Depending onhow readis implemented (if you try to do striping for instance), bad datacould bereturned. You could try to implement a policy of always readingfrom B if
   the block has been copied but this gets harry really quickly.  It's
   definitely not RAID1 anymore.


As related elsewhere, you restart qemu with image B.

The trick is to partition the problem into idempotent commands; theseallow you to recover from any failure.

You may observe that the problem is not the RAID1 mechanism, butchanging fromusing a normal device and the RAID1 mechanism. It would then be wiseto say,let's always use this image format. Since that eliminates the race,we don't
really need the copy bitmap anymore.
Now we're left with a simple format that just refers to twofilenames. However,
block devices are more than just a filename.  It needs a format, cache
settings, etc. So let's put this all in the RAID1 block format. Wealso need
a way to indicate which block device is selected.
Let's make it a text file for purposes of discussion. It will looksomething
like:

[primary]
filename=diska.img
cache=none
format=raw

[secondary]
filename=diskb.img
cache=writethrough
format=qcow2

[global]
active=primary
Since we might want to mirror multiple drives at once, we shouldprobablynsupport having multiple drives configured which means we need to notjust have
a single active entry, but an entry associated with a particular device.

Or you have one file per RAID-1 image set. This is important becauseimages are not associated with a qemu instance. You can hot-unplug animage from one qemu and hot-plug it into another.


[drive "diskA"]
filename=diska.img
cache=none
format=raw

[drive "diskB"]
filename=diskb.img
cache=writethrough
format=qcow2

[device "vda"]
drive=diskB

And this is exactly what I'm proposing.


It's exactly what I'm opposing.  Making qemu manage all this stuff.

It's really the natural generalization
of what you're proposing.

So basically, the only differences are:

 1) always use the new RAID1 format
 2) drop the progress bitmap
 3) support multiple devices per file
 4) let drive properties be specified beyond filename

All reasonable things to do.

Well, I dislike 3, and the whole "qemu is authoritative source ofconfiguration" thing.


--
error compiling committee.c: too many arguments to function

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy, Dor Laor, 2011/03/01
- Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy, Anthony Liguori, 2011/03/02
  - Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy, Avi Kivity, 2011/03/02
    - Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy, Anthony Liguori, 2011/03/02
- Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy, Avi Kivity, 2011/03/01
  - Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy, Anthony Liguori, 2011/03/01
    - Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy, Dor Laor, 2011/03/01
    - Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy, Avi Kivity <=
    - Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy, Anthony Liguori, 2011/03/02
- Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy, Avi Kivity, 2011/03/01

Prev by Date: [Qemu-devel] Anuncie totalmente grátis!
Next by Date: Re: [Qemu-devel] Memory Map
Previous by thread: Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
Next by thread: Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
Index(es):
- Date
- Thread