Re: [Qemu-devel] [RFC PATCH] replication agent module

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH] replication agent module

From:	Dor Laor
Subject:	Re: [Qemu-devel] [RFC PATCH] replication agent module
Date:	Wed, 08 Feb 2012 10:49:22 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20120131 Thunderbird/10.0

On 02/08/2012 08:10 AM, Ori Mamluk wrote:

On 07/02/2012 17:47, Paolo Bonzini wrote:

On 02/07/2012 03:48 PM, Ori Mamluk wrote:

The current streaming code in QEMU only deals with the former.
Streaming to a remote server would not be supported.

I need it at the same time. The Rephub reads either the full volume or
parts of, and concurrently protects new IOs.


Why can't QEMU itself stream the full volume in the background, and
send that together with any new I/O? Is it because the rephub knows
which parts are out-of-date and need recovery? In that case, as a
first approximation the rephub can pass the sector at which streaming
should start.

Yes - it's because rephub knows. The parts that need recovery may be a
series of random IOs that were lost because of a network outage
somewhere along the replication pipe.
Easy to think of it as a bitmap holding the not-yet-replicated IOs. The
rephub occasionally reads those areas to 'sync' them, so in effect the
rephub needs read access - it's not really to trigger streaming from an
offset.


But I'm also starting to wonder whether it would be simpler to use
existing replication code. DRBD is more feature-rich, and you can use
it over loopback or NBD devices (respectively raw and non-raw), and
also store the replication metadata on a file using the loopback
device. Ceph even has a userspace library and support within QEMU.

I think there are two immediate problems that drbd poses:
1. Our replication is not a simple mirror - it maintains history. I.e.
you can recover to any point in time in the last X hours (usually 24) at
a granularity of about 5 seconds.
To be able to do that and keep the replica consistent we need to be
notified for each IO.


Can you please elaborate some more in the exact details -

In theory, you can build a setup where the drbd (or nbd) copy on thedestination side write to a intermediate image and every such write istrapped locally on the destination and you may not immediately propagatethat to the disk image the VM sees.

2. drbd is 'below' all the Qemu block layers - if the protected volume
is qcow2 then drbd doesn't get the raw IOs, right?

That's one of the major caveats in drbd/iscsi/nbd - there is no supportfor block level snapshots[1]. I wonder if the scsi protocol hassomething like this so we'll get efficient replication of qcow2/lvmsnapshots that their base is already shared. If we'll gain suchfunctionality, we'll benefit of it for storage vm motion solution too.

Another issue w/ drbd is that a continuous backup solution requires todo consistent snapshot and call file system freeze and sync it w/ thecurrent block IO transfer. DRBD doesn't do that nor the other protocols.Of course DRBD can be enhanced but it will take allot more time.

A third requirement and similar to above is to group snapshots ofseveral VMs so a consistent _cross vm application view_ will be created.It demands some control over IO tagging.

To summarize, IMHO drbd (which I used successfully 6 years ago and Ilove) is not drop&replace solution to this case.I recommend we either to fit the nbd/iscsi case and improve our vmstorage motion on the way or worse case develop proprietary logic thatcan live out side of qemu using IO tapping interface, similar to theguidelines Ori outlines.


Thanks,
Dor

[1] Check the far too basic approach for snapshots:http://www.drbd.org/users-guide/s-lvm-snapshots.html

Ori

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [RFC PATCH] replication agent module, Ori Mamluk, 2012/02/07
- Re: [Qemu-devel] [RFC PATCH] replication agent module, Anthony Liguori, 2012/02/07
  - Re: [Qemu-devel] [RFC PATCH] replication agent module, Dor Laor, 2012/02/07
    - Re: [Qemu-devel] [RFC PATCH] replication agent module, Ori Mamluk, 2012/02/07
    - Re: [Qemu-devel] [RFC PATCH] replication agent module, Anthony Liguori, 2012/02/07
    - Re: [Qemu-devel] [RFC PATCH] replication agent module, Ori Mamluk, 2012/02/07
    - Re: [Qemu-devel] [RFC PATCH] replication agent module, Paolo Bonzini, 2012/02/07
    - Re: [Qemu-devel] [RFC PATCH] replication agent module, Ori Mamluk, 2012/02/07
    - Re: [Qemu-devel] [RFC PATCH] replication agent module, Paolo Bonzini, 2012/02/07
    - Re: [Qemu-devel] [RFC PATCH] replication agent module, Ori Mamluk, 2012/02/08
    - Re: [Qemu-devel] [RFC PATCH] replication agent module, Dor Laor <=
    - Re: [Qemu-devel] [RFC PATCH] replication agent module, Stefan Hajnoczi, 2012/02/08
    - Re: [Qemu-devel] [RFC PATCH] replication agent module, Kevin Wolf, 2012/02/08
    - Re: [Qemu-devel] [RFC PATCH] replication agent module, Ori Mamluk, 2012/02/08
    - Re: [Qemu-devel] [RFC PATCH] replication agent module, Kevin Wolf, 2012/02/08
    - Re: [Qemu-devel] [RFC] Replication agent design (was [RFC PATCH] replication agent module), Ori Mamluk, 2012/02/08
    - Re: [Qemu-devel] [RFC] Replication agent design (was [RFC PATCH] replication agent module), Stefan Hajnoczi, 2012/02/08
    - Re: [Qemu-devel] [RFC] Replication agent design (was [RFC PATCH] replication agent module), Stefan Hajnoczi, 2012/02/08
    - Re: [Qemu-devel] [RFC] Replication agent design (was [RFC PATCH] replication agent module), Ori Mamluk, 2012/02/19
    - Re: [Qemu-devel] [RFC] Replication agent design (was [RFC PATCH] replication agent module), Paolo Bonzini, 2012/02/20
    - [Qemu-devel] BlockDriverState stack and BlockListeners (was: [RFC] Replication agent design), Kevin Wolf, 2012/02/21

Prev by Date: [Qemu-devel] [Bug 824650] Re: Latest GIT assert error in arp_table.c
Next by Date: Re: [Qemu-devel] [RFC PATCH] replication agent module
Previous by thread: Re: [Qemu-devel] [RFC PATCH] replication agent module
Next by thread: Re: [Qemu-devel] [RFC PATCH] replication agent module
Index(es):
- Date
- Thread