[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Block job commands in QEMU 1.2 [v2, including support f

From: Eric Blake
Subject: Re: [Qemu-devel] Block job commands in QEMU 1.2 [v2, including support for replication]
Date: Thu, 24 May 2012 10:57:05 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1

On 05/24/2012 07:41 AM, Paolo Bonzini wrote:
> changes from v1:
> - added per-job iostatus
> - added description of persistent dirty bitmap
> The same content is also at
> http://wiki.qemu.org/Features/LiveBlockMigration/1.2

> * query-block-jobs: BlockJobInfo gets two new fields, paused and
> io-status.  The job-specific iostatus is completely separate from the
> block device iostatus.

Is it still true that for mirror jobs, whether we are mirroring is still
determined by whether 'len'=='offset'?

> * drive-mirror: activates mirroring to a second block device (optionally
> creating the image on that second block device).  Compared to the
> earlier versions, the "full" argument is replaced by an enum option
> "sync" with three values:
> - top: copies data in the topmost image to the destination
> - full: copies data from all images to the destination
> - dirty: copies clusters that are marked in the dirty bitmap to the
> destination (see below)

Different, but at least RHEL used the name __com.redhat_drive-mirror, so
libvirt can cope with the difference.

> * block-job-complete: force completion of mirroring and switching of the
> device to the target, not related to the rest of the proposal.
> Synchronously opens backing files if needed, asynchronously completes
> the job.

Can this be made part of a 'transaction'?  Likewise, can
'block-job-cancel' be made part of a 'transaction'?  Having those two
commands transactionable means that you could copy multiple disks at the
same point in time (block-job-cancel) or pivot multiple disks leaving
the former files consistent at the same point in time
(block-job-complete).  It doesn't have to be done in the first round,
but we should make sure we are not precluding this for future growth.

Also, for the purposes of copying but not pivoting, you only have a safe
copy if 'len'=='offset' at the time of the cancel.  But now that you are
adding the possibility of mirroring reverting to copying, there is a
race where I can probe and see that we are in mirroring, then issue a
'block-job-cancel' to affect a copy operation, but in the meantime
things reverted, and the cancel ends up leaving me with an incomplete
copy.  Maybe 'block-job-complete' should be given an optional boolean
parameter; by default or if the parameter is true, we pivot, but if
false, then we do the same as 'block-job-cancel' to affect a safe copy
if we are in mirroring, while erroring out if we are not in mirroring,
leaving 'block-job-cancel' as a way to always cancel a job but no longer
a safe way to guarantee a copy operation.

> Persistent dirty bitmap
> =======================
> A persistent dirty bitmap can be used by management for two reasons.
> When mirroring is used for continuous replication of storage, to record
> I/O operations that happened while the replication server is not
> connected or unavailable.  When mirroring is used for storage migration,
> to check after a management crash whether the VM must be restarted with
> the source or the destination.

Is there a particular file format for the dirty bitmap?  Is there a
header, or is it just straight bitmap, where the size of the file is an
exact function of size of the file that it maps?

> If management crashes between (6) and (7), it can examine the dirty
> bitmap on disk.  If it is all-zeros,

Obviously, this would be all-zeros in the map portion of the file, any
header portion would not impact this.

> management can restart the virtual
> machine with /mnt/dest/diskname.img.  If it has even a single zero bit,


> management can restart the virtual machine with the persistent dirty
> bitmap enabled, and later issue again a drive-mirror command to restart
> from step 4.
> Paolo

Eric Blake   address@hidden    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]