qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Qemu and Changed Block Tracking


From: Eric Blake
Subject: Re: [Qemu-devel] Qemu and Changed Block Tracking
Date: Wed, 22 Feb 2017 06:32:05 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0

On 02/22/2017 02:45 AM, Peter Lieven wrote:
>> A bit outdated now, but:
>> http://wiki.qemu-project.org/Features/IncrementalBackup
>>
>> and also a summary I wrote not too far back (PDF):
>> https://drive.google.com/file/d/0B3CFr1TuHydWalVJaEdPaE5PbFE
>>
>> and I'm sure the Virtuozzo developers could chime in on this subject,
>> but basically we do have something similar in the works, as eblake says.
> 
> Hi John, Hi Erik,

It's Eric, but you're not the first to make that typo :)

> 
> thanks for your feedback. Are you both the ones working primary on this topic?
> If there is anything to review or help needed, please let me know.
> 
> My 2 cents:
> I thing I had in mind if there is no image fleecing available, but fetching 
> the dirty bitmap
> from external would be a feauture to put a write lock on a block device.

The whole idea is to use a dirty bitmap coupled with image fleecing,
where the point-in-time of the image fleecing is done at a window where
the guest I/O is quiescent in order to get a stable fleecing point.  We
already support write locks (guest quiesence) using qga to do fsfreeze.
You want the time that guest I/O is frozen to be as small as possible
(in particular, the Windows implementation of quiescence will fail if
you hold things frozen for more than a couple of seconds).

Right now, the qcow2 image format does not track write generations, and
I don't think we plan on adding that directly into qcow2.  However, you
can externally simulate write generations by keeping track of how many
image fleecing points you have created (each fleecing point is another
write generation).


> In this case something like this via QMP (and external software) should work:
> ---8<---
>  gen =  write generation of last backup (or 0 for full backup)
>  do {
>      nextgen = fetch current write generation (via QMP)
>      dirtymap = send all block whose write generation is greater than 'gen' 
> (via QMP)

No, we are NOT going to send dirty information via QMP.  Rather, we are
going to send it via NBD's extension NBD_CMD_BLOCK_STATUS.  The idea is
that a client connects and asks which qemu blocks are dirty, then uses
that information to read only the dirty blocks.

>      dirtycnt = 0
>      foreach block in dirtymap {
>                copy to backup via external software
>                dirtycnt++
>      }
>      gen = nextgen
>  } while (dirtycnt < X)         <--- to achieve this a thorttling or similar 
> might be needed
> 
> fsfreeze (optional)
> write lock (via QMP)
> backupgen = fetch current write generation (via QMP)
> dirtymap = send all block whose write generation is greater than 'gen' (via 
> QMP)
> foreach block in dirtymap {
>                copy to backup via external software
> }
> unlock (via QMP)
> fsthaw (optional)
> --->8---

That is too long for the guest to be frozen.  Rather, the flow is more like:

set up bitmap0 to track all writes since last point in time
fsfreeze (optional)
transaction to pivot to new bitmap1 (effectively freezing bitmap0 as the
point in time we are interested in)
fsthaw
connect via NBD with a request to view the data at the bitmap0 point in
time - read the bitmap, then read the sectors that the bitmap says are dirty
clean up bitmap0 (qemu can finally delete any point-in-time sectors that
were copied off due to any writes after the thaw)

> As far as I understand CBT in VMware is not just only a dirty bitmap, but 
> also a write generation tracking for blocks (size 64kb or whatever)

Write generation is a matter of tracking which bitmaps and points in
time you fleeced from.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]