qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Nbd] [PATCH v2] doc: Add NBD_CMD_BLOCK_STATUS extensio


From: Alex Bligh
Subject: Re: [Qemu-devel] [Nbd] [PATCH v2] doc: Add NBD_CMD_BLOCK_STATUS extension
Date: Mon, 4 Apr 2016 19:06:01 +0100

On 4 Apr 2016, at 17:39, Eric Blake <address@hidden> wrote:

> +    This command is meant to operate in tandem with other (non-NBD)
> +    channels to the server.  Generally, a "dirty" block is a block
> +    that has been written to by someone, but the exact meaning of "has
> +    been written" is left to the implementation.  For example, a
> +    virtual machine monitor could provide a (non-NBD) command to start
> +    tracking blocks written by the virtual machine.  A backup client
> +    can then connect to an NBD server provided by the virtual machine
> +    monitor and use `NBD_CMD_BLOCK_STATUS` with the
> +    `NBD_FLAG_STATUS_DIRTY` bit set in order to read only the dirty
> +    blocks that the virtual machine has changed.
> +
> +    An implementation that doesn't track the "dirtiness" state of
> +    blocks MUST either fail this command with `EINVAL`, or mark all
> +    blocks as dirty in the descriptor that it returns.  Upon receiving
> +    an `NBD_CMD_BLOCK_STATUS` command with the flag
> +    `NBD_FLAG_STATUS_DIRTY` set, the server MUST return the dirtiness
> +    status of the device, where the status field of each descriptor is
> +    determined by the following bit:
> +
> +      - `NBD_STATE_CLEAN` (bit 2); if set, the block represents a
> +        portion of the file that is still clean because it has not
> +        been written; if clear, the block represents a portion of the
> +        file that is dirty, or where the server could not otherwise
> +        determine its status.

A couple of questions:

1. I am not sure that the block dirtiness and the zero/allocation/hole thing
   always have the same natural blocksize. It's pretty easy to imagine
   a server whose natural blocksize is a disk sector (and can therefore
   report presence of zeroes to that resolution) but where 'dirtiness'
   was maintained independently at a less fine-grained level. Maybe
   that suggests 2 commands would be useful.

2. Given the communication is out of band, how is it realistically
   possible to sync this backup? You'll ask for all the dirty blocks,
   but whilst the command is being executed (as well as immediately
   after the reply) further blocks may be dirtied. So your reply
   always overestimates what is clean (probably the wrong way around).
   Furthermore, the next time you do a 'backup', you don't know whether
   the blocks were dirty as they were dirty on the previous backup,
   or because they were dirty on this backup.

If I was designing a backup protocol (off the top of my head) I'd
make all commands return a monotonic 64 bit counter of the number of
writes to the disk since some arbitrary time, and provide a 'GETDIRTY'
command that returned all blocks with a monotonic counter greater than that.
That way I could precisely get the writes that were executed since
any particular read. You'd allow it to be 'slack' and include things
in that list that might not have changed (i.e. false positives) but
not false negatives.

-- 
Alex Bligh







reply via email to

[Prev in Thread] Current Thread [Next in Thread]