qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Nbd] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extensi


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [Nbd] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension
Date: Thu, 24 Mar 2016 12:55:51 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0


On 23/03/2016 18:58, Wouter Verhelst wrote:
>> +To provide such class of information, `GET_LBA_STATUS` extension adds new
>> +`NBD_CMD_GET_LBA_STATUS` command which returns a list of LBA ranges with
>> +their respective states.
>> +
>> +* `NBD_CMD_GET_LBA_STATUS` (7)
>> +
>> +    An LBA range status query request. Length and offset define the range
>> +    of interest. The server MUST reply with a reply header, followed
>> +    immediately by the following data:
> 
> As Eric noted, please expand LBA at least once.

Let's just use "block" (e.g. NBD_CMD_GET_BLOCK_STATUS).

>> +      - 32 bits, length of parameter data that follow (unsigned)
>> +      - zero or more LBA status descriptors, each having the following
>> +        structure:
>> +
>> +        * 64 bits, offset (unsigned)
>> +        * 32 bits, length (unsigned)
>> +        * 16 bits, status (unsigned)
>> +
>> +    unless an error condition has occurred.
>> +

Can we just return one descriptor?  That would simplify the protocol a bit.

>> +    the provisioning state of the device. The following provisionnig states
>> +    are defined for the command:
>> +
>> +      - `NBD_STATE_ALLOCATED` (0x0), LBA extent is present on the block 
>> device;
>> +      - `NBD_STATE_ZEROED` (0x1), LBA extent is present on the block device
>> +        and contains zeroes;
> 
> Presumably this should be "contains only zeroes"?

Yes, or "reads as zeroes".

> Also, this may end up being a fairly expensive call for the server to
> process. Is it really useful?

It's always okay for the server to omit NBD_STATE_ZERO, but it can be
useful if the state is known for some reason.  For example, file system
holes are always zero, but unallocated blocks on a block device are not
always zero.

However, let's make these bits, so that

NBD_STATE_ALLOCATED (0x1), LBA extent is present on the block device
NBD_STATE_ZERO (0x2), LBA extent will read as zeroes

and you can have NBD_STATE_ALLOCATED|NBD_STATE_ZERO.  File systems do
have the concept of unwritten extents which would be represented like
that.  The API to access the information (the FIEMAP ioctl) is broken,
but perhaps in the future a non-broken API could be added---for example
SEEK_ZERO and SEEK_NONZERO values for lseek's "whence" argument.

>> +      - `NBD_STATE_DEALLOCATED` (0x2), LBA extent is not present on the
>> +        block device. A client MUST NOT make any assumptions about the
>> +        contents of the extent.
>> +
>> +    2. Block dirtiness state
>> +
>> +    Upon receiving an `NBD_CMD_GET_LBA_STATUS` command with command flags
>> +    field set to `NBD_FLAG_GET_DIRTY` (0x1), the server MUST return
>> +    the dirtiness status of the device. The following dirtiness states
>> +    are defined for the command:
>> +
>> +      - `NBD_STATE_DIRTY` (0x0), LBA extent is dirty;
>> +      - `NBD_STATE_CLEAN` (0x1), LBA extent is clean.
>> +
>> +    Generic NBD client implementation without knowledge of a particular NBD
>> +    server operation MUST NOT make any assumption on the meaning of the
>> +    NBD_STATE_DIRTY or NBD_STATE_CLEAN states.
> 
> That makes it a useless call. A server can read /dev/random to decide
> whether to send STATE_DIRTY or STATE_CLEAN, and still be compliant with
> this spec.
> 
> Either the spec should define what it means for a block to be in a dirty
> state, or it should not talk about it.

Here is my attempt:

    This command is meant to operate in tandem with other (non-NBD)
    channels to the server.  Generally, a "dirty" block is a block that
    has been written to by someone, but the exact meaning of "has been
    written" is left to the implementation.  For example, a virtual
    machine monitor could provide a (non-NBD) command to start tracking
    blocks written by the virtual machine.  A backup client then can
    connect to an NBD server provided by the virtual machine monitor
    and use NBD_CMD_GET_BLOCK_STATUS only read blocks that the virtual
    machine has changed.

    An implementation that doesn't track the "dirtiness" state of blocks
    MUST either fail this command with EINVAL, or mark all blocks as
    dirty in the descriptor that it returns.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]