qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] NBD structured reads vs. block size


From: Eric Blake
Subject: Re: [Qemu-block] NBD structured reads vs. block size
Date: Tue, 28 Aug 2018 15:41:24 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

Revisiting this:

On 08/01/2018 09:41 AM, Eric Blake wrote:
Rich Jones pointed me to questionable behavior in qemu's NBD server implementation today: qemu advertises a minimum block size of 512 to any client that promises to honor block sizes, but when serving up a raw file that is not aligned to a sector boundary, attempting to read that final portion of the file results in a structured read with two chunks, the first for the data up to the end of the actual file, and the second reporting a hole for the rest of the sector. If a client is promising to obey block sizes on its requests, it seems odd that the server is allowed to send a result that is not also aligned to block sizes.

Right now, the NBD spec says that when structured replies are in use, then for a structured read:

     The server MAY split the reply into any number of content chunks;
     each chunk MUST describe at least one byte, although to minimize
     overhead, the server SHOULD use chunks with lengths and offsets as
     an integer multiple of 512 bytes, where possible (the first and
     last chunk of an unaligned read being the most obvious places for
     an exception).

I'm wondering if we should tighten that to require that the server partition the reply chunks to be aligned to the advertised minimum block size (at which point, qemu should either advertise 1 instead of 512 as the minimum size when serving up an unaligned file, or else qemu should just send the final partial sector as a single data chunk rather than trying to report the last few bytes as a hole).

For comparison, on block status, we require:

    The server SHOULD use descriptor
     lengths that are an integer multiple of 512 bytes where possible
     (the first and last descriptor of an unaligned query being the
     most obvious places for an exception), and MUST use descriptor
     lengths that are an integer multiple of any advertised minimum
     block size.

And qemu as a client currently hangs up on any server that violates that requirement on block status (that is, when qemu as the server tries to send a block status that was not aligned to the advertised block size, qemu as the client flags it as an invalid server - which means qemu as server is currently broken).  So I'm thinking we should copy that requirement onto servers for reads as well.

Vladimir pointed out that the problem is not necessarily just limited to the implicit hole at the end of a file that was rounded up to sector size. Another case where sub-region changes occur in qemu is where you have a backing file with 512-byte hole granularity (qemu-img create -f qcow2 -o cluster_size=512 backing.qcow2 100M) and an overlay with larger granularity (qemu-img create -f qcow2 -b backing.qcow2 -F qcow2 -o cluster_size=4k active.qcow2). On a cluster where the top layer defers to the underlying layer, it is possible to alternate between holes and data at sector boundaries but at subsets of the cluster boundary of the top layer. As long as qemu advertises a minimum block size of 512 rather than the cluster size, then this isn't a problem, but if qemu were to change to report the qcow2 cluster size as its minimum I/O (rather than merely its preferred I/O, because it can do read-modify-write on data smaller than a cluster), this would be another case where unaligned replies might confuse a client.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]