[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH 51/67] block: Let write zeroes fallback work even wi
From: |
Michael Roth |
Subject: |
[Qemu-devel] [PATCH 51/67] block: Let write zeroes fallback work even with small max_transfer |
Date: |
Wed, 14 Dec 2016 18:44:45 -0600 |
From: Eric Blake <address@hidden>
Commit 443668ca rewrote the write_zeroes logic to guarantee that
an unaligned request never crosses a cluster boundary. But
in the rewrite, the new code assumed that at most one iteration
would be needed to get to an alignment boundary.
However, it is easy to trigger an assertion failure: the Linux
kernel limits loopback devices to advertise a max_transfer of
only 64k. Any operation that requires falling back to writes
rather than more efficient zeroing must obey max_transfer during
that fallback, which means an unaligned head may require multiple
iterations of the write fallbacks before reaching the aligned
boundaries, when layering a format with clusters larger than 64k
atop the protocol of file access to a loopback device.
Test case:
$ qemu-img create -f qcow2 -o cluster_size=1M file 10M
$ losetup /dev/loop2 /path/to/file
$ qemu-io -f qcow2 /dev/loop2
qemu-io> w 7m 1k
qemu-io> w -z 8003584 2093056
In fairness to Denis (as the original listed author of the culprit
commit), the faulty logic for at most one iteration is probably all
my fault in reworking his idea. But the solution is to restore what
was in place prior to that commit: when dealing with an unaligned
head or tail, iterate as many times as necessary while fragmenting
the operation at max_transfer boundaries.
Reported-by: Ed Swierk <address@hidden>
CC: address@hidden
CC: Denis V. Lunev <address@hidden>
Signed-off-by: Eric Blake <address@hidden>
Reviewed-by: Max Reitz <address@hidden>
Signed-off-by: Kevin Wolf <address@hidden>
(cherry picked from commit b2f95feec5e4d546b932848dd421ec3361e8ef77)
Signed-off-by: Michael Roth <address@hidden>
---
block/io.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/block/io.c b/block/io.c
index e579eda..959e140 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1179,6 +1179,8 @@ static int coroutine_fn
bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
int alignment = MAX(bs->bl.pwrite_zeroes_alignment,
bs->bl.request_alignment);
+ int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
+ MAX_WRITE_ZEROES_BOUNCE_BUFFER);
assert(alignment % bs->bl.request_alignment == 0);
head = offset % alignment;
@@ -1194,9 +1196,12 @@ static int coroutine_fn
bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
* boundaries.
*/
if (head) {
- /* Make a small request up to the first aligned sector. */
- num = MIN(count, alignment - head);
- head = 0;
+ /* Make a small request up to the first aligned sector. For
+ * convenience, limit this request to max_transfer even if
+ * we don't need to fall back to writes. */
+ num = MIN(MIN(count, max_transfer), alignment - head);
+ head = (head + num) % alignment;
+ assert(num < max_write_zeroes);
} else if (tail && num > alignment) {
/* Shorten the request to the last aligned sector. */
num -= tail;
@@ -1222,8 +1227,6 @@ static int coroutine_fn
bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
if (ret == -ENOTSUP) {
/* Fall back to bounce buffer if write zeroes is unsupported */
- int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
- MAX_WRITE_ZEROES_BOUNCE_BUFFER);
BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
if ((flags & BDRV_REQ_FUA) &&
--
1.9.1
- [Qemu-devel] [PATCH 38/67] qemu-iotests: Test I/O in a single drive from a throttling group, (continued)
- [Qemu-devel] [PATCH 38/67] qemu-iotests: Test I/O in a single drive from a throttling group, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 03/67] hw/ppc/spapr: Fix the selection of the processor features, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 46/67] virtio-net: mark VIRTIO_NET_F_GSO as legacy, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 43/67] net: fix sending of data with -net socket, listen backend, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 40/67] memory: Replace skip_dump flag with "ram_device", Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 41/67] memory: Don't use memcpy for ram_device regions, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 42/67] acpi/ipmi: Initialize the fwinfo before fetching it, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 45/67] virtio: allow per-device-class legacy features, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 47/67] block: Don't mark node clean after failed flush, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 50/67] qcow2: Inform block layer about discard boundaries, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 51/67] block: Let write zeroes fallback work even with small max_transfer,
Michael Roth <=
- [Qemu-devel] [PATCH 53/67] block: Pass unaligned discard requests to drivers, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 48/67] vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 55/67] block/curl: Fix return value from curl_read_cb, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 56/67] block/curl: Remember all sockets, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 58/67] vhost: drop legacy vring layout bits, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 49/67] slirp: Fix access to freed memory, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 04/67] ppc: Check the availability of transactional memory, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 52/67] block: Return -ENOTSUP rather than assert on unaligned discards, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 61/67] rules.mak: Use -r instead of -Wl, -r to fix building when PIE is default, Michael Roth, 2016/12/14
- [Qemu-devel] [PATCH 54/67] block/curl: Use BDRV_SECTOR_SIZE, Michael Roth, 2016/12/14