Hi all!
This a last part of original
"[RFC 00/24] backup performance: block_status + async", prepartions are
already merged.
The series turn backup into series of block_copy_async calls, covering
the whole disk, so we get block-status based paralallel async requests
out of the box, which gives performance gain:
----------------- ---------------- ------------- --------------------------
-------------------------- ---------------- -------------------------------
mirror(upstream) backup(new) backup(new, no-copy-range)
backup(new, copy-range-1w) backup(upstream) backup(upstream, no-copy-range)
hdd-ext4:hdd-ext4 18.86 +- 0.11 45.50 +- 2.35 19.22 +- 0.09
19.51 +- 0.09 22.85 +- 5.98 19.72 +- 0.35
hdd-ext4:ssd-ext4 8.99 +- 0.02 9.30 +- 0.01 8.97 +- 0.02
9.02 +- 0.02 9.68 +- 0.26 9.84 +- 0.12
ssd-ext4:hdd-ext4 9.09 +- 0.11 9.34 +- 0.10 9.34 +- 0.10
8.99 +- 0.01 11.37 +- 0.37 11.47 +- 0.30
ssd-ext4:ssd-ext4 4.07 +- 0.02 5.41 +- 0.05 4.05 +- 0.01
8.35 +- 0.58 9.83 +- 0.64 8.62 +- 0.35
hdd-xfs:hdd-xfs 18.90 +- 0.19 43.26 +- 2.47 19.62 +- 0.14
19.38 +- 0.16 19.55 +- 0.26 19.62 +- 0.12
hdd-xfs:ssd-xfs 8.93 +- 0.12 9.35 +- 0.03 8.93 +- 0.08
8.93 +- 0.05 9.79 +- 0.30 9.55 +- 0.15
ssd-xfs:hdd-xfs 9.15 +- 0.07 9.74 +- 0.28 9.29 +- 0.03
9.08 +- 0.05 10.85 +- 0.31 10.91 +- 0.30
ssd-xfs:ssd-xfs 4.06 +- 0.01 4.93 +- 0.02 4.04 +- 0.01
8.17 +- 0.42 9.52 +- 0.49 8.85 +- 0.46
ssd-ext4:nbd 9.96 +- 0.11 11.45 +- 0.15 11.45 +- 0.02
17.22 +- 0.06 34.45 +- 1.35 35.16 +- 0.37
nbd:ssd-ext4 9.84 +- 0.02 9.84 +- 0.04 9.80 +- 0.06
18.96 +- 0.06 30.89 +- 0.73 31.46 +- 0.21
----------------- ---------------- ------------- --------------------------
-------------------------- ---------------- -------------------------------
The table shows, that copy_range is in bad relation with parallel async
requests. copy_range brings real performance gain only on supporting fs,
like btrfs. But even on such fs, I'm not sure that this is a good
default behavior: if we do offload copy, so, that no real copy but just
link block in backup the same blocks as in original, this means that
further write from guest will lead to fragmentation of guest disk, when
the aim of backup is to operate transparently for the guest.
So, in addition to these series I also suggest to disable copy_range by
default.
===
How to test:
prepare images:
In a directories, where you want to place source and target images,
prepare images by:
for img in test-source test-target; do
./qemu-img create -f raw $img 1000M;
./qemu-img bench -c 1000 -d 1 -f raw -s 1M -w --pattern=0xff $img
done
prepare similar image for nbd server, and start it somewhere by
qemu-nbd --persistent --nocache -f raw IMAGE
Then, run benchmark, like this:
./bench-backup.py --qemu new:../../x86_64-softmmu/qemu-system-x86_64
upstream:/work/src/qemu/up-backup-block-copy-master/x86_64-softmmu/qemu-system-x86_64
--dir hdd-ext4:/test-a hdd-xfs:/test-b ssd-ext4:/ssd ssd-xfs:/ssd-b --test
$(for fs in ext4 xfs; do echo hdd-$fs:hdd-$fs hdd-$fs:ssd-$fs ssd-$fs:hdd-$fs
ssd-$fs:ssd-$fs; done) --nbd 192.168.100.2 --test ssd-ext4:nbd nbd:ssd-ext4
(you may simply reduce number of directories/test-cases, use --help for
help)
===
Note, that I included here
"[PATCH] block/block-copy: block_copy_dirty_clusters: fix failure check"
which was previously sent in separate, but still untouched in mailing
list. It still may be applied separately.
Vladimir Sementsov-Ogievskiy (20):
block/block-copy: block_copy_dirty_clusters: fix failure check
iotests: 129 don't check backup "busy"
qapi: backup: add x-use-copy-range parameter
block/block-copy: More explicit call_state
block/block-copy: implement block_copy_async
block/block-copy: add max_chunk and max_workers parameters
block/block-copy: add ratelimit to block-copy
block/block-copy: add block_copy_cancel
blockjob: add set_speed to BlockJobDriver
job: call job_enter from job_user_pause
qapi: backup: add x-max-chunk and x-max-workers parameters
iotests: 56: prepare for backup over block-copy
iotests: 129: prepare for backup over block-copy
iotests: 185: prepare for backup over block-copy
iotests: 219: prepare for backup over block-copy
iotests: 257: prepare for backup over block-copy
backup: move to block-copy
block/block-copy: drop unused argument of block_copy()
simplebench: bench_block_job: add cmd_options argument
simplebench: add bench-backup.py
qapi/block-core.json | 11 +-
block/backup-top.h | 1 +
include/block/block-copy.h | 45 +++-
include/block/block_int.h | 8 +
include/block/blockjob_int.h | 2 +
block/backup-top.c | 6 +-
block/backup.c | 170 ++++++++------
block/block-copy.c | 183 ++++++++++++---
block/replication.c | 1 +
blockdev.c | 10 +
blockjob.c | 6 +
job.c | 1 +
scripts/simplebench/bench-backup.py | 132 +++++++++++
scripts/simplebench/bench-example.py | 2 +-
scripts/simplebench/bench_block_job.py | 13 +-
tests/qemu-iotests/056 | 8 +-
tests/qemu-iotests/129 | 3 +-
tests/qemu-iotests/185 | 3 +-
tests/qemu-iotests/185.out | 2 +-
tests/qemu-iotests/219 | 13 +-
tests/qemu-iotests/257 | 1 +
tests/qemu-iotests/257.out | 306 ++++++++++++-------------
22 files changed, 640 insertions(+), 287 deletions(-)
create mode 100755 scripts/simplebench/bench-backup.py