qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 2/2] block: avoid recursive block_status call


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [Qemu-devel] [PATCH v2 2/2] block: avoid recursive block_status call if possible
Date: Tue, 28 May 2019 06:39:32 +0000

27.05.2019 18:13, Max Reitz wrote:
> On 08.04.19 18:26, Vladimir Sementsov-Ogievskiy wrote:
>> drv_co_block_status digs bs->file for additional, more accurate search
>> for hole inside region, reported as DATA by bs since 5daa74a6ebc.
>>
>> This accuracy is not free: assume we have qcow2 disk. Actually, qcow2
>> knows, where are holes and where is data. But every block_status
>> request calls lseek additionally. Assume a big disk, full of
>> data, in any iterative copying block job (or img convert) we'll call
>> lseek(HOLE) on every iteration, and each of these lseeks will have to
>> iterate through all metadata up to the end of file. It's obviously
>> ineffective behavior. And for many scenarios we don't need this lseek
>> at all.
>>
>> However, lseek is needed when we have metadata-preallocated image.
>>
>> So, let's detect metadata-preallocation case and don't dig qcow2's
>> protocol file in other cases.
>>
>> The idea is to compare allocation size in POV of filesystem with
>> allocations size in POV of Qcow2 (by refcounts). If allocation in fs is
>> significantly lower, consider it as metadata-preallocation case.
>>
>> 102 iotest changed, as our detector can't detect shrinked file as
>> metadata-preallocation, which don't seem to be wrong, as with metadata
>> preallocation we always have valid file length.
>>
>> Other two iotests tiny changed QMP output sequence, which should be
>> exactly because skipped lseek at mirror beginning.
>>
>> Suggested-by: Denis V. Lunev <address@hidden>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>> ---
>>   block/qcow2.h              |  4 ++++
>>   include/block/block.h      |  8 +++++++-
>>   block/io.c                 |  9 ++++++++-
>>   block/qcow2-refcount.c     | 32 ++++++++++++++++++++++++++++++++
>>   block/qcow2.c              | 11 +++++++++++
>>   tests/qemu-iotests/102     |  2 +-
>>   tests/qemu-iotests/102.out |  3 ++-
>>   tests/qemu-iotests/141.out |  2 +-
>>   tests/qemu-iotests/144.out |  2 +-
>>   9 files changed, 67 insertions(+), 6 deletions(-)
> 
> For me, this patch breaks iotests 141 (for qed) and 211 (for vdi):
> 
>> 141 1s ...        [17:11:53] [17:11:53] - output mismatch (see 141.out.bad)
>> --- tests/qemu-iotests/141.out 2019-05-27 17:11:43.327664282 +0200
>> +++ build/tests/qemu-iotests/141.out.bad       2019-05-27 17:11:53.949439880 
>> +0200
>> @@ -42,9 +42,9 @@
>>   {"return": {}}
>>   {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, 
>> "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
>>   {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, 
>> "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>> -{"return": {}}
>>   {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, 
>> "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job0"}}
>>   {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, 
>> "event": "BLOCK_JOB_READY", "data": {"device": "job0", "len": 0, "offset": 
>> 0, "speed": 0, "type": "commit"}}
>> +{"return": {}}
>>   {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: block 
>> device is in use by block job: commit"}}
>>   {"return": {}}
>>   {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, 
>> "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job0"}}
> 
> and
> 
>> 211 5s ...        [17:11:54] [17:11:58] - output mismatch (see 211.out.bad)
>> --- tests/qemu-iotests/211.out 2019-05-22 19:58:34.870273427 +0200
>> +++ build/tests/qemu-iotests/211.out.bad       2019-05-27 17:11:58.259348827 
>> +0200
>> @@ -55,8 +55,7 @@
>>   virtual size: 32 MiB (33554432 bytes)
>>   cluster_size: 1048576
>>   
>> -[{ "start": 0, "length": 3072, "depth": 0, "zero": false, "data": true, 
>> "offset": 1024},
>> -{ "start": 3072, "length": 33551360, "depth": 0, "zero": true, "data": 
>> true, "offset": 4096}]
>> +[{ "start": 0, "length": 33554432, "depth": 0, "zero": false, "data": true, 
>> "offset": 1024}]
>>   
>>   === Invalid BlockdevRef ===
> 
> Doesn’t look too bad, but still, broken iotests are broken iotests. :/
> 


You are right, thanks for pointing to this! So, here is my investigation:

about 211: aha, looks like I just didn't implement metadata preallocation 
detection for vdi,
and default behavior is changed. I don't know vdi code, but after fast look, it 
seems that
it may be as simple as (and it fixes the test):


diff --git a/block/vdi.c b/block/vdi.c
index d7ef6628e7..a72333dcf4 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -542,7 +542,9 @@ static int coroutine_fn 
vdi_co_block_status(BlockDriverState *bs,
      *map = s->header.offset_data + (uint64_t)bmap_entry * s->block_size +
          index_in_block;
      *file = bs->file->bs;
-    return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
+
+    return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID |
+            (s->header.image_type == VDI_TYPE_STATIC ? BDRV_BLOCK_RECURSE : 0);
  }



Or, we can rollback vdi behavior to pre-patch state:



diff --git a/block/vdi.c b/block/vdi.c
index d7ef6628e7..074256d845 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -542,7 +542,7 @@ static int coroutine_fn 
vdi_co_block_status(BlockDriverState *bs,
      *map = s->header.offset_data + (uint64_t)bmap_entry * s->block_size +
          index_in_block;
      *file = bs->file->bs;
-    return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
+    return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_RECURSE;
  }

  static int coroutine_fn




about 141: aha, it's a difference between qcow2 and qed, because of changed io 
sequence for
qcow2, as Kevin described "What happens is that 
qcow2_detect_metadata_preallocation() causes
additional I/O by reading in the refcount block.". I think correct way to fix 
it is to filter
{"return": {}} elements, as we can't be sure where they occur among events:




diff --git a/tests/qemu-iotests/141 b/tests/qemu-iotests/141
index 2197a82d45..2704695921 100755
--- a/tests/qemu-iotests/141
+++ b/tests/qemu-iotests/141
@@ -58,16 +58,20 @@ test_blockjob()
                }}}" \
          'return'

+    # We may get {"return": {}} result of the following command among events in
+    # the following output or after all events in the next output (if $2 is an
+    # event, not 'return'). So, filter it here and in the following output to
+    # avoid race in test output.
      _send_qemu_cmd $QEMU_HANDLE \
          "$1" \
          "$2" \
-        | _filter_img_create
+        | _filter_img_create | _filter_qmp_empty_return

      # We want this to return an error because the block job is still running
      _send_qemu_cmd $QEMU_HANDLE \
          "{'execute': 'blockdev-del',
            'arguments': {'node-name': 'drv0'}}" \
-        'error' | _filter_generated_node_ids
+        'error' | _filter_generated_node_ids | _filter_qmp_empty_return

      _send_qemu_cmd $QEMU_HANDLE \
          "{'execute': 'block-job-cancel',
diff --git a/tests/qemu-iotests/141.out b/tests/qemu-iotests/141.out
index 4d71d9dcae..dbd3bdef6c 100644
--- a/tests/qemu-iotests/141.out
+++ b/tests/qemu-iotests/141.out
@@ -10,7 +10,6 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 
backing_file=TEST_DIR/m.
  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 
backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
-{"return": {}}
  {"error": {"class": "GenericError", "desc": "Node drv0 is in use"}}
  {"return": {}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
@@ -27,7 +26,6 @@ Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 
backing_file=TEST_DIR/t.
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job0"}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_READY", "data": {"device": "job0", "len": 0, "offset": 0, "speed": 
0, "type": "mirror"}}
-{"return": {}}
  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: block 
device is in use by block job: mirror"}}
  {"return": {}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job0"}}
@@ -42,7 +40,6 @@ Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 
backing_file=TEST_DIR/t.
  {"return": {}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
-{"return": {}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job0"}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_READY", "data": {"device": "job0", "len": 0, "offset": 0, "speed": 
0, "type": "commit"}}
  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: block 
device is in use by block job: commit"}}
@@ -61,7 +58,6 @@ wrote 1048576/1048576 bytes at offset 0
  {"return": {}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
-{"return": {}}
  {"error": {"class": "GenericError", "desc": "Node drv0 is in use"}}
  {"return": {}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
@@ -77,7 +73,6 @@ wrote 1048576/1048576 bytes at offset 0
  {"return": {}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
-{"return": {}}
  {"error": {"class": "GenericError", "desc": "Node drv0 is in use"}}
  {"return": {}}
  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 35fddc746f..8e9235d6fe 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -219,5 +219,10 @@ _filter_nbd()
          -e 's#\(foo\|PORT/\?\|.sock\): Failed to .*$#\1#'
  }

+_filter_qmp_empty_return()
+{
+    grep -v '{"return": {}}'
+}
+
  # make sure this script returns success
  true




-- 
Best regards,
Vladimir

reply via email to

[Prev in Thread] Current Thread [Next in Thread]