qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 1/2] block: sync bdrv_co_get_block_status_abo


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH v3 1/2] block: sync bdrv_co_get_block_status_above() with bdrv_is_allocated_above()
Date: Tue, 20 Sep 2016 01:18:12 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0

On 2016-09-15 at 18:34, Denis V. Lunev wrote:
They should work very similar, covering same areas if backing store is
shorter than the image. This change is necessary for the followup patch
switching to bdrv_get_block_status_above() in mirror to avoid assert
in check_block.

This change should be made very carefully. Let us assume that we have
top image and 2 backing stores L0->L1->L2.
  L0: --------------
  L1: -------
  L2: -------=======
The data marked as '=' in L2 should not appear as BDRV_BLOCK_ALLOCATED
and we should return it as filled in L0 image with properly calculated
*pnum value.

Signed-off-by: Denis V. Lunev <address@hidden>
CC: Stefan Hajnoczi <address@hidden>
CC: Fam Zheng <address@hidden>
CC: Kevin Wolf <address@hidden>
CC: Max Reitz <address@hidden>
CC: Jeff Cody <address@hidden>
---
 block/io.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index 420944d..067d465 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1741,18 +1741,33 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status_above(BlockDriverState *bs,
         BlockDriverState **file)
 {
     BlockDriverState *p;
-    int64_t ret = 0;
+    int64_t ret = 0, res = nb_sectors;

It's not wrong to make res an int64_t, but an int is sufficient.


     assert(bs != base);
     for (p = bs; p != base; p = backing_bs(p)) {
-        ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum, file);
-        if (ret < 0 || ret & BDRV_BLOCK_ALLOCATED) {
-            break;
+        int sc;
+        ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, &sc, file);
+        if (ret < 0) {
+            return ret;
+        } else if (ret & BDRV_BLOCK_ALLOCATED) {
+            *pnum = sc;
+            return ret;
+        }
+
+        if (res > sc && (p == bs || sector_num + sc < p->total_sectors)) {
+            res = sc;

This definitely requires some comments because it took me a long time to figure out why we need "res" to be a separate variable from "nb_sectors" and why this condition is like it is.

So what I think this does is:

Basically, we want to return our final nb_sectors in *pnum. But we can have the constellation you noted in the commit message: A short intermediate layer, and the bottom layer has some data allocated beyond the end of that intermediate layer.

Now, when we pass through that intermediate layer, we need to shorten nb_sectors so that we don't query anything beyond the end of that intermediate layer because it doesn't matter anyway.

But we also want to remember that all of this area appears as unallocated to the top layer, so therefore we have to keep a second variable ("res") which retains this information.

Therefore, nb_sectors is always exactly the range we want to query, and "res" is the range we know to appear unallocated. This condition here tries to adjust "res" so that it conforms to that specification.

However, I'm not quite sure it actually does that. Let's take the case from your commit message:

L0: --------------
L1: -------
L2: -------=======

Let's say we invoke this function in the range [0, 14]. After passing through L0, res is 14 and nb_sectors is 14. After L1, res is still 14, but nb_sectors is 7. So far, so good.

But when passing through L2, "sc" will be 7 (and it will actually always be 7, regardless of what comes past sector 7, because nb_sectors is 7). Since L2 is larger than just 7 sectors, we will now reduce res to 7 as well (because sector_num + sc (= 0 + 7 = 7) < p->total_sectors (= 14)).

So therefore, we will set *pnum to 7. That doesn't seem too bad to me, but we could have achieved the same result by just setting *pnum to nb_sectors and not having to track the separate "res" variable.


Thus, I'm not quite sure what the point of this is. "res" will only be longer than "nb_sectors" as long as the layers get shorter or stay the same length when going downwards. As soon as one layer is longer than the one above it, "res" will probably be truncated to "sc" (which is going to be the same value as "nb_sectors", unless bdrv_co_get_block_status() returns a *pnum > nb_sectors).

I'm not sure whether I'm missing something here, though.

         }
+
         /* [sector_num, pnum] unallocated on this layer, which could be only

The "pnum" here should be changed to "sc".

          * the first part of [sector_num, nb_sectors].  */
-        nb_sectors = MIN(nb_sectors, *pnum);
+        nb_sectors = MIN(nb_sectors, sc);
+
+        if (nb_sectors == 0) {
+            break;

While I can see that in this case ret would be 0, I think it wouldn't hurt to add an explicit "ret = 0;" here, too.

Max

+        }
     }
+
+    *pnum = res;
     return ret;
 }






reply via email to

[Prev in Thread] Current Thread [Next in Thread]