Re: [PATCH v3 08/10] nbd/server: introduce NBDExtentArray

qemu-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 08/10] nbd/server: introduce NBDExtentArray

From:	Vladimir Sementsov-Ogievskiy
Subject:	Re: [PATCH v3 08/10] nbd/server: introduce NBDExtentArray
Date:	Tue, 21 Jan 2020 10:25:24 +0000
20.01.2020 23:20, Eric Blake wrote:
> On 12/19/19 4:03 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Introduce NBDExtentArray class, to handle extents list creation in more
>> controlled way and with less OUT parameters in functions.
> 
> s/less/fewer/
> 
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>> ---
>>   nbd/server.c | 201 ++++++++++++++++++++++++++++-----------------------
>>   1 file changed, 109 insertions(+), 92 deletions(-)
>>
>> diff --git a/nbd/server.c b/nbd/server.c
>> index a4b348eb32..cc722adc31 100644
>> --- a/nbd/server.c
>> +++ b/nbd/server.c
>> @@ -1909,27 +1909,89 @@ static int coroutine_fn 
>> nbd_co_send_sparse_read(NBDClient *client,
>>       return ret;
>>   }
>> +typedef struct NBDExtentArray {
>> +    NBDExtent *extents;
>> +    unsigned int nb_alloc;
>> +    unsigned int count;
>> +    uint64_t total_length;
>> +    bool converted; /* extents are converted to BE, no more changes allowed 
>> */
>> +} NBDExtentArray;
>> +
> 
> Looks good.
> 
>> +static NBDExtentArray *nbd_extent_array_new(unsigned int nb_alloc)
>> +{
>> +    NBDExtentArray *ea = g_new0(NBDExtentArray, 1);
>> +
>> +    ea->nb_alloc = nb_alloc;
>> +    ea->extents = g_new(NBDExtent, nb_alloc);
> 
> I guess g_new() is okay rather tahn g_new0, as long as we are careful not to 
> read that uninitialized memory.
> 
>> +
>> +    return ea;
>> +}
>> +
>> +static void nbd_extent_array_free(NBDExtentArray *ea)
>> +{
>> +    g_free(ea->extents);
>> +    g_free(ea);
>> +}
>> +G_DEFINE_AUTOPTR_CLEANUP_FUNC(NBDExtentArray, nbd_extent_array_free);
>> +
>> +/* Further modifications of the array after conversion are abandoned */
>> +static void nbd_extent_array_convert_to_be(NBDExtentArray *ea)
>> +{
>> +    int i;
>> +
>> +    if (ea->converted) {
>> +        return;
>> +    }
> 
> Would this be better as assert(!ea->converted), to ensure we aren't buggy in 
> our usage? ...

No, as array may be already automatically converted by nbd_extent_array_add, or 
may be not.

But your question stress that my design is weird. Now I think it's better to 
add separate
boolean ea field for nbd_extent_array_add() safety, instead of reusing 
.converted.

> 
>> +    ea->converted = true;
>> +
>> +    for (i = 0; i < ea->count; i++) {
>> +        ea->extents[i].flags = cpu_to_be32(ea->extents[i].flags);
>> +        ea->extents[i].length = cpu_to_be32(ea->extents[i].length);
>> +    }
>> +}
>> +
>>   /*
>> - * Populate @extents from block status. Update @bytes to be the actual
>> - * length encoded (which may be smaller than the original), and update
>> - * @nb_extents to the number of extents used.
>> - *
>> - * Returns zero on success and -errno on bdrv_block_status_above failure.
>> + * Add extent to NBDExtentArray. If extent can't be added (no available 
>> space),
>> + * return -1.
>> + * For safety, when returning -1 for the first time, the array is converted
>> + * to BE and further modifications are abandoned.
>>    */
>> -static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset,
>> -                                  uint64_t *bytes, NBDExtent *extents,
>> -                                  unsigned int *nb_extents)
>> +static int nbd_extent_array_add(NBDExtentArray *ea,
>> +                                uint32_t length, uint32_t flags)
>>   {
>> -    uint64_t remaining_bytes = *bytes;
>> -    NBDExtent *extent = extents, *extents_end = extents + *nb_extents;
>> -    bool first_extent = true;
>> +    assert(!ea->converted);
> 
> ...especially since you assert here.
> 
>> +
>> +    if (!length) {
>> +        return 0;
>> +    }
>> +
>> +    /* Extend previous extent if flags are the same */
>> +    if (ea->count > 0 && flags == ea->extents[ea->count - 1].flags) {
>> +        ea->extents[ea->count - 1].length += length;
>> +        ea->total_length += length;
>> +        return 0;
>> +    }
> 
> The NBD spec states that NBD_CMD_BLOCK_STATUS with flag NBD_CMD_FLAG_REQ_ONE 
> must not exceed the original length of the client's request, but that when 
> the flag is not present, the final extent may indeed go beyond the client's 
> request.  I see two potential problems here:
> 
> 1) I don't see any check that extending .length does not exceed the client's 
> request if NBD_CMD_FLAG_REQ_ONE was set (we can sort of tell if that is the 
> case based on whether nb_alloc is 1 or greater than 1, but not directly here, 
> and it seems like this is a better place to do a common check than to make 
> each caller repeat it).

we have two callers. blockstatus_to_extents can't exceed the requested range, 
and bitmaps_to_extents has own check. If we want to move the check into 
nbd_extent_array_add, we need to enhance its interface to allow it to add only 
"part" of extent.. And how to handle it?
Mark the array "closed" after first partly applied extent, but return success? 
Then we'll have to change assertion at start of _add s/assert(ea->can_add)/if 
(!ea->can_add) {return -1}/..  Or return count of really applied bytes to the 
caller?

I doubt that this is a good idea, it seems simpler to keep nbd extent array not 
knowing about length limitation, keeping in mind that the following patch will 
drop any exceeding of the requested range.

> 
> 2) I don't see any check that extending .length does not exceed 32 bits.  If 
> the client requested status of 3.5G, but the caller divides that into two 
> extent additions of 3G each and with the same flags, we could end up 
> overflowing the 32-bit reply field (not necessarily fatal except when the 
> overflow is exactly at 4G, because as long as the server is making progress, 
> the client will eventually get all data; it is only when the overflow wraps 
> to exactly 0 that we quit making progress). 32-bit overflow is one case where 
> the server HAS to return back-to-back extents with the same flags (if it is 
> going to return information on that many bytes, rather than truncating its 
> reply to just the first extent < 4G).

good catch. I'll write it like

      /* Extend previous extent if flags are the same */
      if (ea->count > 0 && flags == ea->extents[ea->count - 1].flags) {
          uint64_t sum = (uint64_t)length + ea->extents[ea->count - 1].length;

          if (sum <= UINT32_MAX) {
              ea->extents[ea->count - 1].length = sum;
              ea->total_length += length;
              return 0;
          }
      }

> 
>> +
>> +    if (ea->count >= ea->nb_alloc) {
>> +        nbd_extent_array_convert_to_be(ea);
>> +        return -1;
>> +    }
>> +
>> +    ea->total_length += length;
>> +    ea->extents[ea->count] = (NBDExtent) {.length = length, .flags = flags};
>> +    ea->count++;
>> -    assert(*nb_extents);
>> -    while (remaining_bytes) {
>> +    return 0;
>> +}
>> +
>> +static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset,
>> +                                  uint64_t bytes, NBDExtentArray *ea)
>> +{
>> +    while (bytes) {
>>           uint32_t flags;
>>           int64_t num;
>> -        int ret = bdrv_block_status_above(bs, NULL, offset, remaining_bytes,
>> -                                          &num, NULL, NULL);
>> +        int ret = bdrv_block_status_above(bs, NULL, offset, bytes, &num,
>> +                                          NULL, NULL);
>>           if (ret < 0) {
>>               return ret;
>> @@ -1938,60 +2000,37 @@ static int blockstatus_to_extents(BlockDriverState 
>> *bs, uint64_t offset,
>>           flags = (ret & BDRV_BLOCK_ALLOCATED ? 0 : NBD_STATE_HOLE) |
>>                   (ret & BDRV_BLOCK_ZERO      ? NBD_STATE_ZERO : 0);
>> -        if (first_extent) {
>> -            extent->flags = flags;
>> -            extent->length = num;
>> -            first_extent = false;
>> -        } else if (flags == extent->flags) {
>> -            /* extend current extent */
>> -            extent->length += num;
>> -        } else {
>> -            if (extent + 1 == extents_end) {
>> -                break;
>> -            }
>> -
>> -            /* start new extent */
>> -            extent++;
>> -            extent->flags = flags;
>> -            extent->length = num;
>> +        if (nbd_extent_array_add(ea, num, flags) < 0) {
>> +            return 0;
>>           }
>> -        offset += num;
>> -        remaining_bytes -= num;
>> -    }
> 
> However, I _do_ like the refactoring on making the rest of the code easier to 
> read.
> 
>> -
>> -    extents_end = extent + 1;
>> -    for (extent = extents; extent < extents_end; extent++) {
>> -        extent->flags = cpu_to_be32(extent->flags);
>> -        extent->length = cpu_to_be32(extent->length);
>> +        offset += num;
>> +        bytes -= num;
>>       }
>> -    *bytes -= remaining_bytes;
>> -    *nb_extents = extents_end - extents;
>> -
>>       return 0;
>>   }
> 
> I think this needs v4 to fix the boundary cases, but I like where it is 
> headed.
> 


-- 
Best regards,
Vladimir
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [PATCH v3 08/10] nbd/server: introduce NBDExtentArray, Eric Blake, 2020/01/20
- Re: [PATCH v3 08/10] nbd/server: introduce NBDExtentArray, Vladimir Sementsov-Ogievskiy <=
Prev by Date: Re: [PATCH v2 3/5] linux-user/i386: Emulate x86_64 vsyscalls
Next by Date: Re: [PATCH v5 0/6] Enable more iotests during "make check-block"
Previous by thread: Re: [PATCH v3 08/10] nbd/server: introduce NBDExtentArray
Next by thread: Re: [PATCH v3 09/10] nbd/server: use bdrv_dirty_bitmap_next_dirty_area
Index(es):
- Date
- Thread