qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 3/4] qcow2: add zoned emulation capability


From: Stefan Hajnoczi
Subject: Re: [PATCH v2 3/4] qcow2: add zoned emulation capability
Date: Tue, 22 Aug 2023 15:48:14 -0400

On Mon, Aug 14, 2023 at 04:58:01PM +0800, Sam Li wrote:
> By adding zone operations and zoned metadata, the zoned emulation
> capability enables full emulation support of zoned device using
> a qcow2 file. The zoned device metadata includes zone type,
> zoned device state and write pointer of each zone, which is stored
> to an array of unsigned integers.
> 
> Each zone of a zoned device makes state transitions following
> the zone state machine. The zone state machine mainly describes
> five states, IMPLICIT OPEN, EXPLICIT OPEN, FULL, EMPTY and CLOSED.
> READ ONLY and OFFLINE states will generally be affected by device
> internal events. The operations on zones cause corresponding state
> changing.
> 
> Zoned devices have a limit on zone resources, which puts constraints on
> write operations into zones.
> 
> Signed-off-by: Sam Li <faithilikerun@gmail.com>
> ---
>  block/qcow2.c          | 676 ++++++++++++++++++++++++++++++++++++++++-
>  block/qcow2.h          |   2 +
>  docs/interop/qcow2.txt |   2 +
>  3 files changed, 678 insertions(+), 2 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index c1077c4a4a..5ccf79cbe7 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -194,6 +194,164 @@ qcow2_extract_crypto_opts(QemuOpts *opts, const char 
> *fmt, Error **errp)
>      return cryptoopts_qdict;
>  }
>  
> +#define QCOW2_ZT_IS_CONV(wp)    (wp & 1ULL << 59)
> +
> +static inline int qcow2_get_wp(uint64_t wp)
> +{
> +    /* clear state and type information */
> +    return ((wp << 5) >> 5);
> +}
> +
> +static inline int qcow2_get_zs(uint64_t wp)
> +{
> +    return (wp >> 60);
> +}
> +
> +static inline void qcow2_set_wp(uint64_t *wp, BlockZoneState zs)
> +{
> +    uint64_t addr = qcow2_get_wp(*wp);
> +    addr |= ((uint64_t)zs << 60);
> +    *wp = addr;
> +}
> +
> +/*
> + * File wp tracking: reset zone, finish zone and append zone can
> + * change the value of write pointer. All zone operations will change
> + * the state of that/those zone.
> + * */
> +static inline void qcow2_wp_tracking_helper(int index, uint64_t wp) {
> +    /* format: operations, the wp. */
> +    printf("wps[%d]: 0x%x\n", index, qcow2_get_wp(wp)>>BDRV_SECTOR_BITS);
> +}
> +
> +/*
> + * Perform a state assignment and a flush operation that writes the new wp
> + * value to the dedicated location of the disk file.
> + */
> +static int qcow2_write_wp_at(BlockDriverState *bs, uint64_t *wp,
> +                             uint32_t index, BlockZoneState zs) {
> +    BDRVQcow2State *s = bs->opaque;
> +    int ret;
> +
> +    qcow2_set_wp(wp, zs);
> +    ret = bdrv_pwrite(bs->file, s->zoned_header.zonedmeta_offset
> +        + sizeof(uint64_t) * index, sizeof(uint64_t), wp, 0);
> +
> +    if (ret < 0) {
> +        goto exit;

Should *wp be restored to its original value to undo the effect of
qcow2_set_wp()?

> +    }
> +    qcow2_wp_tracking_helper(index, *wp);
> +    return ret;
> +
> +exit:
> +    error_report("Failed to write metadata with file");
> +    return ret;
> +}
> +
> +static int qcow2_check_active(BlockDriverState *bs)

Please rename this to qcow2_check_active_zones() to avoid confusion with
other uses "active" in qcow2.

> +{
> +    BDRVQcow2State *s = bs->opaque;
> +
> +    if (!s->zoned_header.max_active_zones) {
> +        return 0;
> +    }
> +
> +    if (s->nr_zones_exp_open + s->nr_zones_imp_open + s->nr_zones_closed
> +        < s->zoned_header.max_active_zones) {
> +        return 0;
> +    }
> +
> +    return -1;
> +}

(This function could return a bool instead of 0/-1 since it doesn't
really need an int.)

> +
> +static int qcow2_check_open(BlockDriverState *bs)

qcow2_check_open_zones() or, even better, qcow2_can_open_zone().

> +{
> +    BDRVQcow2State *s = bs->opaque;
> +    int ret;
> +
> +    if (!s->zoned_header.max_open_zones) {
> +        return 0;
> +    }
> +
> +    if (s->nr_zones_exp_open + s->nr_zones_imp_open
> +        < s->zoned_header.max_open_zones) {
> +        return 0;
> +    }
> +
> +    if(s->nr_zones_imp_open) {
> +        ret = qcow2_check_active(bs);
> +        if (ret == 0) {
> +            /* TODO: it takes O(n) time complexity (n = nr_zones).
> +             * Optimizations required. */

One solution is to keep an implicitly open list. Then this operation is
O(1).

> +            /* close one implicitly open zones to make it available */
> +            for (int i = s->zoned_header.zone_nr_conv;
> +            i < bs->bl.nr_zones; ++i) {
> +                uint64_t *wp = &s->wps->wp[i];
> +                if (qcow2_get_zs(*wp) == BLK_ZS_IOPEN) {
> +                    ret = qcow2_write_wp_at(bs, wp, i, BLK_ZS_CLOSED);

I'm wondering if it's correct to store the zone state persistently in
the qcow2 file. If the guest or QEMU crashes, then zones will be left in
states like EOPEN. Since the guest software will have forgotten about
explicitly opened zones, the guest would need to recover zone states.
I'm not sure if existing software is designed to do that.

Damien: Should the zone state be persistent?

> +                    if (ret < 0) {
> +                        return ret;
> +                    }
> +                    s->wps->wp[i] = *wp;
> +                    s->nr_zones_imp_open--;
> +                    s->nr_zones_closed++;
> +                    break;
> +                }
> +            }
> +            return 0;

This returns 0 if there are no IOPEN zones available. It needs to return
an error when there are not enough resources available.

> +        }
> +        return ret;
> +    }
> +
> +    return -1;

This function mixes 0/-1 with -errno return values. -1 is -EPERM, but I
think that's not what you want.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]