qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH 3/8] quorum: Implement .bdrv_co_readv/writev


From: Eric Blake
Subject: Re: [Qemu-block] [PATCH 3/8] quorum: Implement .bdrv_co_readv/writev
Date: Mon, 21 Nov 2016 11:58:46 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0

On 11/21/2016 11:31 AM, Kevin Wolf wrote:
> This converts the quorum block driver from implementing callback-based
> interfaces for read/write to coroutine-based ones. This is the first
> step that will allow us further simplification of the code.
> 
> Signed-off-by: Kevin Wolf <address@hidden>
> ---
>  block/quorum.c | 192 
> ++++++++++++++++++++++++++++++++++-----------------------
>  1 file changed, 115 insertions(+), 77 deletions(-)
> 

> @@ -174,14 +162,14 @@ static bool quorum_64bits_compare(QuorumVoteValue *a, 
> QuorumVoteValue *b)
>  static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
>                                     QEMUIOVector *qiov,
>                                     uint64_t sector_num,
> -                                   int nb_sectors,
> -                                   BlockCompletionFunc *cb,
> -                                   void *opaque)
> +                                   int nb_sectors)
>  {
>      BDRVQuorumState *s = bs->opaque;
> -    QuorumAIOCB *acb = qemu_aio_get(&quorum_aiocb_info, bs, cb, opaque);
> +    QuorumAIOCB *acb = g_new(QuorumAIOCB, 1);

Worth using g_new0() here...

>      int i;
>  
> +    acb->co = qemu_coroutine_self();
> +    acb->bs = bs;
>      acb->sector_num = sector_num;
>      acb->nb_sectors = nb_sectors;
>      acb->qiov = qiov;
> @@ -191,6 +179,7 @@ static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
>      acb->rewrite_count = 0;
>      acb->votes.compare = quorum_sha256_compare;
>      QLIST_INIT(&acb->votes.vote_list);
> +    acb->has_completed = false;
>      acb->is_read = false;
>      acb->vote_ret = 0;

...to eliminate 0-assignments here? Not a show-stopper to leave it
as-is, though.


> -static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb);
> +static int read_fifo_child(QuorumAIOCB *acb);
>  
>  static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
>  {
> @@ -272,14 +261,14 @@ static void quorum_report_bad_acb(QuorumChildRequest 
> *sacb, int ret)
>      QuorumAIOCB *acb = sacb->parent;
>      QuorumOpType type = acb->is_read ? QUORUM_OP_TYPE_READ : 
> QUORUM_OP_TYPE_WRITE;
>      quorum_report_bad(type, acb->sector_num, acb->nb_sectors,
> -                      sacb->aiocb->bs->node_name, ret);
> +                      sacb->bs->node_name, ret);
>  }
>  
> -static void quorum_fifo_aio_cb(void *opaque, int ret)
> +static int quorum_fifo_aio_cb(void *opaque, int ret)
>  {
>      QuorumChildRequest *sacb = opaque;
>      QuorumAIOCB *acb = sacb->parent;
> -    BDRVQuorumState *s = acb->common.bs->opaque;
> +    BDRVQuorumState *s = acb->bs->opaque;
>  
>      assert(acb->is_read && s->read_pattern == QUORUM_READ_PATTERN_FIFO);
>  
> @@ -288,8 +277,7 @@ static void quorum_fifo_aio_cb(void *opaque, int ret)
>  
>          /* We try to read next child in FIFO order if we fail to read */
>          if (acb->children_read < s->num_children) {
> -            read_fifo_child(acb);
> -            return;
> +            return read_fifo_child(acb);
>          }

Question unrelated to this patch: in FIFO mode, are we doing work
sequentially or in parallel?  That is, does the quorum code kick off all
children simultaneously, then wait until the first child answers with
success (and abort all remaining children) or failure (at which point
moving to the second child may already have an answer)?  Or does it only
kick of the first child, wait for a response, and not start the second
child until after the first child fails?  I guess one way has more
potentially wasted work (and a stress test of our ability to cancel work
on secondary children), while the other has higher latencies, so maybe
it is something that a future quorum patch may want to make configurable?

>  
> -static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb)
> +static int read_fifo_child(QuorumAIOCB *acb)
>  {
> -    BDRVQuorumState *s = acb->common.bs->opaque;
> +    BDRVQuorumState *s = acb->bs->opaque;
>      int n = acb->children_read++;
> +    int ret;
>  
> -    acb->qcrs[n].aiocb = bdrv_aio_readv(s->children[n], acb->sector_num,
> -                                        acb->qiov, acb->nb_sectors,
> -                                        quorum_fifo_aio_cb, &acb->qcrs[n]);
> +    acb->qcrs[n].bs = s->children[n]->bs;
> +    ret = bdrv_co_preadv(s->children[n], acb->sector_num * BDRV_SECTOR_SIZE,
> +                         acb->nb_sectors * BDRV_SECTOR_SIZE, acb->qiov, 0);
> +    ret = quorum_fifo_aio_cb(&acb->qcrs[n], ret);

somewhat answering myself - it looks like the current fifo approach is
high-latency rather than parallel, in that at most one child is being
run at a time.

The conversion itself looks sane;
Reviewed-by: Eric Blake <address@hidden>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]