qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v2 01/13] blockjob: record time of last entrance


From: Max Reitz
Subject: Re: [Qemu-block] [PATCH v2 01/13] blockjob: record time of last entrance
Date: Wed, 7 Feb 2018 23:05:40 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2

On 2018-01-19 21:58, John Snow wrote:
> The mirror job makes a semi-inaccurate record of the last time we yielded
> by recording the last time we left a "pause", but this doesn't always
> correlate to the time we actually last successfully ceded control.
> 
> Record the time we last *exited* a yield centrally. In other words, record
> the time we began execution of this job to know how long we have been
> selfish for.
> 
> Signed-off-by: John Snow <address@hidden>
> ---
>  block/mirror.c           | 8 ++------
>  blockjob.c               | 2 ++
>  include/block/blockjob.h | 5 +++++
>  3 files changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index c9badc1203..88f4e8964d 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -63,7 +63,6 @@ typedef struct MirrorBlockJob {
>      QSIMPLEQ_HEAD(, MirrorBuffer) buf_free;
>      int buf_free_count;
>  
> -    uint64_t last_pause_ns;
>      unsigned long *in_flight_bitmap;
>      int in_flight;
>      int64_t bytes_in_flight;
> @@ -596,8 +595,7 @@ static void mirror_throttle(MirrorBlockJob *s)
>  {
>      int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>  
> -    if (now - s->last_pause_ns > SLICE_TIME) {
> -        s->last_pause_ns = now;
> +    if (now - s->common.last_enter_ns > SLICE_TIME) {
>          block_job_sleep_ns(&s->common, 0);
>      } else {
>          block_job_pause_point(&s->common);
> @@ -769,7 +767,6 @@ static void coroutine_fn mirror_run(void *opaque)
>  
>      mirror_free_init(s);
>  
> -    s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>      if (!s->is_none_mode) {
>          ret = mirror_dirty_init(s);
>          if (ret < 0 || block_job_is_cancelled(&s->common)) {
> @@ -803,7 +800,7 @@ static void coroutine_fn mirror_run(void *opaque)
>           * We do so every SLICE_TIME nanoseconds, or when there is an error,
>           * or when the source is clean, whichever comes first.
>           */
> -        delta = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - s->last_pause_ns;
> +        delta = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - 
> s->common.last_enter_ns;

The horror.

>          if (delta < SLICE_TIME &&
>              s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
>              if (s->in_flight >= MAX_IN_FLIGHT || s->buf_free_count == 0 ||
> @@ -878,7 +875,6 @@ static void coroutine_fn mirror_run(void *opaque)
>              delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
>              block_job_sleep_ns(&s->common, delay_ns);
>          }
> -        s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>      }

Hmmm.  So you're now relying on block_job_sleep_ns() updating
last_enter_ns.  But it will only do so if we actually do sleep, which we
will not if the block job has been cancelled.

Then, last_enter_ns will stay at its current value indefinitely because
neither block_job_sleep_ns() nor block_job_pause_point() will yield.
OK, so at this point we should leave the mirror loop.  There are three
points where this is done:

(1) If s->ret < 0.  OK, let's hope it isn't.
(2) If cnt == 0 && should_complete.  We'll come back to that.
(3) If !s->synced && block_job_is_cancelled(...).  So basically now, if
the job has not emitted a READY event yet.

But if we have emitted the READY, we have to wait for cnt == 0 &&
should_complete (note that should_complete in turn will only be set if
s->in_flight == 0 && cnt == 0).  But unless delta < SLICE_TIME, we will
never do another mirror_iteration(), so unless we have already started
the necessary requests, they will never be started and we will loop forever.

So try this on master (I prefixed the QMP lines with in/out depending on
the direction -- note that it's important to have the block-job-cancel
on the same line the human-monitor-command finishes on so they are
executed right after each other!):

$ ./qemu-img create -f qcow2 foo.qcow2 64M
$ x86_64-softmmu/qemu-system-x86_64 -qmp stdio
In: {"QMP": {"version": {"qemu": {"micro": 50, "minor": 11, "major": 2},
"package": " (v2.9.0-632-g4a52d43-dirty)"}, "capabilities": []}}
Out: {"execute":"qmp_capabilities"}
In: {"return": {}}
Out: {"execute":"object-add","arguments":
      {"qom-type":"throttle-group","id":"tg0","props":{"limits":
         {"bps-total":16777216}}}}
In: {"return": {}}
Out: {"execute":"blockdev-add","arguments":
      {"node-name":"source","driver":"qcow2","file":
       {"driver":"file","filename":"foo.qcow2"}}}
In: {"return": {}}
Out: {"execute":"blockdev-add","arguments":
      {"node-name":"target","driver":"throttle",
       "throttle-group":"tg0","file":
       {"driver":"null-co","size":67108864}}}
In: {"return": {}}
Out: {"execute":"blockdev-mirror","arguments":
     {"job-id":"mirror","device":"source","target":"target",
      "sync":"full"}}
In: {"return": {}}
In: {"timestamp": {"seconds": 1518040566, "microseconds": 658111},
     "event": "BLOCK_JOB_READY", "data":
     {"device": "mirror", "len": 67108864, "offset": 67108864,
      "speed": 0, "type": "mirror"}}
Out: {"execute":"human-monitor-command","arguments":
      {"command-line":"qemu-io source \"write 0 64M\""}
     }{"execute":"block-job-cancel","arguments":{"device":"mirror"}}
In: wrote 67108864/67108864 bytes at offset 0
In: 64 MiB, 1 ops; 0.0310 sec (2.011 GiB/sec and 32.1823 ops/sec)
In: {"return": ""}
In: {"return": {}}
In: {"timestamp": {"seconds": 1518040578, "microseconds": 626669},
     "event": "BLOCK_JOB_COMPLETED", "data":
     {"device": "mirror", "len": 134217728, "offset": 134217728,
      "speed": 0, "type": "mirror"}}

But with your patch, the BLOCK_JOB_COMPLETED never appears (it should
take four seconds).

>  
>  immediate_exit:
> diff --git a/blockjob.c b/blockjob.c
> index f5cea84e73..2a9ff66b95 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -321,6 +321,7 @@ void block_job_start(BlockJob *job)
>      job->pause_count--;
>      job->busy = true;
>      job->paused = false;
> +    job->last_enter_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>      bdrv_coroutine_enter(blk_bs(job->blk), job->co);
>  }
>  
> @@ -786,6 +787,7 @@ static void block_job_do_yield(BlockJob *job, uint64_t ns)
>      job->busy = false;
>      block_job_unlock();
>      qemu_coroutine_yield();
> +    job->last_enter_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>  
>      /* Set by block_job_enter before re-entering the coroutine.  */
>      assert(job->busy);
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index 00403d9482..e965845c94 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -141,6 +141,11 @@ typedef struct BlockJob {
>       */
>      QEMUTimer sleep_timer;
>  
> +    /**
> +     * Timestamp of the last yield

A yield has two timestamps, one pre-yield, one post-yield.  Maybe it
should be noted that this records the post-yield one because that's not
what I'd assume just based on the comment, but OTOH the name of the
variable should be enough to clear that up.

Max

> +     */
> +    uint64_t last_enter_ns;
> +
>      /** Non-NULL if this job is part of a transaction */
>      BlockJobTxn *txn;
>      QLIST_ENTRY(BlockJob) txn_list;
> 


Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]