qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 6/6] jobs: ensure sleep in job_sleep_ns is fully performed


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [PATCH 6/6] jobs: ensure sleep in job_sleep_ns is fully performed
Date: Thu, 10 Feb 2022 18:02:23 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0

08.02.2022 18:36, Emanuele Giuseppe Esposito wrote:
If a drain happens while a job is sleeping, the timeout
gets cancelled and the job continues once the drain ends.
This is especially bad for the sleep performed in commit and stream
jobs, since that is dictated by ratelimit to maintain a certain speed.

Basically the execution path is the following:
1. job calls job_sleep_ns, and yield with a timer in @ns ns.
2. meanwhile, a drain is executed, and
    child_job_drained_{begin/end} could be executed as ->drained_begin()
    and ->drained_end() callbacks.
    Therefore child_job_drained_begin() enters the job, that continues
    execution in job_sleep_ns() and calls job_pause_point_locked().
3. job_pause_point_locked() detects that we are in the middle of a
    drain, and firstly deletes any existing timer and then yields again,
    waiting for ->drained_end().
4. Once draining is finished, child_job_drained_end() runs and resumes
    the job. At this point, the timer has been lost and we just resume
    without checking if enough time has passed.

This fix implies that from now onwards, job_sleep_ns will force the job
to sleep @ns, even if it is wake up (purposefully or not) in the middle
of the sleep. Therefore qemu-iotests test might run a little bit slower,
depending on the speed of the job. Setting a job speed to values like "1"
is not allowed anymore (unless you want to wait forever).

Because of this fix, test_stream_parallel() in tests/qemu-iotests/030
takes too long, since speed of stream job is just 1024 and before
it was skipping all the wait thanks to the drains. Increase the
speed to 256 * 1024. Exactly the same happens for test 151.

Instead we need to sleep less in test_cancel_ready() test-blockjob.c,
so that the job will be able to exit the sleep and transition to ready
before the main loop asserts.

Signed-off-by: Emanuele Giuseppe Esposito<eesposit@redhat.com>

Acked-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]