[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-block] [PATCH 0/4] throttle: Race condition fixes and test cases

From: Alberto Garcia
Subject: [Qemu-block] [PATCH 0/4] throttle: Race condition fixes and test cases
Date: Thu, 2 Aug 2018 17:50:22 +0300

Hi all,

here are the patches that I promised yesterday.

I was originally thinking to propose this for the v3.0 release, but
after debugging and fixing the problem I think that it's not
essential (details below).

The important patch is the second one. The first and the third are
just test cases and the last is an alternative solution for the bug
that Stefan fixed in 6fccbb475bc6effc313ee9481726a1748b6dae57.

There are details in the patches themselves, but here's an explanation
of the problem: consider a scenario with two drives A and B that are
part of the same throttle group. Both of them have throttled requests
and they're waiting for a timer that is set on drive A.

    (timer here) -->  [A]  ---  req1, req2
                      [B]  ---  req3

If we drain drive [A] (e.g. by disabling its I/O limits) then its
queue is restarted. req1 is processed immediately, and before
finishing it calls schedule_next_request(). This follows the
round-robin algorithm, selects req3 and puts a timer in [B].

But we're still not done with draining [A], and now we have a
BDRV_POLL_WHILE() loop at the end of bdrv_do_drained_begin() waiting
for req2 to finish. That won't happen until the timer in [B] fires and
req3 is done. If there are more drives in the group and more requests
in the queue this can take a while. That's why disabling a drive's I/O
limits can be noticeably slow: we disabled the I/O limits but they're
still being enforced in practice.

The QEMU I/O tests run in qtest mode (with QEMU_CLOCK_VIRTUAL). The
clock must be advanced manually, which means that the scenario that I
just described hangs QEMU because BDRV_POLL_WHILE() loops forever (you
can reproduce this with patch 3). In a real world scenario this only
results in the aforementioned slowdown (probably negligible in
practice), which is not a critical thing, and that's why I think it's
safe to keep the current code for QEMU 3.

I think that's all. Questions and commend are welcome.


Alberto Garcia (4):
  qemu-iotests: Test removing a throttle group member with a pending
  throttle-groups: Skip the round-robin if a member is being drained
  qemu-iotests: Update 093 to improve the draining test
  throttle-groups: Don't allow timers without throttled requests

 block/throttle-groups.c    | 41 +++++++++++++++++++++++++---------
 tests/qemu-iotests/093     | 55 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/093.out |  4 ++--
 3 files changed, 88 insertions(+), 12 deletions(-)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]