[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] throttle-groups: fix hang when group member lea

From: Alberto Garcia
Subject: Re: [Qemu-devel] [PATCH] throttle-groups: fix hang when group member leaves
Date: Tue, 31 Jul 2018 18:47:53 +0200
User-agent: Notmuch/0.18.2 (http://notmuchmail.org) Emacs/24.4.1 (i586-pc-linux-gnu)

On Wed 04 Jul 2018 04:54:10 PM CEST, Stefan Hajnoczi wrote:
> Throttle groups consist of members sharing one throttling state
> (including bps/iops limits).  Round-robin scheduling is used to ensure
> fairness.  If a group member already has a timer pending then other
> groups members do not schedule their own timers.  The next group
> member will have its turn when the existing timer expires.
> A hang may occur when a group member leaves while it had a timer
> scheduled.

Ok, I can reproduce this if I run fio with iodepth=1.

We're draining the BDS before removing it from a throttle group, and
therefore there cannot be any pending requests.

So the problem seems to be that when throttle_co_drain_begin() runs the
pending requests from a member using throttle_group_co_restart_queue(),
it simply uses qemu_co_queue_next() and doesn't touch the timer at all.

So it can happen that there's a request in the queue waiting for a
timer, and after that call the request is gone but the timer remains.

The current patch is perhaps not worth touching at this point (we're
about to release QEMU 3.0), but I think that a better solution would be
to either

a) cancel the existing timer and reset tg->any_timer_armed on the given
   tgm after throttle_group_co_restart_queue() and before
   schedule_next_request() if the queue is empty.

b) force the existing timer to run immediately instead of calling
   throttle_group_co_restart_queue(). Seems cleaner, but I haven't tried
   this one yet.

I'll explore them a bit and send a patch.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]