[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] qemu-coroutine-lock: fix co_queue multi-adding

From: Bin Wu
Subject: Re: [Qemu-devel] [PATCH] qemu-coroutine-lock: fix co_queue multi-adding bug
Date: Tue, 10 Feb 2015 09:08:22 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

On 2015/2/9 18:12, Kevin Wolf wrote:
> Am 09.02.2015 um 10:36 hat Bin Wu geschrieben:
>> On 2015/2/9 16:12, Fam Zheng wrote:
>>> On Sat, 02/07 17:51, w00214312 wrote:
>>>> From: Bin Wu <address@hidden>
>>>> When a coroutine holds a lock, other coroutines who want to get
>>>> the lock must wait on a co_queue by adding themselves to the
>>>> CoQueue. However, if a waiting coroutine is woken up with the
>>>> lock still be holding by other coroutine, this waiting coroutine
>>> Could you explain who wakes up the waiting coroutine? Maybe the bug is that
>>> it shouldn't be awaken in the first place.
>> During the mirror phase with nbd devices, if we send a cancel command or
>> physical network breaks down, the source qemu process will receive a readable
>> event and the main loop will invoke nbd_reply_ready to deal with it. This
>> function finds the connection is down and then goes into
>> nbd_teardown_connection. nbd_teardown_connection wakes up all working 
>> coroutines
>> by nbd_recv_coroutines_enter_all. These coroutines include the one which 
>> holds
>> the sending lock, the ones which wait for the lock, and the ones which wait 
>> for
>> receiving messages.
> This is the bug. It's not allowed to reenter a coroutine if you don't
> know its state. NBD needs a fix, not the the generic coroutine
> infrastructure.
> If we want to change anything in the lock implementation, it should be
> adding an assertion to catch such violations of the rule. (Untested, but
> I think the assertion should hold true.)
> Kevin
> diff --git a/qemu-coroutine-lock.c b/qemu-coroutine-lock.c
> index e4860ae..25fc111 100644
> --- a/qemu-coroutine-lock.c
> +++ b/qemu-coroutine-lock.c
> @@ -123,9 +123,8 @@ void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex)
>      trace_qemu_co_mutex_lock_entry(mutex, self);
> -    while (mutex->locked) {
> -        qemu_co_queue_wait(&mutex->queue);
> -    }
> +    qemu_co_queue_wait(&mutex->queue);
> +    assert(!mutex->locked);
>      mutex->locked = true;
> .

I see. paolo's patch in his first reply can fix the bug in NBD (not in
coroutine). I will complete that patch to do some tests and send another version

Bin Wu

reply via email to

[Prev in Thread] Current Thread [Next in Thread]