[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] fix the co_queue multi-adding bug
From: |
Bin Wu |
Subject: |
Re: [Qemu-devel] [PATCH] fix the co_queue multi-adding bug |
Date: |
Tue, 10 Feb 2015 14:34:47 +0800 |
User-agent: |
Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 |
On 2015/2/9 17:23, Paolo Bonzini wrote:
>
>
> On 07/02/2015 10:51, w00214312 wrote:
>> From: Bin Wu <address@hidden>
>>
>> When we test the drive_mirror between different hosts by ndb devices,
>> we find that, during the cancel phase the qemu process crashes sometimes.
>> By checking the crash core file, we find the stack as follows, which means
>> a coroutine re-enter error occurs:
>
> This bug probably can be fixed simply by delaying the setting of
> recv_coroutine.
>
> What are the symptoms if you only apply your "qemu-coroutine-lock: fix
> co_queue multi-adding bug" patch but not "qemu-coroutine: fix
> qemu_co_queue_run_restart error"?
These two patches are used to solve two different problems:
-"qemu-coroutine-lock: fix co_queue multi-adding bug" solves the coroutine
re-enter problem which is found when we send a cancel command after the
drive_mirror is just started.
-"qemu-coroutine: fix qemu_co_queue_run_restart error" solves the segfault
problem during drive_mirror phase of two VMs which copy large files between each
other.
>
> Can you try the patch below? (Compile-tested only).
>
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index 6e1c97c..23d6a71 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -104,10 +104,21 @@ static int nbd_co_send_request(NbdClientSession *s,
> QEMUIOVector *qiov, int offset)
> {
> AioContext *aio_context;
> - int rc, ret;
> + int rc, ret, i;
>
> qemu_co_mutex_lock(&s->send_mutex);
> +
> + for (i = 0; i < MAX_NBD_REQUESTS; i++) {
> + if (s->recv_coroutine[i] == NULL) {
> + s->recv_coroutine[i] = qemu_coroutine_self();
> + break;
> + }
> + }
> +
> + assert(i < MAX_NBD_REQUESTS);
> + request->handle = INDEX_TO_HANDLE(s, i);
> s->send_coroutine = qemu_coroutine_self();
> +
> aio_context = bdrv_get_aio_context(s->bs);
> aio_set_fd_handler(aio_context, s->sock,
> nbd_reply_ready, nbd_restart_write, s);
> @@ -164,8 +175,6 @@ static void nbd_co_receive_reply(NbdClientSession *s,
> static void nbd_coroutine_start(NbdClientSession *s,
> struct nbd_request *request)
> {
> - int i;
> -
> /* Poor man semaphore. The free_sema is locked when no other request
> * can be accepted, and unlocked after receiving one reply. */
> if (s->in_flight >= MAX_NBD_REQUESTS - 1) {
> @@ -174,15 +183,7 @@ static void nbd_coroutine_start(NbdClientSession *s,
> }
> s->in_flight++;
>
> - for (i = 0; i < MAX_NBD_REQUESTS; i++) {
> - if (s->recv_coroutine[i] == NULL) {
> - s->recv_coroutine[i] = qemu_coroutine_self();
> - break;
> - }
> - }
> -
> - assert(i < MAX_NBD_REQUESTS);
> - request->handle = INDEX_TO_HANDLE(s, i);
> + /* s->recv_coroutine[i] is set as soon as we get the send_lock. */
> }
>
> static void nbd_coroutine_end(NbdClientSession *s,
>
>
>
--
Bin Wu
Re: [Qemu-devel] [PATCH] fix the co_queue multi-adding bug, Paolo Bonzini, 2015/02/09