[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] migration: avoid starting a new migration task
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [PATCH] migration: avoid starting a new migration task while the previous one still exist |
Date: |
Wed, 06 Nov 2013 10:04:36 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130923 Thunderbird/17.0.9 |
Il 06/11/2013 02:50, Zhanghaoyu (A) ha scritto:
>>>>> Avoid starting a new migration task while the previous one still
>>>> exist.
>>>>
>>>> Can you explain how to reproduce the problem?
>>>>
>>> When network disconnection between source and destination happened,
>>> the migration thread stuck at below stack,
>>> Then I cancel the migration task, the migration state in qemu will be set
>>> to MIG_STATE_CANCELLED, so the migration job in libvirt quits.
>>> Then I perform migration again, at this time, the network reconnected
>>> successfully, since the TCP timeout retransmission, above stack will not
>>> return immediately, so two migration tasks exist at the same time.
>>> And still worse, source qemu will crash, because of accessing the NULL
>>> pointer in qemu_bh_schedule(s->cleanup_bh); statement in latter migration
>>> task, since the "s->cleanup_bh" had been deleted by previous migration task.
>>
>> Thanks for explaining. CANCELLING looks like a useful addition.
>>
>> Why do you need both CANCELLING and COMPLETING? The COMPLETED state should
>> be set only after all I/O is done.
>
> There is a period of time from the timing of setting COMPLETED state to that
> of migration task exits,
> so it's problematic in COMPLETED->CANCELLED transition, but if applying your
> below proposal, the problem gone.
>
> do {
> old_state = s->state;
> if (old_state != MIG_STATE_SETUP && old_state != MIG_STATE_ACTIVE) {
> break;
> }
> migrate_set_state(s, old_state, MIG_STATE_CANCELLED);
> } while (s->state != MIG_STATE_CANCELLED);
Ok.
>> I agree with Eric that the CANCELLING state should not be exposed via QMP.
>> "info migrate" and "query-migrate" can keep showing "active" for maximum
>> backwards compatibility.
>>
>> More comments below.
>>
>>
>>> - if (s->state != MIG_STATE_COMPLETED) {
>>> + if (s->state != MIG_STATE_COMPLETING) {
>>> qemu_savevm_state_cancel();
>>> + if (s->state == MIG_STATE_CANCELLING) {
>>> + migrate_set_state(s, MIG_STATE_CANCELLING,
>>> MIG_STATE_CANCELLED);
>>> + }
>>
>> I think you can remove the "if" and unconditionally call migrate_set_state.
>
> Do you mean to remove the "if (s->state == MIG_STATE_CANCELLING)" ?
> The s->state probably is MIG_STATE_ERROR here, is it okay to unconditionally
> call migrate_set_state?
migrate_set_state has atomic_cmpxchg so it has an "implicit" if, but
you're right it's clearer this way.
Paolo
> Thanks,
> Zhang Haoyu
>
>>
>>> + }else {
>>> + migrate_set_state(s, MIG_STATE_COMPLETING,
>>> + MIG_STATE_COMPLETED);
>>> }
>>>
>>> notifier_list_notify(&migration_state_notifiers, s); }
>>>
>>> -static void migrate_set_state(MigrationState *s, int old_state, int
>>> new_state) -{
>>> - if (atomic_cmpxchg(&s->state, old_state, new_state) == new_state) {
>>> - trace_migrate_set_state(new_state);
>>> - }
>>> -}
>>> -
>>> void migrate_fd_error(MigrationState *s) {
>>> DPRINTF("setting error state\n"); @@ -328,7 +337,7 @@ static void
>>> migrate_fd_cancel(MigrationState *s) {
>>> DPRINTF("cancelling migration\n");
>>>
>>> - migrate_set_state(s, s->state, MIG_STATE_CANCELLED);
>>> + migrate_set_state(s, s->state, MIG_STATE_CANCELLING);
>>
>> Here probably we want something like
>>
>> do {
>> old_state = s->state;
>> if (old_state != MIG_STATE_SETUP && old_state != MIG_STATE_ACTIVE) {
>> break;
>> }
>> migrate_set_state(s, old_state, MIG_STATE_CANCELLING);
>> } while (s->state != MIG_STATE_CANCELLING);
>>
>> to avoid a bogus COMPLETED->CANCELLED transition. Please separate the patch
>> in two parts:
>>
>> (1) the first uses the above code, with CANCELLED instead of CANCELLING
>>
>> (2) the second, similar to the one you have posted, introduces the new
>> CANCELLING state
>>
>> Thanks,
>>
>> Paolo
Re: [Qemu-devel] [PATCH] migration: avoid starting a new migration task while the previous one still exist, Eric Blake, 2013/11/04