[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 2/2] migration: Replace the return path retry logic
From: |
Fabiano Rosas |
Subject: |
Re: [PATCH v2 2/2] migration: Replace the return path retry logic |
Date: |
Wed, 02 Aug 2023 17:04:45 -0300 |
Peter Xu <peterx@redhat.com> writes:
>> + if (await_return_path_close_on_source(s)) {
>> + trace_migration_return_path_pause_err();
>> + return MIG_THR_ERR_FATAL;
>> + }
>
> I see that here on return path failures we'll bail out, and actually it's
> against the instinction (that when pause it should have failed, so it's
> weird why it's returning 0).
>
> So how about above suggestion, plus here we just call
> await_return_path_close_on_source(), without caring about the retval?
So you are suggesting to remove the knowledge of the retry entirely from
the thread. It just reports the error and the postcopy_pause takes the
responsibility of ignoring it when we want to retry... It could be
clearer that way indeed.
>> +
>> migrate_set_state(&s->state, s->state,
>> MIGRATION_STATUS_POSTCOPY_PAUSED);
>>
>> @@ -2566,12 +2542,6 @@ static MigThrError postcopy_pause(MigrationState *s)
>> if (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER) {
>> /* Woken up by a recover procedure. Give it a shot */
>>
>> - /*
>> - * Firstly, let's wake up the return path now, with a new
>> - * return path channel.
>> - */
>> - qemu_sem_post(&s->postcopy_pause_rp_sem);
>> -
>> /* Do the resume logic */
>> if (postcopy_do_resume(s) == 0) {
>> /* Let's continue! */
>> @@ -3259,7 +3229,7 @@ void migrate_fd_connect(MigrationState *s, Error
>> *error_in)
>> * QEMU uses the return path.
>> */
>> if (migrate_postcopy_ram() || migrate_return_path()) {
>> - if (open_return_path_on_source(s, !resume)) {
>> + if (open_return_path_on_source(s)) {
>> error_report("Unable to open return-path for postcopy");
>> migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
>> migrate_fd_cleanup(s);
>> @@ -3320,7 +3290,6 @@ static void migration_instance_finalize(Object *obj)
>> qemu_sem_destroy(&ms->rate_limit_sem);
>> qemu_sem_destroy(&ms->pause_sem);
>> qemu_sem_destroy(&ms->postcopy_pause_sem);
>> - qemu_sem_destroy(&ms->postcopy_pause_rp_sem);
>> qemu_sem_destroy(&ms->rp_state.rp_sem);
>> qemu_sem_destroy(&ms->rp_state.rp_pong_acks);
>> qemu_sem_destroy(&ms->postcopy_qemufile_src_sem);
>> @@ -3340,7 +3309,6 @@ static void migration_instance_init(Object *obj)
>> migrate_params_init(&ms->parameters);
>>
>> qemu_sem_init(&ms->postcopy_pause_sem, 0);
>> - qemu_sem_init(&ms->postcopy_pause_rp_sem, 0);
>> qemu_sem_init(&ms->rp_state.rp_sem, 0);
>> qemu_sem_init(&ms->rp_state.rp_pong_acks, 0);
>> qemu_sem_init(&ms->rate_limit_sem, 0);
>> diff --git a/migration/migration.h b/migration/migration.h
>> index b7c8b67542..e78db5361c 100644
>> --- a/migration/migration.h
>> +++ b/migration/migration.h
>> @@ -382,7 +382,6 @@ struct MigrationState {
>>
>> /* Needed by postcopy-pause state */
>> QemuSemaphore postcopy_pause_sem;
>> - QemuSemaphore postcopy_pause_rp_sem;
>> /*
>> * Whether we abort the migration if decompression errors are
>> * detected at the destination. It is left at false for qemu
>> diff --git a/migration/trace-events b/migration/trace-events
>> index 5259c1044b..19ec649d1d 100644
>> --- a/migration/trace-events
>> +++ b/migration/trace-events
>> @@ -164,6 +164,7 @@ migration_rate_limit_pre(int ms) "%d ms"
>> migration_rate_limit_post(int urgent) "urgent: %d"
>> migration_return_path_end_before(void) ""
>> migration_return_path_end_after(int rp_error) "%d"
>> +migration_return_path_pause_err(void) ""
>
> If it should never trigger, it shouldn't need a tracepoint. It needs an
> assertion if we're 100% confident, or error_report_once() perhaps would be
> more suitable.
It would trigger when a rp error happened that wasn't related to the
QEMUFile. If we go with your suggestion above, then this goes away.
- [PATCH v2 1/2] migration: Split await_return_path_close_on_source, (continued)
- [PATCH v2 1/2] migration: Split await_return_path_close_on_source, Fabiano Rosas, 2023/08/02
- Re: [PATCH v2 1/2] migration: Split await_return_path_close_on_source, Peter Xu, 2023/08/02
- Re: [PATCH v2 1/2] migration: Split await_return_path_close_on_source, Fabiano Rosas, 2023/08/02
- Re: [PATCH v2 1/2] migration: Split await_return_path_close_on_source, Peter Xu, 2023/08/02
- Re: [PATCH v2 1/2] migration: Split await_return_path_close_on_source, Fabiano Rosas, 2023/08/03
- Re: [PATCH v2 1/2] migration: Split await_return_path_close_on_source, Peter Xu, 2023/08/03
- Re: [PATCH v2 1/2] migration: Split await_return_path_close_on_source, Daniel P . Berrangé, 2023/08/03
- Re: [PATCH v2 1/2] migration: Split await_return_path_close_on_source, Peter Xu, 2023/08/03
[PATCH v2 2/2] migration: Replace the return path retry logic, Fabiano Rosas, 2023/08/02