[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC 1/1] migration: Update error description whenever migration fai
|
From: |
Daniel P . Berrangé |
|
Subject: |
Re: [RFC 1/1] migration: Update error description whenever migration fails |
|
Date: |
Thu, 4 May 2023 09:16:42 +0100 |
|
User-agent: |
Mutt/2.2.9 (2022-11-12) |
On Wed, May 03, 2023 at 08:31:16PM +0000, tejus.gk wrote:
> There are places in the code where the migration is marked failed with
> MIGRATION_STATUS_FAILED, but the failiure reason is never updated. Hence
> libvirt doesn't know why the migration failed when it queries for it.
>
> Signed-off-by: tejus.gk <tejus.gk@nutanix.com>
> ---
> migration/migration.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index feb5ab7493..0d7d34bf4d 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1665,8 +1665,11 @@ void qmp_migrate(const char *uri, bool has_blk, bool
> blk,
> }
> error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "uri",
> "a valid migration protocol");
> + error_setg(&local_err, QERR_INVALID_PARAMETER_VALUE, "uri",
> + "a valid migration protocol");
> migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> MIGRATION_STATUS_FAILED);
> + migrate_set_error(s, local_err);
> block_cleanup_parameters();
> return;
Most of this "} else {" block is duplicating what is done in
the following "if (local_error)" block. As such I think this
should be deleted and replaced with merely
} else {
error_setg(&local_err, QERR_INVALID_PARAMETER_VALUE, "uri",
"a valid migration protocol");
block_cleanup_parameters();
}
...so we just fallthruogh to the local_error cleanup block.
> }
> @@ -2059,6 +2062,7 @@ static int postcopy_start(MigrationState *ms)
> int64_t bandwidth = migrate_max_postcopy_bandwidth();
> bool restart_block = false;
> int cur_state = MIGRATION_STATUS_ACTIVE;
> + Error *local_err = NULL;
>
> if (migrate_postcopy_preempt()) {
> migration_wait_main_channel(ms);
> @@ -2203,8 +2207,10 @@ static int postcopy_start(MigrationState *ms)
> ret = qemu_file_get_error(ms->to_dst_file);
> if (ret) {
> error_report("postcopy_start: Migration stream errored");
> + error_setg(&local_err, "postcopy_start: Migration stream errored");
There is an earlier place in this method which also calls
error_report which you've not changed to call migrate_set_error.
Even more crazy is that the caller of postcopy_start() also
calls error_report() but with a useless error message.
ALso nothing is free'ing the local_err object once set.
IMHO, the postcopy_start() method should be changed to accept
an "Error **errp" parameter, and then the caller should be
responsible for calling error_report_err and migrate_set_error
> migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
> MIGRATION_STATUS_FAILED);
> + migrate_set_error(ms, local_err);
> }
>
> trace_postcopy_preempt_enabled(migrate_postcopy_preempt());
> @@ -3233,7 +3239,9 @@ void migrate_fd_connect(MigrationState *s, Error
> *error_in)
> if (migrate_postcopy_ram() || migrate_return_path()) {
> if (open_return_path_on_source(s, !resume)) {
> error_report("Unable to open return-path for postcopy");
> + error_setg(&local_err, "Unable to open return-path");
Having two different error messages is bad and again nothing free's
the local_err object. Remove the error_report call and have it call
error_report_err(&local_err) which does free the object
> migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> + migrate_set_error(s, local_err);
> migrate_fd_cleanup(s);
> return;
> }
> --
> 2.22.3
>
>
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|