[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH RFC v6 6/7] migration: fix some error handling
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] [PATCH RFC v6 6/7] migration: fix some error handling |
Date: |
Wed, 31 Oct 2018 16:30:41 +0000 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
* Fei Li (address@hidden) wrote:
>
>
> On 10/31/2018 03:49 AM, Dr. David Alan Gilbert wrote:
> > * Fei Li (address@hidden) wrote:
> > > Add error handling for qemu_ram_foreach_migratable_block() when
> > > it fails.
> > >
> > > Always call migrate_set_error() to set the error state without relying
> > > on whether multifd_save_cleanup() succeeds. As the passed &local_err
> > > is never used in multifd_save_cleanup(), remove it.
> > >
> > > Signed-off-by: Fei Li <address@hidden>
> > > ---
> > > migration/migration.c | 5 +----
> > > migration/postcopy-ram.c | 3 +++
> > > migration/ram.c | 7 +++----
> > > migration/ram.h | 2 +-
> > > 4 files changed, 8 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 87dfc7374f..3b8b7ab4f9 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -1377,7 +1377,6 @@ static void migrate_fd_cleanup(void *opaque)
> > > qemu_savevm_state_cleanup();
> > > if (s->to_dst_file) {
> > > - Error *local_err = NULL;
> > > QEMUFile *tmp;
> > > trace_migrate_fd_cleanup();
> > > @@ -1388,9 +1387,7 @@ static void migrate_fd_cleanup(void *opaque)
> > > }
> > > qemu_mutex_lock_iothread();
> > > - if (multifd_save_cleanup(&local_err) != 0) {
> > > - error_report_err(local_err);
> > > - }
> > > + multifd_save_cleanup();
> > > qemu_mutex_lock(&s->qemu_file_lock);
> > > tmp = s->to_dst_file;
> > > s->to_dst_file = NULL;
> > > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> > > index e5c02a32c5..4ca2bc02b3 100644
> > > --- a/migration/postcopy-ram.c
> > > +++ b/migration/postcopy-ram.c
> > > @@ -1117,6 +1117,9 @@ int
> > > postcopy_ram_enable_notify(MigrationIncomingState *mis)
> > > /* Mark so that we get notified of accesses to unwritten areas */
> > > if (qemu_ram_foreach_migratable_block(ram_block_enable_notify,
> > > mis)) {
> > > + error_report("ram_block_enable_notify failed");
> > > + close(mis->userfault_event_fd);
> > > + close(mis->userfault_fd);
> > I don't think these close() calls are safe. This code is just after
> > starting the fault thread, and the fault thread has a poll() call on
> > these fd's, so we can't close them until we've instructed that thread
> > to exit.
> >
> > We should fall out through postcopy_ram_incoming_cleanup, and because
> > the thread exists it should do a notify to the thread, a join and then
> > only later do the close calls.
> >
> > Dave
> I see the postcopy_ram_incoming_cleanup() already include the
> notify & join the fault thread & close these two fds and other more cleanup,
> thus that means directly replace the above two close() with
> postcopy_ram_incoming_cleanup() is ok, right? :)
Yes; I think I'd do that in loadvm_postcopy_handle_listen perhaps.
It's a weird case, normally if it fails before 'listen' then we call
cleanup in process_incoming_migration_co, but after listen we call it
at the bottom of the listen thread. Here we've got a half-state where
we're trying and failing to enter listen.
> BTW, does the parameter "&local_err" for multifd_save_cleanup(&local_err) is
> reserved for some reason? As I notice no code is using this error parameter,
> and if it is reserved for nothing I'd like to delete it just as the second
> paragraph
> of the commit message said. :)
Yes, that's OK; it might be better as a separate patch; I suspect maybe
in an earlier version it had an error case.
Dave
> Have a nice day, thanks
> Fei
> >
> > > return -1;
> > > }
> > > diff --git a/migration/ram.c b/migration/ram.c
> > > index 8f03afe228..57cb1bee3d 100644
> > > --- a/migration/ram.c
> > > +++ b/migration/ram.c
> > > @@ -922,7 +922,7 @@ static void multifd_send_terminate_threads(Error *err)
> > > }
> > > }
> > > -int multifd_save_cleanup(Error **errp)
> > > +int multifd_save_cleanup(void)
> > > {
> > > int i;
> > > int ret = 0;
> > > @@ -1082,9 +1082,8 @@ static void multifd_new_send_channel_async(QIOTask
> > > *task, gpointer opaque)
> > > if (qio_task_propagate_error(task, &local_err)) {
> > > migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> > > - if (multifd_save_cleanup(&local_err) != 0) {
> > > - migrate_set_error(migrate_get_current(), local_err);
> > > - }
> > > + multifd_save_cleanup();
> > > + migrate_set_error(migrate_get_current(), local_err);
> > > } else {
> > > p->c = QIO_CHANNEL(sioc);
> > > qio_channel_set_delay(p->c, false);
> > > diff --git a/migration/ram.h b/migration/ram.h
> > > index 046d3074be..0d1215209e 100644
> > > --- a/migration/ram.h
> > > +++ b/migration/ram.h
> > > @@ -43,7 +43,7 @@ uint64_t ram_bytes_remaining(void);
> > > uint64_t ram_bytes_total(void);
> > > int multifd_save_setup(void);
> > > -int multifd_save_cleanup(Error **errp);
> > > +int multifd_save_cleanup(void);
> > > int multifd_load_setup(void);
> > > int multifd_load_cleanup(Error **errp);
> > > bool multifd_recv_all_channels_created(void);
> > > --
> > > 2.13.7
> > >
> > --
> > Dr. David Alan Gilbert / address@hidden / Manchester, UK
> >
>
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK
[Qemu-devel] [PATCH RFC v6 4/7] migration: fix some segmentation faults when using multifd, Fei Li, 2018/10/29
[Qemu-devel] [PATCH RFC v6 6/7] migration: fix some error handling, Fei Li, 2018/10/29
[Qemu-devel] [PATCH RFC v6 7/7] qemu_thread_create: propagate the error to callers to handle, Fei Li, 2018/10/29