[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH COLO-Frame v19 16/22] COLO: Shutdown related socket
From: |
zhanghailiang |
Subject: |
[Qemu-devel] [PATCH COLO-Frame v19 16/22] COLO: Shutdown related socket fd while do failover |
Date: |
Thu, 1 Sep 2016 11:24:19 +0800 |
If the net connection between primary host and secondary host
is broken while COLO/COLO incoming thread is blocked in read()/write()
socket fd.
It will be a long time to detect this error until connection is timeout.
Here we shutdown all the related socket file descriptors to wake up the
blocking operation in failover BH. Besides, we should close the corresponding
file descriptors after failvoer BH shutdown them, or there will be an error.
Signed-off-by: zhanghailiang <address@hidden>
Signed-off-by: Li Zhijian <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Cc: Dr. David Alan Gilbert <address@hidden>
---
v19:
- fix the title
v17:
- Rename colo_sem to colo_exit_sem.
v13:
- Add Reviewed-by tag
- Use semaphore to notify colo/colo incoming loop that
failover work is finished.
v12:
- Shutdown both QEMUFile's fd though they may use the
same fd. (Dave's suggestion)
v11:
- Only shutdown fd for once
---
include/migration/migration.h | 3 +++
migration/colo.c | 43 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)
diff --git a/include/migration/migration.h b/include/migration/migration.h
index f4b215a..9406218 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -113,6 +113,7 @@ struct MigrationIncomingState {
QemuThread colo_incoming_thread;
/* The coroutine we should enter (back) after failover */
Coroutine *migration_incoming_co;
+ QemuSemaphore colo_incoming_sem;
/* See savevm.c */
LoadStateEntry_Head loadvm_handlers;
@@ -183,6 +184,8 @@ struct MigrationState
QSIMPLEQ_HEAD(src_page_requests, MigrationSrcPageRequest)
src_page_requests;
/* The RAMBlock used in the last src_page_request */
RAMBlock *last_req_rb;
+ /* The semaphore is used to notify COLO thread that failover is finished */
+ QemuSemaphore colo_exit_sem;
/* The last error that occurred */
Error *error;
diff --git a/migration/colo.c b/migration/colo.c
index f1fb2ef..fc89438 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -59,6 +59,18 @@ static void secondary_vm_do_failover(void)
/* recover runstate to normal migration finish state */
autostart = true;
}
+ /*
+ * Make sure COLO incoming thread not block in recv or send,
+ * If mis->from_src_file and mis->to_src_file use the same fd,
+ * The second shutdown() will return -1, we ignore this value,
+ * It is harmless.
+ */
+ if (mis->from_src_file) {
+ qemu_file_shutdown(mis->from_src_file);
+ }
+ if (mis->to_src_file) {
+ qemu_file_shutdown(mis->to_src_file);
+ }
old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
FAILOVER_STATUS_COMPLETED);
@@ -67,6 +79,8 @@ static void secondary_vm_do_failover(void)
"secondary VM", old_state);
return;
}
+ /* Notify COLO incoming thread that failover work is finished */
+ qemu_sem_post(&mis->colo_incoming_sem);
/* For Secondary VM, jump to incoming co */
if (mis->migration_incoming_co) {
qemu_coroutine_enter(mis->migration_incoming_co);
@@ -81,6 +95,18 @@ static void primary_vm_do_failover(void)
migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
MIGRATION_STATUS_COMPLETED);
+ /*
+ * Wake up COLO thread which may blocked in recv() or send(),
+ * The s->rp_state.from_dst_file and s->to_dst_file may use the
+ * same fd, but we still shutdown the fd for twice, it is harmless.
+ */
+ if (s->to_dst_file) {
+ qemu_file_shutdown(s->to_dst_file);
+ }
+ if (s->rp_state.from_dst_file) {
+ qemu_file_shutdown(s->rp_state.from_dst_file);
+ }
+
old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
FAILOVER_STATUS_COMPLETED);
if (old_state != FAILOVER_STATUS_HANDLING) {
@@ -88,6 +114,8 @@ static void primary_vm_do_failover(void)
old_state);
return;
}
+ /* Notify COLO thread that failover work is finished */
+ qemu_sem_post(&s->colo_exit_sem);
}
void colo_do_failover(MigrationState *s)
@@ -362,6 +390,14 @@ out:
qemu_fclose(fb);
+ /* Hope this not to be too long to wait here */
+ qemu_sem_wait(&s->colo_exit_sem);
+ qemu_sem_destroy(&s->colo_exit_sem);
+ /*
+ * Must be called after failover BH is completed,
+ * Or the failover BH may shutdown the wrong fd that
+ * re-used by other threads after we release here.
+ */
if (s->rp_state.from_dst_file) {
qemu_fclose(s->rp_state.from_dst_file);
}
@@ -370,6 +406,7 @@ out:
void migrate_start_colo_process(MigrationState *s)
{
qemu_mutex_unlock_iothread();
+ qemu_sem_init(&s->colo_exit_sem, 0);
migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_COLO);
colo_process_checkpoint(s);
@@ -408,6 +445,8 @@ void *colo_process_incoming_thread(void *opaque)
uint64_t value;
Error *local_err = NULL;
+ qemu_sem_init(&mis->colo_incoming_sem, 0);
+
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_COLO);
@@ -517,6 +556,10 @@ out:
qemu_fclose(fb);
}
+ /* Hope this not to be too long to loop here */
+ qemu_sem_wait(&mis->colo_incoming_sem);
+ qemu_sem_destroy(&mis->colo_incoming_sem);
+ /* Must be called after failover BH is completed */
if (mis->to_src_file) {
qemu_fclose(mis->to_src_file);
}
--
1.8.3.1
- [Qemu-devel] [PATCH COLO-Frame v19 07/22] COLO: Add a new RunState RUN_STATE_COLO, (continued)
- [Qemu-devel] [PATCH COLO-Frame v19 07/22] COLO: Add a new RunState RUN_STATE_COLO, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 14/22] COLO: Implement the process of failover for primary VM, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 13/22] COLO: Introduce state to record failover process, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 10/22] COLO: Add checkpoint-delay parameter for migrate-set-parameters, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 12/22] COLO: Add 'x-colo-lost-heartbeat' command to trigger failover, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 19/22] COLO: Update the global runstate after going into colo state, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 18/22] COLO: Handle shutdown command for VM in COLO state, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 15/22] COLO: Implement failover work for secondary VM, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 22/22] configure: Support enable/disable COLO feature, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 21/22] docs: Add documentation for COLO feature, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 16/22] COLO: Shutdown related socket fd while do failover,
zhanghailiang <=
- [Qemu-devel] [PATCH COLO-Frame v19 17/22] COLO: Don't do failover while loading VM's state, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 20/22] COLO: Add block replication into colo process, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 06/22] COLO: Introduce checkpointing protocol, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 11/22] COLO: Synchronize PVM's state to SVM periodically, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 08/22] COLO: Send PVM state to secondary side when do checkpoint, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 01/22] migration: Introduce capability 'x-colo' to migration, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 03/22] migration: Enter into COLO mode after migration if COLO is enabled, zhanghailiang, 2016/08/31
- [Qemu-devel] [PATCH COLO-Frame v19 09/22] COLO: Load VMState into QIOChannelBuffer before restore it, zhanghailiang, 2016/08/31
- Re: [Qemu-devel] [PATCH COLO-Frame v19 00/22] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT), no-reply, 2016/08/31