qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] 458fec: migration: provide an error message t


From: Richard Henderson
Subject: [Qemu-commits] [qemu/qemu] 458fec: migration: provide an error message to migration_c...
Date: Thu, 04 Nov 2021 03:32:47 -0700

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: 458fecca80963b4c2c2164889d817542d2cece4f
      
https://github.com/qemu/qemu/commit/458fecca80963b4c2c2164889d817542d2cece4f
  Author: Laurent Vivier <lvivier@redhat.com>
  Date:   2021-11-03 (Wed, 03 Nov 2021)

  Changed paths:
    M migration/migration.c
    M migration/migration.h
    M migration/ram.c

  Log Message:
  -----------
  migration: provide an error message to migration_cancel()

This avoids to call migrate_get_current() in the caller function
whereas migration_cancel() already needs the pointer to the current
migration state.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: fa0b31d585b66e0a9e6e9e72edf729ad2f10e6fd
      
https://github.com/qemu/qemu/commit/fa0b31d585b66e0a9e6e9e72edf729ad2f10e6fd
  Author: yuxiating <yuxiating@huawei.com>
  Date:   2021-11-03 (Wed, 03 Nov 2021)

  Changed paths:
    M migration/migration.c

  Log Message:
  -----------
  migration: initialise compression_counters for a new migration

If the compression migration fails or is canceled, the query for the value of
compression_counters during the next compression migration is wrong.

Signed-off-by: yuxiating <yuxiating@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 02abee3d5129e1430af35ba4a174980a5a22d64e
      
https://github.com/qemu/qemu/commit/02abee3d5129e1430af35ba4a174980a5a22d64e
  Author: Juan Quintela <quintela@redhat.com>
  Date:   2021-11-03 (Wed, 03 Nov 2021)

  Changed paths:
    M migration/savevm.c

  Log Message:
  -----------
  migration: Zero migration compression counters

Based on previous patch from yuxiating <yuxiating@huawei.com>

Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: ae4c2099351a6ea5774428a0884ace32595b341c
      
https://github.com/qemu/qemu/commit/ae4c2099351a6ea5774428a0884ace32595b341c
  Author: Rao, Lei <lei.rao@intel.com>
  Date:   2021-11-03 (Wed, 03 Nov 2021)

  Changed paths:
    M migration/colo.c
    M net/colo-compare.c

  Log Message:
  -----------
  Some minor optimizations for COLO

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: aa505f8e0e04ddb404c00b907b176a6204cbb3c7
      
https://github.com/qemu/qemu/commit/aa505f8e0e04ddb404c00b907b176a6204cbb3c7
  Author: Rao, Lei <lei.rao@intel.com>
  Date:   2021-11-03 (Wed, 03 Nov 2021)

  Changed paths:
    M migration/migration.c

  Log Message:
  -----------
  Fixed qemu crash when guest power off in COLO mode

This patch fixes the following:
qemu-system-x86_64: invalid runstate transition: 'shutdown' -> 'running'
Aborted (core dumped)
The gdb bt as following:
0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
1  0x00007faa3d613859 in __GI_abort () at abort.c:79
2  0x000055c5a21268fd in runstate_set (new_state=RUN_STATE_RUNNING) at vl.c:723
3  0x000055c5a1f8cae4 in vm_prepare_start () at 
/home/workspace/colo-qemu/cpus.c:2206
4  0x000055c5a1f8cb1b in vm_start () at /home/workspace/colo-qemu/cpus.c:2213
5  0x000055c5a2332bba in migration_iteration_finish (s=0x55c5a4658810) at 
migration/migration.c:3376
6  0x000055c5a2332f3b in migration_thread (opaque=0x55c5a4658810) at 
migration/migration.c:3527
7  0x000055c5a251d68a in qemu_thread_start (args=0x55c5a5491a70) at 
util/qemu-thread-posix.c:519
8  0x00007faa3d7e9609 in start_thread (arg=<optimized out>) at 
pthread_create.c:477
9  0x00007faa3d710293 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 684bfd1820d45435ee4956e5a3ed231121b36634
      
https://github.com/qemu/qemu/commit/684bfd1820d45435ee4956e5a3ed231121b36634
  Author: Rao, Lei <lei.rao@intel.com>
  Date:   2021-11-03 (Wed, 03 Nov 2021)

  Changed paths:
    M migration/migration.c

  Log Message:
  -----------
  Fixed SVM hang when do failover before PVM crash

This patch fixed as follows:
    Thread 1 (Thread 0x7f34ee738d80 (LWP 11212)):
    #0 __pthread_clockjoin_ex (threadid=139847152957184, 
thread_return=0x7f30b1febf30, clockid=<optimized out>, abstime=<optimized out>, 
block=<optimized out>) at pthread_join_common.c:145
    #1 0x0000563401998e36 in qemu_thread_join (thread=0x563402d66610) at 
util/qemu-thread-posix.c:587
    #2 0x00005634017a79fa in process_incoming_migration_co (opaque=0x0) at 
migration/migration.c:502
    #3 0x00005634019b59c9 in coroutine_trampoline (i0=63395504, i1=22068) at 
util/coroutine-ucontext.c:115
    #4 0x00007f34ef860660 in ?? () at 
../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 from 
/lib/x86_64-linux-gnu/libc.so.6
    #5 0x00007f30b21ee730 in ?? ()
    #6 0x0000000000000000 in ?? ()

    Thread 13 (Thread 0x7f30b3dff700 (LWP 11747)):
    #0  __lll_lock_wait (futex=futex@entry=0x56340218ffa0 <qemu_global_mutex>, 
private=0) at lowlevellock.c:52
    #1  0x00007f34efa000a3 in _GI__pthread_mutex_lock (mutex=0x56340218ffa0 
<qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:80
    #2  0x0000563401997f99 in qemu_mutex_lock_impl (mutex=0x56340218ffa0 
<qemu_global_mutex>, file=0x563401b7a80e "migration/colo.c", line=806) at 
util/qemu-thread-posix.c:78
    #3  0x0000563401407144 in qemu_mutex_lock_iothread_impl 
(file=0x563401b7a80e "migration/colo.c", line=806) at 
/home/workspace/colo-qemu/cpus.c:1899
    #4  0x00005634017ba8e8 in colo_process_incoming_thread 
(opaque=0x563402d664c0) at migration/colo.c:806
    #5  0x0000563401998b72 in qemu_thread_start (args=0x5634039f8370) at 
util/qemu-thread-posix.c:519
    #6  0x00007f34ef9fd609 in start_thread (arg=<optimized out>) at 
pthread_create.c:477
    #7  0x00007f34ef924293 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

    The QEMU main thread is holding the lock:
    (gdb) p qemu_global_mutex
    $1 = {lock = {_data = {lock = 2, __count = 0, __owner = 11212, __nusers = 
9, __kind = 0, __spins = 0, __elision = 0, __list = {_prev = 0x0, __next = 
0x0}},
     __size = "\002\000\000\000\000\000\000\000\314+\000\000\t", '\000' 
<repeats 26 times>, __align = 2}, file = 0x563401c07e4b "util/main-loop.c", 
line = 240,
    initialized = true}

>From the call trace, we can see it is a deadlock bug. and the QEMU main thread 
>holds the global mutex to wait until the COLO thread ends. and the colo thread
wants to acquire the global mutex, which will cause a deadlock. So, we should 
release the qemu_global_mutex before waiting colo thread ends.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: ac183dac96603005ef2985d0e2ea2eb84fe2e03b
      
https://github.com/qemu/qemu/commit/ac183dac96603005ef2985d0e2ea2eb84fe2e03b
  Author: Rao, Lei <lei.rao@intel.com>
  Date:   2021-11-03 (Wed, 03 Nov 2021)

  Changed paths:
    M migration/colo.c

  Log Message:
  -----------
  colo: fixed 'Segmentation fault' when the simplex mode PVM poweroff

The GDB statck is as follows:
Program terminated with signal SIGSEGV, Segmentation fault.
0  object_class_dynamic_cast (class=0x55c8f5d2bf50, typename=0x55c8f2f7379e 
"qio-channel") at qom/object.c:832
         if (type->class->interfaces &&
[Current thread is 1 (Thread 0x7f756e97eb00 (LWP 1811577))]
(gdb) bt
0  object_class_dynamic_cast (class=0x55c8f5d2bf50, typename=0x55c8f2f7379e 
"qio-channel") at qom/object.c:832
1  0x000055c8f2c3dd14 in object_dynamic_cast (obj=0x55c8f543ac00, 
typename=0x55c8f2f7379e "qio-channel") at qom/object.c:763
2  0x000055c8f2c3ddce in object_dynamic_cast_assert (obj=0x55c8f543ac00, 
typename=0x55c8f2f7379e "qio-channel",
    file=0x55c8f2f73780 "migration/qemu-file-channel.c", line=117, 
func=0x55c8f2f73800 <__func__.18724> "channel_shutdown") at qom/object.c:786
3  0x000055c8f2bbc6ac in channel_shutdown (opaque=0x55c8f543ac00, rd=true, 
wr=true, errp=0x0) at migration/qemu-file-channel.c:117
4  0x000055c8f2bba56e in qemu_file_shutdown (f=0x7f7558070f50) at 
migration/qemu-file.c:67
5  0x000055c8f2ba5373 in migrate_fd_cancel (s=0x55c8f4ccf3f0) at 
migration/migration.c:1699
6  0x000055c8f2ba1992 in migration_shutdown () at migration/migration.c:187
7  0x000055c8f29a5b77 in main (argc=69, argv=0x7fff3e9e8c08, 
envp=0x7fff3e9e8e38) at vl.c:4512

The root cause is that we still want to shutdown the from_dst_file in
migrate_fd_cancel() after qemu_close in colo_process_checkpoint().
So, we should set the s->rp_state.from_dst_file = NULL after
qemu_close().

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 04dd89169b94789aacde0a9b29943e8614879343
      
https://github.com/qemu/qemu/commit/04dd89169b94789aacde0a9b29943e8614879343
  Author: Rao, Lei <lei.rao@intel.com>
  Date:   2021-11-03 (Wed, 03 Nov 2021)

  Changed paths:
    M migration/colo.c

  Log Message:
  -----------
  Removed the qemu_fclose() in colo_process_incoming_thread

After the live migration, the related fd will be cleanup in
migration_incoming_state_destroy(). So, the qemu_close()
in colo_process_incoming_thread is not necessary.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 2b9f6bf36c7483a07e45cc20cb0bb794769fb6d1
      
https://github.com/qemu/qemu/commit/2b9f6bf36c7483a07e45cc20cb0bb794769fb6d1
  Author: Rao, Lei <lei.rao@intel.com>
  Date:   2021-11-03 (Wed, 03 Nov 2021)

  Changed paths:
    M migration/colo.c

  Log Message:
  -----------
  Changed the last-mode to none of first start COLO

When we first stated the COLO, the last-mode is as follows:
{ "execute": "query-colo-status" }
{"return": {"last-mode": "primary", "mode": "primary", "reason": "none"}}

The last-mode is unreasonable. After the patch, will be changed to the
following:
{ "execute": "query-colo-status" }
{"return": {"last-mode": "none", "mode": "primary", "reason": "none"}}

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: e5fdf920964b65678798960d8b3a55453c2e9094
      
https://github.com/qemu/qemu/commit/e5fdf920964b65678798960d8b3a55453c2e9094
  Author: Lukas Straub <lukasstraub2@web.de>
  Date:   2021-11-03 (Wed, 03 Nov 2021)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  colo: Don't dump colo cache if dump-guest-core=off

One might set dump-guest-core=off to make coredumps smaller and
still allow to debug many qemu bugs. Extend this option to the colo
cache.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 64153ca613d0a50d1301eae4bd895aade001fcca
      
https://github.com/qemu/qemu/commit/64153ca613d0a50d1301eae4bd895aade001fcca
  Author: Rao, Lei <lei.rao@intel.com>
  Date:   2021-11-03 (Wed, 03 Nov 2021)

  Changed paths:
    M net/colo-compare.c
    M net/colo.c
    M net/colo.h
    M net/filter-rewriter.c

  Log Message:
  -----------
  Optimized the function of fill_connection_key.

Remove some unnecessary code to improve the performance of
the filter-rewriter module.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@trasno.org>


  Commit: 752e235464d62d31f14a9790b4b24e396c86bb0e
      
https://github.com/qemu/qemu/commit/752e235464d62d31f14a9790b4b24e396c86bb0e
  Author: Richard Henderson <richard.henderson@linaro.org>
  Date:   2021-11-04 (Thu, 04 Nov 2021)

  Changed paths:
    M migration/colo.c
    M migration/migration.c
    M migration/migration.h
    M migration/ram.c
    M migration/savevm.c
    M net/colo-compare.c
    M net/colo.c
    M net/colo.h
    M net/filter-rewriter.c

  Log Message:
  -----------
  Merge remote-tracking branch 
'remotes/juanquintela/tags/migration-20211102-pull-request' into staging

Migration Pull request

Hi

This are the pending migration patches on the list:
- Provide an error message for migration_cancel by Laurent
- Don't dump colo cache when a guest core is requested by Lukas
- Initialise Compression_conters for new migration by Yuxiating
  On top of that I added another missing initialization
- Colo optimizations and crash improvements by Rao.

Please, apply.

# gpg: Signature made Wed 03 Nov 2021 04:45:35 AM EDT
# gpg:                using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg:                 aka "Juan Quintela <quintela@trasno.org>" [full]

* remotes/juanquintela/tags/migration-20211102-pull-request:
  Optimized the function of fill_connection_key.
  colo: Don't dump colo cache if dump-guest-core=off
  Changed the last-mode to none of first start COLO
  Removed the qemu_fclose() in colo_process_incoming_thread
  colo: fixed 'Segmentation fault' when the simplex mode PVM poweroff
  Fixed SVM hang when do failover before PVM crash
  Fixed qemu crash when guest power off in COLO mode
  Some minor optimizations for COLO
  migration: Zero migration compression counters
  migration: initialise compression_counters for a new migration
  migration: provide an error message to migration_cancel()

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


Compare: https://github.com/qemu/qemu/compare/b1fd92137e4d...752e235464d6



reply via email to

[Prev in Thread] Current Thread [Next in Thread]