[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3] migration/block:limit the time used for bloc

From: 858585 jemmy
Subject: Re: [Qemu-devel] [PATCH v3] migration/block:limit the time used for block migration
Date: Sat, 8 Apr 2017 21:17:58 +0800

On Fri, Apr 7, 2017 at 7:34 PM, Stefan Hajnoczi <address@hidden> wrote:
> On Fri, Apr 07, 2017 at 09:30:33AM +0800, 858585 jemmy wrote:
>> On Thu, Apr 6, 2017 at 10:02 PM, Stefan Hajnoczi <address@hidden> wrote:
>> > On Wed, Apr 05, 2017 at 05:27:58PM +0800, address@hidden wrote:
>> >
>> > A proper solution is to refactor the synchronous code to make it
>> > asynchronous.  This might require invoking the system call from a
>> > thread pool worker.
>> >
>> yes, i agree with you, but this is a big change.
>> I will try to find how to optimize this code, maybe need a long time.
>> this patch is not a perfect solution, but can alleviate the problem.
> Let's try to understand the problem fully first.

when migrate the vm with high speed, i find vnc response slowly sometime.
not only vnc response slowly, virsh console aslo response slowly sometime.
and the guest os block io performance is also reduce.

the bug can be reproduce by this command:
virsh migrate-setspeed 165cf436-312f-47e7-90f2-f8aa63f34893 900
virsh migrate --live 165cf436-312f-47e7-90f2-f8aa63f34893
--copy-storage-inc qemu+ssh://

and --copy-storage-all have no problem.
virsh migrate --live 165cf436-312f-47e7-90f2-f8aa63f34893
--copy-storage-all qemu+ssh://

compare the difference between --copy-storage-inc and
--copy-storage-all. i find out the reason is
mig_save_device_bulk invoke bdrv_is_allocated, but bdrv_is_allocated
is synchronous and maybe wait
for a long time.

i write this code to measure the time used by  brdrv_is_allocated()

 279     static int max_time = 0;
 280     int tmp;

 288             clock_gettime(CLOCK_MONOTONIC_RAW, &ts1);
 289             ret = bdrv_is_allocated(blk_bs(bb), cur_sector,
 290                                     MAX_IS_ALLOCATED_SEARCH, &nr_sectors);
 291             clock_gettime(CLOCK_MONOTONIC_RAW, &ts2);
 294             tmp =  (ts2.tv_sec - ts1.tv_sec)*1000000000L
 295                            + (ts2.tv_nsec - ts1.tv_nsec);
 296             if (tmp > max_time) {
 297                max_time=tmp;
 298                fprintf(stderr, "max_time is %d\n", max_time);
 299             }

the test result is below:

 max_time is 37014
 max_time is 1075534
 max_time is 17180913
 max_time is 28586762
 max_time is 49563584
 max_time is 103085447
 max_time is 110836833
 max_time is 120331438

bdrv_is_allocated is called after qemu_mutex_lock_iothread.
and the main thread is also call qemu_mutex_lock_iothread.
so cause the main thread maybe wait for a long time.

   if (bmds->shared_base) {
        /* Skip unallocated sectors; intentionally treats failure as
         * an allocated sector */
        while (cur_sector < total_sectors &&
               !bdrv_is_allocated(blk_bs(bb), cur_sector,
                                  MAX_IS_ALLOCATED_SEARCH, &nr_sectors)) {
            cur_sector += nr_sectors;

#0  0x00007f107322f264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f107322a508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00007f107322a3d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000949ecb in qemu_mutex_lock (mutex=0xfc51a0) at
#4  0x0000000000459e58 in qemu_mutex_lock_iothread () at /root/qemu/cpus.c:1516
#5  0x0000000000945322 in os_host_main_loop_wait (timeout=28911939) at
#6  0x00000000009453f2 in main_loop_wait (nonblocking=0) at util/main-loop.c:517
#7  0x00000000005c76b4 in main_loop () at vl.c:1898
#8  0x00000000005ceb77 in main (argc=49, argv=0x7fff921182b8,
envp=0x7fff92118448) at vl.c:4709

> Stefan

reply via email to

[Prev in Thread] Current Thread [Next in Thread]