[Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long

From:	Juan Quintela
Subject:	[Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long
Date:	Wed, 24 Nov 2010 12:01:51 +0100
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux)

"Michael S. Tsirkin" <address@hidden> wrote:
> On Wed, Nov 24, 2010 at 12:03:06AM +0100, Juan Quintela wrote:
>> From: Juan Quintela <address@hidden>
>> 
>> cheking each 64 pages is a random magic number as good as any other.
>> We don't want to test too many times, but on the other hand,
>> qemu_get_clock_ns() is not so expensive either.
>> 
>
> Could you please explain what's the problem this fixes?
> I would like to see an API that documents the contract
> we are making with the backend.

buffered_file is an "abstraction" that uses a buffer.

live migration code (remember it can't sleep, it runs on the main loop)
stores its "stuff" on that buffer.  And a timer writes that buffer to
the fd that is associated with migration.

This design is due to the main_loop/no threads qemu model.

buffered_file timer runs each 100ms.  And we "try" to measure channel
bandwidth from there.  If we are not able to run the timer, all the
calculations are wrong, and then stalls happens.

>> @@ -269,6 +272,19 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
>> void *opaque)
>>          if (bytes_sent == 0) { /* no more blocks */
>>              break;
>>          }
>> +    /* we want to check in the 1st loop, just in case it was the 1st time
>> +           and we had to sync the dirty bitmap.
>> +           qemu_get_clock_ns() is a bit expensive, so we only check each 
>> some
>> +           iterations
>> +    */
>> +        if ((i & 63) == 0) {
>> +            uint64_t t1 = (qemu_get_clock_ns(rt_clock) - t0) / 1000000;
>
> This adds even more non-determinism to savevm behaviour.  If bandwidth
> limit is higth enough, I expect it to just keep going.

If we find a row of 512MB of zero pages together (and that happens if
you have a 64GB iddle guest, then you can spent more than 3seconds to
fill the default bandwith).  After that everything that uses the main
loop has had stalls.

>> +            if (t1 > buffered_file_interval/2) {
>
> arch_init should not depend on buffered_file implementation IMO.
>
> Also - / 2?

We need to run a timer each 100ms.  For times look at the 0/6 patch.
We can't spent more that 50ms in each function.  It is something that
should happen for all funnctions called from io_handlers.

>> +                printf("big delay %ld milliseconds, %d iterations\n", t1, 
>> i);
>
> Is this a debugging aid?

I left that on purpose, to show that it happens a lot.  There is no
DEBUG_ARCH or DEBUG_RAM around, I can create them if you preffer.  But
notice that this is something that shouldn't happen (but it happens).

DPRINTF for that file should be a good idea, will do.

>> +            break;
>> +        }
>> +    }
>> +        i++;
>>      }
>> 
>>      t0 = qemu_get_clock_ns(rt_clock) - t0;
>> -- 
>> 1.7.3.2
>>

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH 03/10] Add printf debug to savevm, (continued)
- [Qemu-devel] [PATCH 03/10] Add printf debug to savevm, Juan Quintela, 2010/11/23
  - Message not available
    - Message not available
    - [Qemu-devel] Re: [PATCH 03/10] Add printf debug to savevm, Juan Quintela, 2010/11/30
- [Qemu-devel] [PATCH 06/10] Only calculate expected_time for stage 2, Juan Quintela, 2010/11/23
- [Qemu-devel] [PATCH 07/10] ram_save_remaining() returns an uint64_t, Juan Quintela, 2010/11/23
  - Message not available
    - [Qemu-devel] Re: [PATCH 07/10] ram_save_remaining() returns an uint64_t, Juan Quintela, 2010/11/30
  - Message not available
    - Message not available
    - [Qemu-devel] Re: [PATCH 07/10] ram_save_remaining() returns an uint64_t, Anthony Liguori, 2010/11/30
- [Qemu-devel] [PATCH 08/10] Count nanoseconds with uint64_t not doubles, Juan Quintela, 2010/11/23
  - Message not available
    - [Qemu-devel] Re: [PATCH 08/10] Count nanoseconds with uint64_t not doubles, Juan Quintela, 2010/11/30
- [Qemu-devel] [PATCH 09/10] Exit loop if we have been there too long, Juan Quintela, 2010/11/23
  - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Michael S. Tsirkin, 2010/11/24
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Juan Quintela <=
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Michael S. Tsirkin, 2010/11/24
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Paolo Bonzini, 2010/11/24
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Michael S. Tsirkin, 2010/11/24
    - Message not available
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Juan Quintela, 2010/11/30
  - Message not available
    - Message not available
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Anthony Liguori, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Paolo Bonzini, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Anthony Liguori, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Juan Quintela, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Avi Kivity, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Anthony Liguori, 2010/11/30

Prev by Date: Re: [Qemu-devel] [Patch] Small fix for qemu APIC for Mac OS X support
Next by Date: [Qemu-devel] Re: [PATCH 02/10] Add buffered_file_internal constant
Previous by thread: [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long
Next by thread: [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long
Index(es):
- Date
- Thread